Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure remote hosts have our keys #679

Closed
wants to merge 1 commit into from
Closed

Conversation

simo5
Copy link
Contributor

@simo5 simo5 commented Mar 31, 2017

In complex replication setups a replica may try to obtain CA keys from a
host that is not the master we initially create the keys against.
In this case race conditions may happen due to replication. So we need
to make sure the server we are contacting to get the CA keys has our
keys in LDAP. We do this by waiting to positively fetch our encryption
public key (the last one we create) from the target host LDAP server.

Fixes: https://pagure.io/freeipa/issue/6838

Signed-off-by: Simo Sorce simo@redhat.com

@simo5 simo5 force-pushed the cakeysfix branch 2 times, most recently from cefe3df to f2835bf Compare March 31, 2017 16:30
@simo5
Copy link
Contributor Author

simo5 commented Mar 31, 2017

I haven't tested this yet ... but what could possibily go wrong? :-)

tiran
tiran previously requested changes Apr 3, 2017
Copy link
Member

@tiran tiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ipaserver/install/custodiainstance.py:135: [W0612(unused-variable), CustodiaInstance.__wait_keys] Unused variable 'result')

ipaserver/install/custodiainstance.py:155: [E0602(undefined-variable), CustodiaInstance.__get_keys] Undefined variable 'sel')

ipaserver/install/custodiainstance.py:9: [W0611(unused-import), ] Unused ipaldap imported from ipapython)

ipaserver/install/custodiainstance.py:14: [W0611(unused-import), ] Unused replication imported from ipaserver.install)

@pvoborni
Copy link
Member

pvoborni commented Apr 4, 2017

Shouldn't the ticket number be: https://pagure.io/freeipa/issue/6838 ?

@simo5
Copy link
Contributor Author

simo5 commented Apr 4, 2017

Seem like both errors are the same problem.
Should we mark 6688 a duplicate of 6838 ?

@simo5
Copy link
Contributor Author

simo5 commented Apr 4, 2017

Nevermind they are not duplicates.
I'll fix the commit message.

Copy link
Member

@tiran tiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I wouldn't mind more logging.

@stlaz stlaz self-assigned this Apr 12, 2017
@stlaz
Copy link
Contributor

stlaz commented Apr 12, 2017

Fails with

2017-04-12T14:16:14Z DEBUG The ipa-replica-install command failed, exception: ValueError: Incorrect number of results (0) searching forpublic key for host/vm-225.abc.idm.lab.eng.brq.redhat.com@DOM-096.ABC.IDM.LAB.ENG.BRQ.REDHAT.COM

on first replica, every try.

try:
konn.get_key(KEY_USAGE_ENC, principal)
return
except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use a more specific exception? I noticed that should not keys be retrieved, we're getting ValueError. We don't want to be waiting here if we're unable to connect to remote master.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you suggest ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that if we do not wait we fail intallation, not sure we gain anything by not waiting, and we can deal with a transient connection error if we wait.

Copy link
Contributor

@stlaz stlaz Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't suggest not waiting, of course. I would have gone for just ValueError but you're probably right that the connection error might appear.
It might be worth to do this, at least:

exc = None
while int(time.time()) < deadline:
    try:
        konn.get_key(KEY_USAGE_ENC, principal)
        break
    except Exception as e:
        if not isinstance(e, ValueError):
            root_logger.debug("Failed to get keys: '{err}'".format(err=e))
        exc = e
        time.sleep(1)
else:
    if exc is not None:
        raise exc
    else:
        raise RuntimeError("Unable to obtain keys in expected time.")

Note that this is solely for debugging reasons, I just don't like so broad exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I handled this (but differently) and rebased on top of current master.
Please check it out @stlaz

@pvoborni
Copy link
Member

pvoborni commented May 2, 2017

What is this PR waiting for?

@stlaz
Copy link
Contributor

stlaz commented May 2, 2017

I was expecting some action about my previous comment:

Fails with
2017-04-12T14:16:14Z DEBUG The ipa-replica-install command failed, exception: ValueError: Incorrect number of results (0) searching forpublic key for host/vm-225.abc.idm.lab.eng.brq.redhat.com@DOM-096.ABC.IDM.LAB.ENG.BRQ.REDHAT.COM
on first replica, every try.

I did not see any change in code to fix this but I can try again.

@stlaz
Copy link
Contributor

stlaz commented May 2, 2017

Still fails.

@simo5
Copy link
Contributor Author

simo5 commented May 2, 2017

Can you please attach more of the logs before the failure ?

@simo5
Copy link
Contributor Author

simo5 commented May 2, 2017

@stlaz just FYI, I am sking this info because I cannot reproduce locally with a single replica.

@simo5
Copy link
Contributor Author

simo5 commented May 2, 2017

Nevermind I finally reproduced

In complex replication setups a replica may try to obtain CA keys from a
host that is not the master we initially create the keys against.
In this case race conditions may happen due to replication. So we need
to make sure the server we are contacting to get the CA keys has our
keys in LDAP. We do this by waiting to positively fetch our encryption
public key (the last one we create) from the target host LDAP server.

Fixes: https://pagure.io/freeipa/issue/6838

Signed-off-by: Simo Sorce <simo@redhat.com>
@simo5
Copy link
Contributor Author

simo5 commented May 2, 2017

Turned out my master had some more relaxed permissions I added when developing the feature.
I now have added a new function to just check for the host keys without asking for data that cannot be read with the identity we have available.
This has been tested and seems to work correctly.
Please check @stlaz

@stlaz
Copy link
Contributor

stlaz commented May 3, 2017

@simo5 will check, sorry for not replying yesterday, I was no more at my machine.

if len(r) != 1:
raise ValueError("Incorrect number of results (%d) searching for"
"public key for %s" % (len(r), host))
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return True seems a bit redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove it.

@stlaz
Copy link
Contributor

stlaz commented May 3, 2017

Seems to work fine against current master, but fails with

Configuring ipa-custodia
  [1/4]: Generating ipa-custodia config file
  [2/4]: Generating ipa-custodia keys
  [3/4]: starting ipa-custodia 
  [4/4]: configuring ipa-custodia to start on boot
Done configuring ipa-custodia.
Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

ipa.ipapython.install.cli.install_tool(CompatServerReplicaInstall): ERROR    503 Server Error: Service Unavailable for url: https://master.domain.com/ipa/keys/ca/caSigningCert%20cert-pki-ca?type=kem&value=eyJhbGciOiJSU0EtT0FFUCIsImVuYyI6IkEyNTZDQkMtSFM1MTIiLCJraWQiOm51bGx9.k6y2jmI8oxRIsieU93_RzG5mZU_u_DPW2XL2jjLukYPZ3oZOkLkufof0fBeH6LAR66aL9m5C9j26GmhlTqNsm2FUQT7Xql975rYR3veooDwLQlPx6k4X1J4CTEeSsf7RVj8KfLE5e4K-nW1hTyepsbm7RDAA_-tbLvWzEqCQ0I3bfpPEDmlML08FA9T_yuPb1FkT0-lSCLV5PHya4tOB3R2q5CHC2b6BpwZQtbVW8eohshEmJMTO2NMAyPlfJscgSHYmhi6oliToV_Dh90Ej1UH_S0UOkHLsvIV5IoW4EGeaGdeHwHo4GsSGHGN3exVxWk9GShhJ_WJ-dlXSGQ_9CA.SfWWO_VrqzKKX3EYSh3E1Q.n4GtjcFZOQSZmAG9MShIQVtfRv_N3jEQMS46rLGUU6xIS-BYBL0Xq1UWP6VFrZW-g96Iqe2PIBhv4m1FsuAzP_gzac1lCr2ghcVuj3rAUg81G5s8vPuYNl_Ur5UVlQ2LtWzGLc26s1z_43MF7qCl8iayvXqnweK8_kj54F1RUJ-Awp0--Z4mnK_FFrPU4BBW2_EjZ1tOR8dV7NnxnN2Gd2tiDFl6Kkbj91rf6Bo2f8telN5RJsX52PsNW2z-l78TOIAKY4qfHhSVz31RO3xgUbyu3yQ79sGIxD66hzmVisB_LnbpNHbIjCP1wKEXXSo-IPrDtXk7ZWZrEITtItzynbzBKddVLjcNMjoqGz-lhLWVNg8R8rdHEdUzhlkdM-kFfW6Fz57wSyOZnt4KvQ-lZxY62TLQB1gqJ7vhzUPUs1g7C9rsy4gTQPjuRxXnLRvqXSb3arQPkrUl_hLqRuAm8FL-ClYY9G38KVns81QTygKvkDC8E5LQBJfyzkg93AyTXNBcrdCxP8AGgaxLBlGyEX-ya0g3mVX5fz_Uj6gyKjtOS_x1AUHOMkAMRmVEzvixrz-krCMWYOQDmJi19OlNeNjb7-NUVDxPRryr7e6Po2OqSbSjP6kUSw_QbMZf8BCrqV4TUFOwndTmZ68n1TOrCqie-UO71TJnherD_3m60_t3-Li1uy6_WWX66BBEMCCtsZBJWP7OYj7c9CzWGuzUEI7g75i4TZwoM1z0SjuyoPE.ZbRawj1B943OeF6AD_W0Z3pfk13fs14rbj_Ab8n-ZXI
ipa.ipapython.install.cli.install_tool(CompatServerReplicaInstall): ERROR    The ipa-replica-install command failed. See /var/log/ipareplica-install.log for more information

against 4.4.4 master.

@simo5
Copy link
Contributor Author

simo5 commented May 3, 2017

I've seen this once but thought it was a fluke due to my "unclean" master, as the following times it did not happen.
Can you reproduce the error against 4.4.4 consistently ?

@stlaz
Copy link
Contributor

stlaz commented May 3, 2017

I was able to do it two times in a row with the same master, I can try to reinstall both the master and replica if you want. What do you mean "unclean"? It's a clean 4.4.4 master, no code changes, /etc/httpd/alias and /etc/pki/pki-tomcat/alias NSS databases seem fine, too.

edit: both have "custodia-0.3.1-1.fc25.noarch"

@simo5
Copy link
Contributor Author

simo5 commented May 3, 2017

I meant my setup was unclean.
I will try to reproduce here.
Does master w/o this patch work properly against 4.4.4 ?

@stlaz
Copy link
Contributor

stlaz commented May 3, 2017

Not sure, I will try that.

@stlaz
Copy link
Contributor

stlaz commented May 3, 2017

It seems that replica install fails even without this patch so it's OK to go with it?

@simo5
Copy link
Contributor Author

simo5 commented May 3, 2017

We need to find why it breaks though, but yeah I think we can go forward with this patch of others agree.
Can you open a separate bug for the failure you got ?

@stlaz
Copy link
Contributor

stlaz commented May 3, 2017

Will do, ACKing this in the meantime.

@stlaz stlaz added ack Pull Request approved, can be merged and removed ack Pull Request approved, can be merged labels May 3, 2017
@stlaz
Copy link
Contributor

stlaz commented May 3, 2017

Removing the ACK to retest on 4.4.4 with Fedora custodia version.

@stlaz stlaz added the ack Pull Request approved, can be merged label May 3, 2017
@tkrizek
Copy link
Contributor

tkrizek commented May 3, 2017

ipa-4-5:

  • 5f8d111 Make sure remote hosts have our keys

master:

  • 1f9f84a Make sure remote hosts have our keys

@tkrizek tkrizek added the pushed Pull Request has already been pushed label May 3, 2017
@tkrizek tkrizek closed this May 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ack Pull Request approved, can be merged pushed Pull Request has already been pushed
Projects
None yet
5 participants