New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure remote hosts have our keys #679
Conversation
cefe3df
to
f2835bf
Compare
|
I haven't tested this yet ... but what could possibily go wrong? :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ipaserver/install/custodiainstance.py:135: [W0612(unused-variable), CustodiaInstance.__wait_keys] Unused variable 'result')
ipaserver/install/custodiainstance.py:155: [E0602(undefined-variable), CustodiaInstance.__get_keys] Undefined variable 'sel')
ipaserver/install/custodiainstance.py:9: [W0611(unused-import), ] Unused ipaldap imported from ipapython)
ipaserver/install/custodiainstance.py:14: [W0611(unused-import), ] Unused replication imported from ipaserver.install)
|
Shouldn't the ticket number be: https://pagure.io/freeipa/issue/6838 ? |
|
Seem like both errors are the same problem. |
|
Nevermind they are not duplicates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I wouldn't mind more logging.
|
Fails with on first replica, every try. |
| try: | ||
| konn.get_key(KEY_USAGE_ENC, principal) | ||
| return | ||
| except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use a more specific exception? I noticed that should not keys be retrieved, we're getting ValueError. We don't want to be waiting here if we're unable to connect to remote master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you suggest ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that if we do not wait we fail intallation, not sure we gain anything by not waiting, and we can deal with a transient connection error if we wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't suggest not waiting, of course. I would have gone for just ValueError but you're probably right that the connection error might appear.
It might be worth to do this, at least:
exc = None
while int(time.time()) < deadline:
try:
konn.get_key(KEY_USAGE_ENC, principal)
break
except Exception as e:
if not isinstance(e, ValueError):
root_logger.debug("Failed to get keys: '{err}'".format(err=e))
exc = e
time.sleep(1)
else:
if exc is not None:
raise exc
else:
raise RuntimeError("Unable to obtain keys in expected time.")Note that this is solely for debugging reasons, I just don't like so broad exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I handled this (but differently) and rebased on top of current master.
Please check it out @stlaz
|
What is this PR waiting for? |
|
I was expecting some action about my previous comment:
I did not see any change in code to fix this but I can try again. |
|
Still fails. |
|
Can you please attach more of the logs before the failure ? |
|
@stlaz just FYI, I am sking this info because I cannot reproduce locally with a single replica. |
|
Nevermind I finally reproduced |
In complex replication setups a replica may try to obtain CA keys from a host that is not the master we initially create the keys against. In this case race conditions may happen due to replication. So we need to make sure the server we are contacting to get the CA keys has our keys in LDAP. We do this by waiting to positively fetch our encryption public key (the last one we create) from the target host LDAP server. Fixes: https://pagure.io/freeipa/issue/6838 Signed-off-by: Simo Sorce <simo@redhat.com>
|
Turned out my master had some more relaxed permissions I added when developing the feature. |
|
@simo5 will check, sorry for not replying yesterday, I was no more at my machine. |
| if len(r) != 1: | ||
| raise ValueError("Incorrect number of results (%d) searching for" | ||
| "public key for %s" % (len(r), host)) | ||
| return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return True seems a bit redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove it.
|
Seems to work fine against current master, but fails with against 4.4.4 master. |
|
I've seen this once but thought it was a fluke due to my "unclean" master, as the following times it did not happen. |
|
I was able to do it two times in a row with the same master, I can try to reinstall both the master and replica if you want. What do you mean "unclean"? It's a clean 4.4.4 master, no code changes, edit: both have "custodia-0.3.1-1.fc25.noarch" |
|
I meant my setup was unclean. |
|
Not sure, I will try that. |
|
It seems that replica install fails even without this patch so it's OK to go with it? |
|
We need to find why it breaks though, but yeah I think we can go forward with this patch of others agree. |
|
Will do, ACKing this in the meantime. |
|
Removing the ACK to retest on 4.4.4 with Fedora custodia version. |
In complex replication setups a replica may try to obtain CA keys from a
host that is not the master we initially create the keys against.
In this case race conditions may happen due to replication. So we need
to make sure the server we are contacting to get the CA keys has our
keys in LDAP. We do this by waiting to positively fetch our encryption
public key (the last one we create) from the target host LDAP server.
Fixes: https://pagure.io/freeipa/issue/6838
Signed-off-by: Simo Sorce simo@redhat.com