New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ceph-salt can no longer bootstrap pacific as of ceph commit 5c9227bbdf987365ae90433ac4e501e1a88b7aca #463
Comments
|
The commit ceph/ceph@5c9227bbdf98 comes originally from here - ceph/ceph#42772 - which was added to fix https://tracker.ceph.com/issues/51667 The description of that PR says: which is rather cryptic, but I did confirm that ceph-salt is not issuing any What's weird is that the error is being triggered when the host is first being added. The PR description seems to indicate it was intended to fix a bug in subsequent host modification. |
|
From what I can see, the linked PR includes a check that requires explicitly providing an address when calling "ceph orch host add" for the host the active mgr is running on. I think this was to prevent issues where the active mgr, when trying to resolve its own ip, would resolve to a loopback address rather than a proper ip. During cephadm bootstrap this typically isn't an issue because we explicitly pass the --mon-ip arg as the address for the bootstrap host (where the active mgr is at the time). If ceph-salt is doing things differently and just calling "ceph orch host add" without an explicit ip addr for the bootstrap host then it would trigger this. Perhaps that check is a bit too severe and we should only throw errors if the address actually resolves to a loopback address? |
|
@sebastian-philipp This is happening post-bootstrap, when ceph-salt tries to add the cluster hosts to the orchestrator. |
How would this be done, syntactically speaking? UPDATE: nvm All clear now. But we'll need to patch ceph-salt for this. |
|
Hm, this thing is: we should not make this a requirement, as this is often just not necessary, cause we can deduce the IP of many hosts properly anyway. |
|
Yeah, I think many will be surprised when they hit this error, especially since the error message:
At the very least it could say something like: "re-run the command providing the host's IP address as a second argument". |
The upstream cephadm developers recently added a commit - 0facfac91fd8f71e5a8b869d818e7c2b07b93516 - which was backported to pacific and was found to be causing our "ceph orch host add" command to fail because we weren't specifying the IP address explicitly. While the developers are now discussing whether the command really should fail in this way, it seems prudent to give our IP address here explicitly, since mgr/cephadm is going to try to resolve the hostname, anyway. Fixes: ceph#463 Signed-off-by: Nathan Cutler <ncutler@suse.com>
|
This error message will only appear when trying to readd the host that the active mgr is running on. we do not want to have the active mgr try to resolve its own ip as there is some weirdness with the /etc/hosts file inside containers and can result in the ip resolving to 12.7.0.0.1 which is not correct. Im not sure we should be explaining the weirdness of the container /etc/host file in this error message as that would be a long wordy error message. and the error does say the user needs to provide an ip address to make this command succeed. if ceph-salt is readding the host the active mgr is running it need to explicitly provide the ip address |
As of ceph/ceph@5c9227bbdf98 ceph-salt can no longer bootstrap pacific clusters:
ANALYSIS:
The relevant yaml is:
which triggers the following Salt State code:
and
my_hostnameis getting populated by this line:which in turns triggers the following function:
All this seems to mean that the hostname returned by
socket.gethostname()is passed to the following function,add_host:Now, when I try
socket.gethostname()in the sesdev environment, it returns the short hostname (master). And it seems that cephadm is expecting to get the short hostname here. But I couldn't find any log message to confirm that ceph-salt is really sending the short hostname.The text was updated successfully, but these errors were encountered: