Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potentially invalid ceph.conf after moving monitors #199

Closed
jschmid1 opened this issue Apr 22, 2020 · 9 comments · Fixed by #291
Closed

potentially invalid ceph.conf after moving monitors #199

jschmid1 opened this issue Apr 22, 2020 · 9 comments · Fixed by #291
Assignees
Labels
blocked-by-cephadm enhancement New feature or request

Comments

@jschmid1
Copy link
Contributor

jschmid1 commented Apr 22, 2020

I used ceph-salt to deploy bootstrap services on a master node. then used cephadm
to spin-up mons on node{1..3} with cephadm orch apply mon node1,node2,node3. Now, the master's ceph.conf is outdated(still pointing to the initial bootstrap mon), hence any ceph command will hang as it's relying on the ceph.conf to retrieve the mon addresses.

There is an cephadm tracker issue for a related issue already: https://tracker.ceph.com/issues/44792

@jschmid1 jschmid1 added the enhancement New feature or request label Apr 22, 2020
@ricardoasmarques ricardoasmarques self-assigned this Apr 28, 2020
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Apr 28, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Apr 28, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
@ricardoasmarques
Copy link
Contributor

Should be fixed by https://tracker.ceph.com/issues/45378

@denisok
Copy link

denisok commented Jun 2, 2020

we had same issue during our testing, there is disconnection between 2 tools - ceph-salt and ceph orch.
We also had the issue of ceph-salt not being able to connect to the ceph, and thus hanging without any errors. Looks like it runs ceph orch status and never returns (at least we weren't patient enough).

At least show error ?

Even -l debug didn't help to find out what is wrong, we just noticed that it spawns that ceph orch status process.

@smithfarm
Copy link
Contributor

@denisok Did you try running ceph orch status directly from the command line (not via ceph-salt)?

@denisok
Copy link

denisok commented Jun 2, 2020

sure, it was hanging because ceph.conf pointed to the wrong ip, as ceph orch apply mon changed that mon :)

smithfarm added a commit to smithfarm/ceph-salt that referenced this issue Jun 2, 2020
The "ceph orch status" command is known to hang if /etc/ceph/ceph.conf
on the host points to hosts that used to be MONs but are not anymore.

It's debatable what an appropriate timeout is (30 seconds, more, less?)
but commands that are known to hang should have an explicit timeout.

References: ceph#199
Signed-off-by: Nathan Cutler <ncutler@suse.com>
@matthewoliver
Copy link

OK, so when we run something like:
ceph orch apply mon node2,node3

This doesn't mean run mon also on these nodes but instead only run it on these nodes? because if that's the case, that doesn't seem intuitive to me.

Changing it in /etc/ceph/ceph.conf to point at one of the new mons worked however. When I ran cephadm it didn't seem to work, should that have created a new and correct cephadm?

@denisok
Copy link

denisok commented Jun 3, 2020

and even when we removed ceph.conf it did hang :)

ret = __salt__['cmd.run_all']("ceph orch status")

looks like that is in some kind of salt state that tries to apply it over and over again ? disregarding that it fails ?

we need kind of info and timeout probably )

@ricardoasmarques
Copy link
Contributor

I wonder if changing it to cephadm shell -- ceph orch status is an alternative for the timeout, see my comment on the following commit smithfarm@5e7fa83 - need to be tested

@denisok
Copy link

denisok commented Jun 3, 2020

maybe, but some error message would be also nice

@ricardoasmarques
Copy link
Contributor

cephadm PR to automatically keep /etc/ceph/ceph.conf in sync on all hosts: ceph/ceph#35576

ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 7, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 8, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 8, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 8, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 24, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 24, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 24, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 24, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 24, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 30, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Jul 31, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Aug 10, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Aug 10, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
ricardoasmarques added a commit to ricardoasmarques/ceph-salt that referenced this issue Aug 10, 2020
Fixes: ceph#199

Signed-off-by: Ricardo Marques <rimarques@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked-by-cephadm enhancement New feature or request
Projects
None yet
5 participants