New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mgr/cephadm/schedule: fix message #41257
Conversation
|
cc @huww98 |
|
Following up 1bf09d1#r50566843 . Still not making sense to me. If some host does not belong to mon public_network, then why will this happen when upgrading, since I don't change any IP address? And why it is resolved by itself? |
I'm a bit confused by that as well. My best guess is that you'll see the same message again if you restart the mgr daemon or do 'ceph orch apply mon 5' (or whatever the current placement is) |
src/pybind/mgr/cephadm/schedule.py
Outdated
| @@ -304,7 +304,7 @@ def get_candidates(self) -> List[DaemonPlacement]: | |||
| ls.append(h) | |||
| else: | |||
| logger.info( | |||
| f"Filtered out host {h.hostname}: could not verify host allowed virtual ips") | |||
| f"Filtered out host {h.hostname}: host does not belong to mon public_network") | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're trying to invent a message here, but only self.filter_new_host can provide a proper message. can we move this this log message to the implementation of self.filter_new_host ?
56b8b88
to
ae9bdb8
Compare
This is now only used when scheduling mons. (Units now enable the kernel features needed instead of checking for them during placement.) Move the message to the filter itself. Signed-off-by: Sage Weil <sage@newdream.net>
ae9bdb8
to
d5aba1e
Compare
No, the log looks like this: The logs when upgrading looks like: This message first appears when a new mgr is activated. Then This checking is preventing new mon from being deployed, and hindering the upgrade process of mons. After about 10 minutes, at 17:18:16, host I guess when a new manager starts, it does not have enough information. So it just prevents deploying for safe. But what blocks it for 10 minutes? |
|
Oh! I know what the problem is. 1897d1c changed the way we store the per-host network interface/network info. On upgrade, cephadm thinks there are no networks on each host until the device refresh happens. |
|
jenkins, retest this please. |
|
No description provided.