Skip to content

tentacle: mon/HealthMonitor: avoid MON_DOWN for freshly added Monitor#67323

Open
batrick wants to merge 4 commits intoceph:tentaclefrom
batrick:wip-74043-tentacle
Open

tentacle: mon/HealthMonitor: avoid MON_DOWN for freshly added Monitor#67323
batrick wants to merge 4 commits intoceph:tentaclefrom
batrick:wip-74043-tentacle

Conversation

@batrick
Copy link
Copy Markdown
Member

@batrick batrick commented Feb 12, 2026

backport tracker: https://tracker.ceph.com/issues/74043


backport of #66328
parent tracker: https://tracker.ceph.com/issues/73934

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

@github-actions
Copy link
Copy Markdown

Config Diff Tool Output

+ added: mon_down_added_grace (mon.yaml.in)
! changed: mon_down_mkfs_grace: old:  (mon.yaml.in)
! changed: mon_down_mkfs_grace: new: ['runtime'] (mon.yaml.in)

The above configuration changes are found in the PR. Please update the relevant release documentation if necessary.
Ignore this comment if docs are already updated. To make the "Check ceph config changes" CI check pass, please comment /config check ok and re-run the test.

@batrick
Copy link
Copy Markdown
Member Author

batrick commented Feb 12, 2026

/config check ok

@batrick batrick force-pushed the wip-74043-tentacle branch 4 times, most recently from c5fc774 to f468b95 Compare February 14, 2026 15:12
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit 42a3791)
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit 62c449e)
So we know when the Monitor was added to the map.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit c5b43e9)

Conflicts:
	src/mon/MonMap.cc: generate_test_instances refactor missing
In testing, we often have the scenario where cephadm has created a
cluster but doesn't add more monitors until well past
mon_down_mkfs_grace. This causes useless MON_DOWN warnings to be thrown
which fails QA jobs. Avoid this situation entirely by giving a
reasonable grace period for a monitor added to the MonMap to join
quorum.

Fixes: https://tracker.ceph.com/issues/73934
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit b028a41)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant