New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon,osd: new mechanism for managing full and nearfull OSDs for luminous #13615

Merged
merged 16 commits into from Mar 8, 2017

Conversation

Projects
None yet
4 participants
@liewegas
Member

liewegas commented Feb 23, 2017

  • per-osd nearfull and full flags
  • full_ratio and nearfull_ratio stored in the osdmap
  • new message from osd to mon requesting a state change (by the osd)
  • mon switches cluster full behavior over from old pg-map scheme to osdmap scheme when require_luminous_osds is set
  • new mon commands to adjust the osdmap thresholds
  • some cleanup of the osd-side code
@liewegas

This comment has been minimized.

Member

liewegas commented Feb 26, 2017

retest this please

1 similar comment
@liewegas

This comment has been minimized.

Member

liewegas commented Feb 27, 2017

retest this please

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Feb 27, 2017

This still has a bunch of failures and they don't all look like noise.

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 2, 2017

@gregsfortytwo failures fixed

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 6, 2017

passed testing, awaiting final review

liewegas added some commits Feb 23, 2017

osd: add per-osd FULL and NEARFULL state bits
Signed-off-by: Sage Weil <sage@redhat.com>
osd/OSDMap: add [near]full_ratio to OSDMap[::Incremental]
This used to live in PGMap; we're moving it here for luminous
(which makes more sense anyway!).

Signed-off-by: Sage Weil <sage@redhat.com>
osd: rename failsafe [near]full getters appropriately
...and make most of these methods private to clarify the public
interface

Signed-off-by: Sage Weil <sage@redhat.com>
dout(7) << __func__ << " state already " << state << " for osd." << from
<< " " << m->get_orig_source_inst() << dendl;
_reply_map(op, m->version);
return true;

This comment has been minimized.

@dzafman

dzafman Mar 6, 2017

Member

goto ignore?

This comment has been minimized.

@liewegas

liewegas Mar 6, 2017

Member

fixed

<< " -> " << get_full_state_name(new_state) << dendl;
if (new_state == FAILSAFE) {
clog->error() << "failsafe engaged, dropping updates, now "
<< (int)(ratio * 100) << "% full";

This comment has been minimized.

@dzafman

dzafman Mar 6, 2017

Member

My merged change uses (int)roundf(ratio * 100) which is being dropped in 2 places by your change

This comment has been minimized.

@liewegas

liewegas Mar 6, 2017

Member

fixed

@@ -19,6 +19,8 @@
created \d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d+ (re)
modified \d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d+ (re)
flags
full_ratio 0
nearfull_ratio 0

This comment has been minimized.

@dzafman

dzafman Mar 6, 2017

Member

Why are these values 0?

This comment has been minimized.

@liewegas

liewegas Mar 6, 2017

Member

that's just the default osdmap value; the mon sets it to something better during mkfs, or from pgmap during upgrade

if (!service.need_fullness_update())
return;
unsigned state = 0;
if (service.is_full()) {

This comment has been minimized.

@dzafman

dzafman Mar 6, 2017

Member

Should this also add "|| check_failsafe_fullI()"? is_full() only says cur_state == FULL?

Better yet make this code and need_fullness_update() consistent by changing is_full() to return true if FULL or FAILSAFE and fix need_fullness_update() to use is_full() and is_nearfull() to create want value.

This comment has been minimized.

@liewegas

liewegas Mar 6, 2017

Member

updated

liewegas added some commits Feb 23, 2017

mon/OSDMonitor: handle MOSDFull messages from OSDs
Signed-off-by: Sage Weil <sage@redhat.com>
mon/OSDMonitor: set osdmap ratios on mkfs
Signed-off-by: Sage Weil <sage@redhat.com>
mon/OSDMonitor: initialize osdmap ratios from pgmap on upgrade
Signed-off-by: Sage Weil <sage@redhat.com>
qa/workunits/cephtool/test.sh: change [near]full_ratio tests
Signed-off-by: Sage Weil <sage@redhat.com>
@dzafman

dzafman approved these changes Mar 6, 2017

bool OSDService::is_nearfull()
{
Mutex::Locker l(full_status_lock);
return cur_state >= NEARFULL;

This comment has been minimized.

@dzafman

dzafman Mar 6, 2017

Member

I would have left this "== NEARFULL" because it could be argued that near full represents only the range from nearfull ratio to below full ration.

liewegas added some commits Feb 23, 2017

osd: restructure and simplify internal fullness checks
First, eliminate the useless nearfull failsafe--all it did was
generate a log message, which we can do based on the OSDMap
states.

Add some new helpers.

Unify the cluster nearfull/full vs failsafe states so that
failsafe is a "really" full state that is more severe than
full, so we have NONE, NEARFULL, FULL, FAILSAFE.

Pull the full/nearfull ratios out of the OSDMap (remember that
we require luminous mons, so these will be initialized).

Signed-off-by: Sage Weil <sage@redhat.com>
osd: require fullness state changes (as needed) before boot
This ensures that we don't have a down osd that is marked full
go up, then realize it's not actually full, and then clear its
full flag.  That would result in a cluster full blip that isn't
needed. This can easily happen if the full_ratio in the osdmap is
increased while the OSD is down.

Signed-off-by: Sage Weil <sage@redhat.com>
osd: request a fullness state change during tick if needed
Signed-off-by: Sage Weil <sage@redhat.com>
mon/OSDMonitor: set cluster flags based on osd flags (luminous)
For luminous, set cluster flags based on osd flags.  Until
require_luminous is set, stick with the old pgmap-based behavior.
Move the new check to encode_pending so that the cluster flag is
set in the same epoch that the osd state(s) change.

Signed-off-by: Sage Weil <sage@redhat.com>
mon/PGMonitor: stop generating health warnings with luminous
Signed-off-by: Sage Weil <sage@redhat.com>
mon/OSDMonitor: generate health warnings for luminous
Note that this tells us how many OSDs are full or nearfull; it
does not include detailed warnings telling you exactly what the
utilization is because we don't have the full osd_stat_t
available.  We leave it to ceph-mgr to generate those health
messages.

Signed-off-by: Sage Weil <sage@redhat.com>
test/cli/osdmaptool: fix osdmap output
Signed-off-by: Sage Weil <sage@redhat.com>

@liewegas liewegas merged commit a681069 into ceph:master Mar 8, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@liewegas liewegas deleted the liewegas:wip-osd-full branch Mar 8, 2017

@jcsp

This comment has been minimized.

Contributor

jcsp commented Mar 10, 2017

Minor nit: cluster log changes like this need updates to the log whitelists in the fs suite (http://tracker.ceph.com/issues/19253)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment