Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/pg_autoscaler: treat target ratios as weights #33035

Merged
merged 11 commits into from
Feb 10, 2020

Conversation

jdurgin
Copy link
Member

@jdurgin jdurgin commented Feb 2, 2020

The tests quickly ran into bugs in the progress handling that crashed the module, so those are fixed too.

https://tracker.ceph.com/issues/43947

Only the first 7 commits (in --topo-order, not as shown by github) are relevant for a backport - the rest are master-specific.

Copy link
Member

@liewegas liewegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! few nits around the release note and table formatting

Normalize across pools so that it's simpler to use - this way you
don't have to adjust every other pool when you add one.

Handle pools with target_bytes by taking their capacity off the top,
and dividing the rest into the pools with a target_ratio.

If both target bytes and ratio are specified, ignore bytes. This
matches the docs and makes accounting simpler.

Fixes: https://tracker.ceph.com/issues/43947

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Also check for pg_num_target being set correctly, rather than pg_num,
so the test doesn't depend on merging/splitting speed.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Since the ratios are normalized, they cannot exceed 1.0 or overcommit
combined with target_bytes.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Reset the progress each time we make an adjustment, and track progress
from that initial state to that new target. Previously we were also
using the wrong target: the current pg_num_target, not the new value
(pg_num_final) that we set.

Look up the pool by name, not id, in _maybe_adjust(), since that is how it is
retrieved by osdmap.get_pools_by_name().

Dedupe some logic into PgAdjustmentProgress to simplify things.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Keep it as an int so we don't have to cast back and forth.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
@tchaikov tchaikov self-assigned this Feb 10, 2020
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
@tchaikov
Copy link
Contributor

changelog

  • rebased against master to resolve conflictions.

@tchaikov tchaikov merged commit d352315 into ceph:master Feb 10, 2020
obnoxxx added a commit to obnoxxx/ocs-operator that referenced this pull request Feb 28, 2020
Currently, due to a floating point imprecision in ceph, we get a
"HEALTH_WARN 1 subtrees have overcommitted pool target_size_ratio"
after deplyoing with the TargetSizeRatio of .5 (0.5 + 0.5 > 1.0..).

This will be fixed in future Ceph versions
(ceph/ceph#33035).

In order to avoid the warning until we have the updated Ceph,
this patch lowers the ratio to 0.49.

https://bugzilla.redhat.com/show_bug.cgi?id=1807950

Signed-off-by: Michael Adam <obnox@redhat.com>
@jdurgin jdurgin deleted the wip-target-ratio branch March 20, 2020 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants