-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #21421 from liewegas/wip-doc-balancer
doc/mgr/balancer: document Reviewed-by: Dan van der Ster <daniel.vanderster@cern.ch>
- Loading branch information
Showing
2 changed files
with
147 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
Balancer plugin | ||
=============== | ||
|
||
The *balancer* plugin can optimize the placement of PGs across OSDs in | ||
order to achieve a balanced distribution, either automatically or in a | ||
supervised fashion. | ||
|
||
Enabling | ||
-------- | ||
|
||
The *balancer* module is enabled with:: | ||
|
||
ceph mgr module enable balancer | ||
|
||
(It is enabled by default.) | ||
|
||
Status | ||
------ | ||
|
||
The current status of the balancer can be check at any time with:: | ||
|
||
ceph balancer status | ||
|
||
|
||
Automatic balancing | ||
------------------- | ||
|
||
The automatic balancing can be enabled, using the default settings, with:: | ||
|
||
ceph balancer on | ||
|
||
The balancer can be turned back off again with:: | ||
|
||
ceph balancer off | ||
|
||
This will use the ``crush-compat`` mode, which is backward compatible | ||
with older clients, and will make small changes to the data | ||
distribution over time to ensure that OSDs are equally utilized. | ||
|
||
|
||
Throttling | ||
---------- | ||
|
||
No adjustments will be made to the PG distribution if the cluster is | ||
degraded (e.g., because an OSD has failed and the system has not yet | ||
healed itself). | ||
|
||
When the cluster is healthy, the balancer will throttle its changes | ||
such that the percentage of PGs that are misplaced (i.e., that need to | ||
be moved) is below a threshold of (by default) 5%. The | ||
``max_misplaced`` threshold can be adjusted with:: | ||
|
||
ceph config-key set mgr/balancer/max_misplaced .07 # 7% | ||
|
||
|
||
Modes | ||
----- | ||
|
||
There are currently two supported balancer modes: | ||
|
||
#. **crush-compat**. The CRUSH compat mode uses the compat weight-set | ||
feature (introduced in Luminous) to manage an alternative set of | ||
weights for devices in the CRUSH hierarchy. The normal weights | ||
should remain set to the size of the device to reflect the target | ||
amount of data that we want to store on the device. The balancer | ||
then optimizes the weight-set values, adjusting them up or down in | ||
small increments, in order to achieve a distribution that matches | ||
the target distribution as closely as possible. (Because PG | ||
placement is a pseudorandom process, there is a natural amount of | ||
variation in the placement; by optimizing the weights we | ||
counter-act that natural variation.) | ||
|
||
Notably, this mode is *fully backwards compatible* with older | ||
clients: when an OSDMap and CRUSH map is shared with older clients, | ||
we present the optimized weights as the "real" weights. | ||
|
||
The primary restriction of this mode is that the balancer cannot | ||
handle multiple CRUSH hierarchies with different placement rules if | ||
the subtrees of the hierarchy share any OSDs. (This is normally | ||
not the case, and is generally not a recommended configuration | ||
because it is hard to manage the space utilization on the shared | ||
OSDs.) | ||
|
||
#. **upmap**. Starting with Luminous, the OSDMap can store explicit | ||
mappings for individual OSDs as exceptions to the normal CRUSH | ||
placement calculation. These `upmap` entries provide fine-grained | ||
control over the PG mapping. This CRUSH mode will optimize the | ||
placement of individual PGs in order to achieve a balanced | ||
distribution. In most cases, this distribution is "perfect," which | ||
an equal number of PGs on each OSD (+/-1 PG, since they might not | ||
divide evenly). | ||
|
||
Note that using upmap requires that all clients be Luminous or newer. | ||
|
||
The default mode is ``crush-compat``. The mode can be adjusted with:: | ||
|
||
ceph balancer mode upmap | ||
|
||
or:: | ||
|
||
ceph balancer mode crush-compat | ||
|
||
Supervised optimization | ||
----------------------- | ||
|
||
The balancer operation is broken into a few distinct phases: | ||
|
||
#. building a *plan* | ||
#. evaluating the quality of the data distribution, either for the current PG distribution, or the PG distribution that would result after executing a *plan* | ||
#. executing the *plan* | ||
|
||
To evautate and score the current distribution,:: | ||
|
||
ceph balancer eval | ||
|
||
You can also evaluate the distribution for a single pool with:: | ||
|
||
ceph balancer eval <pool-name> | ||
|
||
Greater detail for the evaluation can be seen with:: | ||
|
||
ceph balancer eval-verbose ... | ||
The balancer can generate a plan, using the currently configured mode, with:: | ||
|
||
ceph balancer optimize <plan-name> | ||
|
||
The name is provided by the user and can be any useful identifying string. The contents of a plan can be seen with:: | ||
|
||
ceph balancer show <plan-name> | ||
|
||
Old plans can be discarded with:: | ||
|
||
ceph balancer rm <plan-name> | ||
|
||
Currently recorded plans are shown as part of the status command:: | ||
|
||
ceph balancer status | ||
|
||
The quality of the distribution that would result after executing a plan can be calculated with:: | ||
|
||
ceph balancer eval <plan-name> | ||
|
||
Assuming the plan is expected to improve the distribution (i.e., it has a lower score than the current cluster state), the user can execute that plan with:: | ||
|
||
ceph balancer execute <plan-name> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters