New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc/rados: edit placement-groups.rst (2 of x) #51991
doc/rados: edit placement-groups.rst (2 of x) #51991
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know the drill
Setting a Minimum Number of PGs and a Maximum Number of PGs | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
If a minimum is set, then Ceph will not itself reduce (or recommend that you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/or/nor/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepted.
for you based on how much data is stored in the pool (see above, :ref:`pg-autoscaler`). | ||
If you opt not to specify ``pg_num`` in this command, the cluster uses the PG | ||
autoscaler to automatically configure the parameter in accordance with the | ||
amount of data that is stored in the pool (see :ref:`pg-autoscaler` above). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, won't it use osd_pool_default_pg_num
? As written this assumes that the autoscaler is enabled.
the additional of the balancer (which is also enabled by default), a | ||
value of more like 50 PGs per OSD is probably reasonable. The | ||
challenge (which the autoscaler normally does for you), is to: | ||
The traditional rule of thumb has held that there should be 100 PGs per OSD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tradition was 200, retconned to 100 a few years ago. I personally disagree with the party line here, but this isn't about me. Also, is rule of thumb a thing that might not be known to non-native English readers?
Suggest
Without the balancer, approximately 100 PG replicas on each OSD is the suggested target. With the balancer, however, an initial target of 50 PG replicas on each OSD is reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepted with slight modifications.
consideration | ||
- the number of PGs per pool should be proportional to the amount of | ||
data in the pool | ||
- there should be 50-100 PGs per pool, taking into account the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/pool/OSD/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepted.
- the number of PGs per pool should be proportional to the amount of | ||
data in the pool | ||
- there should be 50-100 PGs per pool, taking into account the | ||
replication overhead or erasure-coding fan-out of each PG across OSDs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
each PG's replicas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepted.
|
||
As long as there are one or two orders of magnitude more PGs than OSDs, the | ||
distribution is likely to be even. For example: 256 PGs for 3 OSDs, 512 PGs for | ||
10 OSDs, or 1024 PGs for 10 OSDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note about how pg_num
not being a power of 2 can work against uniform distribution. Don't go into why, if someone really cares they can read chapter 8 of my book, so long as they pre-dose with their favorite headache medication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is covered later on this same page (and will be edited in the "3 of x" PR in this series).
10 OSDs, or 1024 PGs for 10 OSDs. | ||
|
||
However, uneven data distribution can emerge due to factors other than the | ||
ratio of OSDs to PGs. For example, since CRUSH does not take into account the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PGs to OSDs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepted.
|
||
However, uneven data distribution can emerge due to factors other than the | ||
ratio of OSDs to PGs. For example, since CRUSH does not take into account the | ||
size of the objects, the presence of a few very large objects can create an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size of RADOS objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepted in all instances.
at all times and even more during recovery. Sharing this overhead by | ||
clustering objects within a placement group is one of the main reasons | ||
they exist. | ||
Every PG in the cluster imposes additional memory, network, and CPU demands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/additional//
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepted.
clustering objects within a placement group is one of the main reasons | ||
they exist. | ||
Every PG in the cluster imposes additional memory, network, and CPU demands | ||
upon OSDs and MONs. These needs must be met at all times and are still more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and are increased during recovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepted.
Edit doc/rados/operations/placement-groups.rst. https://tracker.ceph.com/issues/58485 Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
04a439d
to
8bdd271
Compare
Edit doc/rados/operations/placement-groups.rst.
https://tracker.ceph.com/issues/58485
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows