Skip to content

Commit

Permalink
Merge pull request ceph#57022 from zdover23/wip-doc-2024-04-22-rados-…
Browse files Browse the repository at this point in the history
…operations-pg-troubleshooting

doc/rados: remove redundant pg repair commands

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
  • Loading branch information
zdover23 committed Apr 22, 2024
2 parents a2c682d + 3c2e8d3 commit 56e81df
Show file tree
Hide file tree
Showing 4 changed files with 62 additions and 121 deletions.
4 changes: 2 additions & 2 deletions doc/rados/operations/health-checks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -962,15 +962,15 @@ or ``snaptrim_error`` flag set, which indicates that an earlier data scrub
operation found a problem, or (2) have the *repair* flag set, which means that
a repair for such an inconsistency is currently in progress.

For more information, see :doc:`pg-repair`.
For more information, see :doc:`../troubleshooting/troubleshooting-pg`.

OSD_SCRUB_ERRORS
________________

Recent OSD scrubs have discovered inconsistencies. This alert is generally
paired with *PG_DAMAGED* (see above).

For more information, see :doc:`pg-repair`.
For more information, see :doc:`../troubleshooting/troubleshooting-pg`.

OSD_TOO_MANY_REPAIRS
____________________
Expand Down
1 change: 0 additions & 1 deletion doc/rados/operations/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ and, monitoring an operating cluster.
monitoring
monitoring-osd-pg
user-management
pg-repair
pgcalc/index

.. raw:: html
Expand Down
118 changes: 0 additions & 118 deletions doc/rados/operations/pg-repair.rst

This file was deleted.

60 changes: 60 additions & 0 deletions doc/rados/troubleshooting/troubleshooting-pg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -544,6 +544,12 @@ form:
.. prompt:: bash

ceph pg repair {placement-group-ID}

For example:

.. prompt:: bash #

ceph pg repair 1.4

.. warning: This command overwrites the "bad" copies with "authoritative"
copies. In most cases, Ceph is able to choose authoritative copies from all
Expand All @@ -553,13 +559,67 @@ form:
ignored when Ceph chooses the authoritative copies. Be aware of this, and
use the above command with caution.
.. note:: PG IDs have the form ``N.xxxxx``, where ``N`` is the number of the
pool that contains the PG. The command ``ceph osd listpools`` and the
command ``ceph osd dump | grep pool`` return a list of pool numbers.


If you receive ``active + clean + inconsistent`` states periodically due to
clock skew, consider configuring the `NTP
<https://en.wikipedia.org/wiki/Network_Time_Protocol>`_ daemons on your monitor
hosts to act as peers. See `The Network Time Protocol <http://www.ntp.org>`_
and Ceph :ref:`Clock Settings <mon-config-ref-clock>` for more information.

More Information on PG Repair
-----------------------------
Ceph stores and updates the checksums of objects stored in the cluster. When a
scrub is performed on a PG, the OSD attempts to choose an authoritative copy
from among its replicas. Only one of the possible cases is consistent. After
performing a deep scrub, Ceph calculates the checksum of an object that is read
from disk and compares it to the checksum that was previously recorded. If the
current checksum and the previously recorded checksum do not match, that
mismatch is considered to be an inconsistency. In the case of replicated pools,
any mismatch between the checksum of any replica of an object and the checksum
of the authoritative copy means that there is an inconsistency. The discovery
of these inconsistencies cause a PG's state to be set to ``inconsistent``.

The ``pg repair`` command attempts to fix inconsistencies of various kinds. If
``pg repair`` finds an inconsistent PG, it attempts to overwrite the digest of
the inconsistent copy with the digest of the authoritative copy. If ``pg
repair`` finds an inconsistent replicated pool, it marks the inconsistent copy
as missing. In the case of replicated pools, recovery is beyond the scope of
``pg repair``.

In the case of erasure-coded and BlueStore pools, Ceph will automatically
perform repairs if ``osd_scrub_auto_repair`` (default ``false``) is set to
``true`` and if no more than ``osd_scrub_auto_repair_num_errors`` (default
``5``) errors are found.

The ``pg repair`` command will not solve every problem. Ceph does not
automatically repair PGs when they are found to contain inconsistencies.

The checksum of a RADOS object or an omap is not always available. Checksums
are calculated incrementally. If a replicated object is updated
non-sequentially, the write operation involved in the update changes the object
and invalidates its checksum. The whole object is not read while the checksum
is recalculated. The ``pg repair`` command is able to make repairs even when
checksums are not available to it, as in the case of Filestore. Users working
with replicated Filestore pools might prefer manual repair to ``ceph pg
repair``.

This material is relevant for Filestore, but not for BlueStore, which has its
own internal checksums. The matched-record checksum and the calculated checksum
cannot prove that any specific copy is in fact authoritative. If there is no
checksum available, ``pg repair`` favors the data on the primary, but this
might not be the uncorrupted replica. Because of this uncertainty, human
intervention is necessary when an inconsistency is discovered. This
intervention sometimes involves use of ``ceph-objectstore-tool``.

PG Repair Walkthrough
---------------------
https://ceph.io/geen-categorie/ceph-manually-repair-object/ - This page
contains a walkthrough of the repair of a PG. It is recommended reading if you
want to repair a PG but have never done so.

Erasure Coded PGs are not active+clean
======================================
Expand Down

0 comments on commit 56e81df

Please sign in to comment.