Skip to content

Commit

Permalink
Merge pull request #26942 from dzafman/wip-38616
Browse files Browse the repository at this point in the history
Feature: Improvements to auto repair

Reviewed-by: Neha Ojha <nojha@redhat.com>
  • Loading branch information
dzafman committed Mar 26, 2019
2 parents 0247d56 + 769cdc8 commit 3335774
Show file tree
Hide file tree
Showing 19 changed files with 580 additions and 40 deletions.
3 changes: 3 additions & 0 deletions doc/dev/placement-group.rst
Expand Up @@ -186,6 +186,9 @@ User-visible PG States
*forced_backfill*
the PG has been marked for highest priority backfill

*failed_repair*
an attempt to repair the PG has failed. Manual intervention is required.


OMAP STATISTICS
===============
Expand Down
4 changes: 2 additions & 2 deletions doc/rados/configuration/osd-config-ref.rst
Expand Up @@ -361,8 +361,8 @@ scrubbing operations.
``osd scrub auto repair``

:Description: Setting this to ``true`` will enable automatic pg repair when errors
are found in deep-scrub. However, if more than ``osd scrub auto repair num errors``
errors are found a repair is NOT performed.
are found in scrub or deep-scrub. However, if more than
``osd scrub auto repair num errors`` errors are found a repair is NOT performed.
:Type: Boolean
:Default: ``false``

Expand Down
12 changes: 12 additions & 0 deletions qa/standalone/osd/osd-rep-recov-eio.sh
Expand Up @@ -110,18 +110,30 @@ function rados_get_data() {

local poolname=pool-rep
local objname=obj-$inject-$$
local pgid=$(get_pg $poolname $objname)

rados_put $dir $poolname $objname || return 1
inject_$inject rep data $poolname $objname $dir 0 || return 1
rados_get $dir $poolname $objname || return 1

COUNT=$(ceph pg $pgid query | jq '.info.stats.stat_sum.num_objects_repaired')
test "$COUNT" = "1" || return 1

inject_$inject rep data $poolname $objname $dir 0 || return 1
inject_$inject rep data $poolname $objname $dir 1 || return 1
rados_get $dir $poolname $objname || return 1

COUNT=$(ceph pg $pgid query | jq '.info.stats.stat_sum.num_objects_repaired')
test "$COUNT" = "2" || return 1

inject_$inject rep data $poolname $objname $dir 0 || return 1
inject_$inject rep data $poolname $objname $dir 1 || return 1
inject_$inject rep data $poolname $objname $dir 2 || return 1
rados_get $dir $poolname $objname hang || return 1

# After hang another repair couldn't happen, so count stays the same
COUNT=$(ceph pg $pgid query | jq '.info.stats.stat_sum.num_objects_repaired')
test "$COUNT" = "2" || return 1
}

function TEST_rados_get_with_eio() {
Expand Down

0 comments on commit 3335774

Please sign in to comment.