Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: ENOENT on clone #4385

Merged
1 commit merged into from Apr 29, 2015

Conversation

Projects
None yet
2 participants
@xinxinsh
Copy link
Member

commented Apr 17, 2015

ReplicatedPG: trim backfill intervals based on peer's last_backfill_s…
…tarted

Otherwise, we fail to trim the peer's last_backfill_started and get bug 11199.

1) osd 4 backfills up to 31bccdb2/mira01213209-286/head (henceforth: foo)

2) Interval change happens

3) osd 0 now finds itself backfilling to 4 (lb=foo) and osd.5
(lb=b6670ba2/mira01213209-160/snapdir//1, henceforth: bar)

4) recover_backfill causes both 4 and 5 to scan forward, so 4 has an interval
starting at foo, 5 has an interval starting at bar.

5) Once those have come back, recover_backfill attempts to trim off the
last_backfill_started, but 4's interval starts after that, so foo remains in
osd 4's interval (this is the bug)

7) We serve a copyfrom on foo (sent to 4 as well).

8) We eventually get to foo in the backfilling. Normally, they would have the
same version, but of course we don't update osd.4's interval from the log since
it should not have received writes in that interval. Thus, we end up trying to
recover foo on osd.4 anyway.

9) But, an interval change happens between removing foo from osd.4 and
completing the recovery, leaving osd.4 without foo, but with lb >= foo

Fixes: #11199
Backport: firefly
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 1388d6b)

@ghost ghost added bug fix core labels Apr 17, 2015

@ghost ghost added this to the firefly milestone Apr 17, 2015

@ghost ghost assigned xinxinsh Apr 17, 2015

ghost pushed a commit that referenced this pull request Apr 29, 2015

Loic Dachary
Merge pull request #4385 from xinxinsh/wip-11199-firefly
osd: ENOENT on clone

Reviewed-by: Samuel Just <sjust@redhat.com>

@ghost ghost merged commit 07031b1 into ceph:firefly Apr 29, 2015

This issue was closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.