Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quincy: mds: fix stray evaluation using scrub and introduce new option #50815

Merged
merged 14 commits into from Aug 17, 2023

Conversation

dparmar18
Copy link
Contributor

Backport tracker: https://tracker.ceph.com/issues/59262, https://tracker.ceph.com/issues/59265
Backport of #47649
Parent tracker: https://tracker.ceph.com/issues/53724, https://tracker.ceph.com/issues/51824

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

Scrub could evaluate strays but there was a race condition where
an inode pinned for scrub would land for purging which would indeed
lead to crashes.

Fixes: https://tracker.ceph.com/issues/51824
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit a8680d3)
it an inode is being purged, scrub should not be ran on it.

Fixes: https://tracker.ceph.com/issues/51824
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit ce12cd7)
Running scrub at CephFS root doesn't iterate over ~mdsdir,
this feature is needed but the cluster admin should have
the liberty to decide whether to evaluate strays at root
or not. Therefore a new option 'scrub_mdsdir' can be useful.

Fixes: https://tracker.ceph.com/issues/53724
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit e65eb2b)
As enqueue_scrub is invoked twice to perform stray evaluation
at root using new op scrub_mdsdir, tests fail as command
scrub start / scrub_mdsdir would return two JSON object which
JSONDecodeError cannot parse, this patch would make sure we do
not dump JSON for recursive scrub on ~mdsdir than happens along with
scrubbing root (only when using scrub_mdsdir op)

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 4a976b5)
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit c85347f)
and make sure it does not dispose JSON for recursive scrub
on ~mdsdir while scrubbing with flag 'scrub_mdsdir'

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit b32c572)
- test_stray_evaluation_with_scrub
this assures that evaluating strays with scrub works fine and no
crash is detected.

- test_flag_scrub_mdsdir
test the new flag to scrub ~mdsdir at CephFS root

Fixes: https://tracker.ceph.com/issues/51824
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 632c8b0)
Also documented the use of new flag introduced in this PR

Fixes: https://tracker.ceph.com/issues/53724
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit d22e60f)
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit f654de9)
@dparmar18 dparmar18 requested a review from a team March 31, 2023 10:19
@dparmar18 dparmar18 requested a review from a team as a code owner March 31, 2023 10:19
@github-actions github-actions bot added this to the quincy milestone Mar 31, 2023
@dparmar18 dparmar18 added the DNM label May 18, 2023
@dparmar18
Copy link
Contributor Author

Put on hold, #50813 (comment)

@dparmar18 dparmar18 changed the title quincy: mds: fix stray evaluation using scrub and introduce new option [DNM] quincy: mds: fix stray evaluation using scrub and introduce new option May 18, 2023
This would avoid the need to run individual scrubs for
~mdsdir and root, i.e. run both the scrubs under the
same header, this also helps to avoid edge case where
in case ~mdsdir is huge and it's taking time to scrub it,
the scrub status would report something like this until
root inodes kick in:

{
    "status": "scrub active (757 inodes in the stack)",
    "scrubs": {}
}

Fixes: https://tracker.ceph.com/issues/59350
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit f85b228)
Previouly, two individual scrubs were initiated to scrub ~mdsdir
at root where the ~mdsdir scrub wasn't provided any tag thus, it
was necessary to not dump it's values for output of 'scrub start'.
Now since mdsdir and root scrub run under single header, there is
no need for this anymore, thus removing this redundant code.

Fixes: https://tracker.ceph.com/issues/59350
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 2f06fee)
Previously when ~mdsdir was scrubbed at CephFS root, it's header
was kept empty, thus it became necessary to not dump it's values
for 'scrub status'. Now since both the scrubs(~mdsdir and root)
run under the same header, this code is no more needed.

Fixes: https://tracker.ceph.com/issues/59350
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 548af10)
Code has been changed, in order to scrub ~mdsdir at root,
recursive flag also needs to be provided along with
scrub_mdsdir.

Fixes: https://tracker.ceph.com/issues/59350
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit e40ca40)
for mdsdir scrub at CephFS root.

Fixes: https://tracker.ceph.com/issues/59350
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 93dfc11)
@dparmar18
Copy link
Contributor Author

Put on hold, #50813 (comment)

Comment addressed in #51539, added its commits

@dparmar18 dparmar18 changed the title [DNM] quincy: mds: fix stray evaluation using scrub and introduce new option quincy: mds: fix stray evaluation using scrub and introduce new option Jul 18, 2023
@dparmar18 dparmar18 added needs-qa and removed DNM labels Jul 18, 2023
@vshankar
Copy link
Contributor

jenkins retest this please

@batrick
Copy link
Member

batrick commented Aug 17, 2023

jenkins test api

@batrick
Copy link
Member

batrick commented Aug 17, 2023

jenkins test make check

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuriw yuriw merged commit e418a8c into ceph:quincy Aug 17, 2023
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants