Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools/cephfs_mirror: enable multiple peers #52456

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dparmar18
Copy link
Contributor

@dparmar18 dparmar18 commented Jul 14, 2023

Fixes: https://tracker.ceph.com/issues/61978

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@github-actions github-actions bot added cephfs Ceph File System tests labels Jul 14, 2023
@dparmar18 dparmar18 changed the title qa: test multiple peers tools/cephfs_mirror: enable multiple peers Jul 15, 2023
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Fixes: https://tracker.ceph.com/issues/61978
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
@dparmar18 dparmar18 marked this pull request as ready for review July 16, 2023 13:19
@dparmar18 dparmar18 requested review from a team and vshankar July 16, 2023 13:19
@dparmar18
Copy link
Contributor Author

@dparmar18
Copy link
Contributor Author

@vshankar There's just one test case added currently and I'm sure more needs to be added; can you help identify what all should I test, some test cases that I identified based on existing tests could be:

  • Test disabling mirroring with multiple peers
  • Blocklist mirror daemon; restart and make sure peers are intact
  • Test mirror daemon status with multiple peers
  • Incremental sync
  • With parent snapshot/purged snapshot

Do let me know if the above list needs any addition/removal

@@ -1305,3 +1319,23 @@ def test_local_and_remote_dir_root_mode(self):
self.disable_mirroring(self.primary_fs_name, self.primary_fs_id)
self.mount_a.run_shell(["rmdir", "l1/.snap/snap0"])
self.mount_a.run_shell(["rmdir", "l1"])

def test_mirror_multiple_peers(self):
Copy link
Contributor

@vshankar vshankar Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also check if the snapshots have been synchronized to all the peers.

I suggest, creating a separate class for

  • fan-out peer setup (site-a->site-b, site-a->site-c, site-a->site-d)
  • cascaded setup (site-a->site-b->site-c->site-d)

Mostly, you'd need to verify if the snapshots get synchronized to all peers - this would involve checks via peer status command.

@dparmar18
Copy link
Contributor Author

@vshankar local test case failure(unable to run admin-daemon cmd) https://termbin.com/9j8zf

@dparmar18
Copy link
Contributor Author

other admin-daemon cmds not working https://termbin.com/wuz5

I know these cmds are incomplete and ought to fail but it should've not said this right, should've threw some suggestions(yesterday worked fine)

@dparmar18 dparmar18 force-pushed the enable_multipeer branch 2 times, most recently from 9c61fc0 to 2e0d510 Compare July 24, 2023 11:39
@vshankar
Copy link
Contributor

@dparmar18 In my list. On it soon.

@dparmar18
Copy link
Contributor Author

@dparmar18 In my list. On it soon.

@vshankar ah i was talking about this one #52524. This PR is still wip

@vshankar
Copy link
Contributor

@dparmar18 In my list. On it soon.

@vshankar ah i was talking about this one #52524. This PR is still wip

Ah ok. Best is to tag me on the PR mentioning that it's ready for review.

@dparmar18
Copy link
Contributor Author

@vshankar the test_mirroring.py needs further more changes to accommodate the cascaded setup, it includes touching class TestMirroing's tests too. I'll need to add those changes here I'll modify the helper methods and doing this will require those test cases to be modified, i hope thats fine right?

@vshankar
Copy link
Contributor

@vshankar the test_mirroring.py needs further more changes to accommodate the cascaded setup, it includes touching class TestMirroing's tests too. I'll need to add those changes here I'll modify the helper methods and doing this will require those test cases to be modified, i hope thats fine right?

Touching which test cases? The one's in TestMirroring that exists now? You can add the fan-out/cascaded tests in a separate class, since the only test that would make sense there is to validate that the snapshots synchronize to all the peers.

@dparmar18
Copy link
Contributor Author

dparmar18 commented Jul 27, 2023

@vshankar the test_mirroring.py needs further more changes to accommodate the cascaded setup, it includes touching class TestMirroing's tests too. I'll need to add those changes here I'll modify the helper methods and doing this will require those test cases to be modified, i hope thats fine right?

Touching which test cases? The one's in TestMirroring that exists now? You can add the fan-out/cascaded tests in a separate class, since the only test that would make sense there is to validate that the snapshots synchronize to all the peers.

i have added a new class for helpers(because the classes i wrote had to use helpers present in TestMirroring); TestMirroring and other two classes(that i added for testing fan out and cascaded setup) now use those helpers. These helpers have some values hardcoded like fs_map = status.get_fsmap_byname(self.primary_fs_name) in get_peer_uuid() which doesn't work in cascaded setup as i might want to get the fsmap of fs other than primary fs. This would mean I need to make the helper get_peer_uuid() accept a fs, this would require changes to all the test cases and other helpers using get_peer_uuid()

Fixes: https://tracker.ceph.com/issues/61978
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
@dparmar18 dparmar18 force-pushed the enable_multipeer branch 2 times, most recently from c32dc5e to eab5762 Compare July 31, 2023 09:06
@dparmar18
Copy link
Contributor Author

class TestMirroring passed with my modifications - http://pulpito.front.sepia.ceph.com/dparmar-2023-07-28_10:58:32-fs:mirror-enable_multipeer-distro-default-smithi/

Please nudge when ready for review.

test classes are ready but you mentioned on IRC last week that we need more YAMLs for these classes to only run a subset of tests? Should I put them in qa/suites/fs/mirror/tasks/ as mirror_multiple_peers_fanout.yaml and mirror_multiple_peers_cascaded.yaml ?

@vshankar
Copy link
Contributor

class TestMirroring passed with my modifications - http://pulpito.front.sepia.ceph.com/dparmar-2023-07-28_10:58:32-fs:mirror-enable_multipeer-distro-default-smithi/

Please nudge when ready for review.

test classes are ready but you mentioned on IRC last week that we need more YAMLs for these classes to only run a subset of tests? Should I put them in qa/suites/fs/mirror/tasks/ as mirror_multiple_peers_fanout.yaml and mirror_multiple_peers_cascaded.yaml ?

Yeh - mirror_fanout.yaml and mirror_cascaded.yaml sounds a bit more appropriate.

Fixes: https://tracker.ceph.com/issues/61978
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
Fixes: https://tracker.ceph.com/issues/61978
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
@dparmar18 dparmar18 marked this pull request as ready for review August 2, 2023 02:54
@dparmar18
Copy link
Contributor Author

@dparmar18 dparmar18 requested a review from vshankar August 2, 2023 05:15
@dparmar18
Copy link
Contributor Author

@vshankar there are a lot of changes made to the existing code (and there are some more plans in future to the qa code), if you need help, we can go through the changes together. Do let me know


class MirroringHelpers(CephFSTestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is an helper class, I believe this need not be subclassed from CephFSTestCase, yes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'a a bit complicated, the actual plan is, YES make this helper class an actual helper class but it requires a lot of changes that are unrelated to this PR(making changes to all other test cases). It's a bit hard to explain you in comment, if possible do let know if/when you can join for a call or maybe discuss this post standup tomorrow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but hard to explain why this needs to be subclassed from CephFSTestCase? Could you explain a bit and I'll try my best to understand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it's not just me who is required to review PRs - so taking time to explain here is going to benefit everyone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the thing is that the change to enable multiple peers required me to create a helper class since the existing helpers in the class TestMirroring has hard coded values that only applies to single peer but just adding those helpers into a class and use it into test classes directly is not enough because the helpers that are used in the existing class TestMirroring (that are also used in the classes I wrote) now need modifications. This forced me to make changes in this PR that are not related (to the tracker). So in order to make minimal unrelated changes, some helpers still depends on the CephFSTestCase, thus you see it being inherited.

I'm planning to make it a proper helper class but I'm not sure which way to go with; either I can make the changes here in this PR altogether or in a new PR. I'm open to suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the thing is that the change to enable multiple peers required me to create a helper class since the existing helpers in the class TestMirroring has hard coded values that only applies to single peer but just adding those helpers into a class and use it into test classes directly is not enough because the helpers that are used in the existing class TestMirroring (that are also used in the classes I wrote) now need modifications. This forced me to make changes in this PR that are not related (to the tracker). So in order to make minimal unrelated changes, some helpers still depends on the CephFSTestCase, thus you see it being inherited.

I'm planning to make it a proper helper class but I'm not sure which way to go with; either I can make the changes here in this PR altogether or in a new PR. I'm open to suggestions.

IMO having is as a separate (standalone) helper would be the better. I understand that its more work that's the current implementation, but it way better in maintenance in the longer run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you want me to do it in this PR itself right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@@ -242,6 +258,24 @@ def get_mirror_daemon_status(self):
log.debug(f'status: {status}')
return status


class TestMirroring(MirroringHelpers):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, then this can be subclassed from MirroringHelpers and CephFSTestCase.

@vshankar
Copy link
Contributor

jenkins test make check

@dparmar18 dparmar18 requested review from vshankar and a team August 28, 2023 11:11
Copy link

github-actions bot commented Nov 7, 2023

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Jan 16, 2024
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Feb 15, 2024
@dparmar18 dparmar18 reopened this Feb 15, 2024
@github-actions github-actions bot removed the stale label Feb 15, 2024
@vshankar
Copy link
Contributor

As discussed, @dparmar18 will revisit this once other higher priority items are taken care.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants