Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] multimds suite sync with fs #1114

Closed
wants to merge 10 commits into from
Closed

[DNM] multimds suite sync with fs #1114

wants to merge 10 commits into from

Conversation

batrick
Copy link
Member

@batrick batrick commented Aug 10, 2016

This PR does two things:

  • Simplifies fs sub-suites by using symlinks to common configurations wherever possible (making unique aspects of a sub-suite stand out and increasing code reuse).
  • Synchronizes the multimds suite with the fs suite (again, lots of symlinks). This adds several sub-suites/tasks that were missing in multimds.

I'm currently running a test on this branch here: http://pulpito.ceph.com/pdonnell-2016-08-10_03:03:20-multimds-master---basic-mira

@batrick batrick added the cephfs label Aug 10, 2016
@batrick
Copy link
Member Author

batrick commented Aug 22, 2016

Here's a run of the fs suite I did over the weekend:

http://pulpito.ceph.com/pdonnell-2016-08-19_22:12:32-fs-master-testing-basic-mira/

Found 7 failures... I'm looking into if those are new problems, caused by the reorg, or legit issues found due to the reorg.

@batrick
Copy link
Member Author

batrick commented Aug 22, 2016

Mostly the same failures as this nightly on master:

http://pulpito.ceph.com/teuthology-2016-08-13_17:15:02-fs-master---basic-smithi/

@gregsfortytwo
Copy link
Member

Why commit "suites/fs: unify/reuse common cluster layout"? I mean it makes the symlinks shorter but most of our clusters are common across suites and this one was a modified duplicate of one of the others in its original folder; moving it down makes them harder to find for global changes.

@@ -0,0 +1 @@
../../basic/mount/fuse.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really comfortable referencing down into parallel trees -- people generally aren't prepared for that and we'll miss it in future reviews. What's the benefit of sharing these besides removing a single task line that actually does vary across some tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case, it's just making the sub-suites consistent. Some sub-suites had the ceph-fuse task specified within a task (e.g. suites/fs/traceless/tasks/cfuse_workunit_suites_dbench.yaml) and others at the top-level (suites/fs/snaps/mount/ceph-fuse.yaml).

@gregsfortytwo
Copy link
Member

Okay, so overall you're trying to remove duplicates between the multimds and other suites and that makes sense. But I'm worried we're going to find that people change the standard fs suites in ways that bust up multi-mds without realizing it. If nothing else, we should add a doc explaining what's copied and why, and in the future we can pull stuff out into a common folder as needed. Ideally I'd like to see a tree of what stuff we expect to share and what we expect to be different, but I'm not sure if we have enough models right now to do it right and it may not be worth the time to start off with.

@gregsfortytwo
Copy link
Member

Oh, and are there any test runs of this on suites/fs and suites/multimds? :)

@gregsfortytwo
Copy link
Member

ping @batrick
@jcsp, do you have any thoughts on how we organize this?

@batrick
Copy link
Member Author

batrick commented Sep 1, 2016

Why commit "suites/fs: unify/reuse common cluster layout"? I mean it makes the symlinks shorter but most of our clusters are common across suites and this one was a modified duplicate of one of the others in its original folder; moving it down makes them harder to find for global changes.

The intent is just to make it clear if a sub-suite is doing something different from basic.

moving it down makes them harder to find for global changes.

I didn't get this part. Which global changes?

@batrick
Copy link
Member Author

batrick commented Sep 1, 2016

Okay, so overall you're trying to remove duplicates between the multimds and other suites and that makes sense. But I'm worried we're going to find that people change the standard fs suites in ways that bust up multi-mds without realizing it. If nothing else, we should add a doc explaining what's copied and why, and in the future we can pull stuff out into a common folder as needed. Ideally I'd like to see a tree of what stuff we expect to share and what we expect to be different, but I'm not sure if we have enough models right now to do it right and it may not be worth the time to start off with.

Well, most of the work is in identifying the differences which I've already done. I don't mind moving the common stuff to a shared directory so it's more obvious.

@batrick
Copy link
Member Author

batrick commented Sep 1, 2016

@jcsp
Copy link
Contributor

jcsp commented Sep 6, 2016

There's a failure in test_python.sh in that fs suite run. Looks like that test was run without the ceph--fuse task previously, and it's failing when run with it -- probably need to specialise that out again

@jcsp
Copy link
Contributor

jcsp commented Sep 6, 2016

Weirdly this is not showing up as having a merge conflict with the btrfs->xfs change that I just made in master, so I guess it needs updating by hand?

@jcsp
Copy link
Contributor

jcsp commented Sep 6, 2016

Please can you remove the multimds/recovery piece from here, the wip-kclient-python branch will do that (with the associated fixes that are required for those tests to actually work)

Oops, I was thinking of kclient/recovery. However, I think transplanting into multimds/recovery probably isn't useful: these python tests all run with an explicit number of MDSs (some of them already use multiple daemons in the fs suite), so running them again in the multimds suite probably isn't helpful.

@jcsp
Copy link
Contributor

jcsp commented Sep 6, 2016

From my point of view the advantage of the symlinks (as opposed to duplicating yaml fragments) is not so much the de-duplication as it is to make obvious which bits are like the rest of the tests, and which bits differ in a particular suite.

I am interested in the idea of adding a fs_common or similar folder (peer to fs, kcephfs, multimds) containing the common bits, to avoid quite so many places reaching into fs/basic (will be painful if we ever needed to modify fs/basic).

(Background: I'm not exactly in love with the whole YAML tree mechanism for specifying test permutations to begin with, so while this still isn't a super-clean thing overall, I do think it's probably an improvement and makes the tests easier to read by highlighting which bits differ)

batrick and others added 10 commits September 29, 2016 16:14
Added in 7ae1aef to help resolve http://tracker.ceph.com/issues/1737.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Goal is to reduce arbitrary differences between fs suites.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Goal is to reduce arbitrary differences between fs suites.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Goal is to reduce arbitrary differences between fs suites.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Goal is to reduce arbitrary differences between fs suites.

Note:
  o fs/{multiclient,multifs,recovery,snaps,standbyreplay} now use basic/overrides/whitelist_wrongly_marked_down.yaml.
  o fs/{snaps,standbyreplay} now use basic/overrides/debug.yaml.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Goal is to reduce arbitrary differences between fs suites.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Goal is to reduce arbitrary differences between fs suites.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Notably:

o Turn on directory fragmentation.
o Add several tests from fs/basic/tasks to multimds/basic.
o Remove libcephfs as fs/basic/tasks already contain
  multimds/basic/tasks.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This mirrors each suites/fs sub-suite except with the basic multimds
cluster configuration.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
libcephfs is sensitive to having the fuse client mounted, so disable for
these tests.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
@batrick batrick closed this Nov 23, 2016
@batrick batrick deleted the multimds-merge branch November 23, 2016 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants