Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qa: switch to use the merge fragment for fscrypt #50728

Merged
merged 3 commits into from Jun 12, 2023
Merged

Conversation

lxbsz
Copy link
Member

@lxbsz lxbsz commented Mar 29, 2023

PR #48183 have add the merge fragment support. This will switch to use fragment to simplify the code.

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@lxbsz lxbsz added the needs-qa label Mar 29, 2023
@lxbsz lxbsz requested a review from a team March 29, 2023 05:10
@github-actions github-actions bot added cephfs Ceph File System tests labels Mar 29, 2023
@lxbsz
Copy link
Member Author

lxbsz commented Mar 29, 2023

All the fscrypt qa tests are running.

Will use the postmerge fragment to check this.

Fixes: https://tracker.ceph.com/issues/59195
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Currently only the upstream kclient supports fscrypt feature.

Fixes: https://tracker.ceph.com/issues/59195
Signed-off-by: Xiubo Li <xiubli@redhat.com>
@lxbsz
Copy link
Member Author

lxbsz commented Mar 29, 2023

Only updated the commit comments.

Comment on lines +3 to +5
# Once can we make sure the distro kernels have included the fscrypt feature
# or the ceph-fuse have supported the fscrypt feature we can remove this
# restriction.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will have to keep track on this right whether and when this feature is added to distro kernels and/or ceph-fuse?

Copy link
Member Author

@lxbsz lxbsz Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this comment just as Patrick mentioned in the previous PR, just to make it clear why we add this restriction here and in which case we can remove it. But couldn't foresee when.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, i mean once the distro kernels/ceph-fuse are capable of fscrypt then we need to change it here and also here https://github.com/ceph/ceph/pull/48183/files#diff-63be9c532a24d81b6391bb01e3d994eef8eb6a948154c227164063e6acd592f7R3-R7. So it would be better that we note it down somewhere, or better way is to link these two in the fscrypt feature PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just linked this to the fscrypt feature PR #50728. Thanks.

@dparmar18
Copy link
Contributor

there's one job running close to 5 hours - https://pulpito.ceph.com/xiubli-2023-03-29_05:17:18-fs:fscrypt-wip-lxb-20230316-2212-unlink-revert-distro-default-smithi/7224905/, I see a lot of failures in generic/XXX that are not visible in other jobs.

@lxbsz
Copy link
Member Author

lxbsz commented Mar 29, 2023

there's one job running close to 5 hours - https://pulpito.ceph.com/xiubli-2023-03-29_05:17:18-fs:fscrypt-wip-lxb-20230316-2212-unlink-revert-distro-default-smithi/7224905/, I see a lot of failures in generic/XXX that are not visible in other jobs.

Yeah, I have checked it. It's because the osd6 dead for some reasons. I need to wait the job to be aborted and check the osd logs:

[ 2045.692294] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.699532] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.706782] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.714312] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.721540] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.728845] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M

@lxbsz
Copy link
Member Author

lxbsz commented Mar 29, 2023

I ran the qa tests again. This time the fscrypt-commom test failed, but it failed from:

generic/020       - output mismatch (see /tmp/tmp.jrgZljkscrxfstests-dev/results//generic/020.out.bad)
    --- tests/generic/020.out   2023-03-29 10:57:23.284330119 +0000
    +++ /tmp/tmp.jrgZljkscrxfstests-dev/results//generic/020.out.bad    2023-03-29 11:08:24.814914036 +0000
    @@ -47,9 +47,13 @@
     user.snrub="fish2\012"

     *** really long value
    -0000000 00
    -*
    -ATTRSIZE
    +attr_set: No space left on device
    +Could not set "long_attr" for <TESTFILE>
    +attr_get: No data available
    +Could not get "long_attr" for <TESTFILE>
    +0000000
    +attr_remove: No data available
    +Could not remove "long_attr" for <TESTFILE>
     *** set/get/remove really long names (expect failure)
     attr_set: Invalid argument
     Could not set "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" for <TESTFILE>

This is because in Ubuntu it doesn't support the selinux:

2023-03-29T10:44:58.481 INFO:teuthology.run_tasks:Running task selinux...
2023-03-29T10:44:58.530 DEBUG:teuthology.task.selinux:Excluding smithi043: OS 'ubuntu' does not support SELinux
2023-03-29T10:44:58.530 DEBUG:teuthology.task.selinux:Excluding smithi161: OS 'ubuntu' does not support SELinux
2023-03-29T10:44:58.530 DEBUG:teuthology.task.selinux:Excluding smithi174: OS 'ubuntu' does not support SELinux
2023-03-29T10:44:58.531 DEBUG:teuthology.task.selinux:Getting current SELinux state
2023-03-29T10:44:58.531 DEBUG:teuthology.task.selinux:Existing SELinux modes: {}

This is a known bug, and the fixing patch has been in the xfstests-dev for-next branch.

@lxbsz
Copy link
Member Author

lxbsz commented Mar 29, 2023

there's one job running close to 5 hours - https://pulpito.ceph.com/xiubli-2023-03-29_05:17:18-fs:fscrypt-wip-lxb-20230316-2212-unlink-revert-distro-default-smithi/7224905/, I see a lot of failures in generic/XXX that are not visible in other jobs.

Yeah, I have checked it. It's because the osd6 dead for some reasons. I need to wait the job to be aborted and check the osd logs:

[ 2045.692294] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.699532] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.706782] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.714312] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.721540] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.728845] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M

This is the same issue with #50728 (comment).

Copy link
Contributor

@rishabh-d-dave rishabh-d-dave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll this a second look and approve if after some time.

@dparmar18
Copy link
Contributor

there's one job running close to 5 hours - https://pulpito.ceph.com/xiubli-2023-03-29_05:17:18-fs:fscrypt-wip-lxb-20230316-2212-unlink-revert-distro-default-smithi/7224905/, I see a lot of failures in generic/XXX that are not visible in other jobs.

Yeah, I have checked it. It's because the osd6 dead for some reasons. I need to wait the job to be aborted and check the osd logs:

[ 2045.692294] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.699532] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.706782] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.714312] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.721540] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.728845] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M

This is the same issue with #50728 (comment).

okay so once your fix has been merged, it should go fine right

@lxbsz
Copy link
Member Author

lxbsz commented Mar 30, 2023

there's one job running close to 5 hours - https://pulpito.ceph.com/xiubli-2023-03-29_05:17:18-fs:fscrypt-wip-lxb-20230316-2212-unlink-revert-distro-default-smithi/7224905/, I see a lot of failures in generic/XXX that are not visible in other jobs.

Yeah, I have checked it. It's because the osd6 dead for some reasons. I need to wait the job to be aborted and check the osd logs:

[ 2045.692294] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.699532] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.706782] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.714312] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.721540] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.728845] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M

This is the same issue with #50728 (comment).

okay so once your fix has been merged, it should go fine right

Yeah, once that fix is applied to the xfstests-dev repo's master branch it will be fetched to git.ceph.com and then this test failure will disappear.

@lxbsz
Copy link
Member Author

lxbsz commented Mar 30, 2023

there's one job running close to 5 hours - https://pulpito.ceph.com/xiubli-2023-03-29_05:17:18-fs:fscrypt-wip-lxb-20230316-2212-unlink-revert-distro-default-smithi/7224905/, I see a lot of failures in generic/XXX that are not visible in other jobs.

Yeah, I have checked it. It's because the osd6 dead for some reasons. I need to wait the job to be aborted and check the osd logs:

[ 2045.692294] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.699532] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.706782] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.714312] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.721540] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.728845] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M

This is the same issue with #50728 (comment).

okay so once your fix has been merged, it should go fine right

Yeah, once that fix is applied to the xfstests-dev repo's master branch it will be fetched to git.ceph.com and then this test failure will disappear.

Or if the job is scheduled with the rhel/Centos OS the test will pass too. Only will it fail in Ubuntu.

@dparmar18
Copy link
Contributor

there's one job running close to 5 hours - https://pulpito.ceph.com/xiubli-2023-03-29_05:17:18-fs:fscrypt-wip-lxb-20230316-2212-unlink-revert-distro-default-smithi/7224905/, I see a lot of failures in generic/XXX that are not visible in other jobs.

Yeah, I have checked it. It's because the osd6 dead for some reasons. I need to wait the job to be aborted and check the osd logs:

[ 2045.692294] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.699532] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.706782] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.714312] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.721540] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M
[ 2045.728845] libceph: osd6 (1)172.21.15.181:6825 socket error on read^M

This is the same issue with #50728 (comment).

okay so once your fix has been merged, it should go fine right

Yeah, once that fix is applied to the xfstests-dev repo's master branch it will be fetched to git.ceph.com and then this test failure will disappear.

Or if the job is scheduled with the rhel/Centos OS the test will pass too. Only will it fail in Ubuntu.

okay but when running full suite, it will show failures for Ubuntu, so is it feasible to merge this or wait till your fix is merged in xfstests-dev master repo? BTW the changes look good

@lxbsz
Copy link
Member Author

lxbsz commented Mar 31, 2023

[...]

Or if the job is scheduled with the rhel/Centos OS the test will pass too. Only will it fail in Ubuntu.

okay but when running full suite, it will show failures for Ubuntu, so is it feasible to merge this or wait till your fix is merged in xfstests-dev master repo? BTW the changes look good

This failure is not related to this PR.

@dparmar18
Copy link
Contributor

[...]

Or if the job is scheduled with the rhel/Centos OS the test will pass too. Only will it fail in Ubuntu.

okay but when running full suite, it will show failures for Ubuntu, so is it feasible to merge this or wait till your fix is merged in xfstests-dev master repo? BTW the changes look good

This failure is not related to this PR.

right, i was just concerned if merging patch that fail even if not related should be done straightway(if the code looks good) or wait until the unrelated issue is patched, but anyways changes look good to me

@rishabh-d-dave
Copy link
Contributor

@lxbsz This is ready for QA, should I pick it up?

@lxbsz
Copy link
Member Author

lxbsz commented Apr 3, 2023

@lxbsz This is ready for QA, should I pick it up?

Sure, please. Thanks.

@rishabh-d-dave rishabh-d-dave added the wip-rishabh-testing Rishabh's testing label label Apr 4, 2023
@lxbsz
Copy link
Member Author

lxbsz commented May 15, 2023

@lxbsz This is ready for QA, should I pick it up?

Sure, please. Thanks.

@rishabh-d-dave BTW, What's the status for the tests ? The xfstests-dev fixing have been applied already. Thanks

@lxbsz
Copy link
Member Author

lxbsz commented May 15, 2023

jenkins test make check arm64

@lxbsz
Copy link
Member Author

lxbsz commented May 15, 2023

I will run all the relevant tests myself today.

@rishabh-d-dave
Copy link
Contributor

@lxbsz This is ready for QA, should I pick it up?

Sure, please. Thanks.

@rishabh-d-dave BTW, What's the status for the tests ? The xfstests-dev fixing have been applied already. Thanks

I initiated testing a couple of times but there were lots of infra-related failures in both runs. I'll initiate it a third time today.

@lxbsz
Copy link
Member Author

lxbsz commented May 15, 2023

@lxbsz This is ready for QA, should I pick it up?

Sure, please. Thanks.

@rishabh-d-dave BTW, What's the status for the tests ? The xfstests-dev fixing have been applied already. Thanks

I initiated testing a couple of times but there were lots of infra-related failures in both runs. I'll initiate it a third time today.

Sure, thanks.

@lxbsz
Copy link
Member Author

lxbsz commented Jun 12, 2023

@rishabh-d-dave @mchangir

I have run the and passed:

https://pulpito.ceph.com/xiubli-2023-06-09_04:16:04-fs:functional-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/
https://pulpito.ceph.com/xiubli-2023-06-09_03:59:14-fs:fscrypt-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/

If no objection I will merge this and the kclient fscrypt patches need to run these tests, we need to push the kclient fscrypt patches further.

Thanks

Copy link
Contributor

@mchangir mchangir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lxbsz
Copy link
Member Author

lxbsz commented Jun 12, 2023

Talked with @rishabh-d-dave, this is fine with him. Since we need to push the fscrypt patches in kclient and the tests passed. Merge it. Thanks!

@lxbsz lxbsz merged commit 4a60f67 into ceph:main Jun 12, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants