Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable uuid checks on XFS #788

Merged
merged 1 commit into from Jun 16, 2021
Merged

Conversation

jsafrane
Copy link
Contributor

@jsafrane jsafrane commented Jun 9, 2021

What type of PR is this?
/kind bug

What this PR does / why we need it:
By default, XFS does not allow mounting two devices that have the same UUID of the XFS filesystem. Therefore it's not possible to mount a volume + its restored snapshot.

Therefore disable UUID check in XFS using a mount option.

Since kubernetes/kubernetes#102538 was merged, all pull-kubernetes-e2e-gce-storage-slow tests fail in Kubernetes, e.g. https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/102690/pull-kubernetes-e2e-gce-storage-slow/1402241210831081472/

Does this PR introduce a user-facing change?:

It is now possible to mount a volume with XFS filesystem and its restored snapshot.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 9, 2021
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jun 9, 2021
@mattcary
Copy link
Contributor

mattcary commented Jun 9, 2021

/assign
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 9, 2021
By default, XFS does not allow mounting two devices that have the same UUID
of the XFS filesystem. Therefore it's not possible to mount a volume + its
restored snapshot.

Therefore disable UUID check in XFS using a mount option.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2021
@jsafrane
Copy link
Contributor Author

Fixed NodePublish. Overwriting -o bind in options was not a good idea :-).

@jsafrane
Copy link
Contributor Author

/retests
The failures look unrelated

@mattcary
Copy link
Contributor

/lgtm
/retest

I agree those look like flakes, created #789.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2021
@jsafrane
Copy link
Contributor Author

@mattcary, can you please also /approve? Or what are the rules here?

@mattcary
Copy link
Contributor

/approve

It doesn't look like you need the approval but I never figure out what github wants.

Unfortunately it looks like the test flake is the problem. We're looking into that.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jsafrane, mattcary

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2021
@mattcary
Copy link
Contributor

/retest

It seems like the flake is pretty severe but ATM not obvious what is going on.

@jsafrane
Copy link
Contributor Author

/retest

1 similar comment
@ehashman
Copy link

/retest

@mattcary
Copy link
Contributor

mattcary commented Jun 14, 2021 via email

@saikat-royc
Copy link
Contributor

By reducing the parallelism of the tests in #790 , all the tests passed except "Kubernetes e2e suite: External Storage [Driver: csi-gcepd-sc-standard] [Testpattern: Dynamic PV (xfs)][Slow] multiVolume [Slow] should concurrently access the volume and restored snapshot from pods on the same node [LinuxOnly][Feature:VolumeSnapshotDataSource][Feature:VolumeSourceXFS]", which needs the fix in #788 . So @mattcary , @jsafrane can you please make a change similar to #790 in this PR to unblock this.

@mattcary
Copy link
Contributor

mattcary commented Jun 15, 2021 via email

@mattcary
Copy link
Contributor

That worked, feel the admin power coursing through my fingers :-)

/test pull-gcp-compute-persistent-disk-csi-driver-kubernetes-integration

@saikat-royc
Copy link
Contributor

I think I can force-merge #790, that will be easier.

On Mon, Jun 14, 2021 at 4:56 PM Saikat Roychowdhury < @.***> wrote: By reducing the parallelism of the tests in #790 <#790> , all the tests passed except "Kubernetes e2e suite: External Storage [Driver: csi-gcepd-sc-standard] [Testpattern: Dynamic PV (xfs)][Slow] multiVolume [Slow] should concurrently access the volume and restored snapshot from pods on the same node [LinuxOnly][Feature:VolumeSnapshotDataSource][Feature:VolumeSourceXFS]", which needs the fix in #788 <#788> . So @mattcary https://github.com/mattcary , @jsafrane https://github.com/jsafrane can you please make a change similar to #790 <#790> in this PR to unblock this. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#788 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIJCBAFNCNMSROF7BZ6HT3DTS2JLBANCNFSM46MMJ3XQ .

Thanks that works too!

@jsafrane
Copy link
Contributor Author

/retest
?

@mattcary
Copy link
Contributor

Sigh
@saikat-royc parallel=4 didn't fix this (I confirmed in the test logs that this actually ran with parallel=4)

@saikat-royc
Copy link
Contributor

/retest

@k8s-ci-robot
Copy link
Contributor

@jsafrane: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-gcp-compute-persistent-disk-csi-driver-kubernetes-integration b1e0f74 link /test pull-gcp-compute-persistent-disk-csi-driver-kubernetes-integration

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@saikat-royc
Copy link
Contributor

My suggestion would be to merge this PR, and continue investigation as part of #789.

The internal test grids https://testgrid.corp.google.com/gcp-compute-persistent-disk-csi-driver#pd-master-sidecars-latest-k8s-master-on-gce have significantly reduced flakes except for one test failure which will be fixed by this PR.
tagging @msau42 since @mattcary is OOO

@saikat-royc
Copy link
Contributor

/retest

@msau42 msau42 merged commit 1067990 into kubernetes-sigs:master Jun 16, 2021
mattcary added a commit that referenced this pull request Sep 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants