Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: volume clone from source volume #426

Merged
merged 3 commits into from
Mar 15, 2023

Conversation

wozniakjan
Copy link
Member

@wozniakjan wozniakjan commented Mar 9, 2023

What type of PR is this?
/kind feature

What this PR does / why we need it:
This PR introduces a first step towards supporting snapshots - volume cloning from other volumes. It's implemented as outlined in #30 (comment), and it is just an invocation of cp -a [src]/. [dst] to recursively copy the content of the source volume to the destination volume and adequately treat soft and hard links.

I would like to first get a consensus on the design decisions and then I can follow up with snapshots and volume copy from snapshots.

Which issue(s) this PR fixes:

first step towards implementing #31

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

feat: volume clone from source volume

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 9, 2023
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. labels Mar 9, 2023
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 9, 2023
var volCap *csi.VolumeCapability
if len(req.GetVolumeCapabilities()) > 0 {
volCap = req.GetVolumeCapabilities()[0]
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if req capabilities apply for both src and dst volume internal mounting. I took this from here:

var volCap *csi.VolumeCapability
if len(req.GetVolumeCapabilities()) > 0 {
volCap = req.GetVolumeCapabilities()[0]
}

@wozniakjan wozniakjan force-pushed the volume_to_volume_copy branch 2 times, most recently from 75ab0ff to 52620bf Compare March 9, 2023 16:13
@coveralls
Copy link

coveralls commented Mar 9, 2023

Pull Request Test Coverage Report for Build 4417866944

  • 3 of 48 (6.25%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-4.05%) to 73.278%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/nfs/controllerserver.go 2 47 4.26%
Totals Coverage Status
Change from base Build 4370941627: -4.05%
Covered Lines: 617
Relevant Lines: 842

💛 - Coveralls

@wozniakjan wozniakjan marked this pull request as draft March 9, 2023 16:34
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 9, 2023
Copy link
Member

@andyzhangx andyzhangx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR lgtm, could you also write an e2e test for volume cloning, you could refer to https://github.com/kubernetes-sigs/azuredisk-csi-driver/blob/master/test/e2e/dynamic_provisioning_test.go#L539-L570

@andyzhangx
Copy link
Member

this PR lgtm, could you also write an e2e test for volume cloning, you could refer to https://github.com/kubernetes-sigs/azuredisk-csi-driver/blob/master/test/e2e/dynamic_provisioning_test.go#L539-L570

there is an easier way to add volume cloning test, just add pvcDataSource: true in external e2e test:

Capabilities:
persistence: true
exec: true
multipods: true
RWX: true
fsGroup: true

@wozniakjan wozniakjan marked this pull request as ready for review March 13, 2023 10:22
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 13, 2023
@wozniakjan
Copy link
Member Author

wozniakjan commented Mar 13, 2023

/retest

maybe someone else should trigger @k8s-ci-robot, all of these tests currently fail with Unauthorized

Pod can not be created: create pod test-pod ... e5ab5e009e in cluster default: Unauthorized BaseSHA:af29ce4d9ff5396cc22cb6c1e18bddcfab23338f

@andyzhangx
Copy link
Member

/retest

@andyzhangx
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Mar 13, 2023
@andyzhangx
Copy link
Member

/retest

@wozniakjan
Copy link
Member Author

/retest
maybe yesterday was just flaky?

@wozniakjan
Copy link
Member Author

the failure in pull-csi-driver-nfs-sanity relates to the changes introduced here:

E0314 09:08:26.406775    7781 utils.go:93] GRPC error: rpc error: code = NotFound desc = could not split fake-vol-id-f600e740-a into server, baseDir and subDir with separator(/)

the volumeID tests are trying to use was fake-vol-id-f600e740-a which doesn't comply with the expected volumeID as {nfs-server-address}#{sub-dir-name}#{share-name}, I will see if there is a way to bend the tests to comply with our expectations (or maybe write custom copy volume test as described in #426 (review))

for pull-csi-driver-nfs-integration, the error is

Package nfs-common is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source

E: Package 'nfs-common' has no installation candidate

seems unrelated, maybe debian package name changed? I will take a look

@k8s-ci-robot
Copy link
Contributor

@wozniakjan: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

  • /test pull-csi-driver-nfs-e2e
  • /test pull-csi-driver-nfs-external-e2e
  • /test pull-csi-driver-nfs-integration
  • /test pull-csi-driver-nfs-sanity
  • /test pull-csi-driver-nfs-unit
  • /test pull-kubernetes-csi-csi-driver-nfs

Use /test all to run all jobs.

In response to this:

/retest pull-csi-driver-nfs-integration

this seems to work just fine, another transient glitch in the matrix?

$ $ docker run registry.k8s.io/build-image/debian-base:bullseye-v1.4.2  sh -c "apt update && apt upgrade -y && apt-mark unhold libcap2 && clean-install ca-certificates mount nfs-common netbase"
...
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.

$ echo $?
0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wozniakjan
Copy link
Member Author

wozniakjan commented Mar 14, 2023

Package nfs-common is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source

E: Package 'nfs-common' has no installation candidate

/test pull-csi-driver-nfs-integration

this seems to work just fine, another transient glitch in the matrix?

$ $ docker run registry.k8s.io/build-image/debian-base:bullseye-v1.4.2  sh -c "apt update && apt upgrade -y && apt-mark unhold libcap2 && clean-install ca-certificates mount nfs-common netbase"
...
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.

$ echo $?
0

@andyzhangx
Copy link
Member

sanity test is failing in clone volume test:
https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-csi_csi-driver-nfs/426/pull-csi-driver-nfs-sanity/1635567645178728448/build-log.txt

I0314 09:08:26.081201    7781 utils.go:88] GRPC call: /csi.v1.Controller/CreateVolume
I0314 09:08:26.081280    7781 utils.go:89] GRPC request: {"capacity_range":{"limit_bytes":10737418240,"required_bytes":10737418240},"name":"sanity-controller-vol-from-vol-1BFA11DF-89190C95","parameters":{"server":"127.0.0.1","share":"/"},"volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":1}}],"volume_content_source":{"Type":{"Volume":{"volume_id":"127.0.0.1##sanity-controller-source-vol-1BFA11DF-89190C95#"}}}}
I0314 09:08:26.081693    7781 controllerserver.go:295] internally mounting 127.0.0.1:/ at /tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95
I0314 09:08:26.081879    7781 nodeserver.go:123] NodePublishVolume: volumeID(127.0.0.1##sanity-controller-vol-from-vol-1BFA11DF-89190C95#) source(127.0.0.1:/) targetPath(/tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95) mountflags([])
I0314 09:08:26.081911    7781 mount_linux.go:220] Mounting cmd (mount) with arguments (-t nfs 127.0.0.1:/ /tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95)
I0314 09:08:26.107444    7781 nodeserver.go:140] skip chmod on targetPath(/tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95) since mountPermissions is set as 0
I0314 09:08:26.107489    7781 nodeserver.go:142] volume(127.0.0.1##sanity-controller-vol-from-vol-1BFA11DF-89190C95#) mount 127.0.0.1:/ on /tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95 succeeded
I0314 09:08:26.108825    7781 controllerserver.go:325] copy volume from volume /tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C95 -> /tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95/sanity-controller-vol-from-vol-1BFA11DF-89190C95
I0314 09:08:26.108859    7781 controllerserver.go:295] internally mounting 127.0.0.1:/ at /tmp/sanity-controller-source-vol-1BFA11DF-89190C95
I0314 09:08:26.108993    7781 nodeserver.go:123] NodePublishVolume: volumeID(127.0.0.1##sanity-controller-source-vol-1BFA11DF-89190C95#) source(127.0.0.1:/) targetPath(/tmp/sanity-controller-source-vol-1BFA11DF-89190C95) mountflags([])
I0314 09:08:26.109020    7781 mount_linux.go:220] Mounting cmd (mount) with arguments (-t nfs 127.0.0.1:/ /tmp/sanity-controller-source-vol-1BFA11DF-89190C95)
I0314 09:08:26.174764    7781 nodeserver.go:140] skip chmod on targetPath(/tmp/sanity-controller-source-vol-1BFA11DF-89190C95) since mountPermissions is set as 0
I0314 09:08:26.174875    7781 nodeserver.go:142] volume(127.0.0.1##sanity-controller-source-vol-1BFA11DF-89190C95#) mount 127.0.0.1:/ on /tmp/sanity-controller-source-vol-1BFA11DF-89190C95 succeeded
I0314 09:08:26.174944    7781 controllerserver.go:295] internally mounting 127.0.0.1:/ at /tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95
I0314 09:08:26.178668    7781 controllerserver.go:351] copied /tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C95 -> /tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95/sanity-controller-vol-from-vol-1BFA11DF-89190C95 output: cp: cannot stat '/tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C9547.': No such file or directory


// recursive 'cp' with '-a' to handle symlinks. Note that the source path must include trailing '/.',
// which is the reason why 'filepath.Join()' is not used as it would perform path cleaning
out, err := exec.Command("cp", "-a", fmt.Sprintf("%v%v.", srcPath, filepath.Separator), dstPath).CombinedOutput()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per the sanity test failure,

controllerserver.go:351] copied /tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C95 -> /tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95/sanity-controller-vol-from-vol-1BFA11DF-89190C95 output: cp: cannot stat '/tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C9547.': No such file or directory

it should be cp -a /tmp/sanity-controller-source-vol-1BFA11DF-89190C95/* /tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95/ ?

Copy link
Member Author

@wozniakjan wozniakjan Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is expected given the implementation of getInternalVolumePath() because volume has a subdir

I0314 09:08:26.078270    7781 utils.go:95] GRPC response: {"volume":{"volume_context":{"server":"127.0.0.1","share":"/","subdir":"sanity-controller-source-vol-1BFA11DF-89190C95"},"volume_id":"127.0.0.1##sanity-controller-source-vol-1BFA11DF-89190C95#"}}

// Get internal path where the volume is created
// The reason why the internal path is "workingDir/subDir/subDir" is because:
// - the semantic is actually "workingDir/volId/subDir" and volId == subDir.
// - we need a mount directory per volId because you can have multiple
// CreateVolume calls in parallel and they may use the same underlying share.
// Instead of refcounting how many CreateVolume calls are using the same
// share, it's simpler to just do a mount per request.
func getInternalVolumePath(workingMountDir string, vol *nfsVolume) string {

Copy link
Member Author

@wozniakjan wozniakjan Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, I think I see it now

I0314 09:08:26.178668    7781 controllerserver.go:351] copied /tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C95 -> /tmp/sanity-controller-vol-from-vol-1BFA11DF-89190C95/sanity-controller-vol-from-vol-1BFA11DF-89190C95 output: cp: cannot stat '/tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C9547.': No such file or directory

there should not be any path terminated with just ., only /.,

FROM TEST: /tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C9547.
EXP SRC:   /tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C9547/.
EXP DST:   /tmp/sanity-controller-source-vol-1BFA11DF-89190C95/sanity-controller-source-vol-1BFA11DF-89190C9547

looks like filepath.Separator was an empty string?

trying hardcoded trailing /. for src path and slightly improved the logging in 784da6f

Copy link
Member Author

@wozniakjan wozniakjan Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests are passing now, looks like using / instead of filepath.Separator did the trick.

@andyzhangx I am not sure if that is a big issue for portability. Technically using cp imho already restricts this to run with the expectation of POSIX file paths. Supporting POSIX and windows separator may not have much of an impact from the OS portability point of view.

Copy link
Member

@andyzhangx andyzhangx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
thanks for the contribution!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 15, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx, wozniakjan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 15, 2023
@k8s-ci-robot k8s-ci-robot merged commit cd50d48 into kubernetes-csi:master Mar 15, 2023
wozniakjan added a commit to wozniakjan/csi-driver-nfs that referenced this pull request Mar 21, 2023
wozniakjan added a commit to wozniakjan/csi-driver-nfs that referenced this pull request Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants