Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement raw block volume support #215

Merged
merged 1 commit into from
Feb 21, 2019

Conversation

leakingtapan
Copy link
Contributor

@leakingtapan leakingtapan commented Feb 14, 2019

Is this a bug fix or adding new feature?
Ref: #101

What is this PR about? / Why do we need it?
To reach feature parity for in-tree ebs plugin and to support CSI migration, adding support for raw block volume

What testing is done?
Manually tested using:
pvc

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: block-pvc
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Block
  storageClassName: ebs-sc
  resources:
    requests:
      storage: 10Gi

pod

apiVersion: v1
kind: Pod
metadata:
  name: pod-with-block-volume
spec:
  containers:
    - name: fc-container
      image: fedora:26
      command: ["/bin/sh", "-c"]
      args: [ "tail -f /dev/null" ]
      volumeDevices:
        - name: data
          devicePath: /dev/xvda
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: block-pvc

verified the device is accessible using dd:

>> kk exec -ti pod-with-block-volume bash
>> dd if=/dev/zero of=/dev/xvda bs=1024k count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0492386 s, 2.1 GB/s

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 14, 2019
@leakingtapan leakingtapan requested review from dkoshkin and bertinatto and removed request for dkoshkin February 14, 2019 19:52
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: leakingtapan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 14, 2019
@leakingtapan leakingtapan changed the title Implement raw block volume support [WIP] Implement raw block volume support Feb 14, 2019
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 14, 2019
@dkoshkin
Copy link
Contributor

Interesting e2e failure that I haven't seen before:

timeout expired waiting for volumes to attach or mount for pod "e2e-tests-ebs-m7qcb"/"ebs-volume-tester-9lgdt". list of unmounted volumes=[test-volume-1 test-volume-2 default-token-nnrtr]. list of unattached volumes=[test-volume-1 test-volume-2 default-token-nnrtr]                       

What's surprising is that even the token default-token-nnrtr wasn't mounted, this looks like a kubelet issues, that latency seems unusual:

Latency metrics for node ip-172-20-42-228.us-west-2.compute.internal
Feb 14 20:23:38.190: INFO: {Operation:sync Method:pod_worker_latency_microseconds Quantile:0.9 Latency:2m3.00123s}
Feb 14 20:23:38.190: INFO: {Operation:sync Method:pod_worker_latency_microseconds Quantile:0.99 Latency:2m3.00123s}
Feb 14 20:23:38.190: INFO: {Operation:sync Method:pod_worker_latency_microseconds Quantile:0.5 Latency:2m3.000821s}

@leakingtapan
Copy link
Contributor Author

leakingtapan commented Feb 14, 2019

From the following logs:

Feb 14 20:23:37.586: INFO: At 2019-02-14 20:08:00 +0000 UTC - event for ebs-volume-tester-9lgdt: {attachdetach-controller } FailedAttachVolume: AttachVolume.Attach failed for volume "pvc-1628ab22-3094-11e9-ab2f-0288477e7a78" : attachment timeout for volume vol-0dd26ed0a85b46bdf
Feb 14 20:23:37.586: INFO: At 2019-02-14 20:08:00 +0000 UTC - event for ebs-volume-tester-9lgdt: {attachdetach-controller } FailedAttachVolume: AttachVolume.Attach failed for volume "pvc-277d5486-3094-11e9-ab2f-0288477e7a78" : attachment timeout for volume vol-01a7ff99583034d52
Feb 14 20:23:37.586: INFO: At 2019-02-14 20:08:05 +0000 UTC - event for ebs-volume-tester-9lgdt: {attachdetach-controller } FailedAttachVolume: AttachVolume.Attach failed for volume "pvc-1628ab22-3094-11e9-ab2f-0288477e7a78" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
Feb 14 20:23:37.586: INFO: At 2019-02-14 20:08:09 +0000 UTC - event for ebs-volume-tester-9lgdt: {attachdetach-controller } FailedAttachVolume: AttachVolume.Attach failed for volume "pvc-277d5486-3094-11e9-ab2f-0288477e7a78" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
Feb 14 20:23:37.586: INFO: At 2019-02-14 20:09:48 +0000 UTC - event for ebs-volume-tester-9lgdt: {kubelet ip-172-20-42-228.us-west-2.compute.internal} FailedMount: Unable to mount volumes for pod "ebs-volume-tester-9lgdt_e2e-tests-ebs-m7qcb(3168775f-3094-11e9-ab2f-0288477e7a78)": timeout expired waiting for volumes to attach or mount for pod "e2e-tests-ebs-m7qcb"/"ebs-volume-tester-9lgdt". list of unmounted volumes=[test-volume-1 test-volume-2]. list of unattached volumes=[test-volume-1 test-volume-2 default-token-nnrtr]
Feb 14 20:23:37.586: INFO: At 2019-02-14 20:10:17 +0000 UTC - event for ebs-volume-tester-9lgdt: {attachdetach-controller } SuccessfulAttachVolume: AttachVolume.Attach succeeded for volume "pvc-277d5486-3094-11e9-ab2f-0288477e7a78" 
Feb 14 20:23:37.586: INFO: At 2019-02-14 20:12:06 +0000 UTC - event for ebs-volume-tester-9lgdt: {kubelet ip-172-20-42-228.us-west-2.compute.internal} FailedMount: Unable to mount volumes for pod "ebs-volume-tester-9lgdt_e2e-tests-ebs-m7qcb(3168775f-3094-11e9-ab2f-0288477e7a78)": timeout expired waiting for volumes to attach or mount for pod "e2e-tests-ebs-m7qcb"/"ebs-volume-tester-9lgdt". list of unmounted volumes=[test-volume-1]. list of unattached volumes=[test-volume-1 test-volume-2 default-token-nnrtr]
Feb 14 20:23:37.586: INFO: At 2019-02-14 20:23:24 +0000 UTC - event for ebs-volume-tester-9lgdt: {kubelet ip-172-20-42-228.us-west-2.compute.internal} FailedMount: Unable to mount volumes for pod "ebs-volume-tester-9lgdt_e2e-tests-ebs-m7qcb(3168775f-3094-11e9-ab2f-0288477e7a78)": timeout expired waiting for volumes to attach or mount for pod "e2e-tests-ebs-m7qcb"/"ebs-volume-tester-9lgdt". list of unmounted volumes=[test-volume-1 test-volume-2 default-token-nnrtr]. list of unattached volumes=[test-volume-1 test-volume-2 default-token-nnrtr]

Should be the volume pvc-1628ab22-3094-11e9-ab2f-0288477e7a78 that is taking too long to attach

Its weird that at 20:12:06 +0000 UTC it has:

list of unmounted volumes=[test-volume-1]. list of unattached volumes=[test-volume-1 test-volume-2 default-token-nnrtr]

And at 20:23:24 +0000 UTC it becomes:

list of unmounted volumes=[test-volume-1 test-volume-2 default-token-nnrtr]. list of unattached volumes=[test-volume-1 test-volume-2 default-token-nnrtr]

I also saw pvc-1628ab22-3094-11e9-ab2f-0288477e7a78 become a lingering volume at the end of the test in aws.
The attachdetach-controller log come from operation_generator in A/D controller.

@coveralls
Copy link

coveralls commented Feb 14, 2019

Pull Request Test Coverage Report for Build 460

  • 48 of 68 (70.59%) changed or added relevant lines in 1 file are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.05%) to 64.478%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/driver/node.go 48 68 70.59%
Files with Coverage Reduction New Missed Lines %
pkg/driver/node.go 2 80.79%
Totals Coverage Status
Change from base Build 459: -0.05%
Covered Lines: 933
Relevant Lines: 1447

💛 - Coveralls

@leakingtapan leakingtapan changed the title [WIP] Implement raw block volume support Implement raw block volume support Feb 19, 2019
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 19, 2019
@leakingtapan
Copy link
Contributor Author

leakingtapan commented Feb 21, 2019

Override the coverage drop (-0.05%)since it is caused by the FakeMounter doesn't support testing error cases. We will consider replacing the FakeMounter with unit testing mock stubs generated by gomock.

@leakingtapan leakingtapan merged commit 46bf6f7 into kubernetes-sigs:master Feb 21, 2019
jsafrane pushed a commit to jsafrane/aws-ebs-csi-driver that referenced this pull request Feb 3, 2023
OCPBUGS-6355: Rebase to v1.15.0 for OCP 4.13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants