New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure PD Plugin (Blob + Managed Disks) #41950

Closed
wants to merge 38 commits into
base: master
from

Conversation

@khenidak
Contributor

khenidak commented Feb 23, 2017

** Please don't use this PR. the code has moved to this PR

What this PR does / why we need it:

  1. Adds K8S support for Azure Managed Disks.
  2. Adds support for dedicated blob disks (1:1 to storage account) in addition to shared blob disks (n:1 to storage account).
  3. Automatically manages the underlying storage accounts. New storage accounts are created at 50% utilization. Max is 100 disks, 60 disks per storage account.
  4. Addresses the current issues with Blob Disks:
    ..* Significantly faster attach process. Disks are now usually available for pods on nodes under 30 sec if formatted, under a min if not formatted.
    ..* Adds support to move disks between nodes.
    ..* Adds consistent attach/detach behavior, checks if the disk is leased/attached on a different node before attempting to attach to target nodes.
    ..* Fixes a random hang behavior on Azure VMs during mount/format (for both blob + managed disks).
    ..* Fixes a potential conflict by avoiding the use of disk names for mount paths. The new plugin uses hashed disk uri for mount path.

The existing AzureDisk is used as is. Additional "kind" property was added allowing the user to decide if the pd will be shared, dedicated or managed (Azure Managed Disks are used).

Due to the change in mounting paths, existing PDs need to be recreated as PV or PVCs on the new plugin.

@k8s-ci-robot

This comment has been minimized.

Show comment
Hide comment
@k8s-ci-robot

k8s-ci-robot Feb 23, 2017

Contributor

Hi @khenidak. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with @k8s-bot ok to test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Contributor

k8s-ci-robot commented Feb 23, 2017

Hi @khenidak. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with @k8s-bot ok to test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot

This comment has been minimized.

Show comment
Hide comment
@k8s-ci-robot

k8s-ci-robot Feb 23, 2017

Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

Once you've signed, please reply here (e.g. "I signed it!") and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Contributor

k8s-ci-robot commented Feb 23, 2017

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

Once you've signed, please reply here (e.g. "I signed it!") and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-reviewable

This comment has been minimized.

Show comment
Hide comment
@k8s-reviewable

k8s-reviewable Feb 23, 2017

This change is Reviewable

k8s-reviewable commented Feb 23, 2017

This change is Reviewable

@k8s-merge-robot

This comment has been minimized.

Show comment
Hide comment
@k8s-merge-robot

k8s-merge-robot Feb 23, 2017

Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

The following people have approved this PR: khenidak

Needs approval from an approver in each of these OWNERS Files:

We suggest the following people:
cc @brendandburns @erictune
You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

Contributor

k8s-merge-robot commented Feb 23, 2017

[APPROVALNOTIFIER] This PR is NOT APPROVED

The following people have approved this PR: khenidak

Needs approval from an approver in each of these OWNERS Files:

We suggest the following people:
cc @brendandburns @erictune
You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@khenidak

This comment has been minimized.

Show comment
Hide comment
@khenidak
Contributor

khenidak commented Feb 23, 2017

@khenidak

This comment has been minimized.

Show comment
Hide comment
@khenidak

khenidak Feb 23, 2017

Contributor

i signed it

Contributor

khenidak commented Feb 23, 2017

i signed it

@rootfs

This comment has been minimized.

Show comment
Hide comment
@rootfs

rootfs Feb 24, 2017

Member

@k8s-bot ok to test

Member

rootfs commented Feb 24, 2017

@k8s-bot ok to test

@khenidak

This comment has been minimized.

Show comment
Hide comment
@khenidak

khenidak Feb 24, 2017

Contributor

This was shared on slack. Just making sure you see it

@rootfs rather a basic question. This PR is failing tests. This is because the generated files are not included with the commits. If i included them they will have too many conflicts to fix by hand. Should i included them in a different commit within the same PR? (in this case i am assuming you merge everything except the generated files?)

Contributor

khenidak commented Feb 24, 2017

This was shared on slack. Just making sure you see it

@rootfs rather a basic question. This PR is failing tests. This is because the generated files are not included with the commits. If i included them they will have too many conflicts to fix by hand. Should i included them in a different commit within the same PR? (in this case i am assuming you merge everything except the generated files?)

@khenidak

This comment has been minimized.

Show comment
Hide comment
@khenidak

khenidak Feb 25, 2017

Contributor

@rootfs thanks for the tip. the PR now contains the generated code.

Contributor

khenidak commented Feb 25, 2017

@rootfs thanks for the tip. the PR now contains the generated code.

@khenidak

This comment has been minimized.

Show comment
Hide comment
@khenidak

khenidak Feb 25, 2017

Contributor

@k8s-bot verify test this

Contributor

khenidak commented Feb 25, 2017

@k8s-bot verify test this

@cblecker

This comment has been minimized.

Show comment
Hide comment
@cblecker

cblecker Apr 4, 2017

Member

Hi @khenidak,
The first issue to tackle is the verify test failures: https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/41950/pull-kubernetes-verify/24225/

I0403 20:06:54.100] /go/src/k8s.io/kubernetes/docs/api-reference is out of date. Please run hack/update-api-reference-docs.sh
I0403 20:06:54.106] FAILED   hack/make-rules/../../hack/verify-api-reference-docs.sh	100s
I0403 20:13:37.934] /go/src/k8s.io/kubernetes/federation/apis/openapi-spec is out of date. Please run hack/update-federation-openapi-spec.sh
I0403 20:13:37.937] FAILED   hack/make-rules/../../hack/verify-federation-openapi-spec.sh	131s
I0403 20:20:30.793] /go/src/k8s.io/kubernetes/api/openapi-spec is out of date. Please run hack/update-openapi-spec.sh
I0403 20:20:30.800] FAILED   hack/make-rules/../../hack/verify-openapi-spec.sh	124s

These tests can be run on your local machine with make verify.

Once those are passing, look at the unit tests next.

Member

cblecker commented Apr 4, 2017

Hi @khenidak,
The first issue to tackle is the verify test failures: https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/41950/pull-kubernetes-verify/24225/

I0403 20:06:54.100] /go/src/k8s.io/kubernetes/docs/api-reference is out of date. Please run hack/update-api-reference-docs.sh
I0403 20:06:54.106] FAILED   hack/make-rules/../../hack/verify-api-reference-docs.sh	100s
I0403 20:13:37.934] /go/src/k8s.io/kubernetes/federation/apis/openapi-spec is out of date. Please run hack/update-federation-openapi-spec.sh
I0403 20:13:37.937] FAILED   hack/make-rules/../../hack/verify-federation-openapi-spec.sh	131s
I0403 20:20:30.793] /go/src/k8s.io/kubernetes/api/openapi-spec is out of date. Please run hack/update-openapi-spec.sh
I0403 20:20:30.800] FAILED   hack/make-rules/../../hack/verify-openapi-spec.sh	124s

These tests can be run on your local machine with make verify.

Once those are passing, look at the unit tests next.

@khenidak

This comment has been minimized.

Show comment
Hide comment
@khenidak

khenidak Apr 5, 2017

Contributor

I am hitting this issue with my verification test #31129

Contributor

khenidak commented Apr 5, 2017

I am hitting this issue with my verification test #31129

@cblecker

This comment has been minimized.

Show comment
Hide comment
@cblecker

cblecker Apr 5, 2017

Member

@khenidak I believe #44109 will fix your issue, once merged.

Additionally, it may be advisable to work in a feature branch instead of your fork's master.. it may make it easier to do things like rebase and squash your commits.

Member

cblecker commented Apr 5, 2017

@khenidak I believe #44109 will fix your issue, once merged.

Additionally, it may be advisable to work in a feature branch instead of your fork's master.. it may make it easier to do things like rebase and squash your commits.

@khenidak

This comment has been minimized.

Show comment
Hide comment
@khenidak

khenidak Apr 5, 2017

Contributor

@cblecker thanks!

@brendandburns @colemickens @rootfs @ncdc

The failing tests are categorized as the following:

Failing because the test itself is faulty (i think):

  1. GCE node e2e: nodes are not coming back to due to timeout.
  2. GCE e2e: cluster is not coming up
  3. Kops: i think the test is missing a yam compiler

Issues with tests itself

  1. Bazel: round trip works only if i remove my defaulting code specifically the Kind defaulting https://github.com/kubernetes/kubernetes/pull/41950/files#diff-14b477f6f8ca5978c702f4d21641e43fR260
  2. Jenkins Verification - failed because of #31129 (please check logs, it points to "a new line at the end of file")

Please let me know how we can move from forward from here

Contributor

khenidak commented Apr 5, 2017

@cblecker thanks!

@brendandburns @colemickens @rootfs @ncdc

The failing tests are categorized as the following:

Failing because the test itself is faulty (i think):

  1. GCE node e2e: nodes are not coming back to due to timeout.
  2. GCE e2e: cluster is not coming up
  3. Kops: i think the test is missing a yam compiler

Issues with tests itself

  1. Bazel: round trip works only if i remove my defaulting code specifically the Kind defaulting https://github.com/kubernetes/kubernetes/pull/41950/files#diff-14b477f6f8ca5978c702f4d21641e43fR260
  2. Jenkins Verification - failed because of #31129 (please check logs, it points to "a new line at the end of file")

Please let me know how we can move from forward from here

@cblecker

This comment has been minimized.

Show comment
Hide comment
@cblecker

cblecker Apr 6, 2017

Member

script fix has been merged. rebase to get the changes.

Member

cblecker commented Apr 6, 2017

script fix has been merged. rebase to get the changes.

@khenidak

This comment has been minimized.

Show comment
Hide comment
@khenidak

khenidak Apr 6, 2017

Contributor

@cblecker Jenkins verification now successful, thanks!
i will create an issue for the round trip thing.

Contributor

khenidak commented Apr 6, 2017

@cblecker Jenkins verification now successful, thanks!
i will create an issue for the round trip thing.

@k8s-ci-robot

This comment has been minimized.

Show comment
Hide comment
@k8s-ci-robot

k8s-ci-robot May 3, 2017

Contributor

@khenidak: The following test(s) failed:

Test name Commit Details Rerun command
Jenkins GCI GKE smoke e2e 8fee86e link @k8s-bot gci gke e2e test this
Jenkins GKE smoke e2e 8fee86e link @k8s-bot cvm gke e2e test this
Jenkins non-CRI GCE Node e2e 6e151a5 link @k8s-bot non-cri node e2e test this
Jenkins GCE e2e 6e151a5 link @k8s-bot cvm gce e2e test this
Jenkins GCI GCE e2e 6e151a5 link @k8s-bot gci gce e2e test this
Jenkins non-CRI GCE e2e 6e151a5 link @k8s-bot non-cri e2e test this
Jenkins Bazel Build b54c268 link @k8s-bot bazel test this
Jenkins GCE Node e2e b54c268 link @k8s-bot node e2e test this
Jenkins unit/integration b54c268 link @k8s-bot unit test this
Jenkins GCE etcd3 e2e b54c268 link @k8s-bot gce etcd3 e2e test this
Jenkins kops AWS e2e b54c268 link @k8s-bot kops aws e2e test this
Jenkins Kubemark GCE e2e b54c268 link @k8s-bot kubemark e2e test this
Jenkins verification b54c268 link @k8s-bot verify test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Contributor

k8s-ci-robot commented May 3, 2017

@khenidak: The following test(s) failed:

Test name Commit Details Rerun command
Jenkins GCI GKE smoke e2e 8fee86e link @k8s-bot gci gke e2e test this
Jenkins GKE smoke e2e 8fee86e link @k8s-bot cvm gke e2e test this
Jenkins non-CRI GCE Node e2e 6e151a5 link @k8s-bot non-cri node e2e test this
Jenkins GCE e2e 6e151a5 link @k8s-bot cvm gce e2e test this
Jenkins GCI GCE e2e 6e151a5 link @k8s-bot gci gce e2e test this
Jenkins non-CRI GCE e2e 6e151a5 link @k8s-bot non-cri e2e test this
Jenkins Bazel Build b54c268 link @k8s-bot bazel test this
Jenkins GCE Node e2e b54c268 link @k8s-bot node e2e test this
Jenkins unit/integration b54c268 link @k8s-bot unit test this
Jenkins GCE etcd3 e2e b54c268 link @k8s-bot gce etcd3 e2e test this
Jenkins kops AWS e2e b54c268 link @k8s-bot kops aws e2e test this
Jenkins Kubemark GCE e2e b54c268 link @k8s-bot kubemark e2e test this
Jenkins verification b54c268 link @k8s-bot verify test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-merge-robot

This comment has been minimized.

Show comment
Hide comment
@k8s-merge-robot

k8s-merge-robot May 5, 2017

Contributor

@khenidak PR needs rebase

Contributor

k8s-merge-robot commented May 5, 2017

@khenidak PR needs rebase

@jdumars

This comment has been minimized.

Show comment
Hide comment
@jdumars

jdumars May 11, 2017

Member

@khenidak any updates on this?

Member

jdumars commented May 11, 2017

@khenidak any updates on this?

@jdumars

This comment has been minimized.

Show comment
Hide comment
@jdumars

jdumars May 30, 2017

Member

@rootfs do you have some availability to meet about this sometime this week?

Member

jdumars commented May 30, 2017

@rootfs do you have some availability to meet about this sometime this week?

@rootfs

This comment has been minimized.

Show comment
Hide comment
@rootfs

rootfs May 30, 2017

Member

@jdumars @khenidak let chat on this. ping me at hchen@redhat.com. I am in eastern time.

Member

rootfs commented May 30, 2017

@jdumars @khenidak let chat on this. ping me at hchen@redhat.com. I am in eastern time.

@khenidak khenidak closed this Jun 1, 2017

@khenidak

This comment has been minimized.

Show comment
Hide comment
@khenidak

khenidak Jun 1, 2017

Contributor

Closed - moved to #46360

Contributor

khenidak commented Jun 1, 2017

Closed - moved to #46360

k8s-merge-robot added a commit that referenced this pull request Jul 14, 2017

Merge pull request #46360 from khenidak/azure-pd-final
Automatic merge from submit-queue

Azure PD (Managed/Blob)

This is exactly the same code as this [PR](#41950). It has a clean set of generated items. We created a separate PR to accelerate the accept/merge the PR

CC @colemickens 
CC @brendandburns 

**What this PR does / why we need it**:

1. Adds K8S support for Azure Managed Disks. 
2. Adds support for dedicated blob disks (1:1 to storage account) in addition to shared blob disks (n:1 to storage account). 
3. Automatically manages the underlying storage accounts. New storage accounts are created at 50% utilization. Max is 100 disks, 60 disks per storage account.    
2. Addresses the current issues with Blob Disks:
..* Significantly faster attach process. Disks are now usually available for pods on nodes under 30 sec if formatted, under a min if not formatted. 
..* Adds support to move disks between nodes.
..* Adds consistent attach/detach behavior, checks if the disk is leased/attached on a different node before attempting to attach to target nodes.
..* Fixes a random hang behavior on Azure VMs during mount/format (for both blob + managed disks).
..* Fixes a potential conflict by avoiding the use of disk names for mount paths. The new plugin uses hashed disk uri for mount path.  

The existing AzureDisk is used as is. Additional "kind" property was added  allowing the user to decide if the pd will be shared, dedicated or managed (Azure Managed Disks are used).

Due to the change in mounting paths, existing PDs need to be recreated as PV or PVCs on the new plugin.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment