Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Validation failure on gp2 volume type for node pools #1170

Closed
shraykay opened this issue Mar 7, 2018 · 17 comments
Closed

Validation failure on gp2 volume type for node pools #1170

shraykay opened this issue Mar 7, 2018 · 17 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@shraykay
Copy link
Contributor

shraykay commented Mar 7, 2018

it seems that AWS has changed the way their cloudformation validation occurs. when validate is ran, this error occurs during the dry-run if you're using disk type = gp2. after commenting this out in and compiling a binary myself I was able to pass validation: cluster.go line 203

$ kube-aws validate --s3-uri=s3://mybucket
Validating UserData and stack template...
Error: failed to validate node pool #0: create volume dry-run request failed: InvalidParameterCombination: The parameter iops is not supported for gp2 volumes.
    status code: 400, request id: 3fd939ea-f4aa-40a1-b26e-de9bc77e632b
@mumoshu
Copy link
Contributor

mumoshu commented Mar 7, 2018

@shraykay Thanks for the concise report!

Your fix does seem to meet the body!
If I couId ask for more, I appreciate it if you could submit a PR for that, with probably setting Iops before validation when and only when c.RootVolume.IOPS != 0.

@mumoshu mumoshu added the kind/bug Categorizes issue or PR as related to a bug. label Mar 7, 2018
@luck02
Copy link
Contributor

luck02 commented Mar 7, 2018

We ran into the same issue ourselves. I was about to reproduce independently.

@luck02
Copy link
Contributor

luck02 commented Mar 7, 2018

@shraykay I was going to submit a PR, but if you're already a contributor you can probably get it done faster. Ping me if you plan on doing this in the near future, otherwise I'll give it a shot. Can @mumoshu or @shraykay point me where a test for this ought to go? (the code is simple but I'm out of practice with Go and don't know what the current testing layout best practices are)

@luck02
Copy link
Contributor

luck02 commented Mar 7, 2018

package main

import (
	"github.com/aws/aws-sdk-go/aws"
	"github.com/aws/aws-sdk-go/service/ec2"
	"github.com/aws/aws-sdk-go/aws/awserr"
	"github.com/aws/aws-sdk-go/aws/session"
	"fmt"
)

func main() {
	awsConfig := aws.NewConfig().
		WithRegion("us-east-1").
		WithCredentialsChainVerboseErrors(true)

	awsConfig = awsConfig.WithLogLevel(aws.LogDebug)

	awsSession := session.New(awsConfig)

	ec2Svc := ec2.New(awsSession)
	az := "us-east-1a"
	workerRootVolume := ec2.CreateVolumeInput{
		DryRun:           aws.Bool(true),
		AvailabilityZone: &az,
		Iops:             aws.Int64(0),
		Size:             aws.Int64(50),
		VolumeType:       aws.String("gp2"),
	}

	if ignore, err := ec2Svc.CreateVolume(&workerRootVolume); err != nil {
		operr, ok := err.(awserr.Error)

		if !ok || (ok && operr.Code() != "DryRunOperation") {
			fmt.Print(fmt.Errorf("create volume dry-run request failed: %v", err))
		}

		print(ignore)
	}

}

results in output:

2018/03/06 18:05:10 DEBUG: Request ec2/CreateVolume Details:
---[ REQUEST POST-SIGN ]-----------------------------
POST / HTTP/1.1
Host: ec2.us-east-1.amazonaws.com
User-Agent: aws-sdk-go/1.13.9 (go1.8.3; darwin; amd64)
Content-Length: 108
Authorization: AWS4-HMAC-SHA256 Credential=redacted/us-east-1/ec2/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date, Signature=f38c5b5443cfd201273bf0a7fb32fe7a2091456c91b6c6e1e31f603bcf57fb49
Content-Type: application/x-www-form-urlencoded; charset=utf-8
X-Amz-Date: 20180307T020510Z
Accept-Encoding: gzip


-----------------------------------------------------
2018/03/06 18:05:11 DEBUG: Response ec2/CreateVolume Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 400 Bad Request
Connection: close
Transfer-Encoding: chunked
Date: Wed, 07 Mar 2018 02:05:10 GMT
Server: AmazonEC2


-----------------------------------------------------
create volume dry-run request failed: InvalidParameterCombination: The parameter iops is not supported for gp2 volumes.
0xc42007c980	status code: 400, request id: 6f3f5123-65fb-4203-8d8f-d55601591e1f
Process finished with exit code 0

@shraykay
Copy link
Contributor Author

shraykay commented Mar 7, 2018

don't set it to 0, just comment out the entire line for the iops

@mumoshu
Copy link
Contributor

mumoshu commented Mar 7, 2018

Note that commenting out the entire line is not preferable, as it throws away the useful validation when the volume type is io1, hence my previous comment 😄

I appreciate it if you could submit a PR for that, with probably setting Iops before validation when and only when c.RootVolume.IOPS != 0.

@luck02
Copy link
Contributor

luck02 commented Mar 7, 2018

That makes sense, I'm just validating something with the AWS api before dropping a PR.

Thanks!

@mumoshu
Copy link
Contributor

mumoshu commented Mar 7, 2018

@luck02 Thanks for your support!

Tests should go into core/controlplane/cluster/cluster_test.go and core/nodepool/cluster/cluster_test.go

@luck02
Copy link
Contributor

luck02 commented Mar 7, 2018

Just as an amusing point of reference: fmt.Println(aws.Int64Value(nil)) == 0 🤣

So the test needs tweaking, ahh well. that was confusing for awhile.

@luck02
Copy link
Contributor

luck02 commented Mar 7, 2018

PR here: #1171

tests pass locally but I didn't run e2e, I'm assuming the build infra does that?

@mumoshu mumoshu changed the title validation fails on GP2 volume types Validation failure on gp2 volume type for node pools Mar 10, 2018
kylehodgetts pushed a commit to HotelsDotCom/kube-aws that referenced this issue Mar 27, 2018
…retired#1171)

Improved tests and code to respect changes in EC2 validate volume API. We can no longer submit IOPS 0 as being synomous with unset field / null.

In `core/nodepool/cluster.go`, the function: `validateWorkerRootVolume` is changed to call an another function to generate EC2 volume parameters. Also tweaked tests around that.

Fixes kubernetes-retired#1170
@zachrattner
Copy link

zachrattner commented Apr 1, 2018

Is there an ETA for this fix to be released generally? I am not able to deploy my cluster using gp2 disks due to this, and io1 can incur unnecessary costs.

@shraykay
Copy link
Contributor Author

shraykay commented Apr 1, 2018

@zachrattner i believe this was merged into master in #1171

@zachrattner
Copy link

Thanks @shraykay I see the latest release was back in December (https://github.com/kubernetes-incubator/kube-aws/releases) and this fix was within the last month.

The readme mentions that master might not be stable and we should pull from release branches if we need a stable build. Where should I pull from to get a stable codebase with the fix?

@shraykay
Copy link
Contributor Author

shraykay commented Apr 1, 2018

master is stable IMO, and works with kubernetes 1.9 as well. if you are especially worried you can cherry pick the commit into a release and compile it.

@luck02
Copy link
Contributor

luck02 commented Apr 2, 2018

Is this broken again? last I checked AWS rolled back this change. We may not have noticed as we started using a locally baked kube-aws binary that had this baked in.

Let us know as we'll ping our AWS Technical Account Managers @zachrattner, also good to know for general knowledge.

@zachrattner
Copy link

@luck02 Yes, I'm seeing this issue on v0.9.9. Keep me posted! I'll try cherry picking the fix on top of the 0.9.9 release in the meantime

@zachrattner
Copy link

For anyone else running into this issue with the same problem of not having a stable build with the fix present, I can confirm the deployment with gp2 disks worked after doing the following steps:

  1. export GOPATH=$HOME/go
  2. Clone the v0.9.9 release source code:
git clone --branch v0.9.9 https://github.com/kubernetes-incubator/kube-aws.git $GOPATH/src/github.com/kubernetes-incubator/kube-aws
  1. Cherry-pick the gp2 fix commit (5e15ac9):
git cherry-pick 5e15ac9059a95c051600c378cec204621c3a722f
  1. Compile the changes: make build

  2. Use kube-aws in the bin/ folder.

Helpful background info: https://kubernetes-incubator.github.io/kube-aws/guides/developer-guide.html

SomeoneWeird pushed a commit to iflix/kube-aws that referenced this issue Apr 3, 2018
…retired#1171)

Improved tests and code to respect changes in EC2 validate volume API. We can no longer submit IOPS 0 as being synomous with unset field / null.

In `core/nodepool/cluster.go`, the function: `validateWorkerRootVolume` is changed to call an another function to generate EC2 volume parameters. Also tweaked tests around that.

Fixes kubernetes-retired#1170
SomeoneWeird pushed a commit to iflix/kube-aws that referenced this issue Apr 3, 2018
…retired#1171)

Improved tests and code to respect changes in EC2 validate volume API. We can no longer submit IOPS 0 as being synomous with unset field / null.

In `core/nodepool/cluster.go`, the function: `validateWorkerRootVolume` is changed to call an another function to generate EC2 volume parameters. Also tweaked tests around that.

Fixes kubernetes-retired#1170
cw-sakamoto pushed a commit to cw-sakamoto/kube-aws that referenced this issue May 22, 2018
…retired#1171)

Improved tests and code to respect changes in EC2 validate volume API. We can no longer submit IOPS 0 as being synomous with unset field / null.

In `core/nodepool/cluster.go`, the function: `validateWorkerRootVolume` is changed to call an another function to generate EC2 volume parameters. Also tweaked tests around that.

Fixes kubernetes-retired#1170
@cknowles cknowles added this to the v0.9.10 milestone Jul 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants