-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Kops 1.17 AWS StorageClass Wreaks Havoc with Existing Defaults and Provides Uncertain Upgrade Path #9336
Comments
Hey @LeeHampton sorry for the confusion here. We are discussing in #kops-dev, but I will say that I don't anticipate this new class to change much, and therefore the version matching should not change unless we find new setting to set. We could not just use I think the missing piece here is what to set the default as, or to set it at all. We discussed this a bit in the original PR but did not nail anything down in terms of a release note. In our environment, we use our own storageclass and use open policy agent as a way to enforce it, as we want things like encryption and rest and volume expansion by default. |
The previous |
Put up a fix, we needed to set the default to |
I'm curious as to why, as including the |
Thanks @joshbranham , makes total sense. I think I've gotten bitten by the whole dropping-annotation-doesn't-remove-it thing before. I'm still a little bit concerned about this sudden change in default though, especially because it doesn't appear to be easily modifiable in the kops yaml manifest (unless I'm missing something?). Given how transparent this change is, a developer could easily find themselves in a situation where they're trying to debug some volume related issue, and they discover that at some point the underlying storage class for new PVs is different than the one for old deploys. I could see that becoming a debugging nightmare, especially as the behavior on the default StorageClass gets more complex/custom. I definitely get the desire to move forward with new functionality on StorageClasses, but I think this change in defaults should be accompanied with a Breaking Change notice, as well as an easy user-facing way to modify it through kops. |
@LeeHampton does the default |
It does have that annotation. Here is the full annotation set from my gp2 class:
|
Yeah so with that said, since you modified the storageclass at some point, it would be hard for |
Is there anything in kops that would potentially result in a StorageClass being modified? The 1.16 clusters I upgraded from were all fairly new clusters and I can't find anything in my stack that edits the underlying StorageClasses. |
This commit introduces a new StorageClass called
kops-ssd-1-17
. There seem to be a couple of issues with this.First, even though it is no longer explicitly declared in this file, the
gp2
StorageClass is set to the default volume in existing pre 1.17 clusters (i.e., it has the"storageclass.beta.kubernetes.io/is-default-class":"true"
annotation). That's because it used to be set to the default, as seen in this commit. For whatever reason, this annotation does not get undone during the upgrade to 1.17. Now thatkops-ssd-1-17
has that annotation explicitly set, what I ended up with after upgrading from 1.16 to 1.17 is 2 default StorageClasses, which leads to the following error when attempting to create a PV/PVC:Internal error occurred: 2 default StorageClasses were found
Second, tying the name of this new StorageClass to the kops version seems pretty problematic. In my case, I'm going to need to remove the annotation on this StorageClass to get my existing infra working again. To do so, I'll add a simple
kubectl patch
operation to my init/upgrade scripts. However, with this naming schema I'm assuming subsequent versions of kops will change the name of the storage class to e.g.kops-ssd-1-18
, which means I'm always going to be playing wack-a-mole with these version specific names. Why can't we just use a generic name likessd-ebs
(or even just still use the gp2 name, since it's still gp2)?Is there any way to work around this behavior that I'm missing, or is
kubectl patch
my best bet for now? And does anyone have thoughts on what a good patch for 1.17 would be to fix this cluster-breaking behavior? Change the name to something more generic and remove the default status of this new StorageClass for now? Or ensure that the default status from gp2 is removed -- (more risky for existing clusters IMO)? Happy to take this on if there's some consensus around it.The text was updated successfully, but these errors were encountered: