-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove InstanceGroup from NodeupModelContext #9294
Remove InstanceGroup from NodeupModelContext #9294
Conversation
6b6038e
to
ceb2ae7
Compare
/retest |
1 similar comment
/retest |
I think we may need to rebase :) |
3b87312
to
04c870b
Compare
4772acf
to
9ee7b8c
Compare
9ee7b8c
to
482807b
Compare
482807b
to
b5cf082
Compare
/retest |
cb7bec1
to
aed047f
Compare
/retest |
Sorry about the delay in reviewing this. So I believe we had problems fitting in the 16KB limit in the past, which is why we had this split. I don't have an example to hand though. I suspect the most likely case would be if we were adding some certificates or keys, simply because those compress so poorly. For the nodes, we could move to fetching this data from kops-controller, which would have two advantages: no S3 access required (so fewer node permissions), and effectively no limit on the size. That doesn't really help us on the control plane nodes, though. I do really like the simplification of the code here, so I would like to see us get this in; we just need to satisfy ourselves that we aren't going to paint users that have large fileAssets/keys into a corner. I propose:
|
The only other thing I can think of is store them in a per-launchconfiguration directory in vfs with deletion tied into the code that deletes old launchconfigurations (/launchtemplates). |
ca2c33f
to
1db6e31
Compare
@johngmyers: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest |
@@ -161,6 +161,9 @@ func DeleteAllClusterState(basePath vfs.Path) error { | |||
if strings.HasPrefix(relativePath, "instancegroup/") { | |||
continue | |||
} | |||
if strings.HasPrefix(relativePath, "igconfig/") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about nodeconfig or configversion or fullconfig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the config specific to particular instancegroups. First I partition by IG role (because that's the granularity our IAM roles currently have), then by IG. Whether the config is "full" or versioned is secondary to this partitioning.
c.AddTask(&fitasks.ManagedFile{ | ||
Name: fi.String("auxconfig-" + ig.Name), | ||
Lifecycle: b.Lifecycle, | ||
Location: fi.String("igconfig/" + strings.ToLower(string(ig.Spec.Role)) + "/" + ig.Name + "/auxconfig.yaml"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does strings.ToLower(string(ig.Spec.Role))
need to be in the path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's so we can deny the nodes IG role read access to other IG roles' config.
@@ -53,6 +53,8 @@ type Config struct { | |||
|
|||
// DefaultMachineType is the first-listed instance machine type, used if querying instance metadata fails. | |||
DefaultMachineType *string `json:",omitempty"` | |||
// EnableLifecycleHook defines whether we need to complete a lifecycle hook. | |||
EnableLifecycleHook bool `json:",omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put these fields (EnableLifecycleHook, UpdatePolicy) into the AuxConfig? I figure we make this config only what we need to load the other configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been leaning towards putting as much of the small stuff into the userdata as possible. If we can gain confidence that we have enough testing to catch any breakage in the AuxConfig mechanism, I'd like to have nodeup skip loading the AuxConfig in the common case where everything fits in userdata.
I think we can get to the case where in the common case the worker nodes don't need any access to the state store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. If you have to go to the state store, you might as well get everything from there instead of going field-by-field. So we'd put everything in Config
and when it's too big, spill it to the state store and put a smaller bootstrap thing in userdata.
I'll put it on the backlog, but unblocking v1beta1 apiversion is higher priority.
} | ||
|
||
// AuxConfig is the configuration for the nodeup binary that might be too big to fit in userdata. | ||
type AuxConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of calling this AuxConfig, NodeFullConfig would be consistent with ClusterFullConfig. I see, nodeup.Config as more the auxiliary configuration ... it is the small config (constrained by userdata size) that lets us locate and load the real config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to put duplicate, unused fields in this object.
I have thought of using the same schema for userdata's Config
and this file in the state store. That would allow putting the hooks, etc. in the userdata when they fit. The userdata would then have metadata saying "pull these fields from the one in the state store" when things don't fit.
But that is all more complicated and should be deferred until after we get more experience and better testing around the exceptional cases.
A few quibbles, but I like where this is going and I think we can discuss the quibbles separately. Going to apply a hold in case you immediately agree with any of the suggestions (and want to change them here), but otherwise please just remove the hold. /approve /hold |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: justinsb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
These comments are all things that can be dealt with in future PRs, possibly in future releases. |
Move all data from
InstanceGroup
intoNodeupModelContext.NodeupConfig
to address some of the issues mentioned in #9229.There could be an issue if theHooks
orFileAssets
are too big to fit in userdata.Creates a new
NodeupAuxConfig
struct which is read from an instancegroup-specific file in the state store.