Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPEC: add wording for GC #981

Merged
merged 1 commit into from Aug 7, 2023
Merged

SPEC: add wording for GC #981

merged 1 commit into from Aug 7, 2023

Conversation

squeed
Copy link
Member

@squeed squeed commented Mar 20, 2023

This is an initial draft of the wording for the GC verb.

TODO:

  • Need to declare that concurrent operations are not allowed (e.g. no ADD and GC at the same time)
  • Describe the execution flow (akin to deletion)
  • Final proofreading

@coveralls
Copy link

coveralls commented Mar 20, 2023

Coverage Status

coverage: 72.685%. remained the same when pulling 3072cfe on squeed:spec-gc into 3bbe370 on containernetworking:main.

@MikeZappa87
Copy link
Contributor

@henry118 give this a look!

SPEC.md Show resolved Hide resolved
SPEC.md Outdated

The runtime must provide a JSON-serialized plugin configuration object (defined below) on standard in. It contains an additional key;

- `cni.dev/attachments` (array of objects): The list of **still valid** attachments to this network:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've been mentioning "attachment" everywhere, wonder shall we formally define the term now? because it's the first time that we'll pass around it as an object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point; attachments are implied by the spec but it would be good to make them clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny that you mentioned this -- #903 has been lying around :-)

SPEC.md Outdated
#### `GC`: Clean up any stale resources

The GC comand provides a way for runtimes to specify the expected set of attachments to a network.
The network plugin may then remove any resources related to attachments that do not exist in this set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about attachments existing in this set but not found by plugins? Do we need to define the behavior? e.g. error out vs. silently ignored?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, interesting idea. We could certainly consider returning those as "potentially failed attachments". However, I think CHECK already covers this use case reasonably well, and I don't want to push too much of a burden on individual plugins.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

SPEC.md Outdated
- IPAM reservations
- Firewall rules

A plugin SHOULD remove as many stale resources as possible. For example, a plugin should remove any IPAM reservations associated with attachments not in the provided list. The plugin MAY assume that the isolation domain (e.g. network namespace) has been deleted, and thus any resources (e.g. network interfaces) therein have been removed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the return value of GC? Can the runtime assume GC always succeeds? Otherwise how does runtime know which attachments are removed and which are not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. What sort of information would be useful for the runtime? For example, we could have the plugin return whether or not any resources were cleaned up. If we wanted to indicate potentially-misconfigured attachments, we could return them here (though I'm not yet convinced this is useful).

As for returning error, the semantics are the same as with DEL: don't return an error if resources are missing, however, if an error condition prevents successful GC, expose that fact to the runtime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I am not sure how much the runtime can do in cases of GC failures, apart from logging an error message. But I do think it would be useful for the runtimes to be aware of GC failures. So if someone were to analyze resource leaks, the log message could be a good indicator.

So we might want to define "what is a GC failure" for CNI plugin. e.g.

  • Plugin's GC procedure cannot be started; or
  • An attachment failed to be deleted; or
  • The resources of an attachment are missing (this case is probably safe to ignore)

When CNI finishes invoking GC on all "leaked" attachments, it should return an error to runtime if "any" of the attachment fails GC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we worked on wording for this, PTAL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@squeed squeed force-pushed the spec-gc branch 2 times, most recently from d3acf5b to 8cecebf Compare April 20, 2023 11:12
Copy link
Member

@henry118 henry118 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Just some nits, PTAL

SPEC.md Outdated
### Garbage-collecting a network
The runtime may also ask every plugin in a network to clean up any stale resources.

Garbage collection is similar to add with two exceptions:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean similar to "del"?

SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated
#### `GC`: Clean up any stale resources

The GC comand provides a way for runtimes to specify the expected set of attachments to a network.
The network plugin may then remove any resources related to attachments that do not exist in this set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

SPEC.md Outdated
- IPAM reservations
- Firewall rules

A plugin SHOULD remove as many stale resources as possible. For example, a plugin should remove any IPAM reservations associated with attachments not in the provided list. The plugin MAY assume that the isolation domain (e.g. network namespace) has been deleted, and thus any resources (e.g. network interfaces) therein have been removed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

SPEC.md Show resolved Hide resolved
SPEC.md Show resolved Hide resolved
SPEC.md Show resolved Hide resolved
@MikeZappa87
Copy link
Contributor

@henry118 we made some updates

SPEC.md Show resolved Hide resolved
@maiqueb
Copy link

maiqueb commented Jul 25, 2023

/cc

SPEC.md Show resolved Hide resolved
SPEC.md Show resolved Hide resolved
SPEC.md Show resolved Hide resolved
GC, or garbage collection, is intended as a way for runtimes to tell
plugins about valid attachments, enabling them to delete any leaked or
stale resources.

Signed-off-by: Casey Callendrello <c1@caseyc.net>
@dcbw
Copy link
Member

dcbw commented Aug 7, 2023

LGTM

@dcbw dcbw merged commit f6506e2 into containernetworking:main Aug 7, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants