-
Notifications
You must be signed in to change notification settings - Fork 3.5k
NRI: add support for NRI with extended scope. #6019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @klihub. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Merge Failed. This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset. |
1797e06
to
1b6b8a5
Compare
Merge Failed. This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset. |
Looks like this was opened from a branch that was outdated; can you try rebasing your branch to get rid of the unrelated commits? |
1b6b8a5
to
3b3ffc1
Compare
Rebased on top of main. |
Build succeeded.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the slack chat on this important and interesting project..
first review note: probably needs a couple md files/edits somewhere in the project for setup and use ... e.g. a small README.md in the contrib/nri path and maybe start a stub for a containerd/docs/nri.md file to help reviewers get started
Merge Failed. This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset. |
Build succeeded.
|
2352174
to
fa82f0e
Compare
Build succeeded.
|
Build succeeded.
|
Sorry for so late reply on this pull request. I like that idea - TTRPC-izes NRI, even if NRI is about binary call. But for stub of CRI, I have questions:
containerd has integrated NRI with And currently, with common OCI-Runtime-Spec implementations, the container Thanks, |
No problem. And thanks for the extra effort of taking a look at this. I really appreciate it !
One thing related to this. Based on our discussion with @mikebrow, we would like to eventually extend our integration/implementation to also handle containers created using other 'frontends' protocols than just CRI (ctr and docker in particular). We rolled our initial implementation this way, partly because the current NRI integration is NRI-only and partly because we'd like to better understand all the implications and problems related to extending it to the rest.
I assume here that by 'stub of CRI' you mean the code added in this patch set to the CRI plugin for triggering NRI processing and altering containers accordingly. Please let me know if this assumption is wrong. That stub, the new NRI-related additions/adaptation, does not maintain state of the containers on its own. It basically does the following things:
State synchronization lets NRI know about all existing pods and containers on startup. Additionally, it allows any NRI plugin that establishes a connection to NRI dynamically to synchronize itself. For request processing, the stub gathers all the necessary data for plugins, makes the call to NRI, then takes the response and applies the necessary changes to containers, also making sure that the changes are applied in the correct order. There are slight differences in how containerd and cri-o mangle and store some of the data carried by CRI into an OCI container, for instance the details of how labels and annotations get converted (if at all) to OCI annotations. We want to hide these differences as much as possible from NRI plugins, so we need to have some extra processing/logic for this in the runtime NRI adaptation.
Yes, I've tried to follow the same pattern. When containers are stopped, hooking in NRI processing is done in a similar fashion as deletion is handled for the original NRI plugins.
The primary purpose of Post* stub calls is to allow a plugin to perform any extra action outside of the realm of control of OCI if necessary. Additionally, these provide a mechanism for a plugin to acquire the final state a container ended up in, once it has been processed by all NRI plugins.
They are not just pure wrappers. Many of them, at least the ones that can result in modifications to containers, need to happen at a particular point during the processing of the corresponding CRI request. So I think these can't really be implemented as a pure add-on that would happen in addition to opaque processing of the related CRI request. I guess those requests that can't result in container modifications could in principle be implemented like that (pod-related ones and the Post* ones). But even for those I'm not sure if digging out all the necessary information for hooking in NRI would be really possible from 'outside' of the CRI plugin.
If I misunderstood some aspect of your questions, please let me know. |
a258b9e
to
ea21711
Compare
f360d32
to
15e6f73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with adding the following TODO's:
-
TODO before release: change to default enabled without experimental warning, in order to enable v010 nri plugins this needs to be reconsidered before release, by default should include the v010 adapter in the release - and it should probably be loaded by default not through the ttrpc but as a pre-registered.
-
TODO before release: having two configs to enable nri is confusing /etc/nri/nri.conf vs containerd/config.toml
on bringing down the logging plugin I get an error:
ERRO[2022-11-22T11:57:26.541929641-06:00] error receiving message error="failed to read header from trunk: EOF"
INFO[2022-11-22T11:57:26.541953042-06:00] connection to plugin "00-logger" closed
INFO[2022-11-22T11:57:26.542130540-06:00] ttrpc server for plugin "00-logger" closed (ttrpc: server closed)
TODO: is there a way to eat that error message and just close?
Yes. I updated it and now if you bring down the plugin it's logged like this:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM to merge with the following TODO's for follow ups:
- TODO: after merging this PR and Extend scope to enable common pluggable runtime extensions. nri#16 remove temporary vendor divert to github.com/klihub/nri
- TODO before release: change to default enabled without experimental warning, in order to enable v010 nri plugins this needs to be reconsidered before release, by default should include the v010 adapter in the release - and it should probably be loaded by default not through the ttrpc but as a pre-registered.
- TODO before release: having two configs to enable nri is confusing /etc/nri/nri.conf vs containerd/config.toml
I can change the initialization code in NRI itself so that the lack of a configuration file is treated as an empty configuration (file) instead of an initialization error (which then results in NRI getting disabled). With that change, NRI could be enabled by simply enabling it in the containerd config. |
Add a common NRI 'service' plugin. It takes care of relaying requests and respones to and from NRI (external NRI plugins) and the high-level containerd namespace-independent logic of applying NRI container adjustments and updates to actual CRI and other containers. The namespace-dependent details of the necessary container manipulation operations are to be implemented by namespace- specific adaptations. This NRI plugin defines the API which such adaptations need to implement. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Implement the adaptation interface required by the NRI service plugin to handle CRI sandboxes and containers. Hook the NRI service plugin into CRI request processing. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Remove direct invocation of old v0.1.0 NRI plugins. They can be enabled using the revised NRI API and the v0.1.0 adapter plugin. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Co-authored-by: Mike Brown <brownwm@us.ibm.com> Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LTGM ready to merge IMO can do TODOs in followups
Removing backport label, added in #7954 |
This PR is one in a set of 3 PRs, that propose and implement a prototype to extend the scope of NRI to enable common, pluggable runtime extensions. The other related PRs in this set are:
Please refer to the NRI PR for a general description of what this is all about.