-
Notifications
You must be signed in to change notification settings - Fork 390
Add support for drop-in configs #1710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
18681f2
to
8193205
Compare
Please note. This is a very quick first draft. |
fff1388
to
456ff8f
Compare
controllers/object_controls.go
Outdated
topLevelConfigFile = value | ||
} | ||
// TODO: This should be a sane default. | ||
dropInConfigFile := "/run/toolkit/config/99-nvidia.toml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question -- what about /run/nvidia/toolkit/config/99-nvidia.toml
? The toolkit container creates the toolkit.pid
file at /run/nvidia/toolkit/toolkit.pid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated this to /run/nvidia/toolkit/config/99-nvidia.toml
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tariq1890 and I discussed this offline. I have updated the host path to /etc/containerd/conf.d/99-nvidia.toml
controllers/object_controls.go
Outdated
case gpuv1.Containerd.String(): | ||
runtimeConfigFile = DefaultContainerdConfigFile | ||
topLevelConfigFile := DefaultContainerdConfigFile | ||
// TODO: We should also read RUNTIME_CONFIG here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Should RUNTIME_CONFIG
, if defined, take precedence over CONTAINERD_CONFIG
/CRIO_CONFIG
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the latter should have higher precedence as it is more specific than RUNTIME_CONFIG
. If you're in agreement with this, can we resolve the TODO in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Let me update this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated so we now read RUNTIME_CONFIG
as well, with the more specific envvars, e.g. CONTAINERD_CONFIG
/ CRIO_CONFIG
, taking precedence. I also added additional unit test cases for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
b3245de
to
f60dadd
Compare
aad82d3
to
c5dc4f4
Compare
Why can't we mount the parent dir of the top-level config and the drop-in config dir? It seems like a lot of the container-toolkit implementation details are spilling into this PR? Any way to minimise? |
We are mounting the parent dir for both config files. |
c5dc4f4
to
764cdb3
Compare
This change adds explicit support for drop-in configs as supported by containerd and cri-o. Signed-off-by: Evan Lezar <elezar@nvidia.com> Co-authored-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
764cdb3
to
584c4c5
Compare
🚀 🚀 🚀 |
This change adds explicit support for drop-in configs as supported by containerd and cri-o.
See: