-
Notifications
You must be signed in to change notification settings - Fork 752
Switch to long-lived container resolvers #1382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
adalton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All nits.
gnosek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the code changes look okay at a glance, I still fail to understand how that helps with locks held over a fork.
|
The reason this helps with forks is that some tests did a exit() in a forked child, which close all FILEs, call the destructor of all globals, etc. Since it's a forked child, the destructors were being called twice (in parent + child), and that caused a hang due to one of the semaphores. The corresponding agent PR had a test that explicitly triggers this--aside from this change, I could make the test fail/pass by switching between exit() and _exit() (which skips all that teardown stuff). |
91f5561 to
e9d73bf
Compare
gnosek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized exit() only calls destructors of statics/globals, not of local variables, so moving from one to the other will indeed help.
Instead of having container engine objects be short-lived, only created at the instant resolve() is called, have them be long-lived and a part of the sinsp_container_manager object. This involves: - create a stub base class libsinsp::container_engine::resolver that defines virtual void resolve() and virtual cleanup() methods. - making all the classes in container_engine/* derive from the base class. - in sinsp_container_manager, create a list of container_engine::resolver objects. - when resolving containers, iterate over the list instead of using the templated functions resolve_container_impl(). This fixes SMAGENT-1569, because there is no global state related to the container engines.
e9d73bf to
5d8eeb2
Compare
Instead of creating the container engines in the constructor, create them at the first call to resolve_container(). This gives time to set things like the cri socket, timeout, etc to alternate values.
Add additional debug logging and slightly modify existing logging to add a consistent "cri: xxx" or "cri (<container id>)" prefix, like we do for docker.
|
@adalton and @gnosek, I made some additional changes to defer creating the container engines until the first call to resolve_container. This allows time to specify an alternate cri socket path, timeout, etc. I'd like to get your feedback. I didn't go the full step of moving the static values like the socket path, timeout, etc out of libsinsp/cri.cpp, but if you feel like it's better to move them into the cri engine, let me know. |
Instead of having container engine objects be short-lived, only created
at the instant resolve() is called, have them be long-lived and a part
of the sinsp_container_manager object. This involves:
defines virtual void resolve() and virtual cleanup() methods.
class.
container_engine::resolver objects.
templated functions resolve_container_impl().
This fixes SMAGENT-1569, because there is no global state related to the
container engines.