-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Podman socket performance issues #14941
Comments
Have you tried removing the nomad client and flood the socket without it? |
@jwhonce any thoughts? |
Hey @baude! Thanks for taking a look.
No, I haven't tried that. I am trying to think about how I would go about getting the same conditions without the nomad client running. The driver uses the socket to stream logs for each container so I think there are a lot of things going on that build up to the socket getting overloaded. |
is it possible to exactly reproduce what you are doing? otherwise, this is a lot to ask |
If it just the log endpoint it is tracked here: #14879 |
@baude Not without launching your own Nomad cluster and loading up each client node 200+ containers each. I understand it's a lot to ask and I am willing to do whatever I can on my end to provide more information.
@Luap99 Yeah the Driver does track the log endpoint. Here is where I believe it is doing that: https://github.com/hashicorp/nomad-driver-podman/blob/main/api/container_logs.go#L16 |
It looks like I can disable log collection in the Nomad Podman driver.
I am going to test that out on my client nodes and see if I have better performance when deploying a lot of containers at once. |
A friendly reminder that this issue had no activity for 30 days. |
Since we have heard nothing back in a month. I am guessing that the issue is resolved. Reopen if I am mistaken. |
I am still seeing issues but I haven't been able to dig into it more. I will respond back once I have more info. |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
This is kind of a cross post issue to see if there is anything that can be done to improve the performance of the Podman socket under high concurrency.
I opened this issue hashicorp/nomad-driver-podman#175 on the Nomad Podman driver project to see if we can track down why Podman on my nomad client nodes becomes overwhelmed and unresponsive under high concurrency. This seems to be a common issue for other users of the Nomad Podman driver.
Is there anything that can be done to help improve the performance of the Podman socket? Are there any tips from the Podman team on how to better debug this issue to get more information?
Steps to reproduce the issue:
Launch hundreds of containers per client node with Nomad
Watch the podman socket become unavailable and my Nomad job allocations start failing
Additional information you deem important (e.g. issue happens only occasionally):
Podman is being run as root on these client nodes on Fedora CoreOS 36.20220618.3.1 on Google Compute VMs.
Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):The text was updated successfully, but these errors were encountered: