Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ansible-test: Set max number of open files in containers to 10240 #75498

Merged
merged 4 commits into from Oct 27, 2021

Conversation

Andersson007
Copy link
Contributor

@Andersson007 Andersson007 commented Aug 13, 2021

SUMMARY

I'm developing a collection for one service from scratch.

I wrote integration tests.

When trying to run the service with ansible-test with --docker, I got the following error from the service I want to test against:

ERROR: failed to start server: failed to create engines: hard open file descriptor limit of 1024 is under the 
minimum required 1956

I tried to run ulimit -n 9000 inside the container with and without --docker-privileged -> in both the cases i got "error setting limit (Operation not permitted)".

With this patch, the following command works now and the service starts successfully:

ansible-test integration test_service_query --docker ubuntu2004 --docker-ulimit "nofile=90000:90000" -vvv > ~/test.log

I'm also OK with if ulimits will be increased in docker files for the tests containers
However, this solution gives more flexibility and covers more possible cases

UPDATE
During discussion we decided to hardcode 10240 as max number of open files to not provide the additional option.

ISSUE TYPE
  • Feature Pull Request
COMPONENT NAME

ansible-test

@ansibot ansibot added affects_2.12 core_review In order to be merged, this PR must follow the core review workflow. feature This issue/PR relates to a feature request. needs_triage Needs a first human triage before being processed. support:core This issue/PR relates to code supported by the Ansible Engineering Team. test This PR relates to tests. labels Aug 13, 2021
Copy link
Member

@mattclay mattclay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of requiring users to pass another option to make tests work, I think we should pick a ulimit and have ansible-test set it automatically. The question then becomes, what should the ulimit be?

@ansibot ansibot added needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. and removed core_review In order to be merged, this PR must follow the core review workflow. labels Aug 16, 2021
@Andersson007
Copy link
Contributor Author

Andersson007 commented Aug 17, 2021

Instead of requiring users to pass another option to make tests work, I think we should pick a ulimit and have ansible-test set it automatically. The question then becomes, what should the ulimit be?

XFS supports 2 ** 64 files in the system:) I don't think many services check and require higher ulimit than the standard number. We could increase it, say, twice, i.e. to 2048 for now (that's also fine for my service). Then, we can increase it more later if someone else needs it.

@mattclay if it sounds sensible, i could adjust the number in this PR.

@mattclay
Copy link
Member

@Andersson007 Yeah, a hard-coded ulimit of 2048 seems like a reasonable fix for now, since this is the first time I'm aware that this issue has come up.

@Andersson007 Andersson007 changed the title ansible-test: add the --docker-ulimit argument ansible-test: Automatically increase max number of open files in test container from 1024 to 2048 Aug 17, 2021
@ansibot ansibot added core_review In order to be merged, this PR must follow the core review workflow. and removed needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. labels Aug 17, 2021
Andersson007 and others added 2 commits August 17, 2021 09:50
Co-authored-by: Matt Clay <matt@mystile.com>
Co-authored-by: Matt Clay <matt@mystile.com>
@mattclay mattclay removed the needs_triage Needs a first human triage before being processed. label Aug 17, 2021
@ansibot ansibot added needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. and removed core_review In order to be merged, this PR must follow the core review workflow. labels Aug 17, 2021
@Andersson007
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ansibot ansibot added core_review In order to be merged, this PR must follow the core review workflow. and removed needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. labels Aug 17, 2021
@Andersson007
Copy link
Contributor Author

the tests are green now

Copy link
Member

@mattclay mattclay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a closer look at this, I'm not sure this is what we need.

The default hard limit comes from the Docker daemon, which at least on my system is much higher than 2K. Attempting to raise that limit with --ulimit fails, so this change would only lower it.

The soft limit of 1K comes from ansible-test itself. Changing it with --ulimit will only affect what the container uses, not what ansible-test uses inside the container.

@Andersson007 Can you run ansible-test shell --docker -vv without this change and see what ansible-test reports for RLIMIT_NOFILE inside the container (the second one reported)?

@Andersson007
Copy link
Contributor Author

Andersson007 commented Aug 18, 2021

I see the following:

RLIMIT_NOFILE: (1024, 1048576)    # this is the first one
...
RLIMIT_NOFILE: (1024, 1024)   # this is at the end
...

I also ran

root@3950af2f216f:~/ansible$ ulimit -n
1024

@Andersson007
Copy link
Contributor Author

$ docker --version
Docker version 19.03.13, build 4484c46

Fedora release 33. Docker was installed via dnf, settings are default

@mattclay
Copy link
Member

@Andersson007 OK, that's what I expected. Try changing the limit to 2048 here:

Let me know if that resolves the issue you're seeing.

@Andersson007
Copy link
Contributor Author

@mattclay now the log contains
RLIMIT_NOFILE: (1024, 1048576)
RLIMIT_NOFILE: (1024, 1024)
ANSIBLE_PYTHON_MODULE_RLIMIT_NOFILE=2048

And my tests fail again with the original error:

ERROR: failed to start server: failed to create engines: hard open file descriptor limit of 1024 is un
der the minimum required 1956

So it didn't help

@mattclay
Copy link
Member

@Andersson007 My main concern with this now is that it will lower the limit for existing users if their limit is higher than 2K. Unfortunately I couldn't find any way to query the docker default without actually running a container. We could just try 2K and see if anyone reports issues, or perhaps pick a higher value to be safe. Since we're still setting the soft limit within ansible-test, a higher hard limit for docker shouldn't cause problems unless it exceeds the system's limit. Thoughts?

@Andersson007
Copy link
Contributor Author

@mattclay it's hard to imagine if anyone really need more than even 2048 files open for purposes of Ansible tests but who knows. The most important question is if the higher limit can cause security issues for testing infrastructure? If not, we could increase it to 4096 or even 10240 and see.

@mattclay
Copy link
Member

@Andersson007 Lets go with 10K and see if anyone has issues with it.

@ansibot ansibot added the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Aug 27, 2021
@ansibot ansibot added core_review In order to be merged, this PR must follow the core review workflow. and removed needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. labels Aug 27, 2021
@Andersson007 Andersson007 changed the title ansible-test: Automatically increase max number of open files in test container from 1024 to 2048 ansible-test: Increase max number of open files from default to 10240 Aug 27, 2021
@ansibot ansibot added needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. and removed core_review In order to be merged, this PR must follow the core review workflow. labels Aug 27, 2021
@Andersson007
Copy link
Contributor Author

@mattclay done, PTAL

@Andersson007
Copy link
Contributor Author

@mattclay while the collection is still in development and I think it'll take some time, right now this change is not really needed.
I can test locally running ansible-test from the branch.
I'll close this PR and when the collection is ready for submission, I'll re-open it. I think it would be better.
Thank you!

@ansible ansible locked and limited conversation to collaborators Sep 24, 2021
@Andersson007 Andersson007 reopened this Oct 26, 2021
@Andersson007
Copy link
Contributor Author

@Andersson007 Lets go with 10K and see if anyone has issues with it.

@mattclay it's been done and i've just transferred the new collection to ansible-collections.
Can we merge this to allow the collection to be tested against devel?

@ansibot ansibot added core_review In order to be merged, this PR must follow the core review workflow. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. stale_review Updates were made after the last review and the last review is more than 7 days old. and removed needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. core_review In order to be merged, this PR must follow the core review workflow. labels Oct 26, 2021
@mattclay mattclay changed the title ansible-test: Increase max number of open files from default to 10240 ansible-test: Set max number of open files in containers to 10240 Oct 27, 2021
@mattclay mattclay merged commit e50ad6f into ansible:devel Oct 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.12 feature This issue/PR relates to a feature request. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. stale_review Updates were made after the last review and the last review is more than 7 days old. support:core This issue/PR relates to code supported by the Ansible Engineering Team. test This PR relates to tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants