[bug-fix] Set number of threads based on allocated CPU count in Docker containers #4471

ervteng · 2020-09-11T00:32:37Z

Proposed change(s)

PyTorch by default sets number of threads to 1/2 available cores. However, in Docker containers, the built-in Python methods return the number of cores of the host machine, not the container. This PR reads the allocated CPU count from cgroup and uses that to set the num threads.

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

chriselion · 2020-09-11T00:42:22Z

Nice catch. I've run into this before and didn't think about it for pytorch.

chriselion · 2020-09-11T00:43:23Z

ml-agents/mlagents/torch_utils/cpu_utils.py

+def _read_in_integer_file(filename: str) -> int:
+    try:
+        with open(filename) as f:
+            return int(f.readlines()[0])


There's some code on https://bugs.python.org/issue36054 - they use the equivalent of int( f.read().rstrip() )

I like the read().rstrip() better that what I had, changed

chriselion · 2020-09-11T17:01:35Z

ml-agents/mlagents/torch_utils/cpu_utils.py

+    return max(min(num_cpus // 2, 4), 1) if num_cpus is not None else None
+
+
+def _get_num_cpus() -> Optional[int]:


nit: maybe _get_num_available_cpus()?

…r containers (#4471) * Set num threads properly for Docker * Pylint-friendly logic * Use f.read().rstrip() * Change function names

…r containers (#4471) (#4478) * Set num threads properly for Docker * Pylint-friendly logic * Use f.read().rstrip() * Change function names Co-authored-by: Ervin T <ervin@unity3d.com>

Ervin Teng added 2 commits September 10, 2020 15:50

Set num threads properly for Docker

4a67c8e

Pylint-friendly logic

dab4883

ervteng requested review from chriselion and vincentpierre September 11, 2020 00:32

chriselion reviewed Sep 11, 2020

View reviewed changes

vincentpierre approved these changes Sep 11, 2020

View reviewed changes

Use f.read().rstrip()

8aee807

chriselion approved these changes Sep 11, 2020

View reviewed changes

chriselion reviewed Sep 11, 2020

View reviewed changes

Change function names

d3ccff9

chriselion approved these changes Sep 11, 2020

View reviewed changes

ervteng merged commit e49a0df into master Sep 11, 2020

delete-merged-branch bot deleted the develop-torch-docker-cpu branch September 11, 2020 21:24

chriselion pushed a commit that referenced this pull request Sep 12, 2020

[bug-fix] Set number of threads based on allocated CPU count in Docke…

3091ae3

…r containers (#4471) * Set num threads properly for Docker * Pylint-friendly logic * Use f.read().rstrip() * Change function names

chriselion mentioned this pull request Sep 12, 2020

[bug-fix] Set number of threads based on allocated CPU count in Docker containers #4478

Merged

github-actions bot locked as resolved and limited conversation to collaborators Sep 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bug-fix] Set number of threads based on allocated CPU count in Docker containers #4471

[bug-fix] Set number of threads based on allocated CPU count in Docker containers #4471

Uh oh!

ervteng commented Sep 11, 2020

Uh oh!

chriselion commented Sep 11, 2020

Uh oh!

chriselion Sep 11, 2020

Uh oh!

ervteng Sep 11, 2020

Uh oh!

chriselion Sep 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		return max(min(num_cpus // 2, 4), 1) if num_cpus is not None else None


		def _get_num_cpus() -> Optional[int]:

[bug-fix] Set number of threads based on allocated CPU count in Docker containers #4471

[bug-fix] Set number of threads based on allocated CPU count in Docker containers #4471

Uh oh!

Conversation

ervteng commented Sep 11, 2020

Proposed change(s)

Types of change(s)

Checklist

Other comments

Uh oh!

chriselion commented Sep 11, 2020

Uh oh!

chriselion Sep 11, 2020

Choose a reason for hiding this comment

Uh oh!

ervteng Sep 11, 2020

Choose a reason for hiding this comment

Uh oh!

chriselion Sep 11, 2020

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants