Skip to content

Conversation

@seemethere
Copy link
Member

@seemethere seemethere commented Sep 15, 2020

Stack from ghstack:

Switches our MAX_JOBS from a hardcoded value to a more dynamic value so
that we can always utilize all of the core that are available to us

Signed-off-by: Eli Uriegas eliuriegas@fb.com

Differential Revision: D23759643

Switches our MAX_JOBS from a hardcoded value to a more dynamic value so
that we can always utilize all of the core that are available to us

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

[ghstack-poisoned]
seemethere added a commit that referenced this pull request Sep 15, 2020
Switches our MAX_JOBS from a hardcoded value to a more dynamic value so
that we can always utilize all of the core that are available to us

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

ghstack-source-id: 569a8c8
Pull Request resolved: #44729
# This script is run inside Docker.2XLarge+ container that has 20 CPU cores
# But ncpu will return total number of cores on the system
export MAX_JOBS=18
export MAX_JOBS=$(( $(nproc) - 2 ))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought nproc was suppose to return # of available CPU on the host instead of the container? @malfet

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a conversation about it and we think the issue with the jobs stemmed from the use of stdbuf and not necessarily from the nproc invocation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seemethere can you please delete comments on lines 8-9 then?
@walterddr yes, that is correct, but actually spawning more build jobs than number of cores is fine.

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM provided that build time is lower across linux platforms than the current build time.

# This script is run inside Docker.2XLarge+ container that has 20 CPU cores
# But ncpu will return total number of cores on the system
export MAX_JOBS=18
export MAX_JOBS=$(( $(nproc) - 2 ))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seemethere can you please delete comments on lines 8-9 then?
@walterddr yes, that is correct, but actually spawning more build jobs than number of cores is fine.

Switches our MAX_JOBS from a hardcoded value to a more dynamic value so
that we can always utilize all of the core that are available to us

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

[ghstack-poisoned]
seemethere added a commit that referenced this pull request Sep 15, 2020
Switches our MAX_JOBS from a hardcoded value to a more dynamic value so
that we can always utilize all of the core that are available to us

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

ghstack-source-id: 92ae195
Pull Request resolved: #44729

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
@malfet
Copy link
Contributor

malfet commented Sep 15, 2020

@seemethere alternatively, perhaps we could try installing ninja on the nightly docker images and benefit from its autoscale feature.

Copy link
Contributor

@xuzhao9 xuzhao9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. According to this issue, the available cpus within a docker from the file /sys/fs/cgroup/cpuset/cpuset.cpus.

@codecov
Copy link

codecov bot commented Sep 16, 2020

Codecov Report

Merging #44729 into gh/seemethere/23/base will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@                    Coverage Diff                    @@
##           gh/seemethere/23/base   #44729      +/-   ##
=========================================================
- Coverage                  68.08%   68.08%   -0.01%     
=========================================================
  Files                        384      384              
  Lines                      49768    49768              
=========================================================
- Hits                       33885    33883       -2     
- Misses                     15883    15885       +2     
Impacted Files Coverage Δ
torch/utils/_benchmark/utils/common.py 77.68% <0.00%> (-1.66%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 07cba8b...4da2c3a. Read the comment docs.

@facebook-github-bot
Copy link
Contributor

@seemethere merged this pull request in 6006e45.

xuzhao9 pushed a commit that referenced this pull request Sep 18, 2020
Summary:
Pull Request resolved: #44729

Switches our MAX_JOBS from a hardcoded value to a more dynamic value so
that we can always utilize all of the core that are available to us

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: walterddr

Differential Revision: D23759643

Pulled By: seemethere

fbshipit-source-id: ad26480cb0359c988ae6f994e26a09f601b728e3
@facebook-github-bot facebook-github-bot deleted the gh/seemethere/23/head branch September 21, 2020 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants