Skip to content

Allow bin/BuildPackages.sh to use as many cores as possible to compile.#4879

Closed
hongyi-zhao wants to merge 1 commit into
gap-system:masterfrom
hongyi-zhao:master
Closed

Allow bin/BuildPackages.sh to use as many cores as possible to compile.#4879
hongyi-zhao wants to merge 1 commit into
gap-system:masterfrom
hongyi-zhao:master

Conversation

@hongyi-zhao
Copy link
Copy Markdown

Currently, the bin/BuildPackages.sh script only use 3 cores by default to do perform parallel compilation, which is a fairly time-consuming and inefficient setting. So, this PR tries to allow bin/BuildPackages.sh to use as many cores as possible to compile the GAP packages.

Copy link
Copy Markdown
Member

@fingolfin fingolfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! However I am afraid this can't be merged in its current form.

Comment thread bin/BuildPackages.sh
STRICT=no # exit with non-zero exit code when encountering any failures
PARALLEL=no
PACKAGES=()
NCORE=$(sudo dmidecode -t 4 | grep 'Core Enabled:' | awk '{a+=$NF}END{print a}')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not portable (e.g. no dmidecode on macOS), and requires root access, both of which are an absolute blocker, I am afraid.

A better way would be to offer a command line option to override the number of parallel tasks to use.

Comment thread bin/BuildPackages.sh
@@ -321,7 +322,7 @@ do
if [ "x$PARALLEL" = "xyes" ]; then
# If more than 4 background jobs are running, wait for one to finish (if
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment should match the code :-)

Comment thread bin/BuildPackages.sh
# If more than 4 background jobs are running, wait for one to finish (if
# <wait -n> is available) or for all to finish (if only <wait> is available)
if [[ $(jobs -r -p | wc -l) -gt 4 ]]; then
if [[ $(jobs -r -p | wc -l) -gt $((NCORE +1)) ]]; then
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that by telling both make and jobs to use that many cores, you are in fact instructing the system to use ~NCORE^2 cores.

One could possibly rearrange things so that a single make job server is shared across all jobs, then it's be "only" 2*NCORE jobs.

Copy link
Copy Markdown
Author

@hongyi-zhao hongyi-zhao Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to tweak the script bin/BuildPackages.sh a lot. If you have a more feasible implementation, I will be glad to see your improved code.

For related discussions, see here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I don't -- there's a reason why we implemented the script the way it is right now: it's a compromise between being portable and still using parallelism somewhat. In practice I can build all packages in under 2 minutes with this, an operation most people only perform once a year, if at all -- so this seems like an adequate compromise.

If you feel you need to run this often enough so that the time you spend on improving the script merits the effort, then I think the best way to go about it is to do what I suggested above, and add a command line option to the script that allows overriding these limits.

@ChrisJefferson
Copy link
Copy Markdown
Contributor

One other problem with parallelising by number of cores is that there exist machines (we have one) with 128 cores, but insufficient memory to build every packages in GAP at the same time.

This parallelization code isn't the best -- but as @fingolfin says I designed it to work "well enough" on every operating system. The better option would probably be to add something like your "NCORES", and make it it a command line option.

In the past I did try a "max parallelisation" option, that builds everything in parallel, with make -j, but I found even on a fairly modern machine that basically "broke the machine", as it tries to build too many large C++ files at the same time (in particular in ferret and semigroups).

@hongyi-zhao
Copy link
Copy Markdown
Author

hongyi-zhao commented Apr 25, 2022

@ChrisJefferson If so, the most feasible practice is to leave the user to set the corresponding cores used by parallel compilation on the fly, and give a relatively small default value of it, say, 3 to a predefined variable NCORE, if the --parallel option is not specified by the user at all. In short, I suggest enabling the --parallel option by default with relative small cores.

@fingolfin fingolfin closed this May 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants