New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-threading use of cores #322

Closed
andylaw opened this Issue Jul 27, 2018 · 1 comment

Comments

Projects
None yet
2 participants
@andylaw

andylaw commented Jul 27, 2018

Less of an issue, more of a plea for consideration before implementing the proposed new feature of 'greedy' core usage.

The trouble is that we don't always have the rights to use all the cores on the machine that we are using. If I am running cutadapt in a shared HPC environment, I emphatically do NOT want it to use all the cores that the OS reports. I want it to use just as many as I have told it that it can use. If my 4 core job suddenly decides that it can use all 40 cores on the node that it has been allocated to, then there are a large number of other users going to be cross that they are now contending for cores with my badly-behaved process. I would strongly argue that the default behaviour should be single core, with an option to specify all as an argument to the --cores flag.

@marcelm

This comment has been minimized.

Owner

marcelm commented Aug 16, 2018

Hi, and thanks for the feedback! Yes, you’re absolutely right that the behavior of using all existing cores would be bad. My idea was to use the number of available cores. Some cluster systems use the cpuset(7) mechanism to reduce the number of cores available to a single process, and in that case the process can detect this by reading out /proc/self/status (the bit mask in the Cpus_allowed: line, see http://stackoverflow.com/a/1006301/715090 , where I got he code for this). So your 4-core job on a 40-core machine would actually use only 4 cores.

I would really like for cutadapt to have sane defaults, and using all available cores would be one of them, but I can see that this might a problem on a system wher Cpus_allowed doesn’t exist or does not contain useful information. I’ll definitely implement a -j all option (possibly named -j 0 or something like that) that works like this, but I’ll think more about whether I will make this the default.

@marcelm marcelm added feature enhancement and removed feature labels Aug 30, 2018

@marcelm marcelm closed this in 4a73ff3 Sep 5, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment