Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreading behaviour for Cutadapt < 1.15 #273

Closed
jrderuiter opened this issue Nov 24, 2017 · 5 comments
Closed

Multithreading behaviour for Cutadapt < 1.15 #273

jrderuiter opened this issue Nov 24, 2017 · 5 comments

Comments

@jrderuiter
Copy link

jrderuiter commented Nov 24, 2017

I am currently using Cutadapt 1.14 as part of one of my pipelines, and was surprised to see that Cutadapt was suddenly using multiple cores to read/write gzip files (via pigz). I see that this is a new addition to Cutadapt 1.15, which is implemented in the xopen package.

The problem is that, using Cutadapt 1.14, I have no way to specify the number of cores Cutadapt should be using (as the --cores flag was added in Cutadapt 1.15). Would it be possible to limit the number of cores for earlier releases of Cutadapt to 1? (Reflecting the previous behaviour?)

This is using the bioconda packages btw. I see that dependencies there specify xopen >= 0.1.1, which results in the above mentioned issue.

Note that I am not opposed to the multithreading feature (which I think is great), but this does change the behaviour of some existing pipelines/analyses, which I am reluctant to change software versions for.

@marcelm
Copy link
Owner

marcelm commented Nov 24, 2017

To solve this, you should explicitly request xopen 0.1.1 when you install cutadapt 1.14. That is, use something like conda install cutadapt=1.14 xopen=0.1.1.

In general, it is probably a good idea to pin the versions of all packages if if you want a reproducible pipeline. You can export the full package list (with versions) of a conda environment using conda list --export > packages.txt. That text file can then be used to exactly recreate the environment.

Note that even cutadapt 1.14 with xopen 0.1.1 uses pigz, but only when it runs under Python 2. Also, the --cores parameter that was added in cutadapt 1.15 doesn’t control the number of threads that pigz uses. When you switch to cutadapt 1.15 and you think you need such an option, please open an issue.

Thanks for the feedback!

@jrderuiter
Copy link
Author

I understand what you mean, and I already do fix package versions to make things reproducible.

However I do think that it is undesirable that the behavior of cutadapt differs depending on the version of one of its dependencies (in this Copenhagen).

@marcelm
Copy link
Owner

marcelm commented Nov 24, 2017

I somewhat agree, but even after reading this blog post again and some others, I’m not 100% sure what the correct way to solve this is.

At minimum, the 1.14 conda recipe should probably have specified the xopen dependency xopen =0.1. I can do this starting with the next cutadapt release, and - if I have time, also for cutadapt 1.15. If you feel this is important enough, you are welcome to fix the 1.14 recipe as well.

I’ll have to think about pinning the dependency in setup.py.

@jrderuiter
Copy link
Author

I understand your point as well. Maybe the best approach would be to add a comment somewhere in the docs, warning people about this issue? That said, it's unlikely to be a problem in the future if people upgrade to the new version.

@marcelm
Copy link
Owner

marcelm commented Mar 4, 2019

This issue is not quite solved, but I’m closing it now in favor of #290, which is also about controlling the number of threads used by pigz.

@marcelm marcelm closed this as completed Mar 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants