Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default to not keeping the STAR index in shared memory #22

Merged
merged 2 commits into from
Aug 14, 2021

Conversation

eernst
Copy link
Contributor

@eernst eernst commented Aug 13, 2021

Thank you for the hard work on Finder! I look forward to applying it to the annotation of non-model species.

The primary motivation for this pull request is to make Finder play nice when running on HPC clusters by disabling STAR's shared memory genome index loading by default and adding a boolean to enable it.

The first issue is that some cluster environments (including my own) don't support STAR's shared memory loading:
alexdobin/STAR#841
BosingerLab/BALDR#11

The default for STAR's genomeLoad parameter is "NoSharedMemory".

The second issue is that the multiple invocations of

        # Remove a pre-loaded genome
        cmd  = "STAR "
        cmd += f" --runThreadN {options.cpu} "
        cmd += f" --genomeLoad Remove "
        cmd += f" --genomeDir {options.genome_dir_star} "
        os.system(cmd)

in finder and alignReads.py, as well as the genomeGenerate call do not specify an --outFileNamePrefix. This causes STAR output files (Log.out, etc.) to be written to the working directory where finder was launched, which causes crashes when running multiple jobs simultaneously from the same directory.

Instead, all output should be directed to the value of Finder's -out_dir parameter.

I've tested this PR on the example dataset with the new boolean enabled and disabled, and the resulting predictions are the same.

@sagnikbanerjee15
Copy link
Owner

Hello @eernst,

That's fantastic. Thank you for submitting the PR and actively modifying the code. Just a heads up, we are planning to remove BRAKER and GeneMarkS/T from the pipeline. This will allow us to release it as a conda package and/or docker container.

Thank you.

@sagnikbanerjee15 sagnikbanerjee15 merged commit ed3f1f1 into sagnikbanerjee15:master Aug 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants