Skip to content

[WIP] STAR Workflow v2.0.0#1

Merged
adthrasher merged 56 commits into
masterfrom
rnaseq-workflow
Sep 25, 2019
Merged

[WIP] STAR Workflow v2.0.0#1
adthrasher merged 56 commits into
masterfrom
rnaseq-workflow

Conversation

@claymcleod
Copy link
Copy Markdown
Member

This PR is an implementation of the new STAR alignment workflow for RNA-seq (v2.0). You can follow the discussion on what features the pipeline will have on the RFC pull request.

@claymcleod
Copy link
Copy Markdown
Member Author

@a-frantz Let's use this repo/branch to develop out the workflow together.

Comment thread docker/star-alignment.Dockerfile Outdated
Comment thread tools/qc.wdl Outdated
Comment thread tools/qc.wdl Outdated
Comment thread tools/qc.wdl Outdated
Comment thread tools/rseqc.wdl Outdated
Comment thread docker/bioinformatics-base/Dockerfile Outdated
ENV PATH /opt/conda/bin:$PATH

RUN apt-get update && \
apt-get upgrade -y && \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apt-get upgrade -y && \

Assume the base image is already up-to-date.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little further why this would be a best practice? Just curious from your perspective.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the responsibility of the base image to maintain and update core packages periodically. Anything else can be updated/installed individually.

Comment thread docker/bioinformatics-base/Dockerfile Outdated
Comment thread tools/fastqc.wdl Outdated
Comment thread tools/qualimap.wdl Outdated
Comment thread tools/samtools.wdl Outdated
Comment thread tools/samtools.wdl Outdated
@zaeleus
Copy link
Copy Markdown
Contributor

zaeleus commented Aug 7, 2019

Integer values for memory requirements are in bytes, e.g., 75000 = ~73 KiB. Use a string with a unit instead: "75 GiB".

@a-frantz
Copy link
Copy Markdown
Member

a-frantz commented Aug 7, 2019

Integer values for memory requirements are in bytes, e.g., 75000 = ~73 KiB. Use a string with a unit instead: "75 GiB".

Just using an integer works everywhere else we've used it. I know on LSF there's a setting in the config file to change default units, would that be a problem if that setting was, say GB instead of bytes or whatever the default value that's currently being used?

@zaeleus
Copy link
Copy Markdown
Contributor

zaeleus commented Aug 7, 2019

Just using an integer works everywhere else we've used it. I know on LSF there's a setting in the config file to change default units, would that be a problem if that setting was, say GB instead of bytes or whatever the default value that's currently being used?

The spec defines integers are parsed as bytes.

In the backend configuration for the job submit command, use the suffixed version of memory.

@adthrasher
Copy link
Copy Markdown
Member

Just using an integer works everywhere else we've used it. I know on LSF there's a setting in the config file to change default units, would that be a problem if that setting was, say GB instead of bytes or whatever the default value that's currently being used?

The spec defines integers are parsed as bytes.

In the backend configuration for the job submit command, use the suffixed version of memory.

@a-frantz - I think what @zaeleus is saying is that there is implicit unit conversion going on and it is not clear to an end-user. For example, star.build_db specifies a memory value of "50000" which according to the WDL spec should be in bytes. The LSF conf file doesn't specify units, so the "50000" is passed directly to the bsub command. Our LSF cluster is set to use MB as the default unit (LSF's built-in default unit is KB). So the "50000" value should be bytes according to the WDL spec, but at runtime, this is being interpreted as MB to get 50GB of reserved memory. For correctness and clarity, we should use the string version of memory specification and set the unit being used in the LSF conf.

Comment thread docker/bioinformatics-base/Dockerfile Outdated
@claymcleod
Copy link
Copy Markdown
Member Author

This seems like it is getting close to being ready for merging, right?

@adthrasher
Copy link
Copy Markdown
Member

This seems like it is getting close to being ready for merging, right?

Yes, I think it is ready to merge. Assuming we want anything arising through the RFC to be in a separate PR.

@claymcleod
Copy link
Copy Markdown
Member Author

This seems like it is getting close to being ready for merging, right?

Yes, I think it is ready to merge. Assuming we want anything arising through the RFC to be in a separate PR.

Yeah I think that's probably the right approach. Go ahead and merge whenever you are ready.

@adthrasher adthrasher merged commit ceb55de into master Sep 25, 2019
@claymcleod claymcleod deleted the rnaseq-workflow branch September 26, 2019 02:46
@adthrasher adthrasher mentioned this pull request Feb 25, 2025
@adthrasher adthrasher mentioned this pull request Jun 1, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants