Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow non-coordinate-sorted bam input to eXpress #581

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

gnaisha
Copy link

@gnaisha gnaisha commented Dec 22, 2020

eXpress requires randomly sorted bams as input, but having the input type set to "bam" causes the bam to be coordinate sorted before running. This PR adds "qname_sorted.bam" and "unsorted.bam" input types to avoid this sorting.

@bernt-matthias
Copy link
Contributor

Thanks for the contribution. The checks are currently not running because github actions changed a bit. They should run again after #582 is merged.

Q: Should the sorted bam types be removed? Also a bump of the tool version is necessary.

@bernt-matthias
Copy link
Contributor

Can you rebase the PR branch. Then Tests will run.

@gnaisha
Copy link
Author

gnaisha commented Dec 29, 2020

Thanks for the fixes to allow the checks to pass.

I tried removing the sorted bam input from the wrapper, but in my local tests it seems that sorted bam is still accepted as input, causing eXpress to fail. I've made the change in the wrapper however, since this should be an invalid input type, and will file an issue with Galaxy main regarding disallowing sorted bam in such cases.

I see that there is a tool version in the wrapper, and an eXpress version in tool_dependencies.xml. These are both set to 1.1.1 currently - is this coincidence? I assume I should only bump the tool wrapper version number.

@bernt-matthias
Copy link
Contributor

I see that there is a tool version in the wrapper, and an eXpress version in tool_dependencies.xml. These are both set to 1.1.1 currently - is this coincidence? I assume I should only bump the tool wrapper version number.

You can remove the file tool_dependencies.xml

@gnaisha
Copy link
Author

gnaisha commented Dec 29, 2020

I've removed that file, and updated eXpress to the most recent version I'm seeing in bioconda.

@bernt-matthias
Copy link
Contributor

I should have asked this earlier: What is the exact problem with sorted bam file. The manual suggests that sorting is necessary:

If you aligned your reads with Bowtie, your alignments will be properly ordered already. If you used another tool, you should ensure that they are properly sorted

@gnaisha
Copy link
Author

gnaisha commented Dec 31, 2020

Sorry I should have been explicit referring to "sorted" - eXpress does require sorted input, but it needs to be sorted by read name rather than coordinate.

You can sort your BAM using this command:

samtools sort -n hits.bam hits.sorted

Galaxy seems to define the "bam" type as being coordinate sorted, and when I used a query-name sorted bam as input to the current eXpress wrapper, Galaxy converted it to a coordinate-sorted bam before running eXpress, causing eXpress to fail. Adding qname_sorted.bam type allows this bam to be used as is, sorted by read name.

@bwlang
Copy link
Contributor

bwlang commented Dec 31, 2020 via email

<description>Quantify the abundances of a set of target sequences from sampled subsequences</description>
<requirements>
<requirement type="package" version="1.1.1">eXpress</requirement>
<requirement type="package" version="1.5.1">eXpress</requirement>
</requirements>
<command>
express --no-update-check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
express --no-update-check
#if $bamOrSamFile.ext == "qname_sorted.bam"
ln -s `$bamOrSamFile` hits.sorted
#else
samtools sort -n '$bamOrSamFile' hits.sorted
#end if
express --no-update-check

Plus: replace $bamOrSamFile on the last line of the command block with hits.sorted (btw. all input files need to be single quoted).

Maybe in the else branch one could check with samtools view -H ... | grep -q "@HD\t.*\tSO:unsorted" if the sam / bam file is sorted before calling sort again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants