Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bam files without @RG tag #115

Closed
mschubert opened this issue Jun 15, 2023 · 4 comments
Closed

bam files without @RG tag #115

mschubert opened this issue Jun 15, 2023 · 4 comments

Comments

@mschubert
Copy link

Thanks a lot for the tool, it's really easy to use, and after the first try already found a sample swap in our setup!

I'm now trying to confirm sample matches between WES and RNA-seq data from an external sequencing provider. I ran into an issue with the RNA-seq bam files provided, where I get the following output from somalier:

Error: unhandled exception: [somalier] no read-group in bam file [ValueError]

The @RG field is indeed missing in these bam file headers.

Is there a way to manually supply the sample ID (e.g. via a command-line argument)?

@brentp brentp closed this as completed in d5e5a03 Jun 15, 2023
@brentp
Copy link
Owner

brentp commented Jun 15, 2023

Hi Michael, always nice to hear that a tool is useful!

I added a way to do this via env variables. Will you give this binary a try (gunzip, chmod +x) and run as:

SOMALIER_SAMPLE_NAME=my_sample somalier_dev extract ...

where my_sample is the name you want to use?

somalier_dev.gz

I'll get a release out soon.

@mschubert
Copy link
Author

Wow, that was quick, thank you! 🎉

I can confirm that the binary works as expected and solves my issue

@cjfields
Copy link

Just a note that I ran into the same issue and this worked wonderfully. I did see there is a --sample-prefix option, was this also meant for adding the sample name?

@brentp
Copy link
Owner

brentp commented Jun 23, 2023

Glad to hear it works.
--sample-prefix is for when you have multiple samples with the same ID, for example if the same sample had RNA-Seq and DNA-Seq. Then the user can specify a sample-prefix so that they (and the hashtable in somalier) can differentiate.
Release for this change is on my TODO.

brentp added a commit that referenced this issue Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants