Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pysam.utils.SamtoolsError cannot parse region "chr22_KI270879v1_alt:267307-281486" #65

Closed
geocarvalho opened this issue Jun 10, 2022 · 2 comments

Comments

@geocarvalho
Copy link

Hello @sbslee, I hope you are well.
I'm trying to run a WGS sample that was run using Dragen 3.9 for the prepare-depth-of-coverage and have the below error:

$ docker run -v "$PWD":/data pypgx:v0.15.0 pypgx prepare-depth-of-coverage /data/sample/sample_WGS-depth-of-coverage.zip /data/sample_WGS/sample_WGS-contigs.bam --assembly GRCh38
ERROR conda.cli.main_run:execute(33): Subprocess for 'conda run ['pypgx', 'prepare-depth-of-coverage', '/data/sample_WGS/sample_WGS-depth-of-coverage.zip', '/data/sample_WGS/sample_WGS-contigs.bam', '--assembly', 'GRCh38']' command failed.  (See above for error)
Traceback (most recent call last):
  File "/opt/conda/envs/myenv/bin/pypgx", line 10, in <module>
    sys.exit(main())
  File "/opt/conda/envs/myenv/lib/python3.8/site-packages/pypgx/__main__.py", line 33, in main
    commands[args.command].main(args)
  File "/opt/conda/envs/myenv/lib/python3.8/site-packages/pypgx/cli/prepare_depth_of_coverage.py", line 90, in main
    archive = utils.prepare_depth_of_coverage(
  File "/opt/conda/envs/myenv/lib/python3.8/site-packages/pypgx/api/utils.py", line 1232, in prepare_depth_of_coverage
    cf = pycov.CovFrame.from_bam(bams, regions=regions, zero=True)
  File "/opt/conda/envs/myenv/lib/python3.8/site-packages/fuc/api/pycov.py", line 261, in from_bam
    results += pysam.depth(*(bams + args + ['-r', region]))
  File "/opt/conda/envs/myenv/lib/python3.8/site-packages/pysam/utils.py", line 69, in __call__
    raise SamtoolsError(
pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=samtools depth: cannot parse region "chr22_KI270879v1_alt:267307-281486"\n'

I think it is happening because the contig name is different:

$ samtools idxstats sample_WGS-contigs.bam | grep chr22
chr22
chr22_KI270731v1_random
chr22_KI270732v1_random
chr22_KI270733v1_random
chr22_KI270734v1_random
chr22_KI270735v1_random
chr22_KI270736v1_random
chr22_KI270737v1_random
chr22_KI270738v1_random
chr22_KI270739v1_random

I have removed the reads in these random contigs because I'm trying to use a sample already aligned instead of aligning it again without the contigs. Do you think there is any other way of solving it? Maybe creating the sample_WGS-depth-of-coverage.zip file out of pypgx?

Thank you.

@sbslee
Copy link
Owner

sbslee commented Jun 10, 2022

@geocarvalho,

This is a GRCh38-specific issue. One of the genes with SV is GSTT1 and it is located in the contig chr22_KI270879v1_alt, which is missing in your BAM file. That's why the prepare-depth-of-coverage command is complaining. For more details, please read GRCh37 vs. GRCh38.

In any case, the solution is to exclude the GSTT1 gene when running the command:

$ pypgx prepare-depth-of-coverage \
depth-of-coverage.zip \
in.bam \
--assembly GRCh38 \
--genes GSTT1 \
--exclude

Please try this.

@geocarvalho
Copy link
Author

Thank you @sbslee, it worked. I'll study how to deal with this gene in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants