Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hapdup fails with multiple primary alignments #25

Closed
mattloose opened this issue Dec 13, 2022 · 4 comments
Closed

hapdup fails with multiple primary alignments #25

mattloose opened this issue Dec 13, 2022 · 4 comments

Comments

@mattloose
Copy link

I've got one sample in a series which is failing in hapdup with the following error.

Any suggestions?

Thanks

`

Starting merge
Expected three tokens in header line, got 2
This usually means you have multiple primary alignments with the same read ID.
You can identify whether this is the case with this command:

    samtools view -F 0x904 YOUR.bam | cut -f 1 | sort | uniq -c | awk '$1 > 1'

Expected three tokens in header line, got 2
This usually means you have multiple primary alignments with the same read ID.
You can identify whether this is the case with this command:

    samtools view -F 0x904 YOUR.bam | cut -f 1 | sort | uniq -c | awk '$1 > 1'

[2022-12-13 17:15:57] ERROR: Missing output: hapdup/margin/MARGIN_PHASED.haplotagged.bam
Traceback (most recent call last):
File "/data/test_data/GIT/hapdup/hapdup.py", line 24, in
sys.exit(main())
File "/data/test_data/GIT/hapdup/hapdup/main.py", line 206, in main
file_check(haplotagged_bam)
File "/data/test_data/GIT/hapdup/hapdup/main.py", line 114, in file_check
raise Exception("Missing output")
Exception: Missing output
`

@fenderglass
Copy link
Collaborator

Never seen this before, but looks like the error might be inside PEPPER/Margin packages. Do you mind sending hapdup.log and full terminal output if you happen to have it? Is there anything special about this failing sample, compared to others (unusual coverage, read length, heterozygosity etc)?

@mattloose
Copy link
Author

I don't believe there to be anything particularly unusual about the sample, but it does have some ultra long read data in it.

The log file is below.

I don't have the full terminal output but here is where it was at before the error:

Phasing 87% complete (8087/9279, 7m 45s). Estimated time remaining: 1m 9s > Phasing 88% complete (8180/9279, 7m 49s). Estimated time remaining: 1m 3s > Phasing 89% complete (8274/9279, 7m 53s). Estimated time remaining: 58s > Phasing 90% complete (8363/9279, 7m 57s). Estimated time remaining: 53s > Phasing 91% complete (8494/9279, 8m 3s). Estimated time remaining: 47s > Phasing 92% complete (8550/9279, 8m 6s). Estimated time remaining: 42s > Phasing 93% complete (8666/9279, 8m 11s). Estimated time remaining: 36s > Phasing 94% complete (8775/9279, 8m 15s). Estimated time remaining: 31s > Phasing 95% complete (8864/9279, 8m 18s). Estimated time remaining: 26s > Phasing 96% complete (8908/9279, 8m 20s). Estimated time remaining: 20s > Phasing 97% complete (9009/9279, 8m 23s). Estimated time remaining: 15s > Phasing 98% complete (9136/9279, 8m 26s). Estimated time remaining: 10s > Phasing 99% complete (9190/9279, 8m 26s). Estimated time remaining: 5s > Starting merge
Expected three tokens in header line, got 2
This usually means you have multiple primary alignments with the same read ID. You can identify whether this is the case with this command:

hapdup.log

@fenderglass
Copy link
Collaborator

Thanks. According to this, it might have something to do with the duplicated read ids: kishwarshafin/pepper#67

Could you try command line: samtools view -F 0x900 -q 10 your.bam | cut -f1 | uniq -c | awk '$1 > 1' | head? If the output is non-empty, there are duplicates. If so, they need to be removed, I don't think there is another workaround in Margin.

@mattloose
Copy link
Author

Apologies - I've realised this was an error on our side where duplicated reads had got into the assembly pipeline.

Thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants