Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

findallcoverageatposition output incorrect numbers #180

Open
yliueagle opened this issue Mar 3, 2021 · 5 comments
Open

findallcoverageatposition output incorrect numbers #180

yliueagle opened this issue Mar 3, 2021 · 5 comments

Comments

@yliueagle
Copy link

Subject of the issue

findallcoverageatposition output incorrect numbers

Your environment

  • version of jvarkit: b0bbbff

  • version of java:

    openjdk version "1.8.0_275"
    OpenJDK Runtime Environment (build 1.8.0_275-b01)
    OpenJDK 64-Bit Server VM (build 25.275-b01, mixed mode)

  • which OS: Linux

Steps to reproduce

echo "hic_reads_trans_chr14_chr18_chr18_2021-03-0318:15:51.bam" | java -jar ./jvarkit/dist/findallcoverageatposition.jar -p chr18:7833720

Expected behaviour

The number of reads at chr18:7833720 should be 2, with one A and one G. But the output of the command is different (see the data and output in the drive):

https://drive.google.com/drive/folders/14HwF65VllIIYSHR3y--k92140J4v4UWL

@lindenb lindenb closed this as completed in 1728a99 Mar 3, 2021
@lindenb lindenb reopened this Mar 3, 2021
@lindenb
Copy link
Owner

lindenb commented Mar 3, 2021

Thank you for the bug report. I think I found my error. Can you please test my latest commit please ?

@yliueagle
Copy link
Author

yliueagle commented Mar 3, 2021

Thanks for your quick update and it works now. Yet I found another issue: the counts at positions encountering hard clipping seems incorrect (seems reads that have hard clippings are not included in the computation), see the following positions as an example

CHROM    POS

1: chr18 10736
2: chr18 10749
3: chr18 10755
4: chr18 11333
5: chr18 11335

@lindenb
Copy link
Owner

lindenb commented Mar 4, 2021

hard clipping seems incorrect

@yliueagle well, I didn't want to count the hard+soft clipping in the count of bases. I've just added a new option --clip to count them. For now I cannot download your BAM from google to test your positions above. Tell me if it fulfills your needs.

P

@yliueagle
Copy link
Author

yliueagle commented Mar 4, 2021

@lindenb
Thanks again. The issue here is about the reads that have hard clipping, instead of clipped bases. See the test data (remove ".gz") below. At the position chr18 10736, findallcoverageatposition outputs DEPTH as 0

(btw, these are Hi-C chimeric reads)

test.bam.gz
test.bam.bai.gz

test

@lindenb
Copy link
Owner

lindenb commented Mar 5, 2021

@yliueagle i saw your mail. Sorry, I don't have the time to explore this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants