Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] summary and log are confusing. #143

Open
ming1211 opened this issue Nov 21, 2023 · 6 comments
Open

[BUG] summary and log are confusing. #143

ming1211 opened this issue Nov 21, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@ming1211
Copy link

The summary file generated is as follows,
I'm so confused as, why the unmaped is minus zero.(the barconde is empty)

barcode,total,duplicate,unmapped,lowmapq
,66319792,2753815,-54606110,12185252

and the following is the log:

Number of reads: 132639584.
Number of mapped reads: 120925902.
Number of uniquely mapped reads: 109487752.
Number of reads have multi-mappings: 11438150.
Number of candidates: 1347489938.
Number of mappings: 120925902.
Number of uni-mappings: 109487752.
Number of multi-mappings: 11438150.
Sorted, deduped and outputed mappings in 482.28s.
‘# uni-mappings: 107154670, # multi-mappings: 10900787, total: 118055457.
Number of output mappings (passed filters): 105986835
Total time: 1821.25s.

my data is pair-end data, the numbers in the log seem to be reads not pairs, and the numbers in the summary seem to be pairs???and reads??
Another question is why the reads number of the final result is singular, not plural. mean that there are reads not in pairs?

Number of output mappings (passed filters): 105986835

My command is
chromap --preset chip -t 12 --MAPQ-threshold 10 -x $chromap_index -r $genome -1 A_R1.fastq.gz -2 A_R2.fastq.gz --SAM -o A_chromap.sam --trim-adapters --summary A-summary

Thanks in advance.

Best,
Ming

@ming1211 ming1211 added the bug Something isn't working label Nov 21, 2023
@ming1211 ming1211 changed the title [BUG] XXX [BUG] summary and log are confusing. Nov 21, 2023
@haowenz
Copy link
Owner

haowenz commented Nov 22, 2023

@mourisl Can you take a look? Thanks!

@mourisl
Copy link
Collaborator

mourisl commented Nov 25, 2023

@ming1211 Sorry for the delayed reply. The summary should be with respect to the read pairs. I think the negative number in the summary is a bug if the output is in the SAM format. I will look into this issue. Thank you for reporting this bug.

@mourisl
Copy link
Collaborator

mourisl commented Dec 26, 2023

Hi @ming1211 , thank you for letting us know about the bug for the negative number. I think I've found the issue and it should be fixed in the li_dev5 branch. Could you please check out that branch and give it a try to see whether it works on your data? Thank you.

@ming1211
Copy link
Author

Thanks for your update!I tried, it's perfect now!
And I got another question, when preset as ATAC, there are 2 parameters inside: --remove-pcr-duplicates --remove-pcr-duplicates-at-cell-level, So if deal with bulk-ATAC seq, will there be bad influence with --remove-pcr-duplicates-at-cell-level?

Thanks again!
Ming

@ming1211
Copy link
Author

not single-cell.

@mourisl
Copy link
Collaborator

mourisl commented Jan 25, 2024

Since your data is bulk ATAC-seq data, you shall use --remove-pcr-duplicates. The -at-single-level is for scATAC-seq data, I guess it will give you the same results as --remove-pcr-duplicates on bulk data, but we never tested it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants