Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
NOTE: this PR would change MAPQ for single-end and Hi-C mappings.
This code refactor helped identify and fix bugs for single-end read and Hi-C read MAPQ. It is clear that the parameters were passed in the wrong order, which led to wrong MAPQ for single-end reads at the following line.
chromap/src/mapping_generator.h
Line 858 in 3ce2414
For Hi-C reads, the bug is at the following line.
chromap/src/mapping_generator.h
Line 1089 in 3ce2414
Here
num_positive_candidates2
should benum_negative_candidates2.
After fixing this, the single-end simulated read MAPQ seems to get improved as shown in the following figure.
![roc-color_single](https://user-images.githubusercontent.com/7692599/155751503-39369d52-f7f2-44cf-852c-1c67c85fd309.jpg)
Tested on all benchmark datasets. No change on paired-end ChIP-seq or scATAC-seq mappings was observed. Hi-C mappings only slightly changed. And the MAPQ of many single-end ChIP-seq mappings changed.
Reducing the parameters for verification functions seems to cause the change of single-end mappings and a few good single-end mappings were missed. But when running Chromap on a single missed read, the mapping of it was reported. Thus, this might be a problem cause by cache. So this PR will be examined again after @mourisl fix the cache.