Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will trimming the reads mess up MM/ML tags #381

Closed
weishwu opened this issue Sep 20, 2023 · 2 comments
Closed

Will trimming the reads mess up MM/ML tags #381

weishwu opened this issue Sep 20, 2023 · 2 comments

Comments

@weishwu
Copy link

weishwu commented Sep 20, 2023

I've run Dorado to do modbase calling in order to detect CpG methylation. Dorado output BAM which I then converted to FASTQ. I usually run NanoFilt to trim off the adapter-containing 5' (~40bp) and 3' (~20bp) ends before mapping and methylation calling. I wonder trimming after running Dorado will make the shortened read sequences incompatible with the info encoded in MM,ML tags. Thanks.

@ArtRand
Copy link

ArtRand commented Sep 21, 2023

Hello @weishwu,

Indeed trimming the sequences will likely invalidate the MM/ML tags output by dorado. One option if you plan to use a tool like NanoFilt/Chopper is to follow up the trimming with modkit repair (docs), which will project the original MM/ML tags onto the trimmed sequences. You can always check if the MM/ML tags are correct by iterating through the BAM with pysam modified bases it will warn you if the tags are incorrect. modkit will do this as well in the logs when you use summary or a similar command. Hope this helps.

@weishwu
Copy link
Author

weishwu commented Sep 21, 2023

@ArtRand Got it. Thanks!

@weishwu weishwu closed this as completed Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants