-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Competitive mapping for strain separation #21
Comments
I guess this goes the same way as #5 , with the minimal adjustment of making these available as multiFastA files. I guess that should be feasible in that scope too. |
After doing something similar in another situation, I agree @EisenRa 's suggestion of making a multiFasta file reference file and mapping to that is probably the way forward, rather than having the multiple channel thing mentioned in #5. One might not even have to split the bamfile, (unless we have an option of spitting out a particular genome from the bam file), as most tools are human-focused that report per-chromosome statistics. |
This could also be of interest but outside of the metagenomics stuff, for improving non-human mappings: https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-07229-y |
Going to close in favour of #878 which is more requested and arguably more powerful. Like above, extracting stats with e.g. bed tools (already supported) should be sufficient in many cases. |
In some metagenomic contexts, there can be closely related species in a sample that can make read mapping to a single reference genome difficult (e.g. cross-mapping of reads between species). In this situation, it can be useful to employ competitive mapping, whereby reference genomes from closely related species are concatenated (in a multifasta), and the reads mapped to this reference. This can allow for the mapping quality filter to filter out reads that would cross-map between species.
EAGER could allow users to select a folder, or a list of reference fastas and concatenate them into a multifasta prior to read mapping. Alternatively, the user could provide their pre-concatenated multifasta file.
Regarding the output of mapping stats, the concatenated BAM file would have to be split using bamtools prior to generating stats.
The text was updated successfully, but these errors were encountered: