Fix for abundance deviation in metagenome mode#232
Merged
Conversation
Contributor
lcoombe
commented
Oct 7, 2024
- If simulated read longer than selected chromosome, select from same species preferentially
- Previous behaviour randomly selected a different sequence from the full pool of species
- In the example of the Zymo mock model, this meant that S. aureus reads ended up under-represented, while Cryptococcus was over-represented
- This was because the S. aureus reference contained 4 sequences - the main circular genome and 3 short plasmids
- When a sequence from S. aureus was randomly selected, frequently the plasmids were chosen, which were shorter than the requested read length
- Because there are more Cryptococcus sequences in the reference compared with the number of sequences in other references, randomly choosing a replacement sequence from the entire pool of species/sequences meant that Cryptococcus was chosen more frequently
- To retain the requested abundances as much as possible, preferentially choose the 'alternative' sequence from the same species
- This is consistent with the version used in the meta-NanoSim paper
- As a fall-back, if there are no appropriate sequences in the species' reference, choose another species
- If this is required, a warning will be printed, advising the user to check the abundances after simulation finishes
…pecies preferentially
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.