You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm in the middle of making an IRMA module for Adenoviruses. I came across your repo today and thought it would be useful for that purpose (I'm definitely thinking of using it to generate consensus sequences.) The IRMA paper mentions a few filtration steps that I thought would be a natural fit (in the "Methods" section, in the "Datasets" sub-section, in the "Influenza alignment dataset" sub-sub-section, second paragraph). In particular, they mentioned:
Removing duplicate sequences
This should be the (second-)easiest of the bunch.
Removing sequences with greater than N ambiguous nucleotides
In the paper, the authors specified N=5, which may be a good default setting for Influenza A/B segments.
Removing sequences causing frame-shifts
I think this may be relatively difficult to calculate, compared to the others.
Removing short sequences
This functionality is already implemented (--remove_short), but it may be nice to have the ability to specify a percentage of the alignment as a cutoff.
The text was updated successfully, but these errors were encountered:
I'm sorry it's taken such a long time to reply! We will look at incorporating these features. All except the frameshift seems reasonably straightforward - I'll look into it and get back to you.
I'm in the middle of making an IRMA module for Adenoviruses. I came across your repo today and thought it would be useful for that purpose (I'm definitely thinking of using it to generate consensus sequences.) The IRMA paper mentions a few filtration steps that I thought would be a natural fit (in the "Methods" section, in the "Datasets" sub-section, in the "Influenza alignment dataset" sub-sub-section, second paragraph). In particular, they mentioned:
This should be the (second-)easiest of the bunch.
In the paper, the authors specified N=5, which may be a good default setting for Influenza A/B segments.
I think this may be relatively difficult to calculate, compared to the others.
This functionality is already implemented (
--remove_short
), but it may be nice to have the ability to specify a percentage of the alignment as a cutoff.The text was updated successfully, but these errors were encountered: