Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utility function suggestions #3

Closed
afrubin opened this issue Oct 25, 2021 · 0 comments
Closed

Utility function suggestions #3

afrubin opened this issue Oct 25, 2021 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@afrubin
Copy link
Member

afrubin commented Oct 25, 2021

Here are two suggestions for utility functions in MaveTools:

Infer target sequence from variant data

The function would work for both protein and nucleotide data. Input is a list of mavehgvs Variant objects.

The function would output an appropriate error or warning message if the variant data has a gap (e.g. if there are variants for positions 1-10 and 12-15 but not 11).

The function would also throw an error if there are conflicting target residues for the same position.

The function needs to handle target identical and indel variants correctly, likely ignoring them in most cases (a possible exception being protein data with two adjacent residues).

Split "matched" delins into substitutions

The code that converts codon changes to mavehgvs variants prefers to define single events, e.g. a deletion-insertion of two bases rather than two substitutions. Many users may prefer to look at these data as multiple substitutions instead.

This function would take delins variants that have matching deletion and insertion length and output the corresponding multi-variant.

For many variants this will require the user to provide the appropriate target sequence. To improve usability, the user should be able to provide a longer target sequence and an offset.

The function should throw an informative error if the delins is not matched or if the target sequence doesn't match (e.g. outside of length bounds, or would result in a target-identical change that suggests the wrong target sequence was provided).

@afrubin afrubin added the enhancement New feature or request label Oct 25, 2021
@afrubin afrubin closed this as completed May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants