Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sankoff characters throw runtime exception when given the L1 norm metric #80

Closed
recursion-ninja opened this issue Oct 6, 2018 · 1 comment
Assignees
Labels
Milestone

Comments

@recursion-ninja
Copy link
Collaborator

After adding the protein data for sankoff characters to the integration test suite, the L1 norm metric fails, throwing an exception. This is likely because we have special case logic (for efficiency purposes) to use the Haskell function \(i,j) -> max i j - min i j when encountering the L1 norm as the metric rather than use the memoized TCM. We also do this special case function for the discrete metric, but that works as expected.

We need to track down why the L1 norm (additive) metric doesn't work as expected for Sankoff characters. This may be related to #78.

@recursion-ninja
Copy link
Collaborator Author

After updating the READ command's grammar to be more expressive, I resolved this issue. The input amino acid sequences of fasta files were being interpreted as large symbols in a custom alphabet.

Example:

The input of: CATGAT was being interpreted as:

["CATGAT"]

instead of

[["C"], ["A"], ["T"], ["G"], ["A"], ["T"]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants