Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing sum-to-1 error to a warning for calculating information content #80

Merged
merged 1 commit into from
Jan 22, 2021

Conversation

AvantiShri
Copy link
Collaborator

@AvantiShri AvantiShri commented Jan 22, 2021

Regarding Issue #79

This error typically occurs when the user has sequences that are not perfectly one-hot encoded - i.e. some columns are all-zeros (usually because the one-hot encoding procedure mapped Ns to all-zeros). This can result in a PPM where the probabilities don't sum to 1 in all the rows. The error is thrown when computing the information content for visualization purposes. The information content can still be calculated by simply renormalizing the rows to sum to 1, so that's what this workaround does (after printing a warning). The user should still make sure that they are ok with this behavior.

assert (np.max(np.abs(np.sum(ppm, axis=1)-1.0)) < 1e-5),(
"Probabilities don't sum to 1 along axis 1 in "
+str(ppm)+"\n"+str(np.sum(ppm, axis=1)))
if (np.max(np.abs(np.sum(ppm, axis=1)-1.0)) < 1e-5):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a greater sign. A cleaner way, also:
if not np.allclose(np.sum(ppm, axis=1), 1.0, atol=1.0e-5):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants