-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prioritization with MutationTaster is broken #509
Comments
Seems like the current mutationtaster does not return the But currently the bayes prob is required in order to obtain a score from MutationTaster https://github.com/bihealth/varfish-server/blob/7a05efc0ef30f7a6f43752e5b5ae46c4076b154a/variants/models.py#L2747 |
|
@xiamaz Thanks for the bug hunt. We need to ask Dominik. I'll drop him an email if this is the correct way (or did you find it in the docs?). We should also replace |
The tree vote is not a probabilistic measure but a measure of One possible formula is Example of MT2021 output:
|
I think one can still use the old MutationTaster interface with the bayes probabilites under https://www.mutationtaster.org/ instead of https://www.genecascade.org/MutationTaster2021 . |
@your-highness (Johannes?) thanks for the input. If this is possible, we should patch it back to use the old interface and then create a ticket for adapting to the new interface as it seems to be more involved. |
@stolpeo I have tested some changes to the current models that would be necessary. IMHO the cleanest implementation should separate MT2021 from MT86, since they follow very different principles. |
@your-highness Reading https://www.genecascade.org/MutationTaster2021/info/#rf I'm not sure whether doing a proba calculation is a good idea with the current MT2021 classifier. |
Just read "Your-Highness" with a German accent and it almost sounds correct 😄
I second this: MT86 should be incorporated and kept.
We should ask MT2021 developers on how to infer a proper ranking. They do it anyway with MutationDistiller |
We do not rank variables with MutationTaster. The Bayes classifier gave a Boolean output and the float values where only indicating its internal confidence. But in contrast to a 'score' (e.g. in CADD or RegulationSpotter) they do not reflect how deleteriousness a variant is thought to be. Do not use them at all, they don't provide any benefit over the prediction values and lead to the false idea that MT could rank the disease potential of variants. With MutationTaster2021 it's a bit different: this is a RandomForest model and the 'tree vote' shows how many decision trees suggest deleterious:harmless. But it's still an internal marker and cannot rank deleteriousness - the model has not been trained to give such a metric. It's either deleterious or harmless. Don't use this score either (unless you want to test the classifier). |
@domibln Thanks for the input! I now inspected our ranking algorithm and currently the ranking with MT is based on the |
Resolution Proposal Affected Components Affected Modules/Files
Required Architectural Changes Required Database Changes Backport Possible? Resolution Sketch https://www.genecascade.org/MT2021/MT_API102.cgi to |
Continue discussion for new implementation in #511 |
…ink-out Related-Issue: #509 Projected-Results-Impact: none
Describe the bug
Currently prioritization using MutationTaster does not work and returns an error.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Table including MutationTaster scores
Screenshots
![image](https://user-images.githubusercontent.com/3340720/169084104-a6fbeab4-6969-4123-b4a0-883e07b9775a.png)
Additional context
Possibly caused by unguarded access. https://github.com/bihealth/varfish-server/blob/7a05efc0ef30f7a6f43752e5b5ae46c4076b154a/variants/models.py#L2694
The text was updated successfully, but these errors were encountered: