-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow encoding #1
Comments
Hi @Akazhiel , Thanks for your interest! One selection step that you can do before encoding is to run all pMHCs through netMHCpan and only keep the pMHCs with a satisfactory rank (e.g. 2%). Then input the pMHCs with TCRs to pMTnet. We are also working on computationally speeding up the encoding process. Best, |
Hello @tianshilu , Yes we do run the pMHCs through an algorithm different than netMHCpan and filter them by the affinity percentile. My question was more towards how (if possible) to reduce the number of candidates TCRs. Since you'd want to screen each TCR against all the pMHCs. Cheers, Jonatan |
Hi @Akazhiel , Sorry that we don't have a pre-selection step for TCRs. We are working on speeding up the encoding and prediction. Thanks very much for your feed back!! Tianshi |
Hi @tianshilu That's totally understandable, indeed subsetting the TCRs might be a really hard feat to achieve. I've been tinkering with the code and sped up the encoding steps that take place previous to the encoding with the autoencoder since my knowledge and capabilities regarding machine learning are pretty limited and wouldn't know how to speed up the autoencoder or the predictions. If it's okay with you I'll open a pull request so that you can review the code. I've done some testing and the Cheers, Jonatan |
Hi @Akazhiel , Thanks for your effort on this. Please feel free to open a pull request! Thanks! Tianshi |
The encoding part has been updated for faster encoding speed. |
Greetings!
Great tool to help predict the TCR-pMHC bindings although, is there any way to speed up the encoding step? Since I understand the aim of this tool is to predict how well your TCR repertoire binds to the predicted pMHCs, the encoding is far slower than what I'd expect. Given you'd pair each TCR to the whole list of pMHCs to test for binding, this would generate files of millions of lines. Currently I'm running it on a file with 2M lines and it's been almost 3 days of running time and the encoding is not even close to be done. Maybe it's not expected to use as input all the possible combinations but just some of them? In that case how would you select them?
Best regards,
Jonatan
The text was updated successfully, but these errors were encountered: