Request for the pretrained model and instructions on getting own proteins' embeddings. #11

yunxiangz · 2022-11-22T02:31:38Z

Thanks for the wonderful work!

I am trying the use the learned embeddings for a downstream protein classification problem on my own datasets. Since training the model requires a good HPC, I am wondering:

whether you could kindly upload your pretrained model.
could you explain how to generate the training and testing datasets (the pkl.gz file) from our own PDB files.
based on the generated pkl.gz file in Q1, how to apply the trained model to get the final embedding vectors (512 dimensions) for our own PDB files.

Oxer11 · 2022-11-22T03:32:54Z

Hi! Thanks for raising these questions!

1.We have a plan for releasing model checkpoint, but it still takes some time for preparation. If it is urgent for you, you can send an email to me (zuobai.zhang@mila.quebec) and I'll personally share a (good but not ready) checkpoint with you.
2&3.To define a customized dataset, you can follow the tutorials in TorchDrug and TorchProtein. The Customize Models & Tasks is about how to define a customized module in TorchDrug and Tutorial 3-Structure-based Protein Property Prediction is about how to define a dataset and run the GearNet model on it.

For your case, you can first define your customized dataset following the code of datasets.EnzymeCommission, which will show you how to generate .pkl.gz file with the API in TorchProtein. Then, you can load the task and model as the script in repo and then get the final embedding by calling task.model() (refer to tasks.MultipleBinaryClassification).

yunxiangz · 2022-11-22T03:54:10Z

Thank you Zuobai!
I have sent you a request email~

yunxiangz closed this as completed Nov 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for the pretrained model and instructions on getting own proteins' embeddings. #11

Request for the pretrained model and instructions on getting own proteins' embeddings. #11

yunxiangz commented Nov 22, 2022

Oxer11 commented Nov 22, 2022 •

edited

Loading

yunxiangz commented Nov 22, 2022

Request for the pretrained model and instructions on getting own proteins' embeddings. #11

Request for the pretrained model and instructions on getting own proteins' embeddings. #11

Comments

yunxiangz commented Nov 22, 2022

Oxer11 commented Nov 22, 2022 • edited Loading

yunxiangz commented Nov 22, 2022

Oxer11 commented Nov 22, 2022 •

edited

Loading