This repository contains data and codes for PolyCL.
- Dependency: You will need only
polycl.py
in this repository andtorch
,transformers
packages as the minimum requirement. - Obtain the polymer embedding: Simply follow the demonstration in
PolyCL_Easy_Usage.ipynb
.
You might need to configure git lfs first and download git lfs following instructions on https://git-lfs.com/ . Then install git lfs using:
$ git lfs install
After git lfs properly configured:
$ git clone https://github.com/JiajunZhou96/PolyCL.git
# create a new environment
$ conda create --name polycl python=3.9
$ conda activate polycl
# install requirements
#$ pip install numpy==1.26.4
#$ pip install pandas==1.3.3
#$ pip install scikit-learn==0.24.2
$ pip install torch==1.12.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html
$ pip install transformers==4.20.1
$ pip install -U torchmetrics
$ pip install tensorboard
$ pip install tqdm
$ conda install -c conda-forge rdkit
pip install torch-geometric==1.7.2 torch-sparse==0.6.18 torch-scatter==2.1.2 -f https://pytorch-geometric.com/whl/torch-1.12.0+cu113.html
Run with key parameters for the pretraining summarized in config.json
.
train.py
Run with sample configurations described in config_tf_notebook.json
.
transfer_learning.py
Models available for benchmarking are stored in the ./benchmark/
directory.
-
- Run
tf_polybert.py
and polyBERT model will be automatically downloaded from https://huggingface.co/kuelumbus/polyBERT .
- Run
-
- Download the model folder of Transpolymer "pretrain.pt" from https://github.com/ChangwenXu98/TransPolymer/tree/master/ckpt .
- Put the folder to the directory
"./model/Trasnpolymer/"
to be referred to as"./model/Trasnpolymer/pretrain.pt"
. - Run
tf_transpolymer.py
.
- Download the model folder of Transpolymer "pretrain.pt" from https://github.com/ChangwenXu98/TransPolymer/tree/master/ckpt .
-
- Assign "gcn" or "gin" to the key "gnn_type" in
config_graph.json
to use different types of GNNs. - Run
gnn.py
.
- Assign "gcn" or "gin" to the key "gnn_type" in
-
- Run
morgan_nn.py
to use neural network. - Run
rf.py
to use random forest. - Run
xgb.py
to use XGBoost.
- Run