In [1]:
import sys
sys.path.append('../')

from gears import PertData, GEARS

  from pandas import Int64Index as NumericIndex


Load data. We use norman as an example.

In [2]:
pert_data = PertData('./data')
pert_data.load(data_name = 'norman')
pert_data.prepare_split(split = 'simulation', seed = 1)
pert_data.get_dataloader(batch_size = 32, test_batch_size = 128)

Found local copy...
Local copy of pyg dataset is detected. Loading...
Done!
Local copy of split is detected. Loading...
Simulation split test composition:
combo_seen0:9
combo_seen1:52
combo_seen2:18
unseen_single:37
Done!
Creating dataloaders....
Done!


Create a model object; if you use [wandb](https://wandb.ai), you can easily track model training and evaluation by setting `weight_bias_track` to true, and specify the `proj_name` and `exp_name` that you like.

In [3]:
gears_model = GEARS(pert_data, device = 'cuda:7', 
                        weight_bias_track = False, 
                        proj_name = 'pertnet', 
                        exp_name = 'pertnet')
gears_model.model_initialize(hidden_size = 64)

You can find available tunable parameters in model_initialize via

In [4]:
gears_model.tunable_parameters()

{'hidden_size': 'hidden dimension, default 64',
 'num_go_gnn_layers': 'number of GNN layers for GO graph, default 1',
 'num_gene_gnn_layers': 'number of GNN layers for co-expression gene graph, default 1',
 'decoder_hidden_size': 'hidden dimension for gene-specific decoder, default 16',
 'num_similar_genes_go_graph': 'number of maximum similar K genes in the GO graph, default 20',
 'num_similar_genes_co_express_graph': 'number of maximum similar K genes in the co expression graph, default 20',
 'coexpress_threshold': 'pearson correlation threshold when constructing coexpression graph, default 0.4',
 'uncertainty': 'whether or not to turn on uncertainty mode, default False',
 'uncertainty_reg': 'regularization term to balance uncertainty loss and prediction loss, default 1',
 'direction_lambda': 'regularization term to balance direction loss and prediction loss, default 1'}

Train your model:

Note: For the sake of demo, we set epoch size to 1. To get full model, set `epochs = 20`.

In [5]:
gears_model.train(epochs = 1, lr = 1e-3)

Start Training...
Epoch 1 Step 1 Train Loss: 0.5698
Epoch 1 Step 51 Train Loss: 0.4839
Epoch 1 Step 101 Train Loss: 0.4901
Epoch 1 Step 151 Train Loss: 0.4075
Epoch 1 Step 201 Train Loss: 0.5746
Epoch 1 Step 251 Train Loss: 0.4715
Epoch 1 Step 301 Train Loss: 0.4592
Epoch 1 Step 351 Train Loss: 0.4402
Epoch 1 Step 401 Train Loss: 0.5056
Epoch 1 Step 451 Train Loss: 0.4829
Epoch 1 Step 501 Train Loss: 0.3779
Epoch 1 Step 551 Train Loss: 0.5310
Epoch 1 Step 601 Train Loss: 0.4236
Epoch 1 Step 651 Train Loss: 0.3958
Epoch 1 Step 701 Train Loss: 0.4064
Epoch 1 Step 751 Train Loss: 0.4564
Epoch 1 Step 801 Train Loss: 0.5437
Epoch 1 Step 851 Train Loss: 0.4514
Epoch 1 Step 901 Train Loss: 0.3983
Epoch 1 Step 951 Train Loss: 0.3882
Epoch 1 Step 1001 Train Loss: 0.4543
Epoch 1 Step 1051 Train Loss: 0.4775
Epoch 1 Step 1101 Train Loss: 0.4316
Epoch 1 Step 1151 Train Loss: 0.4562
Epoch 1 Step 1201 Train Loss: 0.4734
Epoch 1 Step 1251 Train Loss: 0.4545
Epoch 1 Step 1301 Train Loss: 0.5209
Epoch 

Save and load pretrained models:

In [6]:
gears_model.save_model('test_model')
gears_model.load_pretrained('test_model')

Make prediction for new perturbation:

In [7]:
gears_model.predict([['FEV'], ['FEV', 'AHR']])

{'FEV': array([-1.5115363e-06,  4.4304952e-02,  1.0309354e-01, ...,
         3.3967001e+00,  7.8529231e-03,  1.0920237e-31], dtype=float32),
 'FEV_SAMD11': array([-2.2916190e-06,  9.7577907e-02,  1.6493453e-01, ...,
         3.2082996e+00,  7.6769367e-03,  1.7619579e-31], dtype=float32)}

Gene list can be found here:

In [8]:
gears_model.gene_list[:5]

['RP11-34P13.8', 'RP11-54O7.3', 'SAMD11', 'PERM1', 'HES4']