# Tutorial

Here, we introduce how to use the model.

In [1]:
from matching_rep import MatchingRep
import json
import numpy as np

# Load data

In [2]:
with open('./data/gmix.json', 'r', encoding='utf8') as f:
    dic = json.loads(f.read())
    
data = np.array(dic['data'])
n_feature_x = dic['n_feature_x']
n_feature_o = dic['n_feature_o']

# random shuffle data
np.random.shuffle(data)

# split dataset into training/validation/test sets
split = int(data.shape[0]/10)
train = data[:-2*split]
valid = data[-2*split:-split]
test = data[-split:]

# split patients/organs/outcomes
train_X = train[:, :n_feature_x]
train_O = train[:, n_feature_x:-1]
train_Y = train[:, -1]

valid_X = valid[:, :n_feature_x]
valid_O = valid[:, n_feature_x:-1]
valid_Y = valid[:, -1]

test_X = test[:, :n_feature_x]
test_O = test[:, n_feature_x:-1]
test_Y = test[:, -1]

print('num feature of patients: ', n_feature_x)
print('num feature of organs: ', n_feature_o)
print('sample size: ', data.shape[0])
print('training set size: ', train.shape[0])

num feature of patients:  128
num feature of organs:  64
sample size:  10000
training set size:  8000


# Build and Train Model

In [3]:
# initialize the model
model = MatchingRep(n_feature_x, n_feature_o, n_clusters=3)

# train the model
hist = model.fit([train_X, train_O], train_Y, validation_data=([valid_X, valid_O], valid_Y), batch_size=256, epochs=80)

pre-training auto-encoder
pre-training clusters
Reached convergence threshold. Stopping training.
Epoch 1/5

Epoch 00001: val_loss improved from inf to 480492.50000, saving model to ./model\MatchingRepCheckpoint
Epoch 2/5

Epoch 00002: val_loss improved from 480492.50000 to 253617.78125, saving model to ./model\MatchingRepCheckpoint
Epoch 3/5

Epoch 00003: val_loss improved from 253617.78125 to 234501.76562, saving model to ./model\MatchingRepCheckpoint
Epoch 4/5

Epoch 00004: val_loss improved from 234501.76562 to 191371.23438, saving model to ./model\MatchingRepCheckpoint
Epoch 5/5

Epoch 00005: val_loss improved from 191371.23438 to 41498.11719, saving model to ./model\MatchingRepCheckpoint
Epoch 1/5

Epoch 00001: val_loss improved from 41498.11719 to 14298.95703, saving model to ./model\MatchingRepCheckpoint
Epoch 2/5

Epoch 00002: val_loss improved from 14298.95703 to 12442.97754, saving model to ./model\MatchingRepCheckpoint
Epoch 3/5

Epoch 00003: val_loss improved from 12442.97


Epoch 00003: val_loss did not improve from 6836.87598
Epoch 4/5

Epoch 00004: val_loss did not improve from 6836.87598
Epoch 5/5

Epoch 00005: val_loss did not improve from 6836.87598
Epoch 1/5

Epoch 00001: val_loss did not improve from 6836.87598
Epoch 2/5

Epoch 00002: val_loss did not improve from 6836.87598
Epoch 3/5

Epoch 00003: val_loss did not improve from 6836.87598
Epoch 4/5

Epoch 00004: val_loss did not improve from 6836.87598
Epoch 5/5

Epoch 00005: val_loss did not improve from 6836.87598
Epoch 1/5

Epoch 00001: val_loss did not improve from 6836.87598
Epoch 2/5

Epoch 00002: val_loss improved from 6836.87598 to 6734.70850, saving model to ./model\MatchingRepCheckpoint
Epoch 3/5

Epoch 00003: val_loss did not improve from 6734.70850
Epoch 4/5

Epoch 00004: val_loss did not improve from 6734.70850
Epoch 5/5

Epoch 00005: val_loss improved from 6734.70850 to 6684.82568, saving model to ./model\MatchingRepCheckpoint
Epoch 1/5

Epoch 00001: val_loss did not improve from 668

# Evaluate the model

In [4]:
# load the best model from checkpoint
model.load_weights(path='./model/MatchingRepCheckpoint')

loss = model.evaluate([test_X, test_O], test_Y)
print('mean squared error on test set: ', loss)

mean squared error on test set:  8077.89208984375


# Get predictions

In [5]:
# load the best model from checkpoint
model.load_weights(path='./model/MatchingRepCheckpoint')


# predict all potential outcomes of each patient
ys_pred = model.predict(test_X)

# predict the potential outcome of each patient-organ pair
y_pred = model.predict_y([test_X, test_O])

# predict soft clustering result of each organ
clus = model.predict_clus(test_O)

# Perform Allocation

In [9]:
idx_o = 0 # a random organ index for test
idx = model.allocate_one(test_X, test_O[idx_o])

print('the organ is allocated to patient ', idx)
print('the estimated survival time is: ', model.predict_y([test_X[idx].reshape((1, -1)), test_O[idx_o].reshape(1, -1)])[0, 0])

the organ is allocated to patient  1
the estimated survival time is:  1078.943


For more detailed instruction, check the code in matching_rep,py