# Contrastive Learning for Predicting Cancer Prognosis Using Gene Expression Values

## Sample Model Prediction

*PredictThroughClassifierModel.ipynb* notebook is offering comprehensive step-by-step instructions on how to effortlessly predict classifier results based on the public models for each cancer.
Please put this notebook at the same path as Classifier Models under CL4CaPro_Models folder

### Pick Cancer

In [2]:
Cancer = 'BLCA'

### Put Input Patient Info
e.g. put your input in *BLCA_predict_input.csv*

In [None]:
input_pth = 'BLCA_predict_input.csv'

### Read Input and Check

In [None]:
import pandas as pd
input_df = pd.read_csv(input_pth)
input_df

### Generate contrastive learning features based on the public cancer model

#### Get model path

In [None]:
import os

def find_clcp_folder_name(directory):
    for folder_name in os.listdir(directory):
        if folder_name.startswith('CLCP'):
            return folder_name
    return 'No CLCP folder found.'

# Assuming the directory to search is the current working directory
directory_to_search = './{}'.format(Cancer)
clcp_folder_name = find_clcp_folder_name(directory_to_search)
clcp_folder_name
model_pth = './{}/{}'.format(Cancer, clcp_folder_name)

#### Generate feature

In [None]:
para = clcp_folder_name.split('_')
input_dim = para[1]
model_n_hidden_1 = para[2]
model_out_dim = para[3]
feat_dim = para[5]
batch_size = para[-3]
l2_rate = para[9]
seed = para[13]
round = para[11]
device = 0
lr = para[7]

In [None]:
! python GenerateFeatures_Predict.py --layer_name feat --model_in_dim {input_dim} --dim_1_list {model_n_hidden_1} \
                                     --dim_2_list {model_out_dim} --dim_3_list {feat_dim} --batch_size {batch_size} \
                                     --l2_rate {l2_rate} --seed {seed} --round {round} --gpu_device {device} \
                                     --learning_rate_list {lr} --task Risk \
                                     --cancer_group {cancer}

#### Predict Results

In [None]:
from xgboost import XGBClassifier

# Initialize a model instance
loaded_classifier_model = XGBClassifier()

# Load the model from the file
loaded_classifier_model.load_model('./{}/classifier_model.json'.format(Cancer))

predict_input_df = pd.read_csv('Features/PredictFeature_{}.txt'.format(Cancer))
X = predict_input_df.iloc[:, 6:]

predictions = loaded_classifier_model.predict(X)