# Empowering Healthcare with Symbolic Learning and Knowledge Graph Embeddings

### Overview: Welcome!!! Today we will play with Symbolic learning and KGE models like TransH or RotatE over the Lung Cancer KG. In Lung Cancer KG, a patient is described by medical characteristics such as smoking habit, cancer stage, mutation type, age, gender and occurence of relapse etc. Here, the task is to predict a patient with recommended drug or relapse condition.

#### Install prerequistes and import necessary modules

In [None]:
!git clone https://github.com/SDM-TIB/SDM-Hackathon.git

In [None]:
%%capture
!pip install -r /content/SDM-Hackathon/requirements.txt

#### Symbolic Learning Execution

In [None]:
%cd /content/SDM-Hackathon/KGE/SymbolicLearning
!python symbolic_predictions.py

#### KGE Models Execution

In [None]:
!python /content/SDM-Hackathon/KGE/kge.py --dataset_path "/content/SDM-Hackathon/KGE/KG/OriginalKG/LungCancer.tsv" --output_dir "/content/SDM-Hackathon/KGE/OriginalKG" --results_path "/content/SDM-Hackathon/KGE/OriginalKG/" --models TransH

#### Perform Link Prediction (to predict the missing link, i.e., tail or head entity)

In [None]:
!python /content/SDM-Hackathon/KGE/link_prediction.py --results "/content/SDM-Hackathon/KGE/KG/OriginalKG/" --model_name "TransH" --head "3561_Patient" --relation "hasRelapse_Progression"

In [None]:
import pandas as pd
import json
pred_result_path = "/content/SDM-Hackathon/KGE/KG/OriginalKG/TransH/prediction_result.csv"
pred = pd.read_csv(pred_result_path)

# Display the prediction result, i.e., top-5
pred.head(5)

#### Impact of Symbolic Learning over KGE Model Evaluation

In [None]:
def generate_dataframe(model_name, file_path):
    with open(file_path, 'r') as file:
        data = json.load(file)

    hits_at_10 = data['metrics']['tail']['realistic']['hits_at_10']
    mrr = data['metrics']['tail']['realistic']['inverse_harmonic_mean_rank']

    df = pd.DataFrame({
        'Benchmark': [kg_name],
        'Hits@10': [hits_at_10],
        'MRR': [mrr]
    })
    return df

model_files = [
    ('OriginalKG', '/content/SDM-Hackathon/KGE/KG/OriginalKG/TransH/results.json'),
    ('EnrichedKG', '/content/SDM-Hackathon/KGE/KG/EnrichedKG/TransH/results.json'),
    ('TransformedKG', '/content/SDM-Hackathon/KGE/KG/TransformedKG/TransH/results.json')
]

dfs = []
for kg_name, file_path in model_files:
    df = generate_dataframe(kg_name, file_path)
    dfs.append(df)

# Concatenate all DataFrames into one
final_df = pd.concat(dfs, ignore_index=True)
print(final_df)