## Data Filtering with `data_filtering.py`

This code snippet demonstrates how to call the `filter_and_save` function from the `data_filtering.py` module to process and filter your PharmAlchemy datasets.

In [None]:
import pandas as pd
from data_filtering import filter_and_save

r2d_path           = input("Path to R2D_FINAL.csv: ")
r2g_validated_path = input("Path to DrugBank_R2G_RomanValidated.csv: ")
d2g_path           = input("Path to D2G_FINAL.csv: ")
g2g_path           = input("Path to G2G_FINAL.csv: ")
output_dir         = input("Directory for filtered outputs: ")

filter_and_save(
    r2d_path,
    r2g_validated_path,
    d2g_path,
    g2g_path,
    output_dir
)

## Triplets Builder 

This code snippet demonstrates how to use the `triplets_builder.py` module to generate RDF‐style triplets from your filtered datasets.


In [None]:
from triplets_builder import build_triplets

r2g_csv    = input("Path to R2G_validated.csv: ")
r2d_csv    = input("Path to R2D_filtered.csv: ")
d2g_csv    = input("Path to D2G_filtered.csv: ")
g2g_csv    = input("Path to G2G_filtered.csv: ")
output_dir = input("Directory where triplets will be saved: ")

all_triplets, unique_entities, unique_predicates = build_triplets(
    r2g_csv,
    r2d_csv,
    d2g_csv,
    g2g_csv,
    output_dir
)

## Neo4j Triplets Upload 

This section shows how to use the `neo4j_upload.py` module to clear your Neo4j graph and upload a new set of triplets from a CSV file.



In [None]:
import pandas as pd
import getpass
from neo4j_upload import upload_to_neo4j

csv_path = input("Enter path to triplets CSV file: ")
triplets_df = pd.read_csv(csv_path)

uri      = input("Enter Neo4j URI (e.g., neo4j+s://<host>): ")
username = input("Enter Neo4j username: ")
password = getpass.getpass("Enter Neo4j password: ")

In [None]:
upload_to_neo4j(triplets_df, uri, username, password)

## GUI- NEO4J Graph Explorer

This line of code will launch the `GUI_neo4j_graph_explorer.py` script, which will prompt you for your filtered CSV paths, Neo4j credentials, and Groq API details before opening a Gradio UI.

In [None]:
# This will prompt you for:
#  • paths to R2G_validated.csv, R2D_filtered.csv, D2G_filtered.csv, G2G_filtered.csv
#  • your Neo4j URI, username, and password
#  • your Groq API key and model name
# Then it will launch the Gradio interface.
%run GUI_neo4j_graph_explorer.py

# Knowledge Graph Embedding Training

This notebook shows how to call the `train_models.py` module to train and evaluate ComplEx, TransE, and RotatE embeddings on your `all_triplets.csv` dataset.


In [None]:
from train_models import train_complex, train_transe, train_rotate

In [None]:
## Train ComplEx model
print("Training ComplEx...")
metrics_ce = train_complex()
print("ComplEx metrics:", metrics_ce)

#Train TransE model
print("Training TransE...")
metrics_tr = train_transe()
print("TransE metrics:", metrics_tr)

#Train RotatE model
print("Training RotatE...")
metrics_rt = train_rotate()
print("RotatE metrics:", metrics_rt)

# Compare all metrics
df_metrics = pd.DataFrame(
    [metrics_ce, metrics_tr, metrics_rt],
    index=['ComplEx','TransE','RotatE']
)
df_metrics.rename(columns={
    'mrr':'MRR', 
    'hits1':'Hits@1', 
    'hits10':'Hits@10', 
    'hits50':'Hits@50', 
    'hits100':'Hits@100'
}, inplace=True)


 ## GUI - KG Link-Prediction 

This code will launch the `GUI_kg_link_prediction.py` script, which prompts you for your triplets CSV and trained model paths, then opens the Gradio interface.


In [None]:
# This will prompt you for:
#  • Path to your triplets CSV (e.g., all_triplets.csv)
#  • Paths to ComplEx.pkl, RotatE.pkl, TransE.pkl
# and then launch the Gradio GUI.
%run GUI_KG_link_prediction.py