## TTD

The Therapeutic Target Database (TTD) is a manually curated database of drug targets and their corresponding diseases. You can find more information about the database [here](https://db.idrblab.net/ttd/). Also, you can download all data files from [here](https://db.idrblab.net/ttd/full-data-download).

### Step 1: Reformat the data files to the BioMedGPS format

Go to the subfolder `biomarker-disease`, `drug-disease`, `drug-target`, `target-disease`, `target-pathway`, `target-pathway` and reformat all data files in each subfolder.

You should see a main.ipynb file in each subfolder. Open the main.ipynb file and run the codes to reformat the data files to the BioMedGPS format.

### Step 2: Merge all formatted data files into one file


In [2]:
import os
import os.path as osp
import subprocess


def format_ttd(filename):
    def get_project_root():
        try:
            return osp.dirname(osp.dirname(os.getcwd()))
        except Exception as e:
            raise RuntimeError(f"Failed to determine project root: {e}")

    try:
        root_dir = get_project_root()
        print(f"Project root directory: {root_dir}")
    except RuntimeError as e:
        print(e)
        exit(1)

    database = "customdb"
    relations_path = osp.join(
        root_dir,
        "relations",
        "ttd",
        filename,
    )
    output_dir = osp.join(root_dir, "formatted_relations", "ttd")
    entities_path = osp.join(root_dir, "entities.tsv")
    log_file = osp.join(output_dir, "log.txt")

    command = [
        "graph-builder",
        "--database",
        database,
        "-d",
        relations_path,
        "-o",
        output_dir,
        "-f",
        entities_path,
        "-n",
        "20",
        "--download",
        "--skip",
        "-l",
        log_file,
        "--debug",
    ]

    print("Executing command:", " ".join(command))

    try:
        subprocess.run(command, check=True)
    except FileNotFoundError:
        print(
            "Error: 'graph-builder' command not found. Make sure it is installed and available in the PATH."
        )
        exit(1)
    except subprocess.CalledProcessError as e:
        print(f"Error: Command execution failed with return code {e.returncode}")
        print(f"Output: {e.output}")
        exit(1)
    except Exception as e:
        print(f"Unexpected error: {e}")
        exit(1)

In [2]:
import os
import pandas as pd

files = [
    "./biomarker-disease/processed_ttd_biomarker_disease.tsv",
    "./drug-disease/processed_ttd_drug_disease.tsv",
    "./drug-target/processed_ttd_drug_target.tsv",
    "./target-disease/processed_ttd_target_disease.tsv",
    "./target-pathway/processed_ttd_target_keggpathway.tsv",
    "./target-pathway/processed_ttd_target_wikipathway.tsv",
]

merged = pd.DataFrame()
for file in files:
    if not os.path.exists(file):
        raise Exception(f"File {file} does not exist")
    
    d = pd.read_csv(file, sep="\t")
    d.drop(columns=[c for c in d.columns if c.startswith("ttd_")], inplace=True)

    merged = pd.concat([merged, d], ignore_index=True)

merged.to_csv("formatted_ttd.tsv", sep="\t", index=False)

In [4]:
format_ttd("formatted_ttd.tsv")

Project root directory: /Users/jy006/Documents/Code/BioMedGPS/biomedgps-data/graph_data
Executing command: graph-builder --database customdb -d /Users/jy006/Documents/Code/BioMedGPS/biomedgps-data/graph_data/relations/ttd/formatted_ttd.tsv -o /Users/jy006/Documents/Code/BioMedGPS/biomedgps-data/graph_data/formatted_relations/ttd -f /Users/jy006/Documents/Code/BioMedGPS/biomedgps-data/graph_data/entities.tsv -n 20 --download --skip -l /Users/jy006/Documents/Code/BioMedGPS/biomedgps-data/graph_data/formatted_relations/ttd/log.txt --debug


2024-11-12 16:57:38 - cli:159 - INFO - Run jobs with (output_dir: /Users/jy006/Documents/Code/BioMedGPS/biomedgps-data/graph_data/formatted_relations/ttd, db file/directory: /Users/jy006/Documents/Code/BioMedGPS/biomedgps-data/graph_data/relations/ttd/formatted_ttd.tsv, databases: ('customdb',), download: True, skip: True)
2024-11-12 16:57:40 - customdb_parser:95 - INFO - Get 63796 relations
2024-11-12 16:57:40 - base_parser:475 - INFO - Found 63796 relations.
2024-11-12 16:57:40 - base_parser:783 - INFO - Start to get entity id map.
2024-11-12 16:57:48 - base_parser:817 - INFO - The number of deduped entity type ids: 28528
2024-11-12 17:01:15 - base_parser:827 - INFO - The number of entity ids: 28528
2024-11-12 17:01:15 - base_parser:477 - INFO - Found 28528 entity ids in entity id map.
2024-11-12 17:01:16 - base_parser:491 - INFO - The number of relations before dropna: 63796
2024-11-12 17:01:16 - base_parser:493 - INFO - The number of relations after dropna: 63796
2024-11-12 17:01:1