# Step 1: Download External Datasets (DAVIS, KIBA, DeepDTA)

This notebook is part of the generalizability study. Here, we download and extract external benchmark datasets (DAVIS, KIBA, DeepDTA) for evaluating the performance of our trained model on unseen data.

- **DAVIS** and **KIBA** are widely used for drug-target interaction prediction.
- **DeepDTA** repository contains both datasets and useful scripts.

The next steps will preprocess these datasets and evaluate the model's generalizability.

In [1]:
# step1_download_davis_kiba.py
import os
import urllib.request
import zipfile

In [2]:

def download_deepdta_datasets(base_dir="data/external"):
    os.makedirs(base_dir, exist_ok=True)
    url = "https://github.com/hkmztrk/DeepDTA/archive/refs/heads/master.zip"
    zip_path = os.path.join(base_dir, "deepdta.zip")
    extract_path = os.path.join(base_dir, "DeepDTA-master")

    print("Downloading DeepDTA datasets...")
    urllib.request.urlretrieve(url, zip_path)

    print("Extracting...")
    with zipfile.ZipFile(zip_path, 'r') as zip_ref:
        zip_ref.extractall(base_dir)

    print("Done. Files available in:", extract_path)

if __name__ == "__main__":
    download_deepdta_datasets()


Downloading DeepDTA datasets...
Extracting...
Done. Files available in: data/external\DeepDTA-master
