# Extract drug targets of significant drugs
This script filters a drug–protein interaction dataset to retrieve the targets of a predefined list of significant drugs. It outputs a two-column DataFrame mapping each significant drug to its known protein targets, with one row per drug–target pair.

!!! Make sure you run the script "extract_significant_drugs.ipynb" before running this one (it generates one of the inputs for this function).

In [2]:
import pandas as pd

## Function to extract and save targets of significant drugs (0.01 significance)

In [3]:
def extract_and_save_significant_targets(step, significant_res_path, output_path, combined_dpi_path="../data/networks/combined_DPI.csv"):
    print(f"\nExtracting significant targets for {step} ...\n")
    
    # Load files
    significant_df = pd.read_csv(significant_res_path, index_col=0)
    combined_dpi = pd.read_csv(combined_dpi_path)
    
    # Filter DPI to contain only significant drugs, sort alphabetically for clarity
    significant_drugs = significant_df['drug'].unique()
    significant_dpi = combined_dpi[combined_dpi['Drug_Name'].isin(significant_drugs)]
    significant_dpi = significant_dpi.sort_values(by=['Drug_Name', 'Drug_Target']).reset_index(drop=True)

    # Print summary information
    print(f"Number of unique significant drugs for {step}: {len(significant_dpi['Drug_Name'].unique())}")
    print(f"Number of unique drug targets for {step}: {len(significant_dpi['Drug_Target'].unique())}")

    # Save to CSV
    significant_dpi.to_csv(output_path, index=False)
    print(f"Significant targets extracted for {step} and saved to:", output_path)

## Extract significant drugs for whole gene list (1000 iterations)

In [6]:
print("================== FOR WHOLE GENE LISTS (1000 ITERATIONS) ==================")

# Define proximity result file reading paths and significant drugs output paths
steps = {
    "step 1 of differentiation": {
       "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_whole_1000_step1.csv",
       "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_whole_1000_step1.csv"
    },
    "step 2 of differentiation": {
       "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_whole_1000_step2.csv",
       "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_whole_1000_step2.csv"
    },
    "step 3 of differentiation": {
       "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_whole_1000_step3.csv",
       "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_whole_1000_step3.csv"
    },
    "full differentiation": {
      "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_whole_1000_full_diff.csv",
      "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_whole_1000_full_diff.csv"
   }
}

# Extract and save significant drugs for each step
for step, paths in steps.items():
    extract_and_save_significant_targets(step=step,
                                         significant_res_path=paths["significant_res_path"],
                                         output_path=paths["output_path"],
                                         combined_dpi_path="../data/networks/combined_DPI.csv"
                                         )


Extracting significant targets for step 1 of differentiation ...

Number of unique significant drugs for step 1 of differentiation: 641
Number of unique drug targets for step 1 of differentiation: 974
Significant targets extracted for step 1 of differentiation and saved to: ../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_whole_1000_step1.csv

Extracting significant targets for step 2 of differentiation ...

Number of unique significant drugs for step 2 of differentiation: 779
Number of unique drug targets for step 2 of differentiation: 1213
Significant targets extracted for step 2 of differentiation and saved to: ../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_whole_1000_step2.csv

Extracting significant targets for step 3 of differentiation ...

Number of unique significant drugs for step 3 of differentiation: 636
Number of unique drug targets for step 3 of differentiation: 1086
Significant targets extrac

## Extract significant drugs for key gene list (100 iterations)

In [24]:
print("================== FOR KEY GENE LISTS (100 ITERATIONS) ==================")

# Define proximity result file reading paths and significant drugs output paths
steps = {
    "step 1 of differentiation": {
       "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_key_100_step1.csv",
       "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_100_step1.csv"
    },
    "step 2 of differentiation": {
       "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_key_100_step2.csv",
       "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_100_step2.csv"
   },
    "step 3 of differentiation": {
       "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_key_100_step3.csv",
       "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_100_step3.csv"
    },
     "full differentiation": {
       "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_key_100_full_diff.csv",
       "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_100_full_diff.csv"
    }
}

# Extract and save significant drugs for each step
for step, paths in steps.items():
    extract_and_save_significant_targets(step=step,
                                         significant_res_path=paths["significant_res_path"],
                                         output_path=paths["output_path"],
                                         combined_dpi_path="../data/networks/combined_DPI.csv"
                                         )


Extracting significant targets for step 1 of differentiation ...

Number of unique significant drugs for step 1 of differentiation: 322
Number of unique drug targets for step 1 of differentiation: 187
Significant targets extracted for step 1 of differentiation and saved to: ../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_100_step1.csv

Extracting significant targets for step 2 of differentiation ...

Number of unique significant drugs for step 2 of differentiation: 423
Number of unique drug targets for step 2 of differentiation: 301
Significant targets extracted for step 2 of differentiation and saved to: ../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_100_step2.csv

Extracting significant targets for step 3 of differentiation ...

Number of unique significant drugs for step 3 of differentiation: 320
Number of unique drug targets for step 3 of differentiation: 194
Significant targets extracted for 

## Extract significant drugs for key gene list (1000 iterations)

In [4]:
print("================== FOR KEY GENE LISTS (1000 ITERATIONS) ==================")

# Define proximity result file reading paths and significant drugs output paths
steps = {
    "step 1 of differentiation": {
       "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_key_1000_step1.csv",
       "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_1000_step1.csv"
    } ,
    "step 2 of differentiation": {
        "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_key_1000_step2.csv",
        "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_1000_step2.csv"
   },
   "step 3 of differentiation": {
       "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_key_1000_step3.csv",
       "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_1000_step3.csv"
   },
    "full differentiation": {
      "significant_res_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_drugs_key_1000_full_diff.csv",
      "output_path": "../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_1000_full_diff.csv"
   }
}

# Extract and save significant drugs for each step
for step, paths in steps.items():
    extract_and_save_significant_targets(step=step,
                                         significant_res_path=paths["significant_res_path"],
                                         output_path=paths["output_path"],
                                         combined_dpi_path="../data/networks/combined_DPI.csv"
                                         )


Extracting significant targets for step 1 of differentiation ...

Number of unique significant drugs for step 1 of differentiation: 335
Number of unique drug targets for step 1 of differentiation: 227
Significant targets extracted for step 1 of differentiation and saved to: ../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_1000_step1.csv

Extracting significant targets for step 2 of differentiation ...

Number of unique significant drugs for step 2 of differentiation: 450
Number of unique drug targets for step 2 of differentiation: 499
Significant targets extracted for step 2 of differentiation and saved to: ../results/humanPVATsn/network_analysis/proximity_significant_drugs/significant_targets_key_1000_step2.csv

Extracting significant targets for step 3 of differentiation ...

Number of unique significant drugs for step 3 of differentiation: 314
Number of unique drug targets for step 3 of differentiation: 179
Significant targets extracted fo