## Parser for Table of Pharmacogenomic Biomarkers in Drug Labeling
* obtained the source table here: https://www.fda.gov/drugs/science-and-research-drugs/table-pharmacogenomic-biomarkers-drug-labeling
* last obtained timestamp: 06/25/2025
* Content current as of: 09/23/2024
* additiona information: 

In [1]:
## To do list:
## Add

In [2]:
## Load necessary packages
import os
import pandas as pd
import glob
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt

## Define the version number
version_number = "07_14_2025"
deployment_date = "2025-07-14"

In [3]:
## Load the Biolink category and predicate dictionary for mapping subject, object, and predicate types
%run ./Biolink_category_and_predication_dictionary.ipynb

Date of last update:  2025-07-08
Order is to always process Node/category map first, since the Edeg/predicate map depends on biolink-complainat node values
-----------------------------------------------------------------------------------------------------------------------------
Dictionary: category_map, Key template: Subject_category or Object_category
------------------------------------------------------------------------------------------
Dictionary: predicate_map, Key template: (Subject_category, Object_category, Predicate)


## Load files and convert them into separate node & edge files
* check all imported file structure

In [4]:
## Notice!! Please change the file path of following codes into your own
raw_files_path = '/Users/Weiqi0/ISB_working/Ilya_lab/Translator/Pharmagenomics_KG/files/FDA_Pharmacogenomic_biomarkers_in_Drug_labeling/'

## Define the output path for node & edge files after formatting
download_path_node_file = f'/Users/Weiqi0/ISB_working/Ilya_lab/Translator/Pharmagenomics_KG/files/parsed/FDA_Pharmacogenomic_biomarkers_parsed_node_{version_number}.tsv'
download_path_edge_file = f'/Users/Weiqi0/ISB_working/Ilya_lab/Translator/Pharmagenomics_KG/files/parsed/FDA_Pharmacogenomic_biomarkers_parsed_edge_{version_number}.tsv'

In [5]:
## Check all node files being read
## Read all BigGIM node csv file in group 1

for f in os.listdir(raw_files_path):
    if f.endswith('.csv'):
        print(f)

Table_of_Pharmacogenomic_Biomarkers_in_Drug_Labeling_FDA.csv


In [7]:
## Read each individual csv files
source_df = pd.read_csv(raw_files_path + 'Table_of_Pharmacogenomic_Biomarkers_in_Drug_Labeling_FDA.csv')
source_df.head(10)

Unnamed: 0,Drug,Therapeutic Area*,Biomarker†,Labeling Sections
0,Articaine and Epinephrine (1),Anesthesiology,G6PD,Warnings and Precautions
1,Articaine and Epinephrine (2),Anesthesiology,Nonspecific (Congenital Methemoglobinemia),Warnings and Precautions
2,Bupivacaine (1),Anesthesiology,G6PD,Warnings
3,Bupivacaine (2),Anesthesiology,Nonspecific (Congenital Methemoglobinemia),Warnings
4,Chloroprocaine (1),Anesthesiology,G6PD,Warnings
5,Chloroprocaine (2),Anesthesiology,Nonspecific (Congenital Methemoglobinemia),Warnings
6,Codeine,Anesthesiology,CYP2D6,"Boxed Warning, Warnings and Precautions, Use i..."
7,Desflurane,Anesthesiology,"CACNA1S, RYR1 (Genetic Susceptibility to Mali...","Contraindications, Warnings and Precautions, C..."
8,Isoflurane,Anesthesiology,"CACNA1S, RYR1 (Genetic Susceptibility to Mali...","Contraindications, Warnings, Clinical Pharmaco..."
9,Lidocaine and Prilocaine (1),Anesthesiology,Nonspecific (Congenital Methemoglobinemia),Warnings and Precautions


In [11]:
## check unique node_type and their node_source
unique_node_type_values = source_df['Drug'].unique()
print("All possible drug types are here: " ,unique_node_type_values)
print("--------------------------------------------------------------------------")
print(len(unique_node_type_values))

unique_node_source_values = source_df['Biomarker†'].unique()
print("All possible biomarker are here: " ,unique_node_source_values)
print("--------------------------------------------------------------------------")
print(len(unique_node_source_values))

All possible drug types are here:  ['Articaine and Epinephrine (1)' 'Articaine and Epinephrine (2)'
 'Bupivacaine (1)' 'Bupivacaine (2)' 'Chloroprocaine (1)'
 'Chloroprocaine (2)' 'Codeine' 'Desflurane' 'Isoflurane'
 'Lidocaine and Prilocaine (1)' 'Lidocaine and Prilocaine (2)'
 'Lidocaine and Tetracaine (1)' 'Lidocaine and Tetracaine (2)'
 'Lofexidine' 'Meloxicam' 'Mepivacaine (1)' 'Mepivacaine (2)' 'Mivacurium'
 'Oliceridine' 'Oxymetazoline and Tetracaine (1)'
 'Oxymetazoline and Tetracaine (2)' 'Ropivacaine (1)' 'Ropivacaine (2)'
 'Sevoflurane' 'Succinylcholine (1)' 'Succinylcholine (2)' 'Tramadol'
 'Carvedilol' 'Clopidogrel' 'Hydralazine' 'Isosorbide Dinitrate'
 'Isosorbide Mononitrate' 'Mavacamten' 'Metoprolol' 'Nebivolol'
 'Prasugrel (1)' 'Prasugrel (2)' 'Prasugrel (3)' 'Prasugrel (4)'
 'Procainamide' 'Propafenone' 'Propranolol' 'Quinidine' 'Rivaroxaban'
 'Tafamidis' 'Ticagrelor' 'Cevimeline' 'Abrocitinib' 'Dapsone (1)'
 'Dapsone (2)' 'Fluorouracil (1)' 'Ustekinumab' 'Ascorbic Ac