# Rxnav data collection

Links:
* API description: https://rxnav.nlm.nih.gov/InteractionAPIs.html#
* REST API return json example: https://rxnav.nlm.nih.gov/REST/interaction/interaction.json?rxcui=341248
  * RxNav in the box (docker): http://localhost:4000/REST/interaction/interaction.json?rxcui=341248
* Where to find list of drugs from DrugBank: https://go.drugbank.com/releases/5-1-8/downloads/all-drugbank-vocabulary
* Where to find RXCUI: https://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormfiles.html
  * RXNCONSO description: https://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/RXNORM/sourcerepresentation.html#file1

In [26]:
import pandas as pd
import json

## Opening rxnav-in-the-box interaction results (pease refer to `rxnav_in_the_box_retr_interactions` notebook)

In [3]:
rxnav_results_df = (
    pd.read_csv('rxnav_in_the_box_05_Sep_2023_drug_interact.csv')
        .drop(columns=['Unnamed: 0'])
)

In [5]:
rxnav_results_df

Unnamed: 0,source_drugbank_id,source_name,target_drugbank_id,target_name,description
0,DB01185,Fluoxymesterone,DB00359,Sulfadiazine,Fluoxymesterone may increase the hypoglycemic ...
1,DB01185,Fluoxymesterone,DB01015,Sulfamethoxazole,Fluoxymesterone may increase the hypoglycemic ...
2,DB01185,Fluoxymesterone,DB00263,Sulfisoxazole,Fluoxymesterone may increase the hypoglycemic ...
3,DB01185,Fluoxymesterone,DB08962,Glibornuride,Fluoxymesterone may increase the hypoglycemic ...
4,DB01185,Fluoxymesterone,DB01382,Glymidine,Fluoxymesterone may increase the hypoglycemic ...
...,...,...,...,...,...
1675961,DB01764,Dalfopristin,DB01039,Fenofibrate,The metabolism of Fenofibrate can be decreased...
1675962,DB01764,Dalfopristin,DB06176,Romidepsin,The metabolism of Romidepsin can be decreased ...
1675963,DB01764,Dalfopristin,DB01238,Aripiprazole,The metabolism of Aripiprazole can be decrease...
1675964,DB01764,Dalfopristin,DB00908,Quinidine,The metabolism of Quinidine can be decreased w...


### UMLS MRCONSO mapping

In [6]:
mrconso_path = (
    '../../UMLS_Metathesaurus/mrconso_and_semtypes_2022AA_df.pkl'
)

In [7]:
mrconso_st_df = pd.read_pickle(mrconso_path)

In [13]:
mrconso_st_drugbank_df = mrconso_st_df[
    mrconso_st_df['SAB'] == 'DRUGBANK'
]

In [17]:
drugbank_id_to_cui_dict = (
    mrconso_st_drugbank_df[['CUI', 'CODE']]
        .groupby('CODE')
        .agg(set)
        ['CUI']
        .to_dict()
)

In [18]:
drugbank_id_to_cui_dict['DB01238']

{'C0299792'}

In [19]:
rxnav_results_df['source_cui'] = rxnav_results_df['source_drugbank_id'].apply(
    lambda x: drugbank_id_to_cui_dict.get(x)
)

rxnav_results_df['target_cui'] = rxnav_results_df['target_drugbank_id'].apply(
    lambda x: drugbank_id_to_cui_dict.get(x)
)

In [21]:
rxnav_results_cui_df = (
    rxnav_results_df
        .explode('source_cui')
        .explode('target_cui')
        .dropna()
)

In [22]:
rxnav_results_cui_df

Unnamed: 0,source_drugbank_id,source_name,target_drugbank_id,target_name,description,source_cui,target_cui
0,DB01185,Fluoxymesterone,DB00359,Sulfadiazine,Fluoxymesterone may increase the hypoglycemic ...,C0016366,C0038675
1,DB01185,Fluoxymesterone,DB01015,Sulfamethoxazole,Fluoxymesterone may increase the hypoglycemic ...,C0016366,C0038689
2,DB01185,Fluoxymesterone,DB00263,Sulfisoxazole,Fluoxymesterone may increase the hypoglycemic ...,C0016366,C0038745
3,DB01185,Fluoxymesterone,DB08962,Glibornuride,Fluoxymesterone may increase the hypoglycemic ...,C0016366,C0350998
4,DB01185,Fluoxymesterone,DB01382,Glymidine,Fluoxymesterone may increase the hypoglycemic ...,C0016366,C0351000
...,...,...,...,...,...,...,...
1675963,DB01764,Dalfopristin,DB01238,Aripiprazole,The metabolism of Aripiprazole can be decrease...,C0756085,C0299792
1675964,DB01764,Dalfopristin,DB00908,Quinidine,The metabolism of Quinidine can be decreased w...,C0756085,C0034414
1675964,DB01764,Dalfopristin,DB00908,Quinidine,The metabolism of Quinidine can be decreased w...,C0756085,C4255874
1675964,DB01764,Dalfopristin,DB00908,Quinidine,The metabolism of Quinidine can be decreased w...,C0756085,C4255875


## Saving

In [24]:
rxnav_mapped_pairs = list({
    tuple(sorted(p)) for p in list(
        zip(
            rxnav_results_cui_df['source_cui'],
            rxnav_results_cui_df['target_cui']
        )
    )
})
len(rxnav_mapped_pairs)

1202410

In [27]:
with open('../../benchmark_data/01_cui_pairs_json/rxnav_cui_pairs.json', 'w') as f:
    json.dump(list(rxnav_mapped_pairs), f)