<a href="https://colab.research.google.com/github/yumakemore/Medication2RxNorm/blob/master/Medication2RxNorm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Medication2RxNorm

Byunggu Yu, Ph.D.

April 2, 2020

Your medication records in your csv file are augmented with clean medication concept names and standard Rx codes and all other properties from RxNorm through approximate name search. All added fields are named with name_prefix "M2RxNorm_" (you can change this) in the output csv file. 

Input: Your own csv file containing medication records with medication names (e.g., zocor 10 mg) from arbitrary sources. Note: the first line of the csv file must be comma-separated column (field) names.

Output: <your file name>-m2rxnorm.csv containing clean standard names and various standard codes (e.g., NDA code) and all other properties from RxNorm. All added fields are named with prefix "M2RxNorm_" in the output csv file. 





In [0]:
import pandas as pd
import sys

waittime=61
name_prefix="M2RxNorm_"




Read Medication Events (event_csv.csv) into a Pandas DataFrame (event_df);
Read Medication Concepts (concept_csv.csv) into a Pandas DataFrame (concept_df).

In [0]:
import datetime

# Path to the data file
location = input("Enter your csv file name with path or url (e.g., ./medrecords.csv):\n")

# data loading
record_df = pd.read_csv(location, sep=',')

#Check out the data
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 2000)
pd.set_option('display.max_rows', None)
print("---Data Fields---\n")
print(record_df.dtypes,"\n")
print("---First 5 Rows---\n")
print(record_df.head(),"\n")
print("---Last 5 Rows---\n")
print(record_df.tail())

Add standard codes and name from RxNorm through approximate search

In [0]:
import requests
import json

def ApproximateName2RxNorm(approximate_name):
  #e.g., approximate_name="zocor 10 mg"
  name_in_url=approximate_name.replace(' ','%20')
  TermRequest = "https://rxnav.nlm.nih.gov/REST/approximateTerm.json?term="+name_in_url+"&maxEntries=1"

  resp = requests.get(TermRequest)
  if resp.status_code != 200:
      # something went wrong
      raise ApiError('GET /tasks/ {}'.format(resp.status_code))


  items = resp.json()
  rxcui=[]
  rxcui_approximateTerm_score=[]
  rxcui_attributes_text=[]
  rxcui_attributes_json=[]
  for item in items['approximateGroup']['candidate']:
    rxcui.append(item['rxcui'])
    rxcui_request = "https://rxnav.nlm.nih.gov/REST/rxcui/"+item['rxcui']+"/allProperties.json?prop=all"
    rxcui_resp = requests.get(rxcui_request)
    if rxcui_resp.status_code != 200:
      # something went wrong
      raise ApiError('GET /tasks/ {}'.format(rxcui_resp.status_code))

    rxcui_attributes_text.append(rxcui_resp.text)
    rxcui_attributes_json.append(rxcui_resp.json())
    rxcui_approximateTerm_score.append(item['score'])

  selected_concept=-1
  for i in range(0,len(rxcui_attributes_json)):
    for j in range(0, len(rxcui_attributes_json[i]['propConceptGroup']['propConcept'])):
      if rxcui_attributes_json[i]['propConceptGroup']['propConcept'][j]['propName']=="NDA":
        selected_concept=i
        break
    if selected_concept != -1: break

  return rxcui_attributes_json[selected_concept], rxcui_approximateTerm_score[selected_concept]
    

In [0]:
import time

# Path to the data file
field_name = input("Enter the name of the medication name field in your csv file:\n")

print("Before RxNorm: \n", record_df.head())
for index, row in record_df.iterrows():
  Myname = row[field_name]
  Myname = Myname.replace("[","(").replace("]",")")
  print("\n... Treating Record No. ",index,"/",len(record_df),": an ApproximateName Search on RxNorm for ",Myname, " ...\n")
  rxdone=0
  while rxdone==0:
    try:
      RxNormData, score =ApproximateName2RxNorm(Myname)
      rxdone=1
    except:
      print("ERRO...retrying after sleeping ", waittime, "sec...")
      print("(Note: if this error persists, try to manually clean this specific medication name shown above and re-run MedicationRecords2RxNorm.)\n")
      time.sleep(waittime)

  record_df.at[index, name_prefix+'Similarity_Score']=score
  for j in range(0, len(RxNormData['propConceptGroup']['propConcept'])):
    record_df.at[index, name_prefix+RxNormData['propConceptGroup']['propConcept'][j]['propName']]=RxNormData['propConceptGroup']['propConcept'][j]['propValue']
print("After RxNorm: \n", record_df.head())


Write the output file

In [0]:
record_df.to_csv(location+"-m2rxnorm.csv")