# Transform text to the format used in the RxNorm and SNOMED standards with the drug normalizer


![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/drug_normalization/drug_norm.ipynb)


## 1. Install NLU, dependecies and Authenticate

See the [install docs](https://nlu.johnsnowlabs.com/docs/en/install#super-quickstart-on-google-colab-or-kaggle) and [authentification docs](https://nlu.johnsnowlabs.com/docs/en/examples_hc#authorize-access-to-licensed-features-and-install-healthcare-dependencies) for more infos 


In [None]:
!wget http://setup.johnsnowlabs.com/nlu/colab.sh -O - | bash
import nlu


SPARK_NLP_LICENSE     ='????'
AWS_ACCESS_KEY_ID     ='????'
AWS_SECRET_ACCESS_KEY ='????'
JSL_SECRET            ='????'

nlu.auth(SPARK_NLP_LICENSE,AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,JSL_SECRET)

# Drug Normalizer

Normalize raw text from clinical documents, e.g. scraped web pages or xml document. Removes all dirty characters from text following one or more input regex patterns. Can apply non wanted character removal which a specific policy. Can apply lower case normalization.

## Parameters are : 
- lowercase: whether to convert strings to lowercase. Default is False.
- `policy`: rule to remove patterns from text. Valid policy values are: `all` `abbreviations`, `dosages`
Defaults is `all`. `abbreviation` policy used to expend common drugs abbreviations, `dosages` policy used to convert drugs dosages and values to the standard form (see examples bellow).


### Examples : 

Examples of transformation:
1) "Sodium Chloride/Potassium Chloride 13bag" >>> "Sodium Chloride / Potassium Chloride 13 bag" : add extra spaces in the form entity

2) "interferon alfa-2b 10 million unit ( 1 ml ) injec" >>> "interferon alfa - 2b 10000000 unt ( 1 ml ) injection " : convert 10 million unit to the 10000000 unt, replace injec with injection

3) "aspirin 10 meq/ 5 ml oral sol" >>> "aspirin 2 meq/ml oral solution" : normalize 10 meq/ 5 ml to the 2 meq/ml, extend abbreviation oral sol to the oral solution

4) "adalimumab 54.5 + 43.2 gm" >>> "adalimumab 97700 mg" : combine 54.5 + 43.2 and normalize gm to mg

5) "Agnogenic one half cup" >>> "Agnogenic 0.5 oral solution" : replace one half to the 0.5, normalize cup to the oral solution





In [None]:
data = ["Agnogenic one half cup","adalimumab 54.5 + 43.2 gm","aspirin 10 meq/ 5 ml oral sol","interferon alfa-2b 10 million unit ( 1 ml ) injec","Sodium Chloride/Potassium Chloride 13bag"]
nlu.load('norm_drugs').predict(data)

Unnamed: 0,document,text,drug_norm
0,Agnogenic one half cup,Agnogenic one half cup,Agnogenic 0.5 oral solution
1,adalimumab 54.5 + 43.2 gm,adalimumab 54.5 + 43.2 gm,adalimumab 97700 mg
2,aspirin 10 meq/ 5 ml oral sol,aspirin 10 meq/ 5 ml oral sol,aspirin 2 meq/ml oral solution
3,interferon alfa-2b 10 million unit ( 1 ml ) injec,interferon alfa-2b 10 million unit ( 1 ml ) injec,interferon alfa - 2b 10000000 unt ( 1 ml ) inj...
4,Sodium Chloride/Potassium Chloride 13bag,Sodium Chloride/Potassium Chloride 13bag,Sodium Chloride / Potassium Chloride 13 bag
