<a href="https://colab.research.google.com/github/fastdatascience/drug_named_entity_recognition/blob/main/drug_named_entity_recognition_example_walkthrough.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![Fast Data Science logo](https://raw.githubusercontent.com/fastdatascience/brand/main/primary_logo.svg)

<a href="https://fastdatascience.com"><span align="left">🌐 fastdatascience.com</span></a>
<a href="https://www.linkedin.com/company/fastdatascience/"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/linkedin.svg" alt="Fast Data Science | LinkedIn" width="21px"/></a>
<a href="https://twitter.com/fastdatascienc1"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/x.svg" alt="Fast Data Science | X" width="21px"/></a>
<a href="https://www.instagram.com/fastdatascience/"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/instagram.svg" alt="Fast Data Science | Instagram" width="21px"/></a>
<a href="https://www.facebook.com/fastdatascienceltd"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/fb.svg" alt="Fast Data Science | Facebook" width="21px"/></a>
<a href="https://www.youtube.com/channel/UCLPrDH7SoRT55F6i50xMg5g"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/yt.svg" alt="Fast Data Science | YouTube" width="21px"/></a>
<a href="https://g.page/fast-data-science"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/google.svg" alt="Fast Data Science | Google" width="21px"/></a>
<a href="https://medium.com/fast-data-science"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/medium.svg" alt="Fast Data Science | Medium" width="21px"/></a>
<a href="https://mastodon.social/@fastdatascience"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/mastodon.svg" alt="Fast Data Science | Mastodon" width="21px"/></a>

# Drug named entity recognition Python library by Fast Data Science

Developed by Fast Data Science, https://fastdatascience.com

Source code at https://github.com/fastdatascience/drug_named_entity_recognition

Tutorial at https://fastdatascience.com/drug-named-entity-recognition-python-library/

This is a lightweight Python library for finding drug names in a string, otherwise known as [named entity recognition (NER)](https://fastdatascience.com/named-entity-recognition/) and named entity linking.

You can run this notebook in Google Colab: <a href="https://colab.research.google.com/github/fastdatascience/drug_named_entity_recognition/blob/main/drug_named_entity_recognition_example_walkthrough.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook shows how you can install and run Drug Named Entity Recognition.

# 😊 Using Drug named entity recognition library directly from Google Sheets (no-code!)

<img align="left" alt="Google Sheets logo" title="Google Sheets logo" width=150 height=105  src="https://raw.githubusercontent.com/fastdatascience/drug_named_entity_recognition/main/google_sheets_logo_small.png" />

We have a no-code solution where you can [use the library directly from Google Sheets](https://fastdatascience.com/drug-name-recogniser) as the library has also been wrapped as a Google Sheets plugin.

[Click here](https://www.youtube.com/watch?v=qab1Bv_YpYU) to watch a video of how the plugin works.

You can install the plugin in Google Sheets [here](https://workspace.google.com/marketplace/app/drug_name_recogniser/463844408236).

![my badge](https://badgen.net/badge/Status/In%20Development/orange)

[![PyPI package](https://img.shields.io/badge/pip%20install-drug_named_entity_recognition-brightgreen)](https://pypi.org/project/drug-named-entity-recognition/)


## Install the Drug named entity recognition Python library from Pypi

In [1]:
!pip install drug-named-entity-recognition

[0m

In [3]:
import drug_named_entity_recognition
print (drug_named_entity_recognition.__version__)

2.0.8


In [4]:
from drug_named_entity_recognition import find_drugs

find_drugs("i bought some Prednisone".split(" "))

[({'medline_plus_id': 'a601102',
   'name': 'Prednisone',
   'mesh_id': 'D018931',
   'mesh_tree': ['D04.210.500.745.432.719.702'],
   'drugbank_id': 'DB00635',
   'smiles': 'C[C@]12CC(=O)[C@H]3[C@H]([C@@H]1CC[C@@]2(C(=O)CO)O)CCC4=CC(=O)C=C[C@]34C',
   'formula': 'C21H26O5',
   'mass_lower': 358.17802393,
   'mass_upper': 358.17802393,
   'synonyms': ['prednisone',
    'sterapred',
    'rayos',
    'prednisone intensol',
    'deltasone',
    'kortancyl',
    'winpred',
    'cutason',
    'orasone',
    'prednidib',
    'acsis, prednison',
    'encorton',
    'cortancyl',
    'rectodelt',
    'prednison acsis',
    'dehydrocortisone',
    'decortin',
    'cortan',
    'prednison hexal',
    'meticorten',
    'predni tablinen',
    'encortone',
    'apo-prednisone',
    'decortisyl',
    'pronisone',
    'enkortolon',
    'delta-cortisone',
    'liquid pred',
    'sone',
    'dacortin',
    'ultracorten',
    'panafcort',
    'predniment',
    'prednison galen',
    'panasol',
    '1,2-d

Example of the tool on a longer text with tokenisation with Regex (you could also use spaCy)

In [5]:
long_text = '''
None of the below text is medical advice and is only provided as an example to test the tool.

Dabigatran, also referred to as Pradaxa, is an anticoagulant that helps prevent blood clots. It works by blocking thrombin, an enzyme that helps blood clot. Dabigatran is available in tablet form and is taken by mouth. It is used to treat or prevent blood clots in the legs, lungs, and brain, and to reduce the risk of death for people with atrial fibrillation (a heart rhythm disorder).

Prednisone is a corticosteroid medication that helps reduce inflammation. It is available in tablet form and is taken by mouth. Prednisone is a synthetic version of cortisol, a hormone produced by the adrenal glands. It is used to treat a variety of conditions, including rheumatoid arthritis, asthma, lupus, and ulcerative colitis.

Actemra (tocilizumab) is an interleukin-6 (IL-6) receptor antagonist. It works by blocking the action of IL-6, a protein that helps regulate the immune system. Actemra is available in injection form and is given subcutaneously (under the skin) or intravenously (into the vein). It is used to treat rheumatoid arthritis, giant cell arteritis, and systemic lupus erythematosus.

How are these medications related?

Dabigatran and prednisone are not typically used together. However, Actemra and prednisone can be used together to treat rheumatoid arthritis and other autoimmune diseases. Actemra works to suppress the immune system, while prednisone helps to reduce inflammation.

The combination of Actemra and prednisone can be effective in managing symptoms of autoimmune diseases, but it is important to be aware of the potential risks. These risks include an increased risk of infection, bleeding, and side effects from both medications.

It is important to talk to your doctor about the risks and benefits of taking any medication, including dabigatran, prednisone, and Actemra.

The above text is not medical advice.
'''

In [7]:
import re, json
PATTERN = r"\w+"
tokens = re.findall(PATTERN, long_text)
print (json.dumps(find_drugs(tokens, is_include_structure=True), indent=4))

[
    [
        {
            "mesh_id": "D010954",
            "mesh_tree": [
                "D12.644.276.374.465.224",
                "D23.529.374.465.224",
                "D12.776.467.374.465.202"
            ],
            "name": "Interleukin-6",
            "synonyms": [
                "interleukin-6",
                "il-6",
                "differentiation factor-2, b-cell",
                "b cell differentiation factor 2",
                "ifn-beta 2",
                "differentiation factor, b cell",
                "growth factor, plasmacytoma",
                "differentiation factor, b-cell",
                "b-cell stimulatory factor 2",
                "mgi-2",
                "differentiation factor 2, b cell",
                "b cell stimulatory factor 2",
                "plasmacytoma growth factor",
                "interleukin 6",
                "hybridoma growth factor",
                "differentiation-inducing protein, myeloid",
                "b cell stim