<a href="https://colab.research.google.com/github/Alfred9/Natural-Language-Processing-Projects/blob/main/Named%20Entity%20Recognition/Named_Entity_Recognition_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Named Entity Recognition (NER)
Named Entity Recognition (NER) is a sub-task of information extraction in Natural Language Processing (NLP) that classifies named entities into predefined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, and more

In [None]:
!pip install gliner
!pip install -q gradio

### Load the Model
GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios.

In [None]:
from gliner import GLiNER

model = GLiNER.from_pretrained("urchade/gliner_base")

### Business Application

In [3]:
from gliner import GLiNER

Bns_text="""NVDA could still be worth up to 26% more at $1,141 per share. This valuation is based on its massive FCF margins. Existing shareholders can gain extra income by shorting out-of-the-money (OTM) put options in nearby expiry periods.
            That will help wait for NVDA stock to reach this price target.Nvidia stock closed up 1% on Tuesday after CEO Jensen Huang said in an analyst meeting that the company expects to increase its share of the $250 billion data center market.
            Huang’s comments were made a day after Nvidia announced its latest generation of artificial intelligence chips, called Blackwell, and a new AI software platform."""

Bns_labels =['Company_name','Stock_symbol','Revenue','Profit_margin','Market_capitalization','CEO_name','Merger_acquisition','Earnings_report','Dividend_yield','Share_price']

entities = model.predict_entities(Bns_text, Bns_labels, threshold=0.5)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

NVDA => Stock_symbol
FCF => Dividend_yield
NVDA stock => Company_name
Nvidia => Stock_symbol
CEO Jensen Huang => CEO_name
Nvidia => Stock_symbol


### Court Documentation Application

In [4]:
text = """
A New York judge ordered Donald Trump on Friday to pay $355 million in penalties, finding that the former president lied about his wealth for years in a sweeping civil fraud verdict that pierces his billionaire image but stops short of putting his real estate empire out of business.
Judge Arthur Engoron’s decision after a trial in New York Attorney General Letitia James’ lawsuit punishes Trump, his company and executives, including his two eldest sons, for scheming to dupe banks, insurers and others by inflating his wealth on financial statements. It forces a shakeup at the top of his Trump Organization, putting the company under court supervision and curtailing how it does business.
The decision is a staggering setback for the Republican presidential front-runner, the latest and costliest consequence of his recent legal troubles. The magnitude of the verdict on top of penalties in other cases could dramatically dent Trump’s financial resources and damage his identity as a savvy businessman who parlayed his fame as a real estate developer into reality TV stardom and the presidency. He has vowed to appeal and won’t have to pay immediately."""

labels = ['Case_name','Defendant_name','Plaintiff_name','Judge_name','Court_name','Legal_document','Verdict','Attorney_name','Lawsuit','Jurisdiction']

entities = model.predict_entities(text, labels, threshold=0.5)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

New York => Jurisdiction
civil fraud verdict => Verdict
Judge Arthur Engoron => Attorney_name
New York => Jurisdiction
financial statements => Legal_document
verdict => Verdict


### **Biomedical Application**

In [5]:

from gliner import GLiNER

Bio_model = GLiNER.from_pretrained("urchade/gliner_large_bio-v0.1")

text_1= """A 28-year-old female with a history of gestational diabetes mellitus diagnosed eight years prior to presentation and subsequent type two diabetes mellitus (T2DM), one prior episode of HTG-induced pancreatitis three years prior to presentation, and associated with an acute hepatitis, presented with a one-week history of polyuria, poor appetite, and vomiting.
She was on metformin, glipizide, and dapagliflozin for T2DM and atorvastatin and gemfibrozil for HTG. She had been on dapagliflozin for six months at the time of presentation.
Physical examination on presentation was significant for dry oral mucosa ; significantly , her abdominal examination was benign with no tenderness, guarding, or rigidity. Pertinent laboratory findings on admission were: serum glucose 111 mg/dl,  creatinine 0.4 mg/dL, triglycerides 508 mg/dL, total cholesterol 122 mg/dL, and venous pH 7.27.
"""

med_labels = ["patient_age","gender", "test", "doctor", "admission_date","date", "symptoms", "drug", "problem", "bodypart", "disease", "result", "location", "procedure"]

entities = Bio_model.predict_entities(text_1, med_labels, threshold=0.5)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

config.json not found in HuggingFace Hub.


28-year-old => patient_age
female => gender
gestational diabetes mellitus => disease
type two diabetes mellitus => disease
T2DM => disease
three years prior to presentation => admission_date
acute hepatitis => disease
polyuria => symptoms
poor appetite => symptoms
vomiting => symptoms
metformin => drug
glipizide => drug
dapagliflozin => drug
T2DM => disease
atorvastatin => drug
gemfibrozil => drug
dapagliflozin => drug
Physical examination => procedure
abdominal examination => procedure


In [None]:
import gradio as gr

def highlight_entities(text):
    # Load the GLiNER model
    model = GLiNER.from_pretrained("urchade/gliner_large_bio-v0.1")

    # Define the labels and their corresponding colors
    labels = {
        "patient_age": "blue",
        "gender": "green",
        "test": "orange",
        "doctor": "red",
        "admission_date": "purple",
        "date": "yellow",
        "symptoms": "cyan",
        "drug": "magenta",
        "problem": "grey",
        "bodypart": "black",
        "disease": "brown"
    }

    # Predict entities
    entities = model.predict_entities(text, list(labels.keys()))

    # Sort entities by start position in descending order
    entities.sort(key=lambda x: x["start"], reverse=True)

    # Initialize highlighted text
    highlighted_text = text

    # Add HTML markup for each entity
    for entity in entities:
        highlighted_text = highlighted_text[:entity["start"]] + \
                           f"<mark style='background-color:{labels[entity['label']]}'>{entity['text']}</mark>" + \
                           f" <span style='color:{labels[entity['label']]}'>[{entity['label']}]</span> " + \
                           highlighted_text[entity["end"]:]

    return highlighted_text

iface = gr.Interface(fn=highlight_entities, inputs="text", outputs="html", title=" Biomedical NER Highlighting App", description="Input text and see named entities highlighted with labels.")
iface.launch(share= True, debug = True)
