#Example1) GliNER and spaCy [Gliner_Spacy library] - Zero Shot NER

model: `en_core_web_lg`

In [None]:
text = """
Manic episodes with irritable mood or mixed episodes. Major depressive episodes with prominent irritable mood may be difficult to distinguish from manic episodes with irritable mood or from mixed episodes. This distinction requires a careful clinical evaluation of the presence of manic symptoms.
Mood disorder due to another medical condition. A major depressive episode is the appropriate diagnosis if the mood disturbance is not judged, based on individual history, physical examination, and laboratory findings, to be the direct pathophysiological consequence of a specific medical condition (e.g., multiple sclerosis, stroke, hypothyroidism).
Substance/medication-induced depressive or bipolar disorder. This disorder is distinguished from major depressive disorder by the fact that a substance (e.g., a drug of abuse, a medication, a toxin) appears to be etiologically related to the mood disturbance. For example, depressed mood that occurs only in the context of withdrawal from cocaine would be diagnosed as cocaine-induced depressive disorder.
Attention-deficit/hyperactivity disorder. Distractibility and low frustration tolerance can occur in both attention-deficit/ hyperactivity disorder and a major depressive episode; if the criteria are met for both, attention-deficit/hyperactivity disorder may be diagnosed in addition to the mood disorder. However, the clinician must be cautious not to overdiagnose a major depressive episode in children with attention-deficit/hyperactivity disorder whose disturbance in mood is characterized by irritability rather than by sadness or loss of interest.
Adjustment disorder with depressed mood. A major depressive episode that occurs in response to a psychosocial stressor is distinguished from adjustment disorder with depressed mood by the fact that the full criteria for a major depressive episode are not met in adjustment disorder.
Sadness. Finally, periods of sadness are inherent aspects of the human experience. These periods should not be diagnosed as a major depressive episode unless criteria are met for severity (i.e., five out of nine symptoms), duration (i.e., most of the day, nearly every day for at least 2 weeks), and clinically significant distress or impairment. The diagnosis other specified depressive disorder may be appropriate for presentations of depressed mood with clinically significant impairment that do not meet criteria for duration or severity

"""

In [None]:
!pip install gliner-spacy

In [None]:
import spacy
from gliner_spacy.pipeline import GlinerSpacy

In [None]:
!python -m spacy download en_core_web_lg

In [None]:
nlp = spacy.load("en_core_web_lg")
nlp.add_pipe("gliner_spacy", config = {"labels": ["SYMPTOM", "DISORDER", "PERSON", "TIME", "DATE", "ORDINAL", "CARDINAL", "LOCATION", "PERCENTAGE"]})

In [None]:
doc = nlp(text)
entities_by_label = {}
for ent in doc.ents:
    label = ent.label_.lower()
    text = ent.text.lower()
    if label not in entities_by_label:
        entities_by_label[label] = [text]
    elif text not in entities_by_label[label]:
        entities_by_label[label].append(text)

for label, texts in entities_by_label.items():
    print(f"'{label.upper()}': {texts}")

'DISORDER': ['manic episodes', 'irritable mood', 'mixed episodes', 'major depressive episodes', 'mood disorder', 'medical condition', 'major depressive episode', 'mood disturbance', 'multiple sclerosis', 'stroke', 'hypothyroidism', 'substance/medication-induced depressive', 'bipolar disorder', 'major depressive', 'depressed mood', 'cocaine-induced depressive disorder', 'attention-deficit/hyperactivity disorder', 'attention-deficit/ hyperactivity disorder', 'adjustment disorder', 'depressive episode', 'diagnosis', 'disorder']
'SYMPTOM': ['manic symptoms', 'findings', 'distractibility', 'low frustration tolerance', 'major depressive episode', 'irritability', 'sadness', 'loss of interest', 'mood', 'psychosocial stressor', 'human experience', 'severity', 'symptoms', 'presentations', 'depressed mood', 'clinically significant impairment']
'PERCENTAGE': ['criteria', 'severity']
'PERSON': ['clinician', 'children']
'TIME': ['2 weeks', 'duration']


#Example2) working with GLiNER
model: `urchade/gliner_large-v2.1`

In [None]:
!pip install gliner

In [None]:
from gliner import GLiNER

In [None]:
model = GLiNER.from_pretrained("urchade/gliner_large-v2.1")
model.eval()
print("OK")

In [None]:
labels = ["SYMPTOM", "DISORDER", "PERSON", "TIME", "DATE", "ORDINAL", "CARDINAL", "LOCATION", "PERCENTAGE"]

In [None]:
entities = model.predict_entities(text, labels, threshold=0.4)

entity_labels_dict = {}

# Convert all entities to lowercase for comparison
for entity in entities:
    entity["text"] = entity["text"].lower()

for entity in entities:
    if entity["label"] not in entity_labels_dict:
        entity_labels_dict[entity["label"]] = set()

    entity_labels_dict[entity["label"]].add(entity["text"])

# Print label-entities dictionary
print("Entities by Label:")
for label, entities in entity_labels_dict.items():
    print(f"[{label}]:")
    for entity in entities:
        print(f"  {entity} -> {label}")

Entities by Label:
[SYMPTOM]:
  distractibility -> SYMPTOM
  depressed mood -> SYMPTOM
  irritability -> SYMPTOM
  irritable mood -> SYMPTOM
  low frustration tolerance -> SYMPTOM
  sadness -> SYMPTOM
[DISORDER]:
  adjustment disorder -> DISORDER
  attention-deficit/ hyperactivity disorder -> DISORDER
  multiple sclerosis -> DISORDER
  cocaine-induced depressive disorder -> DISORDER
  attention-deficit/hyperactivity disorder -> DISORDER
  major depressive disorder -> DISORDER
  bipolar disorder -> DISORDER
  major depressive episode -> DISORDER


#Example3) working with GLiNER
**difference from example2:** defining library functions manually by ourselves. <br>
model: `urchade/gliner_large-v2.1`

In [None]:
from gliner import GLiNER
import time
from rich.console import Console

c = Console(soft_wrap=True)

In [None]:
import hashlib

def colour_code_for(label):
    hash_object = hashlib.sha256(label.encode())
    hash_int = int.from_bytes(hash_object.digest(), 'big')
    darker_color_code_int = hash_int % 0x505050
    return f'#{darker_color_code_int:06x}'

In [None]:
def fill_gaps(entities, text):
    chunks = []
    for prev_entity, next_entity in zip(entities[:-1], entities[1:]):
        chunks.append(prev_entity)
        start = prev_entity["end"]
        end = next_entity["start"]
        chunks.append({"start": start, "end": end, "text": text[start:end]})
    chunks.append(entities[-1])

    if len(entities) > 0:
        if entities[0]["start"] > 0:
            start = 0
            end = entities[0]["start"]
            start_chunk = {"start": start, "end": end, "text": text[start:end]}
            chunks.insert(0, start_chunk)
        if entities[-1]["end"] < len(text)-1:
            start = entities[-1]["end"]
            end = len(text)
            end_chunk = {"start": start, "end": end, "text": text[start:end]}
            chunks.append(end_chunk)
    return chunks


In [None]:
def render_labels(label_colour_codes):
    c = Console()
    c.print("Labels: ", end='')
    for label, colour in label_colour_codes.items():
        c.print(label, end='', style=f"bold white on {colour}")
        c.print("", end=' ')
    c.print()


def render_text(chunks, label_colour_codes):
    c = Console()
    for entity in chunks:
        colour = entity.get("label")
        if colour in label_colour_codes:
            style = f"bold white on {label_colour_codes[colour]}"
        else:
            style = "white on black"
        c.print(entity['text'], end='', style=style)


In [None]:

def annotate_text(text, labels, model):
    label_color_codes = {label: colour_code_for(label) for label in labels}
    render_labels(label_color_codes)

    start = time.time()
    entities = model.predict_entities(text, labels)
    end = time.time()
    render_text(fill_gaps(entities, text), label_color_codes)
    c.print(f"\nTime taken: {end-start:.2f} seconds")


In [None]:
# Load the model
model = GLiNER.from_pretrained("urchade/gliner_large-v2.1")

In [None]:
#text = ""
labels = ["SYMPTOM", "DISORDER", "PERSON", "TIME", "DATE", "ORDINAL", "CARDINAL", "LOCATION", "PERCENTAGE"]
annotate_text(text, labels, model)

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
