# BERT is a powerful puppet

We will be using 3 BERT models, each fine-tuned for a different purpose

## Importing (or downloading) our libraries

- If you are running the code on Colab, or if you don't have those libraries installed, execute the next cell

In [None]:
!pip install transformers torch datasets

Importing our libraries

In [None]:
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
from transformers import AutoModelForTokenClassification, AutoModelForQuestionAnswering
import torch.nn.functional as F

print("Transformers version:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

## Variant 1: BERT for Sentiment Analysis

**Model:** `nlptown/bert-base-multilingual-uncased-sentiment`  
**Purpose:** Fine-tuned for 5-star sentiment analysis of product reviews

In [None]:
# our sentiment analysis model huggingface name
sentiment_model_name = "nlptown/bert-base-multilingual-uncased-sentiment"

# we load the tokenizer and the model
sentiment_tokenizer = AutoTokenizer.from_pretrained(sentiment_model_name)
sentiment_model = AutoModelForSequenceClassification.from_pretrained(sentiment_model_name).to(device)


print(f"Model: {sentiment_model_name}")
print(f"Our classes: 1-star to 5-star ratings")

# these are the reviews we will be using for inference
reviews = [
    "The product exceeded my expectations! Absolutely fantastic quality and fast shipping.",
    "Terrible experience. Arrived broken and customer service was unhelpful.",
    "It's okay for the price. Does what it's supposed to but nothing special.",
    "Best purchase I've made this year! Worth every penny.",
    "Mediocre at best. I've seen better products for half the price."
]

## Using the huggingface pipeline (the easy way)
- Tasks:
    - Task 1: using `sentiment_pipeline` try to get `result` (this contains things we will need like: score and label)

In [None]:
sentiment_pipeline = pipeline("sentiment-analysis",
                               model=sentiment_model_name,
                               device=0 if torch.cuda.is_available() else -1)

print("\n" + "="*60)
print("Sentiment Analysis with BERT (Pipeline)")
print("="*60)

for i, review in enumerate(reviews, 1):
    # --- Task 1 begins here ---
    result = sentiment_pipeline(review)[0]
    # --- Task 2 begins here ---
    print(f"\n{i}. Review: {review[:80]}...")
    print(f"   Rating: {result['label']}")
    print(f"   Confidence: {result['score']:.2%}")

# Running the inference manually
- Now let's assume we can't use the pre-built Hugging Face pipeline
- Using what you learned before, try to get the inference results and scores
- Tasks:
  - Task 1: Using the inputs provided, try to get the outputs using the `sentiment_model`
  - Task 2: Now get our `logits`, and from those, our `probs`
    - Hint: How do we get probabilities from 0 to 1 given logits?
  - Task 3: Now we need our `pred`, `label`, and `score`


In [None]:
print("\n" + "="*60)
print("Sentiment Analysis with BERT (Manual)")
print("="*60)

print("\nAnalyzing Product Reviews:")
for i, review in enumerate(reviews, 1):
    inputs = sentiment_tokenizer(review, return_tensors="pt", truncation=True, padding=True).to(device)
    with torch.no_grad():
        # --- Task 1 begins here ---
        outputs = sentiment_model(**inputs)
        # --- Task 1 ends here ---
        # --- Task 2 begins here ---
        logits = outputs.logits[0]
        probs = F.softmax(logits, dim=-1)
        # --- Task 2 ends here
        # --- Task 3 begins here ---
        pred = int(torch.argmax(probs).cpu().item())
        label = sentiment_model.config.id2label[pred]
        score = float(probs[pred].cpu().item())
        # --- Task 3 ends here ---

    stars = label
    print(f"\n{i}. Review: {review[:80]}...")
    print(f"   Rating: {stars}  (mapped id: {pred})")
    print(f"   Confidence: {score:.2%}")

## Variant 2: BERT for Question Answering

**Model:** `deepset/bert-base-cased-squad2`  
**Purpose:** Fine-tuned for extractive question answering (SQuAD 2.0)

In [None]:
# path to our question answering model on huggingface
qa_model_name = "deepset/bert-base-cased-squad2"


# we load the tokenizer and the model
qa_tokenizer = AutoTokenizer.from_pretrained(qa_model_name)
qa_model = AutoModelForQuestionAnswering.from_pretrained(qa_model_name).to(device)

print("Question Answering Model Loaded")
print(f"Model: {qa_model_name}")
print(f"This model was trained on: SQuAD 2.0 dataset (100k+ question-answer pairs)")

In [None]:
qa_pipeline = pipeline("question-answering",
                       model=qa_model_name,
                       device=0 if torch.cuda.is_available() else -1)

print("\n" + "="*60)
print("Question Answering with BERT (Pipeline)")
print("="*60)

# our context paragraph
context = """
Albert Einstein, a theoretical physicist who revolutionized modern science,
was a patent clerk before developing his groundbreaking theories.
He built a monumental legacy with his work on relativity and quantum
mechanics, winning the Nobel Prize for Physics, and became a
leading public intellectual, advocating for civil rights,
pacifism, and academic freedom throughout his life.
"""
# questions
questions = [
    "What was Albert Einstein's profession before he became famous for his theories?",
    "For what achievement did Einstein win the Nobel Prize?",
    "Name one of the social causes he advocated for as a public intellectual.",
]

print("\nContext:")
print(context)
print("\n" + "-"*60)

for i, question in enumerate(questions, 1):
    print(f"\n{i}. Question: {question}")

    result = qa_pipeline({
        'question': question,
        'context': context
    })

    answer = result['answer']
    confidence = result['score']
    start = result['start']
    end = result['end']

    snippet_start = max(0, start - 30)
    snippet_end = min(len(context), end + 30)
    context_snippet = context[snippet_start:snippet_end]

    print(f"   Answer: {answer}")
    print(f"   Confidence: {confidence:.2%}")
    print(f"   Context snippet: ...{context_snippet.replace('\n', ' ')}...")

## Variant 3: BERT for Named Entity Recognition (NER)

**Model:** `dslim/bert-base-NER`  
**Purpose:** Fine-tuned for identifying named entities (persons, organizations, locations)

In [None]:
# our NER model huggingface name
ner_model_name = "dslim/bert-base-NER"

# we load the tokenizer and the model
ner_tokenizer = AutoTokenizer.from_pretrained(ner_model_name)
ner_model = AutoModelForTokenClassification.from_pretrained(ner_model_name).to(device)

print(f"Model: {ner_model_name}")
print(f"Entity types: PER (Person), ORG (Organization), LOC (Location), MISC (Miscellaneous)")

# the texts we will be using for our NER task
texts = [
    "Apple Inc. was founded by Steve Jobs and Steve Wozniak in Cupertino, California.",
    "Elon Musk announced that Tesla will build a new factory in Berlin next year.",
    "Microsoft's CEO Satya Nadella visited the London office on Monday with executives from GitHub."
]

# Pipeline NER

In [None]:
ner_pipeline = pipeline("ner",
                        model=ner_model_name,
                        device=0 if torch.cuda.is_available() else -1,
                        aggregation_strategy="simple")

print("\n" + "="*60)
print("NER with BERT (Pipeline)")
print("="*60)

for i, text in enumerate(texts, 1):
    print(f"\n{i}. Text: {text}")
    entities = ner_pipeline(text)

    if entities:
        print("   Entities found:")
        for entity in entities:
            entity_type = entity['entity_group']
            entity_text = entity['word']
            confidence = entity['score']

            print(f"{entity_type}: {entity_text} ({confidence:.2%})")
    else:
        print("   No entities detected.")

# Manual NER Implementations
- This code simulates how the Hugging Face pipeline actually works
- The model returns token-level predictions, and we use character offset mappings to reconstruct entities
- Below are comments explaining what each part does and why it's implemented that way

In [None]:
print("\n" + "="*60)
print("NER with Bert (Manual)")
print("="*60)

# we are getting the mapping from label ids to label names (e.g., 0 -> "O", 1 -> "B-PER",...)
id2label = ner_model.config.id2label

for i, text in enumerate(texts, 1):
    # here is our NER tokenization
    # we are setting "return_offsets_mapping=True" to get the character offsets of each token
    # truncation = True, means that if the text is longer than the model's max length, it will be truncated
    enc = ner_tokenizer(text, return_tensors="pt", return_offsets_mapping=True, truncation=True).to(device)

    # here we remove the offset mapping from the model inputs, we don't need it for inference, and the model might not expect it
    model_inputs = {k: v for k, v in enc.items() if k != "offset_mapping"}

    # we are doing inference, so we disable gradient calculations
    with torch.no_grad():
        logits = ner_model(**model_inputs).logits

    preds = logits.argmax(dim=-1)[0].cpu().numpy()
    offsets = enc["offset_mapping"][0].cpu().numpy()

    entities = []
    current_ent = None

    for idx, pred_id in enumerate(preds):
        off = offsets[idx]
        # we skip special tokens and tokens with no character offsets (like [CLS], [SEP], ...)
        if off[0] == off[1]:
            if current_ent is not None:
                entities.append(current_ent)
                current_ent = None
            continue

        label = id2label[int(pred_id)]

        # skip "O" labels (the "[O]utside" label)
        if label == "O":
            if current_ent is not None:
                entities.append(current_ent)
                current_ent = None
            continue

        # we split the label into prefix and entity type (e.g., "B-PER" -> ("B", "PER"))
        # we usually have "B" (begin), "I" (inside), "O" (outside)
        # example: B-PER, I-PER mean the [B]eginning and [I]nside of a [PER]son entity
        if "-" in label:
            prefix, ent_type = label.split("-", 1)
        else:
            prefix, ent_type = "B", label

        # we start a new entity if the prefix is "B" or if there is no current entity or if the entity type has changed
        if prefix == "B" or current_ent is None or current_ent["type"] != ent_type:
            if current_ent is not None:
                entities.append(current_ent)
            current_ent = {"type": ent_type, "start": int(off[0]), "end": int(off[1])}
        else:
            current_ent["end"] = int(off[1])

    # finalize the last entity if it exists
    if current_ent is not None:
        entities.append(current_ent)

    print(f"\n{i}. Text: {text}")
    if entities:
        print("   Entities found:")
        for e in entities:
            ent_text = text[e["start"]:e["end"]]
            print(f"{e['type']}: {ent_text} (chars {e['start']}:{e['end']})")
    else:
        print("   No entities detected.")