# 📝 Named Entity Recognition + Image Classification - Demo Notebook
---
## 🎯 Goal: Verify if a text description correctly matches an image of an animal
---
### 🚀 Introduction
This notebook demonstrates:
- How to extract animal names from text using **NER (Named Entity Recognition)**.
- How to classify animals in images using a **CNN model**.
- How to compare results and determine correctness.
- How to handle **edge cases** (e.g., multiple animals, incorrect labels).
---

In [1]:
# Install the required packages
%pip install -q -r requirements.txt

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
# 📌 Step 1: Install & Import Dependencies
import matplotlib.pyplot as plt
from PIL import Image
from pipeline import verify_animal

print('✅ All dependencies imported!')

  from .autonotebook import tqdm as notebook_tqdm


✅ All dependencies imported!


## 📌 Step 2: Test the Pipeline on an Example
We will run the pipeline on a sample text and image.

In [5]:
# Define example inputs
text = "There is a cat in the picture."
image_path = r"D:\Projects\Winstars_AI_DS_internship_test\task2_ner_image_pipeline\dataset\val\cow\OIP-_01xh7qI0nvypZLkOxGyIgHaFi.jpeg"

# Run the verification pipeline
result = verify_animal(text, image_path)

# Display result
print(f'✅ Verification Result: {result}')

Tokenized Text and Labels:
[CLS]: 0
There: 0
is: 0
a: 0
cat: 0
in: 0
the: 0
picture: 0
.: 0
[SEP]: 0
Extracted from Text: set()
Predicted from Image: sheep
✅ Verification Result: False


## 📌 Step 3: Test Edge Cases
Let's see how the pipeline handles edge cases, such as:
1. **Multiple animals** mentioned in text.
2. **Incorrect labels** in text.
3. **Misspelled animal names**.
4. **No animal mentioned in text.**

In [8]:
# Define edge case inputs
edge_cases = [
    ("There is a dog and a cat in the picture.", r"D:\Projects\Winstars_AI_DS_internship_test\task2_ner_image_pipeline\dataset\val\cow\OIP-_01xh7qI0nvypZLkOxGyIgHaFi.jpeg"),
    ("There is a tiger in the picture.", r"D:\Projects\Winstars_AI_DS_internship_test\task2_ner_image_pipeline\dataset\val\cow\OIP-_01xh7qI0nvypZLkOxGyIgHaFi.jpeg"),
    ("There is a cta in the picture.", r"D:\Projects\Winstars_AI_DS_internship_test\task2_ner_image_pipeline\dataset\val\cow\OIP-_01xh7qI0nvypZLkOxGyIgHaFi.jpeg"),
    ("This is a beautiful landscape.", r"D:\Projects\Winstars_AI_DS_internship_test\task2_ner_image_pipeline\dataset\val\cow\OIP-_01xh7qI0nvypZLkOxGyIgHaFi.jpeg")
]

# Run pipeline on edge cases
for i, (text, image) in enumerate(edge_cases):
    print(f'Test Case {i+1}: {text}')
    result = verify_animal(text, image)
    print(f'✅ Verification Result: {result}')

Test Case 1: There is a dog and a cat in the picture.
Tokenized Text and Labels:
[CLS]: 0
There: 0
is: 0
a: 0
dog: 0
and: 0
a: 0
cat: 0
in: 0
the: 0
picture: 0
.: 0
[SEP]: 0
Extracted from Text: set()
Predicted from Image: sheep
✅ Verification Result: False
Test Case 2: There is a tiger in the picture.
Tokenized Text and Labels:
[CLS]: 0
There: 0
is: 0
a: 0
tiger: 0
in: 0
the: 0
picture: 0
.: 0
[SEP]: 0
Extracted from Text: set()
Predicted from Image: sheep
✅ Verification Result: False
Test Case 3: There is a cta in the picture.
Tokenized Text and Labels:
[CLS]: 0
There: 0
is: 0
a: 0
c: 0
##ta: 0
in: 0
the: 0
picture: 0
.: 0
[SEP]: 0
Extracted from Text: set()
Predicted from Image: sheep
✅ Verification Result: False
Test Case 4: This is a beautiful landscape.
Tokenized Text and Labels:
[CLS]: 0
This: 0
is: 0
a: 0
beautiful: 0
landscape: 0
.: 0
[SEP]: 0
Extracted from Text: set()
Predicted from Image: sheep
✅ Verification Result: False
