<a href="https://colab.research.google.com/github/Nice9115/SmartBin_AI/blob/main/SmartBin_spaCy_NLP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Smart Bin Project – AI Waste Management
## Task 3: NLP with spaCy
**Goal**: Extract location, waste type, and problem from citizen reports

In [5]:
# CELL 1: Install & load BETTER model (en_core_web_md)
!pip install spacy
!python -m spacy download en_core_web_md

import spacy
nlp = spacy.load("en_core_web_md")
print("spaCy MEDIUM model loaded! (Better at finding locations)")

Collecting en-core-web-md==3.8.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_md-3.8.0/en_core_web_md-3.8.0-py3-none-any.whl (33.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m33.5/33.5 MB[0m [31m22.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: en-core-web-md
Successfully installed en-core-web-md-3.8.0
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_md')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
spaCy MEDIUM model loaded! (Better at finding locations)


In [1]:
# ONLY RUN THIS AFTER RESTART
import spacy
nlp = spacy.load("en_core_web_sm")
print("spaCy is now fully loaded!")

spaCy is now fully loaded!


In [2]:
# STEP 2: Fake complaints from Nairobi citizens
complaints = [
    "The bin near Kenyatta Market is full of plastic bottles",
    "Glass waste not collected in Westlands for 3 days",
    "Organic waste smells bad at Ngong Road junction",
    "Metal cans overflowing near Uhuru Park",
    "Paper trash scattered around Kibera"
]

print("Complaints loaded!")

Complaints loaded!


In [6]:
# CELL 2: STEP 3 – Extract location, waste, problem (with better model)
for text in complaints:
    doc = nlp(text)  # ← Uses en_core_web_md now!

    location = ""
    waste_type = ""
    problem = ""

    # Find location
    for ent in doc.ents:
        if ent.label_ in ["GPE", "LOC"]:
            location = ent.text
        # Find waste type (even if not in .ents)
        if ent.text.lower() in ["plastic", "glass", "organic", "metal", "paper", "cardboard", "trash"]:
            waste_type = ent.text.lower()

    # Find problem
    text_lower = text.lower()
    if "full" in text_lower or "overflow" in text_lower:
        problem = "Overflowing"
    elif "smell" in text_lower:
        problem = "Bad smell"
    elif "not collected" in text_lower:
        problem = "Not collected"
    elif "scattered" in text_lower:
        problem = "Scattered"

    print(f"\nComplaint: {text}")
    print(f"   → Location: {location}")
    print(f"   → Waste: {waste_type}")
    print(f"   → Problem: {problem}")


Complaint: The bin near Kenyatta Market is full of plastic bottles
   → Location: 
   → Waste: 
   → Problem: Overflowing

Complaint: Glass waste not collected in Westlands for 3 days
   → Location: Westlands
   → Waste: glass
   → Problem: Not collected

Complaint: Organic waste smells bad at Ngong Road junction
   → Location: 
   → Waste: 
   → Problem: Bad smell

Complaint: Metal cans overflowing near Uhuru Park
   → Location: 
   → Waste: 
   → Problem: Overflowing

Complaint: Paper trash scattered around Kibera
   → Location: 
   → Waste: 
   → Problem: Scattered


In [7]:
# CELL 3: STEP 4 – Visualize entities (with better model)
from spacy import displacy

sample = complaints[0]  # First complaint
doc = nlp(sample)       # ← Uses en_core_web_md

displacy.render(doc, style="ent", jupyter=True)