<a href="https://www.kaggle.com/code/mdshahnewazibrahim/ai-news-summarizer-notebook?scriptVersionId=278692417" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# üåê AI News Summarizer (Kaggle Capstone Project)

This notebook demonstrates an AI-powered news summarization pipeline using **Google Gemini 2.5 Flash**.
The goal is to:

- Fetch real news articles from the web (public URLs)
- Clean and extract the main article text
- Send the content to Gemini 2.5 Flash
- Receive a **structured JSON-style summary** that includes:
  - Main summary
  - Bullet points
  - Category
  - Entities
  - Sentiment
  - Impact
- Optionally build a **daily digest** from multiple articles
- Run a simple **evaluation** against human-written summaries

---

## 1. Problem Statement

Modern news consumption suffers from information overload:
- Long articles are time-consuming to read
- Readers want quick, clear summaries
- Bangla-friendly summaries for international news are rare

This project builds an AI News Summarizer that converts raw news URLs into concise,
human-friendly summaries in multiple styles.

---

## 2. High-Level Architecture

```text
News URL
   ‚îÇ
   ‚ñº
[ Scraper & Cleaner ]   ‚Üí  Extract plain text from HTML
   ‚îÇ
   ‚ñº
[ Prompt Builder ]       ‚Üí  Add instructions + article
   ‚îÇ
   ‚ñº
[ Gemini 2.5 Flash ]     ‚Üí  Generate JSON-style summary
   ‚îÇ
   ‚ñº
[ Output Formatter ]     ‚Üí  Pretty print results / daily digest
```

---


In [1]:
# ==================================
# 1. Imports & Configuration
# ==================================

import json
import re
import textwrap
from typing import List, Dict, Any, Optional

import requests
from bs4 import BeautifulSoup

import google.generativeai as gen
from kaggle_secrets import UserSecretsClient

# ----------------------------------
# Load Gemini API Key from Kaggle Secrets
# ----------------------------------

secrets = UserSecretsClient()
GOOGLE_API_KEY = secrets.get_secret("GOOGLE_API_KEY")

if not GOOGLE_API_KEY:
    raise ValueError("‚ùå GOOGLE_API_KEY not found in Kaggle secrets. Please set it before running.")

gen.configure(api_key=GOOGLE_API_KEY)

MODEL_NAME = "gemini-2.5-flash"
model = gen.GenerativeModel(MODEL_NAME)

# ----------------------------------
# Global Configuration
# ----------------------------------

MAX_ARTICLE_CHARS = 4000   # Trim long articles before sending to the model
MIN_PARAGRAPH_LEN = 40     # Skip very short <p> tags (likely noise)
MIN_ARTICLE_LEN = 200      # Minimum length for summarization

LANGUAGE_MODES = {
    "bn_simple": "Bangla, very simple, conversational tone.",
    "bn_formal": "Bangla, formal news style.",
    "en_simple": "English, simple beginner-friendly style.",
    "en_formal": "English, formal journalistic style.",
}

print("‚úÖ Configuration loaded. Gemini model initialized.")


‚úÖ Configuration loaded. Gemini model initialized.


In [2]:
# ==================================
# 2. Scraper & Helper Functions
# ==================================

def fetch_article_from_url(url: str) -> tuple[str, str]:
    """Fetch the title and main article text from a news URL."""
    print(f"üåê Fetching article from: {url}")
    try:
        resp = requests.get(url, timeout=20)
        resp.raise_for_status()
    except Exception as e:
        print("‚ùå Request failed:", e)
        return "Untitled", ""

    soup = BeautifulSoup(resp.text, "html.parser")

    # Title
    title = soup.title.get_text(strip=True) if soup.title else "Untitled"

    # Collect paragraph text
    paragraphs = []
    for p in soup.find_all("p"):
        text = p.get_text(" ", strip=True)
        if text and len(text) >= MIN_PARAGRAPH_LEN:
            paragraphs.append(text)

    article_text = "\n".join(paragraphs).strip()
    print(f"‚úÖ Extracted {len(article_text.split())} words from article.")
    return title, article_text


def clean_json_string(s: str) -> str:
    """Remove ```json ... ``` wrappers if the model returns code fences."""
    s = s.strip()
    if s.startswith("```"):
        # remove leading ```json or ```
        s = re.sub(r"^```(?:json)?", "", s, flags=re.IGNORECASE).strip()
        # remove trailing ```
        if s.endswith("```"):
            s = s[:-3].strip()
    return s


def safe_load_json(s: str) -> Dict[str, Any]:
    """Try to parse JSON from a model output string safely.
    If parsing fails, return a dict with the raw text under 'summary'.
    """
    s = clean_json_string(s)
    try:
        return json.loads(s)
    except Exception:
        # Fallback: treat the whole string as plain summary text
        return {"summary": s}

print("‚úÖ Helper functions ready.")


‚úÖ Helper functions ready.


In [3]:
# ==================================
# 3. Core NewsSummarizer Class (Gemini 2.5 Flash)
# ==================================

class NewsSummarizer:
    """High-level wrapper around Gemini 2.5 Flash for news summarization."""

    def __init__(self, model: Any):
        self.model = model

    def _build_instruction(self, mode: str) -> str:
        """Build the instruction text for the model."""
        style = LANGUAGE_MODES.get(mode, LANGUAGE_MODES["en_simple"])

        instruction = f"""
You are an expert news summarization assistant.

You will receive the raw text of a news article and the source URL.
Your task is to produce a JSON object with the following exact keys:

- "language": string (e.g., "bn" or "en")
- "summary": string (2‚Äì4 short paragraphs)
- "bullets": array of strings (5‚Äì7 key bullet points)
- "category": string (e.g., "Politics", "Technology", "Economy", "Health", "Sports", "Environment", "Other")
- "entities": array of strings (important people, places, organizations, countries)
- "sentiment": string ("Positive", "Negative", "Neutral", or "Mixed")
- "impact": string (who is affected and how, in 1‚Äì3 sentences)

Write the summary and bullets in this style:
{style}

Return only valid JSON, with no extra commentary, no backticks, and no markdown.
"""
        return textwrap.dedent(instruction).strip()

    def _build_prompt(self, article_text: str, url: Optional[str] = None) -> str:
        """Build the user-facing part of the prompt from article text + URL."""
        article_trimmed = article_text[:MAX_ARTICLE_CHARS]

        prompt = f"{article_trimmed}\n\nSource URL: {url or 'N/A'}"
        return prompt

    def summarize_article(
        self,
        article_text: str,
        url: Optional[str] = None,
        mode: str = "bn_simple",
    ) -> Dict[str, Any]:
        """Summarize raw article text using Gemini and return a structured dict."""
        if not article_text or len(article_text) < MIN_ARTICLE_LEN:
            return {
                "language": "unknown",
                "summary": "Article text is too short to summarize.",
                "bullets": [],
                "category": "Other",
                "entities": [],
                "sentiment": "Neutral",
                "impact": "",
            }

        instruction = self._build_instruction(mode)
        user_prompt = self._build_prompt(article_text, url=url)

        # Single string prompt (no roles / system_instruction to avoid API issues)
        full_prompt = instruction + "\n\nArticle:\n\n" + user_prompt

        try:
            response = self.model.generate_content(full_prompt)
            raw_text = response.text or ""
            data = safe_load_json(raw_text)

            # Ensure all expected keys exist
            data.setdefault("language", "unknown")
            data.setdefault("summary", "")
            data.setdefault("bullets", [])
            data.setdefault("category", "Other")
            data.setdefault("entities", [])
            data.setdefault("sentiment", "Neutral")
            data.setdefault("impact", "")
            return data
        except Exception as e:
            print("‚ùå Gemini error:", e)
            return {
                "language": "unknown",
                "summary": f"Model error: {e}",
                "bullets": [],
                "category": "Other",
                "entities": [],
                "sentiment": "Neutral",
                "impact": "",
            }

    def summarize_url(self, url: str, mode: str = "bn_simple") -> Dict[str, Any]:
        """End-to-end: URL ‚Üí fetch article ‚Üí summarize with Gemini."""
        title, article_text = fetch_article_from_url(url)
        result = self.summarize_article(article_text, url=url, mode=mode)
        result["title"] = title
        result["url"] = url
        return result


# create a global summarizer instance
summarizer = NewsSummarizer(model)
print("‚úÖ NewsSummarizer is ready.")



‚úÖ NewsSummarizer is ready.


In [4]:
# ==================================
# 4. Wrapper & Pretty Printer
# ==================================

def summarize_news_url(url: str, mode: str = "bn_simple") -> Dict[str, Any]:
    """Simple function to summarize a single news URL."""
    return summarizer.summarize_url(url, mode=mode)


def print_summary(result: Dict[str, Any]) -> None:
    """Pretty-print a summary result dict."""
    print("=" * 80)
    print("üì∞ TITLE:", result.get("title", "(no title)"))
    print("üîó URL   :", result.get("url", ""))
    print("üè∑Ô∏è  Category:", result.get("category", ""))
    print("üòä Sentiment:", result.get("sentiment", ""))
    print("-" * 80)
    print("üìÑ SUMMARY:\n")
    print(result.get("summary", ""))
    print("\n‚Ä¢ Bullet Points:")
    for i, b in enumerate(result.get("bullets", []), start=1):
        print(f"  {i}. {b}")
    if result.get("entities"):
        print("\nüîë Entities:")
        print(", ".join(result["entities"]))
    if result.get("impact"):
        print("\nüìå Impact:")
        print(result["impact"])
    print("=" * 80)

print("‚úÖ Wrapper and printer ready.")


‚úÖ Wrapper and printer ready.


In [5]:
# ==================================
# 5. Single Article Demo (you can change the URL)
# ==================================

demo_url = "https://www.bbc.com/news/articles/c891jp9j79do"  # example URL

demo_result = summarize_news_url(demo_url, mode="bn_simple")
print_summary(demo_result)


üåê Fetching article from: https://www.bbc.com/news/articles/c891jp9j79do
‚úÖ Extracted 903 words from article.
üì∞ TITLE: Trump says he will sue BBC for at least $1bn over Panorama edit
üîó URL   : https://www.bbc.com/news/articles/c891jp9j79do
üè∑Ô∏è  Category: Politics
üòä Sentiment: Negative
--------------------------------------------------------------------------------
üìÑ SUMMARY:

‡¶Ø‡ßÅ‡¶ï‡ßç‡¶§‡¶∞‡¶æ‡¶∑‡ßç‡¶ü‡ßç‡¶∞‡ßá‡¶∞ ‡¶∏‡¶æ‡¶¨‡ßá‡¶ï ‡¶™‡ßç‡¶∞‡ßá‡¶∏‡¶ø‡¶°‡ßá‡¶®‡ßç‡¶ü ‡¶°‡ßã‡¶®‡¶æ‡¶≤‡ßç‡¶° ‡¶ü‡ßç‡¶∞‡¶æ‡¶Æ‡ßç‡¶™ ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø-‡¶∞ ‡¶¨‡¶ø‡¶∞‡ßÅ‡¶¶‡ßç‡¶ß‡ßá ‡¶¨‡¶ø‡¶∂‡¶æ‡¶≤ ‡¶Ö‡¶Ç‡¶ï‡ßá‡¶∞ ‡¶Æ‡¶æ‡¶Æ‡¶≤‡¶æ ‡¶ï‡¶∞‡¶æ‡¶∞ ‡¶π‡ßÅ‡¶Æ‡¶ï‡¶ø ‡¶¶‡¶ø‡¶Ø‡¶º‡ßá‡¶õ‡ßá‡¶®‡•§ ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø ‡¶§‡¶æ‡¶¶‡ßá‡¶∞ '‡¶™‡ßç‡¶Ø‡¶æ‡¶®‡ßã‡¶∞‡¶æ‡¶Æ‡¶æ' ‡¶Ö‡¶®‡ßÅ‡¶∑‡ßç‡¶†‡¶æ‡¶®‡ßá ‡¶ü‡ßç‡¶∞‡¶æ‡¶Æ‡ßç‡¶™‡ßá‡¶∞ ‡¶è‡¶ï‡¶ü‡¶ø ‡¶ó‡ßÅ‡¶∞‡ßÅ‡¶§‡ßç‡¶¨‡¶™‡ßÇ‡¶∞‡ßç‡¶£ ‡¶¨‡¶ï‡ßç‡¶§‡ßÉ‡¶§‡¶æ‡¶∞ ‡¶Ö‡¶Ç‡¶∂ ‡¶≠‡ßÅ‡¶≤‡¶≠‡¶æ‡¶¨‡ßá ‡¶∏‡¶Æ‡ßç‡¶™‡¶æ‡¶¶‡¶®‡¶æ ‡¶ï‡¶∞‡ßá‡¶õ‡¶ø‡¶≤, ‡¶Ø‡¶æ ‡¶®‡¶ø‡¶Ø‡¶º‡ßá ‡¶è‡¶á ‡¶¨‡¶ø

In [6]:
# ==================================
# 6. Multi-URL Daily Digest
# ==================================

def summarize_multiple_urls(urls: List[str], mode: str = "bn_simple") -> List[Dict[str, Any]]:
    results = []
    for u in urls:
        print("\n" + "#" * 80)
        print("Processing:", u)
        results.append(summarize_news_url(u, mode=mode))
    return results


def build_daily_digest(results: List[Dict[str, Any]], mode: str = "bn_simple") -> str:
    """Ask Gemini to build a concise daily briefing from multiple article summaries."""
    items = []
    for r in results:
        items.append({
            "title": r.get("title", ""),
            "category": r.get("category", ""),
            "summary": r.get("summary", ""),
            "sentiment": r.get("sentiment", ""),
        })

    prompt_obj = {
        "instruction": f"Create a daily news briefing in language mode: {mode}.",
        "items": items,
    }

    prompt = json.dumps(prompt_obj, ensure_ascii=False)

    try:
        response = model.generate_content(prompt)
        return response.text or "(no response)"
    except Exception as e:
        return f"Model error while building daily digest: {e}"


def print_daily_digest(digest_text: str) -> None:
    print("\n" + "=" * 80)
    print("üì∞ DAILY BRIEFING")
    print("=" * 80)
    print(digest_text)
    print("=" * 80)

print("‚úÖ Daily digest helpers ready.")


‚úÖ Daily digest helpers ready.


In [7]:
# Example usage for Daily Digest (you can add your own URLs)

digest_urls = [
    "https://www.bbc.com/news/articles/c891jp9j79do",
    # Add more news URLs here if you want
]

digest_results = summarize_multiple_urls(digest_urls, mode="bn_simple")
daily_digest_text = build_daily_digest(digest_results, mode="bn_simple")
print_daily_digest(daily_digest_text)



################################################################################
Processing: https://www.bbc.com/news/articles/c891jp9j79do
üåê Fetching article from: https://www.bbc.com/news/articles/c891jp9j79do
‚úÖ Extracted 903 words from article.

üì∞ DAILY BRIEFING
‡¶Ü‡¶ú‡¶ï‡ßá‡¶∞ ‡¶¶‡ßà‡¶®‡¶ø‡¶ï ‡¶∏‡¶Ç‡¶¨‡¶æ‡¶¶ ‡¶¨‡ßç‡¶∞‡¶ø‡¶´‡¶ø‡¶Ç-‡¶è ‡¶Ü‡¶™‡¶®‡¶æ‡¶ï‡ßá ‡¶∏‡ßç‡¶¨‡¶æ‡¶ó‡¶§‡¶Æ‡•§

**‡¶Ü‡¶ú‡¶ï‡ßá‡¶∞ ‡¶™‡ßç‡¶∞‡¶ß‡¶æ‡¶® ‡¶ñ‡¶¨‡¶∞:**

*   **‡¶∞‡¶æ‡¶ú‡¶®‡ßÄ‡¶§‡¶ø:** ‡¶°‡ßã‡¶®‡¶æ‡¶≤‡ßç‡¶° ‡¶ü‡ßç‡¶∞‡¶æ‡¶Æ‡ßç‡¶™ ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø‡¶∞ ‡¶¨‡¶ø‡¶∞‡ßÅ‡¶¶‡ßç‡¶ß‡ßá ‡¶Ö‡¶®‡ßç‡¶§‡¶§ ‡ßß ‡¶¨‡¶ø‡¶≤‡¶ø‡¶Ø‡¶º‡¶® ‡¶°‡¶≤‡¶æ‡¶∞ ‡¶ï‡ßç‡¶∑‡¶§‡¶ø‡¶™‡ßÇ‡¶∞‡¶£‡ßá‡¶∞ ‡¶Æ‡¶æ‡¶Æ‡¶≤‡¶æ ‡¶ï‡¶∞‡¶¨‡ßá‡¶® ‡¶¨‡¶≤‡ßá ‡¶ú‡¶æ‡¶®‡¶ø‡ßü‡ßá‡¶õ‡ßá‡¶®‡•§ ‡¶ü‡ßç‡¶∞‡¶æ‡¶Æ‡ßç‡¶™‡ßá‡¶∞ ‡¶Ö‡¶≠‡¶ø‡¶Ø‡ßã‡¶ó, ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø-‡¶∞ '‡¶™‡ßç‡¶Ø‡¶æ‡¶®‡ßã‡¶∞‡¶æ‡¶Æ‡¶æ' ‡¶Ö‡¶®‡ßÅ‡¶∑‡ßç‡¶†‡¶æ‡¶®‡ßá ‡¶§‡¶æ‡¶∞ ‡ß®‡ß¶‡ß®‡ßß ‡¶∏‡¶æ‡¶≤‡ßá‡¶∞ ‡ß¨ ‡¶ú‡¶æ‡¶®‡ßÅ‡¶Ø‡¶º‡¶æ‡¶∞‡¶ø‡¶∞ ‡¶≠‡¶æ‡¶∑‡¶£ ‡¶è‡¶Æ‡¶®‡¶≠‡¶æ‡¶¨‡ßá ‡¶∏‡¶Æ‡ßç‡¶™‡¶æ‡¶¶‡¶®‡¶æ ‡¶ï‡¶

In [8]:
# ==================================
# 7. Prompt Experiments (for report / analysis)
# ==================================

def experiment_prompts(article_text: str, url: str) -> Dict[str, str]:
    """Compare different prompt styles for the same article."""
    article_trimmed = article_text[:MAX_ARTICLE_CHARS]

    base = f"URL: {url}\n\nArticle:\n{article_trimmed}"

    prompt_variants = {
        "short_bullets": "Summarize in Bangla using ONLY 5 bullet points.",
        "detailed_paragraphs": "Summarize in Bangla using 2 paragraphs and then 3 bullet points.",
        "json_structured": "Summarize in Bangla and respond ONLY in JSON with keys: summary, bullets, category.",
    }

    outputs: Dict[str, str] = {}

    for name, instr in prompt_variants.items():
        print("\nRunning variant:", name)
        full_prompt = instr + "\n\n" + base
        try:
            resp = model.generate_content(full_prompt)
            outputs[name] = resp.text or ""
        except Exception as e:
            outputs[name] = f"Model error: {e}"

    return outputs


# Example experiment on a single article (you can comment this out if not needed)
exp_title, exp_article = fetch_article_from_url("https://www.bbc.com/news/articles/c891jp9j79do")
exp_outputs = experiment_prompts(exp_article, "https://www.bbc.com/news/articles/c891jp9j79do")

for name, text_out in exp_outputs.items():
    print("\n" + "=" * 80)
    print("VARIANT:", name)
    print("=" * 80)
    print(text_out)


üåê Fetching article from: https://www.bbc.com/news/articles/c891jp9j79do
‚úÖ Extracted 903 words from article.

Running variant: short_bullets

Running variant: detailed_paragraphs

Running variant: json_structured

VARIANT: short_bullets
‡¶è‡¶ñ‡¶æ‡¶®‡ßá ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø ‡¶®‡¶ø‡¶¨‡¶®‡ßç‡¶ß‡¶ü‡¶ø‡¶∞ ‡¶è‡¶ï‡¶ü‡¶ø ‡¶∏‡¶Ç‡¶ï‡ßç‡¶∑‡¶ø‡¶™‡ßç‡¶§ ‡¶∏‡¶æ‡¶∞ ‡¶¨‡¶æ‡¶Ç‡¶≤‡¶æ‡¶§‡ßá ‡¶¶‡ßá‡¶ì‡ßü‡¶æ ‡¶π‡¶≤‡ßã:

*   ‡¶°‡ßã‡¶®‡¶æ‡¶≤‡ßç‡¶° ‡¶ü‡ßç‡¶∞‡¶æ‡¶Æ‡ßç‡¶™ ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø-‡¶è‡¶∞ ‡¶¨‡¶ø‡¶∞‡ßÅ‡¶¶‡ßç‡¶ß‡ßá ‡¶§‡¶æ‡¶∞ ‡ß¨ ‡¶ú‡¶æ‡¶®‡ßÅ‡¶Ø‡¶º‡¶æ‡¶∞‡¶ø, ‡ß®‡ß¶‡ß®‡ßß-‡¶è‡¶∞ ‡¶¨‡¶ï‡ßç‡¶§‡ßÉ‡¶§‡¶æ ‡¶≠‡ßÅ‡¶≤‡¶≠‡¶æ‡¶¨‡ßá ‡¶∏‡¶Æ‡ßç‡¶™‡¶æ‡¶¶‡¶®‡¶æ ‡¶ï‡¶∞‡¶æ‡¶∞ ‡¶ú‡¶®‡ßç‡¶Ø ‡ßß ‡¶•‡ßá‡¶ï‡ßá ‡ß´ ‡¶¨‡¶ø‡¶≤‡¶ø‡¶Ø‡¶º‡¶® ‡¶°‡¶≤‡¶æ‡¶∞‡ßá‡¶∞ ‡¶Æ‡¶æ‡¶Æ‡¶≤‡¶æ ‡¶ï‡¶∞‡¶æ‡¶∞ ‡¶π‡ßÅ‡¶Æ‡¶ï‡¶ø ‡¶¶‡¶ø‡¶Ø‡¶º‡ßá‡¶õ‡ßá‡¶®‡•§
*   ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø ‡¶Ö‡¶®‡¶ø‡¶ö‡ßç‡¶õ‡¶æ‡¶ï‡ßÉ‡¶§‡¶≠‡¶æ‡¶¨‡ßá "‡¶∏‡¶π‡¶ø‡¶Ç‡¶∏‡¶§‡¶æ‡¶∞ ‡¶∏‡¶∞‡¶æ‡¶∏‡¶∞‡¶ø ‡¶Ü‡¶π‡ßç‡¶¨‡¶æ‡¶®" ‡¶è‡¶∞ ‡¶≠‡ßÅ‡¶≤ ‡¶ß‡¶æ‡¶∞‡¶£‡¶æ ‡¶¶‡ßá‡¶ì‡ßü‡¶æ‡¶∞ ‡¶ú‡¶®‡ßç‡¶Ø ‡¶ï‡ßç‡

In [9]:
# ==================================
# 8. Simple Evaluation Against Human Summary
# ==================================

# You should replace the 'human_summary' text with your own manual summary
# for proper evaluation.

eval_data = [
    {
        "url": "https://www.bbc.com/news/articles/c891jp9j79do",
        "human_summary": "Write your human-made reference summary here.",
    },
]


def run_eval(data_list: List[Dict[str, str]], mode: str = "bn_simple") -> List[Dict[str, Any]]:
    results: List[Dict[str, Any]] = []
    for item in data_list:
        url = item["url"]
        human = item["human_summary"]

        ai_result = summarize_news_url(url, mode=mode)
        ai_summary = ai_result.get("summary", "")

        human_words = len(human.split())
        ai_words = len(ai_summary.split())
        length_ratio = round(ai_words / max(human_words, 1), 2)

        results.append({
            "url": url,
            "human_words": human_words,
            "ai_words": ai_words,
            "length_ratio": length_ratio,
            "human_summary": human,
            "ai_summary": ai_summary,
        })
    return results


evaluations = run_eval(eval_data, mode="bn_simple")

for r in evaluations:
    print("\n" + "=" * 80)
    print("URL:", r["url"])
    print("Human words:", r["human_words"])
    print("AI words:", r["ai_words"])
    print("Length ratio (AI/Human):", r["length_ratio"])
    print("\nHuman Summary:\n", r["human_summary"])
    print("\nAI Summary:\n", r["ai_summary"])


üåê Fetching article from: https://www.bbc.com/news/articles/c891jp9j79do
‚úÖ Extracted 903 words from article.

URL: https://www.bbc.com/news/articles/c891jp9j79do
Human words: 6
AI words: 164
Length ratio (AI/Human): 27.33

Human Summary:
 Write your human-made reference summary here.

AI Summary:
 ‡¶°‡ßã‡¶®‡¶æ‡¶≤‡ßç‡¶° ‡¶ü‡ßç‡¶∞‡¶æ‡¶Æ‡ßç‡¶™ ‡¶¨‡¶≤‡ßá‡¶õ‡ßá‡¶® ‡¶Ø‡ßá ‡¶§‡¶ø‡¶®‡¶ø ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø-‡¶∞ ‡¶¨‡¶ø‡¶∞‡ßÅ‡¶¶‡ßç‡¶ß‡ßá ‡¶Ü‡¶á‡¶®‡¶ø ‡¶¨‡ßç‡¶Ø‡¶¨‡¶∏‡ßç‡¶•‡¶æ ‡¶®‡ßá‡¶¨‡ßá‡¶®‡•§ ‡¶ï‡¶æ‡¶∞‡¶£ ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø ‡¶§‡¶æ‡¶∞ ‡¶è‡¶ï‡¶ü‡¶ø ‡¶≠‡¶æ‡¶∑‡¶£ ‡¶è‡¶Æ‡¶®‡¶≠‡¶æ‡¶¨‡ßá ‡¶∏‡¶Æ‡ßç‡¶™‡¶æ‡¶¶‡¶®‡¶æ ‡¶ï‡¶∞‡ßá‡¶õ‡ßá, ‡¶Ø‡¶æ ‡¶≠‡ßÅ‡¶≤ ‡¶¨‡¶æ‡¶∞‡ßç‡¶§‡¶æ ‡¶¶‡¶ø‡¶Ø‡¶º‡ßá‡¶õ‡ßá‡•§ ‡¶Ø‡¶¶‡¶ø‡¶ì ‡¶¨‡¶ø‡¶¨‡¶ø‡¶∏‡¶ø ‡¶è‡¶á ‡¶≠‡ßÅ‡¶≤ ‡¶∏‡¶Æ‡ßç‡¶™‡¶æ‡¶¶‡¶®‡¶æ‡¶∞ ‡¶ú‡¶®‡ßç‡¶Ø ‡¶ï‡ßç‡¶∑‡¶Æ‡¶æ ‡¶ö‡ßá‡¶Ø‡¶º‡ßá‡¶õ‡ßá, ‡¶§‡¶¨‡ßá ‡¶§‡¶æ‡¶∞‡¶æ ‡¶ï‡ßã‡¶®‡ßã ‡¶ï‡ßç‡¶∑‡¶§‡¶ø‡¶™‡ßÇ‡¶∞‡¶£ ‡¶¶‡¶ø‡¶§‡ßá ‡¶∞‡¶æ‡¶ú‡¶ø ‡¶π‡¶Ø‡¶º‡¶®‡¶ø‡•§ ‡¶ü‡ßç‡¶∞‡¶æ‡¶Æ‡ßç‡¶™ ‡¶è‡¶∞ ‡¶ú‡¶®‡ßç‡¶Ø ‡ßß ‡¶¨‡¶ø‡¶≤‡¶ø‡¶Ø‡¶º‡¶® ‡¶•‡ßá‡¶ï‡

## 9. Limitations & Future Work

### Limitations
- Relies only on article text; no external fact-checking.
- Paywalled or heavily JavaScript-based sites may not scrape correctly.
- Very short or noisy articles may not produce good summaries.
- AI may carry over bias from the original news source.

### Future Work
- Add custom scrapers for Bangla news websites.
- Integrate external fact-checking or cross-source verification.
- Build a browser extension ("Summarize this article").
- Create a mobile app or Telegram bot around this pipeline.
- Add ROUGE/BERTScore metrics for a more robust quantitative evaluation.
- Allow user to control summary length and tone from a simple UI.
