# üîç Crime Intelligence Newspaper Agent

This notebook implements a simple autonomous AI agent that monitors news sources for crime-related incidents including **cybercrime, theft, robbery, assault, fraud, and other important criminal activities**. The agent automatically fetches newspaper articles, identifies crime mentions across multiple categories, summarizes key information, and generates a daily intelligence brief for law enforcement agencies. This is a **Stage-1 prototype** focused on clarity and simplicity‚Äîdesigned to evolve into a production-ready system.

## What is an AI Agent?

Think of an AI agent as a **smart assistant that can act on its own**. Unlike a simple chatbot that just answers questions, an agent:

1. **Perceives** its environment (reads news, checks data)
2. **Thinks** about what it sees (analyzes, filters, decides)
3. **Acts** on its analysis (summarizes, reports, alerts)
4. **Repeats** this cycle autonomously

In Ed Donner's framework, an agent has:
- **Goals** (find crime-related news across multiple categories)
- **Tools** (web scraping, LLM reasoning)
- **Memory** (what it has already processed)
- **Actions** (generate reports)

Our agent is simple but powerful: it replaces hours of manual news monitoring with an automated, intelligent system.

In [1]:
# Import necessary libraries
import requests
from bs4 import BeautifulSoup
from datetime import datetime
import json
import re

# For LLM integration - using Ollama with local Llama 3.2 model
try:
    import ollama
    # Test if Ollama is running and model is available
    try:
        ollama.chat(model='llama3.2:latest', messages=[{'role': 'user', 'content': 'test'}])
        LLM_AVAILABLE = True
        print("‚úÖ Ollama connected successfully with llama3.2:latest model")
    except Exception as e:
        print(f"‚ö†Ô∏è Ollama error: {e}")
        print("   Make sure Ollama is running and llama3.2:latest model is downloaded")
        print("   Run: ollama pull llama3.2:latest")
        LLM_AVAILABLE = False
except ImportError:
    print("‚ö†Ô∏è Ollama library not installed. Install with: pip install ollama")
    print("   Will use mock summaries for demo.")
    LLM_AVAILABLE = False

print("\n‚úÖ Libraries imported successfully")
print(f"üìÖ Today's date: {datetime.now().strftime('%Y-%m-%d')}")

‚ö†Ô∏è Ollama library not installed. Install with: pip install ollama
   Will use mock summaries for demo.

‚úÖ Libraries imported successfully
üìÖ Today's date: 2026-02-03


## The Agent Loop Explained

Our agent follows a simple four-step cycle:

### 1Ô∏è‚É£ **FETCH** (Perception)
- Retrieve news headlines from RSS feeds from major Indian news sources
- Parse the content into structured data

### 2Ô∏è‚É£ **THINK** (Reasoning)
- Filter articles using crime-related keywords across multiple categories
- Identify which news items are relevant to law enforcement

### 3Ô∏è‚É£ **ACT** (Decision Making)
- Use an LLM to summarize relevant articles
- Extract key insights and implications

### 4Ô∏è‚É£ **REPORT** (Output)
- Generate a clean, actionable intelligence brief
- Present findings in a structured format

This is the **core pattern** of agentic AI‚Äîsimple but infinitely scalable.

In [2]:
# STEP 1: FETCH - Newspaper Article Fetcher

def fetch_from_rss(rss_url, source_name):
    """
    Fetch articles from an RSS feed.
    
    Args:
        rss_url: URL of the RSS feed
        source_name: Name of the news source
    
    Returns:
        List of article dictionaries
    """
    try:
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
        }
        response = requests.get(rss_url, timeout=15, headers=headers)
        response.raise_for_status()
        
        soup = BeautifulSoup(response.content, 'xml')
        items = soup.find_all('item')
        
        articles = []
        for item in items[:20]:  # Limit to 20 articles per source
            title = item.title.text.strip() if item.title else ''
            description = item.description.text.strip() if item.description else ''
            link = item.link.text.strip() if item.link else ''
            
            # Clean HTML tags from description if present
            if description:
                description = BeautifulSoup(description, 'html.parser').get_text()
            
            if title:  # Only add if we have at least a title
                articles.append({
                    'title': title,
                    'summary': description if description else title,
                    'source': source_name,
                    'url': link
                })
        
        print(f"   ‚úì {source_name}: {len(articles)} articles")
        return articles
    
    except Exception as e:
        print(f"   ‚úó {source_name}: Error - {str(e)[:50]}")
        return []


def fetch_news_articles():
    """
    Fetch news headlines from multiple Indian news sources via RSS feeds.
    
    Returns: List of article dictionaries with title, summary, source, and url
    """
    
    # Major Indian news sources with RSS feeds
    rss_feeds = [
        # Times of India
        ('https://timesofindia.indiatimes.com/rssfeedstopstories.cms', 'Times of India'),
        
        # The Hindu
        ('https://www.thehindu.com/news/national/feeder/default.rss', 'The Hindu'),
        
        # NDTV
        ('https://feeds.feedburner.com/ndtvnews-top-stories', 'NDTV'),
        
        # India Today
        ('https://www.indiatoday.in/rss/1206514', 'India Today'),
        
        # Hindustan Times
        ('https://www.hindustantimes.com/feeds/rss/india-news/rssfeed.xml', 'Hindustan Times'),
        
        # Indian Express
        ('https://indianexpress.com/feed/', 'Indian Express'),
    ]
    
    all_articles = []
    
    print("üì∞ Fetching articles from news sources...")
    
    for rss_url, source_name in rss_feeds:
        articles = fetch_from_rss(rss_url, source_name)
        all_articles.extend(articles)
    
    print(f"\nüìä Total articles fetched: {len(all_articles)}")
    return all_articles


# Execute the fetch
all_articles = fetch_news_articles()
print(f"\n‚úÖ Stage 1 (FETCH) complete: {len(all_articles)} articles retrieved")

üì∞ Fetching articles from news sources...
   ‚úì Times of India: 20 articles
   ‚úì The Hindu: 20 articles
   ‚úì NDTV: 20 articles
   ‚úì India Today: 20 articles
   ‚úì Hindustan Times: 20 articles
   ‚úì Indian Express: 20 articles

üìä Total articles fetched: 120

‚úÖ Stage 1 (FETCH) complete: 120 articles retrieved


In [3]:
# STEP 2: THINK - Filter for Crime Content

def filter_crime_articles(articles):
    """
    Filter articles for crime-related keywords across multiple categories.
    
    This is a simple keyword-based approach covering:
    - Cybercrime
    - Theft & Robbery
    - Violent Crimes
    - Fraud & Financial Crimes
    - Other serious offenses
    
    In Stage-2, we could use LLM-based classification for better accuracy.
    
    Args:
        articles: List of article dictionaries
    
    Returns:
        List of filtered articles related to crime
    """
    
    # Comprehensive crime-related keywords (government intelligence focus)
    keywords = [
        # Cybercrime
        'fraud', 'scam', 'cybercrime', 'cyber crime', 'digital arrest',
        'deepfake', 'phishing', 'hacking', 'ransomware', 'data breach',
        'online fraud', 'upi fraud', 'cryptocurrency scam', 'ponzi scheme',
        'identity theft', 'cyber attack', 'malware', 'fake website',
        'cyber security', 'cybersecurity', 'hacker', 'cyber threat',
        'online scam', 'digital fraud', 'bank fraud', 'credit card fraud',
        
        # Theft & Robbery
        'theft', 'robbery', 'burglary', 'stolen', 'loot', 'looted',
        'dacoity', 'pickpocket', 'shoplifting', 'vehicle theft',
        'chain snatching', 'house breaking', 'armed robbery',
        
        # Violent Crimes
        'murder', 'homicide', 'killed', 'assault', 'attacked',
        'stabbed', 'shot', 'shooting', 'lynching', 'mob attack',
        'rape', 'sexual assault', 'molestation', 'kidnapping',
        'abduction', 'domestic violence', 'acid attack',
        
        # Organized Crime & Drugs
        'gang', 'mafia', 'smuggling', 'trafficking', 'drug',
        'narcotics', 'contraband', 'illegal', 'organized crime',
        'extortion', 'racketeering', 'cartel',
        
        # Financial Crimes
        'embezzlement', 'money laundering', 'bribery', 'corruption',
        'forgery', 'counterfeit', 'tax evasion', 'financial fraud',
        'cheating', 'swindled', 'duped',
        
        # Other Serious Crimes
        'arson', 'vandalism', 'rioting', 'terrorism', 'terrorist',
        'bomb', 'explosion', 'weapon', 'arms', 'ammunition',
        'arrest', 'arrested', 'detained', 'police', 'investigation',
        'fir', 'complaint', 'accused', 'suspect', 'criminal'
    ]
    
    filtered = []
    
    for article in articles:
        # Combine title and summary for keyword matching
        text = (article['title'] + ' ' + article['summary']).lower()
        
        # Check if any keyword appears in the text
        for keyword in keywords:
            if keyword in text:
                article['matched_keyword'] = keyword  # Track what triggered the match
                filtered.append(article)
                break  # One match is enough
    
    print(f"üéØ Filtered to {len(filtered)} crime-related articles")
    print(f"   (Removed {len(articles) - len(filtered)} irrelevant articles)")
    
    return filtered


# Execute the filtering
crime_articles = filter_crime_articles(all_articles)

# Show what we found
print("\nüìã Crime articles identified:")
for idx, article in enumerate(crime_articles, 1):
    print(f"   {idx}. {article['title'][:60]}...")
    print(f"      Matched keyword: '{article['matched_keyword']}'")

print(f"\n‚úÖ Stage 2 (THINK) complete: {len(crime_articles)} relevant articles")

üéØ Filtered to 25 crime-related articles
   (Removed 95 irrelevant articles)

üìã Crime articles identified:
   1. Russia fires over 70 missiles, 450 drones at Ukraine; iconic...
      Matched keyword: 'fir'
   2. 'Effort to win trophies will continue': Rohit after being co...
      Matched keyword: 'fir'
   3. Apple‚Äôs first foldable iPhone design leaked online: Here‚Äôs w...
      Matched keyword: 'fir'
   4. Uttarakhand man who shielded Muslim trader becomes social me...
      Matched keyword: 'assault'
   5. Police to introduce locked house beat system in Belagavi...
      Matched keyword: 'police'
   6. 15-year-old boy dies after fall from apartment in Bengaluru,...
      Matched keyword: 'police'
   7. Sabarimala gold row: LDF in Kerala questions Sonia Gandhi li...
      Matched keyword: 'suspect'
   8. Chennai triple murder of migrant family: Tamil Nadu governme...
      Matched keyword: 'murder'
   9. Trump's Board Of Peace Could See US Firm Gain 300% Profits I...
      Mat

In [4]:
# STEP 3: ACT - LLM Summarization and Analysis

def summarize_with_llm(article):
    """
    Use an LLM to create intelligence-focused summaries.
    
    The LLM extracts:
    - Key facts (who, what, where, when)
    - Law enforcement implications
    - Actionable insights
    
    Args:
        article: Dictionary with article data
    
    Returns:
        Dictionary with intelligence summary
    """
    
    if not LLM_AVAILABLE:
        # Fallback: Return a basic summary without LLM
        return {
            'title': article['title'],
            'summary': article['summary'][:200] + '...' if len(article['summary']) > 200 else article['summary'],
            'implications': 'LLM not available - using basic summary',
            'source': article['source'],
            'url': article.get('url', 'N/A')
        }
    
    # Create a focused prompt for intelligence analysis
    prompt = f"""
You are an intelligence analyst for a government law enforcement unit.

Article Title: {article['title']}
Article Content: {article['summary']}
Source: {article['source']}

Provide a brief intelligence summary (2-3 sentences) covering:
1. Key facts (who, what, where, when, amounts/items involved)
2. Type of crime (cybercrime, theft, violent crime, fraud, etc.)
3. Why this matters for law enforcement
4. Any patterns or trends

Keep it concise and actionable.
"""
    
    try:
        # Use Ollama with local Llama 3.2 model
        response = ollama.chat(
            model='llama3.2:latest',
            messages=[
                {
                    'role': 'system',
                    'content': 'You are a crime intelligence analyst. Provide concise, factual analysis.'
                },
                {
                    'role': 'user',
                    'content': prompt
                }
            ],
            options={
                'temperature': 0.3,  # Lower temperature for factual analysis
                'num_predict': 200   # Limit response length
            }
        )
        
        intelligence_summary = response['message']['content'].strip()
        
        return {
            'title': article['title'],
            'summary': intelligence_summary,
            'implications': 'Analyzed by Llama 3.2 (local)',
            'source': article['source'],
            'url': article.get('url', 'N/A')
        }
    
    except Exception as e:
        print(f"‚ö†Ô∏è LLM error for article '{article['title'][:30]}...': {str(e)[:50]}")
        return {
            'title': article['title'],
            'summary': article['summary'][:200] + '...' if len(article['summary']) > 200 else article['summary'],
            'implications': 'LLM analysis failed - using original summary',
            'source': article['source'],
            'url': article.get('url', 'N/A')
        }


# Process all filtered articles
intelligence_summaries = []

print("ü§ñ Analyzing articles with Llama 3.2 (local model)...\n")

for idx, article in enumerate(crime_articles, 1):
    print(f"   Processing {idx}/{len(crime_articles)}: {article['title'][:50]}...")
    summary = summarize_with_llm(article)
    intelligence_summaries.append(summary)

print(f"\n‚úÖ Stage 3 (ACT) complete: {len(intelligence_summaries)} summaries generated")

ü§ñ Analyzing articles with Llama 3.2 (local model)...

   Processing 1/25: Russia fires over 70 missiles, 450 drones at Ukrai...
   Processing 2/25: 'Effort to win trophies will continue': Rohit afte...
   Processing 3/25: Apple‚Äôs first foldable iPhone design leaked online...
   Processing 4/25: Uttarakhand man who shielded Muslim trader becomes...
   Processing 5/25: Police to introduce locked house beat system in Be...
   Processing 6/25: 15-year-old boy dies after fall from apartment in ...
   Processing 7/25: Sabarimala gold row: LDF in Kerala questions Sonia...
   Processing 8/25: Chennai triple murder of migrant family: Tamil Nad...
   Processing 9/25: Trump's Board Of Peace Could See US Firm Gain 300%...
   Processing 10/25: Class 10 Bengaluru Boy Jumps To Death From 7th Flo...
   Processing 11/25: Air India Re-inspects Fuel Switches On Boeing 787s...
   Processing 12/25: Groom Kills Friend Hours Before Wedding Over Rs 1....
   Processing 13/25: 3 Indians Arrested In Canada 

In [5]:
# STEP 4: REPORT - Generate Daily Intelligence Brief

def generate_intelligence_report(summaries):
    """
    Generate a clean, actionable daily intelligence brief.
    
    This is what gets delivered to law enforcement analysts.
    """
    
    report_date = datetime.now().strftime('%Y-%m-%d %H:%M')
    
    # Build the report
    report = f"""
{'='*80}
    DAILY CRIME INTELLIGENCE BRIEF
    Generated: {report_date}
    Source: Automated Newspaper Agent (Real-time RSS Feeds)
    AI Model: Llama 3.2 (Local)
{'='*80}

üìä SUMMARY
   ‚Ä¢ Total articles scanned: {len(all_articles)}
   ‚Ä¢ Crime incidents identified: {len(summaries)}
   ‚Ä¢ Analysis method: {'Llama 3.2 Local LLM' if LLM_AVAILABLE else 'Keyword-based'}
   ‚Ä¢ News sources: Times of India, The Hindu, NDTV, India Today, Hindustan Times, Indian Express
   ‚Ä¢ Crime categories: Cybercrime, Theft, Robbery, Violent Crimes, Fraud, Organized Crime

{'='*80}

üö® KEY INCIDENTS

"""
    
    # Add each incident
    for idx, summary in enumerate(summaries, 1):
        report += f"""
{'‚îÄ'*80}
INCIDENT #{idx}
{'‚îÄ'*80}

üì∞ Headline:
   {summary['title']}

üîç Intelligence Summary:
   {summary['summary']}

üìå Source: {summary['source']}
üîó URL: {summary.get('url', 'N/A')}

"""
    
    # Add footer
    report += f"""
{'='*80}

üìù NOTES
   ‚Ä¢ This is an automated prototype using real-time RSS feeds and local Llama 3.2 model
   ‚Ä¢ Human verification recommended for high-priority incidents
   ‚Ä¢ For urgent matters, contact the Crime Coordination Center
   ‚Ä¢ Report generated from live news sources at {report_date}
   ‚Ä¢ Covers multiple crime categories: Cybercrime, Theft, Robbery, Violent Crimes, Fraud, etc.

{'='*80}
End of Report
{'='*80}
"""
    
    return report


# Generate and display the final report
final_report = generate_intelligence_report(intelligence_summaries)
print(final_report)

# Optionally save to file
report_filename = f"crime_intelligence_brief_{datetime.now().strftime('%Y%m%d_%H%M%S')}.txt"
with open(report_filename, 'w', encoding='utf-8') as f:
    f.write(final_report)

print(f"\nüíæ Report saved to: {report_filename}")
print("\n‚úÖ Stage 4 (REPORT) complete: Intelligence brief generated")
print("\nüéâ AGENT CYCLE COMPLETE!")


    DAILY CRIME INTELLIGENCE BRIEF
    Generated: 2026-02-03 15:12
    Source: Automated Newspaper Agent (Real-time RSS Feeds)
    AI Model: Llama 3.2 (Local)

üìä SUMMARY
   ‚Ä¢ Total articles scanned: 120
   ‚Ä¢ Crime incidents identified: 25
   ‚Ä¢ Analysis method: Keyword-based
   ‚Ä¢ News sources: Times of India, The Hindu, NDTV, India Today, Hindustan Times, Indian Express
   ‚Ä¢ Crime categories: Cybercrime, Theft, Robbery, Violent Crimes, Fraud, Organized Crime


üö® KEY INCIDENTS


‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
INCIDENT #1
‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ

üì∞ H

## üöÄ Future Enhancements (Stage-2 and Beyond)

This Stage-1 prototype demonstrates the core agent pattern with real-time data across multiple crime categories. Here's how we can evolve it:

### **Immediate Next Steps (Stage-2)**
- **Automation**: Schedule the agent to run every 6 hours using cron jobs or cloud functions
- **More RSS Feeds**: Add regional news sources and specialized crime publications
- **Better Classification**: Replace keyword matching with LLM-based relevance scoring and crime categorization
- **Crime Type Classification**: Automatically categorize incidents (cybercrime, theft, violent crime, etc.)
- **Severity Scoring**: Assign priority levels based on crime severity and public impact
- **Sentiment Analysis**: Track public sentiment around crime incidents
- **Deduplication**: Identify and merge duplicate stories from different sources

### **Medium-Term (Stage-3)**
- **Admin Dashboard**: Build a web interface for viewing reports and managing alerts
- **Alert System**: Send notifications for high-priority incidents (email, SMS, Slack)
- **Trend Detection**: Track recurring patterns and emerging crime trends over time
- **Geographic Mapping**: Visualize crime hotspots on interactive maps
- **Multi-Source Integration**: Add social media monitoring (Twitter, Reddit)
- **Database Storage**: Store articles and summaries for historical analysis
- **Crime Statistics**: Generate weekly/monthly crime trend reports

### **Advanced (Stage-4 - Multi-Agent System)**
- **Specialist Agents**: Deploy separate agents for different crime types
  - Agent 1: Cybercrime (fraud, hacking, digital crimes)
  - Agent 2: Property Crimes (theft, robbery, burglary)
  - Agent 3: Violent Crimes (assault, murder, kidnapping)
  - Agent 4: Financial Crimes (fraud, embezzlement, money laundering)
  - Agent 5: Organized Crime (gangs, trafficking, smuggling)
- **Coordinator Agent**: Synthesizes findings from all specialist agents
- **Vector Database**: Store and search historical intelligence reports
- **Predictive Analytics**: Use ML to forecast crime trends and hotspots
- **Cross-Reference Analysis**: Link related incidents across time and location

### **Production Deployment**
- **API Integration**: Connect with government databases and case management systems
- **Compliance**: Ensure data privacy and security standards (encryption, access logs)
- **Human-in-the-Loop**: Allow analysts to provide feedback and refine agent behavior
- **Scalability**: Deploy on cloud infrastructure (AWS, Azure, GCP) for 24/7 operation
- **Multi-Language Support**: Process news in Hindi, regional languages
- **Real-time Alerts**: Instant notifications for critical incidents

---

### **Key Principle (Ed Donner Style)**

> *"Start simple, prove value, then scale."*

This notebook now uses real data and proves the concept works with live news sources across multiple crime categories. Now we can confidently add complexity where it matters most.

---

**Questions? Feedback?**  
This is a living prototype. Test it, break it, improve it. That's how great agents are built. üõ†Ô∏è