# Dynamic Micro-Blog Writer
## Multi-Agent System for Autonomous Blog Generation

**Optimized for Google Colab with REAL Web Search**

This notebook implements a multi-agent system that autonomously generates high-quality blog posts through:
1. **Persona Architect**: Creates writer persona and search queries
2. **Research Analyst**: Performs REAL web searches (with options!)
3. **Content Synthesizer**: Writes blog posts based on actual web research
4. **Critic/Editor**: Reviews and provides feedback for iterative improvement

**Author**: Student Developer  
**Date**: August 12, 2025  
**Version**: 2.0 (Real Web Search Edition)

## Setup Options:

### Option 1: FREE Version (Recommended)
- **Required**: Only `GEMINI_API_KEY` 
- **Search**: DuckDuckGo (no additional API keys)
- **Cost**: Completely FREE
- **Setup**: Just add Gemini API key to Colab secrets

### Option 2: Premium Version  
- **Required**: `GEMINI_API_KEY` + `GOOGLE_CSE_API_KEY` + `GOOGLE_CSE_ID`
- **Search**: Google Custom Search API
- **Cost**: 100 searches/day FREE, then $5/1000 searches
- **Setup**: Additional Google Custom Search setup required

## Quick Setup for FREE Version:
1. Click the key icon in the left sidebar
2. Add secret: `GEMINI_API_KEY` (get from https://ai.google.dev/)
3. Enable notebook access
4. Run all cells!

## What You Get:
- **Real Web Search**: Actual current information from the web
- **Live Data**: Blog content based on real-time search results  
- **Source Tracking**: Shows actual URLs and snippets
- **Rich Previews**: Beautiful markdown display of all content
- **Current Information**: No longer limited to training data

## Setup and Dependencies

In [None]:
import os
import json
import datetime
import re
import requests
import time
from typing import Dict, List, Optional, Tuple
from dataclasses import dataclass
from pathlib import Path
from urllib.parse import quote_plus

try:
    import google.generativeai as genai
except ImportError:
    print("Installing google-generativeai...")
    !pip install google-generativeai
    import google.generativeai as genai

try:
    from bs4 import BeautifulSoup
except ImportError:
    print("Installing beautifulsoup4 for web scraping...")
    !pip install beautifulsoup4
    from bs4 import BeautifulSoup

# Import IPython display for rich markdown rendering
from IPython.display import display, Markdown

print("All dependencies loaded successfully!")

## Configuration

In [None]:
# Configure APIs for Google Colab
# Using Google Colab secrets for secure API key management

# Choose your search method
USE_FREE_SEARCH = True  # Set to False to use Google Custom Search API

try:
    # Import Google Colab userdata for secrets
    from google.colab import userdata
    
    # Get Gemini API key (required)
    gemini_api_key = userdata.get('GEMINI_API_KEY')
    if not gemini_api_key:
        raise ValueError("GEMINI_API_KEY not found in Colab secrets.")
    
    # Configure Gemini
    genai.configure(api_key=gemini_api_key)
    model = genai.GenerativeModel('gemini-2.5-pro')
    
    if USE_FREE_SEARCH:
        # Free web search - no additional API keys needed
        GOOGLE_CSE_API_KEY = None
        GOOGLE_CSE_ID = None
        print("Gemini Pro 2.5 model initialized successfully!")
        print("Using FREE web search (no additional API keys required)")
        print("Only Gemini API key loaded from Colab secrets")
    else:
        # Google Custom Search API (requires additional keys)
        google_cse_api_key = userdata.get('GOOGLE_CSE_API_KEY')
        google_cse_id = userdata.get('GOOGLE_CSE_ID')
        
        if not google_cse_api_key or not google_cse_id:
            print("Google Custom Search keys not found, falling back to free search")
            USE_FREE_SEARCH = True
            GOOGLE_CSE_API_KEY = None
            GOOGLE_CSE_ID = None
        else:
            GOOGLE_CSE_API_KEY = google_cse_api_key
            GOOGLE_CSE_ID = google_cse_id
            print("Gemini Pro 2.5 model initialized successfully!")
            print("Google Custom Search API configured successfully!")
            print("API keys loaded from Google Colab secrets")
    
except ImportError:
    # Fallback for non-Colab environments
    print("Not running in Google Colab, using fallback method...")
    
    gemini_api_key = os.environ.get('GEMINI_API_KEY')
    if not gemini_api_key:
        gemini_api_key = input("Enter your Gemini API key: ")
    
    genai.configure(api_key=gemini_api_key)
    model = genai.GenerativeModel('gemini-2.5-pro')
    
    if USE_FREE_SEARCH:
        GOOGLE_CSE_API_KEY = None
        GOOGLE_CSE_ID = None
        print("Using FREE web search method")
    else:
        google_cse_api_key = os.environ.get('GOOGLE_CSE_API_KEY')
        google_cse_id = os.environ.get('GOOGLE_CSE_ID')
        
        if not google_cse_api_key:
            google_cse_api_key = input("Enter your Google Custom Search API key (or press Enter for free search): ")
        if not google_cse_id:
            google_cse_id = input("Enter your Google Custom Search Engine ID (or press Enter for free search): ")
        
        if not google_cse_api_key or not google_cse_id:
            USE_FREE_SEARCH = True
            GOOGLE_CSE_API_KEY = None
            GOOGLE_CSE_ID = None
            print("Using FREE web search method")
        else:
            GOOGLE_CSE_API_KEY = google_cse_api_key
            GOOGLE_CSE_ID = google_cse_id
            print("Using Google Custom Search API")
    
    print("Gemini Pro 2.5 model initialized successfully!")
    
except Exception as e:
    print(f"Error configuring APIs: {e}")
    print("Required setup:")
    print("   1. GEMINI_API_KEY - Your Gemini API key (REQUIRED)")
    if not USE_FREE_SEARCH:
        print("   2. GOOGLE_CSE_API_KEY - Your Google Custom Search API key (OPTIONAL)")
        print("   3. GOOGLE_CSE_ID - Your Google Custom Search Engine ID (OPTIONAL)")
    print("\nSetup instructions:")
    print("   • Gemini API: https://ai.google.dev/")
    if not USE_FREE_SEARCH:
        print("   • Google Custom Search: https://developers.google.com/custom-search/v1/overview")
    raise

## Fundamental Principles of Quality Writing

In [None]:
QUALITY_PRINCIPLES = """
FUNDAMENTAL PRINCIPLES OF QUALITY WRITING:

P1: Evidentiary Support
All claims must be directly traceable to the provided research material.

P2: Clarity and Conciseness
Writing must be precise, unambiguous, and free of unnecessary jargon.

P3: Engaging Narrative
The post must feature a strong hook, a logical flow, and a memorable conclusion.

P4: Structural Integrity
The output must be well-organized with a clear title, introduction, body, and conclusion.

P5: Intellectual Honesty
Information must be represented accurately, even when adopting a specific persona.
"""

print("Quality principles defined:")
print(QUALITY_PRINCIPLES)

## Data Classes and Utilities

In [None]:
@dataclass
class SearchResult:
    title: str
    url: str
    snippet: str

@dataclass
class PersonaResult:
    persona_prompt: str
    search_queries: List[str]

@dataclass
class EditorPersonaResult:
    editor_persona: str
    review_criteria: str

@dataclass
class RequirementAnalysis:
    persona_requirements: str  # For writer persona (tone, style, voice)
    content_requirements: str  # For blog writing (length, structure, format)
    review_requirements: str   # For editor review (standards, criteria)

@dataclass
class ResearchResult:
    content: str
    source_count: int = 0
    search_results: List[SearchResult] = None

@dataclass
class BlogDraft:
    content: str
    version: int

@dataclass
class EditorReview:
    is_approved: bool
    comments: str

@dataclass
class FactualClaim:
    claim_text: str
    claim_type: str  # "statistic", "date", "name", "fact", "quote"
    importance: str  # "high", "medium", "low"
    context: str  # surrounding text for context
    
@dataclass
class ClaimVerification:
    claim: FactualClaim
    search_queries: List[str]
    evidence: List[SearchResult]
    verification_status: str  # "verified", "contradicted", "insufficient_evidence", "unverifiable"
    confidence: float  # 0.0 to 1.0
    explanation: str

@dataclass
class FactCheckReport:
    total_claims: int
    verified_claims: int
    contradicted_claims: int
    insufficient_evidence_claims: int
    overall_reliability: str  # "high", "medium", "low", "unreliable"
    critical_issues: List[str]
    verifications: List[ClaimVerification]

class ClaimExtractor:
    """Extract verifiable factual claims from blog content using LLM."""
    
    def __init__(self, model):
        self.model = model
    
    def extract_claims(self, blog_content: str, topic: str) -> List[FactualClaim]:
        """Extract verifiable factual claims from the blog post."""
        prompt = f"""
You are a Claim Extraction specialist. Your task is to identify and extract verifiable factual claims from a blog post.

BLOG CONTENT:
{blog_content}

TOPIC: {topic}

TASK: Extract factual claims that can be verified through online search. Focus on:

1. STATISTICS AND NUMBERS: Specific percentages, amounts, quantities, growth rates
2. DATES AND TIMEFRAMES: When events occurred, publication dates, deadlines
3. NAMES AND ENTITIES: Companies, people, organizations, products, technologies
4. SPECIFIC FACTS: Technical specifications, study results, market data
5. QUOTES AND CITATIONS: Direct quotes, study findings, expert statements

IGNORE:
- Opinions and subjective statements
- General trends without specific data
- Common knowledge facts
- Vague or unspecific claims

For each claim, determine:
- claim_type: "statistic", "date", "name", "fact", "quote"
- importance: "high" (core to article), "medium" (supporting), "low" (minor detail)

Output as JSON array:
[
  {{
    "claim_text": "Exact text of the claim",
    "claim_type": "statistic|date|name|fact|quote",
    "importance": "high|medium|low",
    "context": "Surrounding sentence(s) for context"
  }}
]

Extract 5-15 most important verifiable claims. Prioritize claims that are:
1. Central to the article's argument
2. Specific and measurable
3. Recent or time-sensitive
4. About well-known entities/topics

Return ONLY the JSON array.
"""
        
        try:
            response = self.model.generate_content(prompt)
            response_text = response.text.strip()
            
            # Clean JSON response
            if response_text.startswith('```json'):
                response_text = response_text[7:-3].strip()
            elif response_text.startswith('```'):
                response_text = response_text[3:-3].strip()
            
            claims_data = json.loads(response_text)
            
            claims = []
            for claim_data in claims_data:
                claim = FactualClaim(
                    claim_text=claim_data.get('claim_text', ''),
                    claim_type=claim_data.get('claim_type', 'fact'),
                    importance=claim_data.get('importance', 'medium'),
                    context=claim_data.get('context', '')
                )
                claims.append(claim)
            
            return claims
            
        except Exception as e:
            print(f"Error extracting claims: {e}")
            return []

class EvidenceEvaluator:
    """Evaluate factual claims against search evidence."""
    
    def __init__(self, model):
        self.model = model
    
    def evaluate_claim(self, claim: FactualClaim, evidence: List[SearchResult]) -> ClaimVerification:
        """Evaluate a single claim against search evidence."""
        
        if not evidence:
            return ClaimVerification(
                claim=claim,
                search_queries=[],
                evidence=evidence,
                verification_status="insufficient_evidence",
                confidence=0.0,
                explanation="No search results found to verify this claim."
            )
        
        # Format evidence for analysis
        evidence_text = ""
        for i, result in enumerate(evidence, 1):
            evidence_text += f"SOURCE {i}:\n"
            evidence_text += f"Title: {result.title}\n"
            evidence_text += f"URL: {result.url}\n" 
            evidence_text += f"Content: {result.snippet}\n\n"
        
        prompt = f"""
You are a Fact Verification specialist. Analyze whether a factual claim is supported by search evidence.

CLAIM TO VERIFY:
"{claim.claim_text}"

CLAIM TYPE: {claim.claim_type}
CLAIM IMPORTANCE: {claim.importance}
CONTEXT: {claim.context}

SEARCH EVIDENCE:
{evidence_text}

TASK: Determine if the claim is supported by the evidence.

VERIFICATION CATEGORIES:
1. "verified" - Evidence clearly supports the claim
2. "contradicted" - Evidence contradicts the claim  
3. "insufficient_evidence" - Not enough evidence to verify
4. "unverifiable" - Claim is too vague or subjective to verify

ANALYSIS CRITERIA:
- Are the sources credible and recent?
- Do multiple sources support the claim?
- Are the numbers/facts exactly matching?
- Is the context similar?

Consider source credibility, recency, and consistency across sources.

Output as JSON:
{{
  "verification_status": "verified|contradicted|insufficient_evidence|unverifiable",
  "confidence": 0.8,
  "explanation": "Detailed explanation of why this verification status was assigned, citing specific evidence sources and any discrepancies found."
}}

Be thorough and conservative in verification. When in doubt, use "insufficient_evidence".

Return ONLY the JSON object.
"""
        
        try:
            response = self.model.generate_content(prompt)
            response_text = response.text.strip()
            
            # Clean JSON response
            if response_text.startswith('```json'):
                response_text = response_text[7:-3].strip()
            elif response_text.startswith('```'):
                response_text = response_text[3:-3].strip()
            
            result_data = json.loads(response_text)
            
            return ClaimVerification(
                claim=claim,
                search_queries=[],  # Will be filled by FactChecker
                evidence=evidence,
                verification_status=result_data.get('verification_status', 'insufficient_evidence'),
                confidence=float(result_data.get('confidence', 0.0)),
                explanation=result_data.get('explanation', '')
            )
            
        except Exception as e:
            print(f"Error evaluating claim: {e}")
            return ClaimVerification(
                claim=claim,
                search_queries=[],
                evidence=evidence,
                verification_status="insufficient_evidence",
                confidence=0.0,
                explanation=f"Error during evaluation: {e}"
            )

class FactChecker:
    """Orchestrate comprehensive fact-checking of blog content."""
    
    def __init__(self, model, web_searcher):
        self.model = model
        self.web_searcher = web_searcher
        self.claim_extractor = ClaimExtractor(model)
        self.evidence_evaluator = EvidenceEvaluator(model)
    
    def generate_search_queries(self, claim: FactualClaim) -> List[str]:
        """Generate targeted search queries for a factual claim."""
        prompt = f"""
Generate 2-3 specific search queries to verify this factual claim.

CLAIM: "{claim.claim_text}"
CLAIM TYPE: {claim.claim_type}
CONTEXT: {claim.context}

Create search queries that are:
1. Specific and targeted to the exact claim
2. Likely to find authoritative sources
3. Include key terms from the claim
4. Vary in approach (direct search, context search, source search)

For statistics: Include exact numbers and context
For names: Include full names and relevant context  
For dates: Include specific dates and events
For facts: Include key technical terms

Return as JSON array:
["query1", "query2", "query3"]

Return ONLY the JSON array.
"""
        
        try:
            response = self.model.generate_content(prompt)
            response_text = response.text.strip()
            
            if response_text.startswith('```json'):
                response_text = response_text[7:-3].strip()
            elif response_text.startswith('```'):
                response_text = response_text[3:-3].strip()
            
            queries = json.loads(response_text)
            return queries if isinstance(queries, list) else [str(queries)]
            
        except Exception as e:
            print(f"Error generating search queries: {e}")
            # Fallback: create basic query from claim text
            return [claim.claim_text]
    
    def fact_check_article(self, blog_content: str, topic: str) -> FactCheckReport:
        """Perform comprehensive fact-checking of the entire article."""
        print("   Starting comprehensive fact-checking...")
        
        # Step 1: Extract factual claims
        print("   Extracting factual claims...")
        claims = self.claim_extractor.extract_claims(blog_content, topic)
        print(f"   Found {len(claims)} factual claims to verify")
        
        if not claims:
            return FactCheckReport(
                total_claims=0,
                verified_claims=0,
                contradicted_claims=0,
                insufficient_evidence_claims=0,
                overall_reliability="unknown",
                critical_issues=["No verifiable claims found in the article"],
                verifications=[]
            )
        
        # Step 2: Prioritize claims (check high importance first)
        high_priority_claims = [c for c in claims if c.importance == "high"]
        medium_priority_claims = [c for c in claims if c.importance == "medium"]
        low_priority_claims = [c for c in claims if c.importance == "low"]
        
        prioritized_claims = high_priority_claims + medium_priority_claims + low_priority_claims
        
        # Step 3: Verify each claim
        verifications = []
        verified_count = 0
        contradicted_count = 0
        insufficient_evidence_count = 0
        
        for i, claim in enumerate(prioritized_claims, 1):
            print(f"   Fact-checking claim {i}/{len(prioritized_claims)}: {claim.claim_text[:50]}...")
            
            # Generate search queries
            search_queries = self.generate_search_queries(claim)
            
            # Search for evidence
            evidence = []
            for query in search_queries:
                query_results = self.web_searcher.search(query, num_results=5)
                evidence.extend(query_results)
            
            # Remove duplicates
            unique_evidence = []
            seen_urls = set()
            for result in evidence:
                if result.url not in seen_urls:
                    unique_evidence.append(result)
                    seen_urls.add(result.url)
            
            # Evaluate claim against evidence
            verification = self.evidence_evaluator.evaluate_claim(claim, unique_evidence[:8])  # Top 8 unique results
            verification.search_queries = search_queries
            
            verifications.append(verification)
            
            # Update counters
            if verification.verification_status == "verified":
                verified_count += 1
            elif verification.verification_status == "contradicted":
                contradicted_count += 1
            else:
                insufficient_evidence_count += 1
        
        # Step 4: Generate overall assessment
        critical_issues = []
        contradicted_verifications = [v for v in verifications if v.verification_status == "contradicted"]
        
        for verification in contradicted_verifications:
            if verification.claim.importance == "high":
                critical_issues.append(f"HIGH PRIORITY: Contradicted claim - {verification.claim.claim_text}")
            elif verification.claim.importance == "medium":
                critical_issues.append(f"MEDIUM PRIORITY: Contradicted claim - {verification.claim.claim_text}")
        
        # Determine overall reliability
        if contradicted_count > 0:
            if any(v.claim.importance == "high" for v in contradicted_verifications):
                overall_reliability = "unreliable"
            else:
                overall_reliability = "low"
        elif verified_count >= len(claims) * 0.7:  # 70%+ verified
            overall_reliability = "high"
        elif verified_count >= len(claims) * 0.4:  # 40%+ verified  
            overall_reliability = "medium"
        else:
            overall_reliability = "low"
        
        print(f"   Fact-checking complete: {verified_count} verified, {contradicted_count} contradicted, {insufficient_evidence_count} insufficient evidence")
        
        return FactCheckReport(
            total_claims=len(claims),
            verified_claims=verified_count,
            contradicted_claims=contradicted_count,
            insufficient_evidence_claims=insufficient_evidence_count,
            overall_reliability=overall_reliability,
            critical_issues=critical_issues,
            verifications=verifications
        )

print("Advanced fact-checking system with claim extraction and evidence evaluation implemented successfully!")

In [None]:
class RequirementAnalyzer:
    """Intelligently analyze and categorize style_and_background requirements using LLM."""
    
    def __init__(self, model):
        self.model = model
    
    def analyze_requirements(self, style_and_background: str, topic: str) -> RequirementAnalysis:
        """Use LLM to intelligently categorize requirements."""
        prompt = f"""
You are a Requirements Analyzer. Analyze the following style and background requirements and intelligently categorize them for different purposes.

TOPIC: {topic}

STYLE & BACKGROUND REQUIREMENTS:
{style_and_background}

TASK: Categorize these requirements into three specific areas:

1. PERSONA REQUIREMENTS (for writer persona generation):
   - Tone and voice characteristics
   - Writing style preferences
   - Personality traits
   - Perspective and approach
   - Expertise level to portray

2. CONTENT REQUIREMENTS (for blog writing process):
   - Length specifications (word counts, article length)
   - Structural requirements (format, sections, organization)
   - Content depth and complexity
   - Specific elements to include/exclude
   - Technical specifications

3. REVIEW REQUIREMENTS (for editorial review):
   - Quality standards and criteria
   - Accuracy requirements
   - Audience appropriateness checks
   - Compliance standards
   - Validation criteria

IMPORTANT:
- Extract and categorize ALL relevant information from the requirements
- Don't duplicate information across categories
- Be comprehensive but avoid redundancy
- If a requirement fits multiple categories, place it in the most appropriate one
- Convert implicit requirements into explicit instructions
- Maintain the original intent and specificity

Output as JSON:
{{
  "persona_requirements": "Detailed requirements for persona generation...",
  "content_requirements": "Detailed requirements for content creation...",
  "review_requirements": "Detailed requirements for editorial review..."
}}

Return ONLY the JSON object.
"""
        
        try:
            response = self.model.generate_content(prompt)
            response_text = response.text.strip()
            
            # Clean JSON response
            if response_text.startswith('```json'):
                response_text = response_text[7:-3].strip()
            elif response_text.startswith('```'):
                response_text = response_text[3:-3].strip()
            
            data = json.loads(response_text)
            
            return RequirementAnalysis(
                persona_requirements=data.get('persona_requirements', ''),
                content_requirements=data.get('content_requirements', ''),
                review_requirements=data.get('review_requirements', '')
            )
            
        except Exception as e:
            print(f"Requirement analysis error: {e}")
            # Fallback to original requirements
            return RequirementAnalysis(
                persona_requirements=style_and_background,
                content_requirements=style_and_background,
                review_requirements=style_and_background
            )

class FolderNameGenerator:
    """Generate smart folder names using LLM."""
    
    def __init__(self, model):
        self.model = model
    
    def generate_keywords(self, topic: str) -> Tuple[str, str]:
        """Generate 2 keywords for folder naming."""
        prompt = f"""
        Generate exactly 2 keywords that best represent this topic for folder naming.
        
        TOPIC: {topic}
        
        REQUIREMENTS:
        1. Return exactly 2 keywords
        2. Keywords should be lowercase
        3. Keywords should be single words (no spaces, hyphens, or special characters)
        4. Keywords should be relevant and descriptive
        5. Use underscores to replace spaces if needed (but prefer single words)
        6. Maximum 15 characters per keyword
        
        Return ONLY the keywords separated by a comma, like: keyword1,keyword2
        
        Examples:
        - "AI in Healthcare" → "ai,healthcare"
        - "Climate Change Solutions" → "climate,solutions"  
        - "Remote Work Productivity" → "remote,productivity"
        """
        
        try:
            response = self.model.generate_content(prompt)
            keywords_text = response.text.strip().lower()
            
            # Clean and extract keywords
            keywords = [k.strip() for k in keywords_text.split(',')]
            
            # Ensure we have exactly 2 keywords
            if len(keywords) >= 2:
                keyword1 = re.sub(r'[^a-z0-9_]', '', keywords[0])[:15]
                keyword2 = re.sub(r'[^a-z0-9_]', '', keywords[1])[:15]
                return keyword1, keyword2
            else:
                # Fallback to topic-based extraction
                words = re.findall(r'\\w+', topic.lower())
                return words[0][:15] if words else "topic", words[1][:15] if len(words) > 1 else "blog"
                
        except Exception as e:
            print(f"Keyword generation error: {e}")
            # Fallback to simple extraction
            words = re.findall(r'\\w+', topic.lower())
            return words[0][:15] if words else "topic", words[1][:15] if len(words) > 1 else "blog"

class FreeWebSearcher:
    """Free web search using DuckDuckGo search (no API key required)."""
    
    def __init__(self):
        self.session = requests.Session()
        self.session.headers.update({
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
        })
    
    def search(self, query: str, num_results: int = 10) -> List[SearchResult]:
        """Perform a free web search using DuckDuckGo."""
        try:
            # Use DuckDuckGo search
            search_url = f"https://html.duckduckgo.com/html/?q={quote_plus(query)}"
            
            response = self.session.get(search_url, timeout=10)
            response.raise_for_status()
            
            soup = BeautifulSoup(response.content, 'html.parser')
            results = []
            
            # Parse DuckDuckGo results
            for result_div in soup.find_all('div', class_='result')[:num_results]:
                try:
                    title_elem = result_div.find('a', class_='result__a')
                    snippet_elem = result_div.find('a', class_='result__snippet')
                    
                    if title_elem and snippet_elem:
                        title = title_elem.get_text(strip=True)
                        url = title_elem.get('href', '')
                        snippet = snippet_elem.get_text(strip=True)
                        
                        if title and url and snippet:
                            results.append(SearchResult(
                                title=title,
                                url=url,
                                snippet=snippet
                            ))
                except Exception:
                    continue
            
            return results
            
        except Exception as e:
            print(f"Free search error: {e}")
            return []
    
    def search_multiple_queries(self, queries: List[str], results_per_query: int = 10) -> List[SearchResult]:
        """Search multiple queries and return combined results."""
        all_results = []
        
        for i, query in enumerate(queries):
            print(f"   Free searching: '{query}'")
            results = self.search(query, results_per_query)
            all_results.extend(results)
            
            # Add delay to be respectful
            if i < len(queries) - 1:
                time.sleep(2)
        
        return all_results

class WebSearcher:
    def __init__(self, api_key: str = None, cse_id: str = None):
        self.api_key = api_key
        self.cse_id = cse_id
        self.base_url = "https://www.googleapis.com/customsearch/v1"
    
    def search(self, query: str, num_results: int = 10) -> List[SearchResult]:
        """Perform a Google Custom Search and return top results."""
        try:
            params = {
                'key': self.api_key,
                'cx': self.cse_id,
                'q': query,
                'num': min(num_results, 10)  # API max is 10 per request
            }
            
            response = requests.get(self.base_url, params=params)
            response.raise_for_status()
            
            data = response.json()
            results = []
            
            if 'items' in data:
                for item in data['items']:
                    result = SearchResult(
                        title=item.get('title', ''),
                        url=item.get('link', ''),
                        snippet=item.get('snippet', '')
                    )
                    results.append(result)
            
            return results
            
        except requests.exceptions.RequestException as e:
            print(f"Search API error: {e}")
            return []
        except Exception as e:
            print(f"Unexpected search error: {e}")
            return []
    
    def search_multiple_queries(self, queries: List[str], results_per_query: int = 10) -> List[SearchResult]:
        """Search multiple queries and return combined results."""
        all_results = []
        
        for i, query in enumerate(queries):
            print(f"   Searching: '{query}'")
            results = self.search(query, results_per_query)
            all_results.extend(results)
            
            # Add delay to respect API rate limits
            if i < len(queries) - 1:
                time.sleep(1)
        
        return all_results

class FileManager:
    def __init__(self, topic: str, folder_name_generator: FolderNameGenerator):
        self.topic = topic
        self.date_str = datetime.datetime.now().strftime("%Y%m%d")
        
        # Generate smart keywords for folder naming
        keyword1, keyword2 = folder_name_generator.generate_keywords(topic)
        self.folder_name = f"{self.date_str}_{keyword1}_{keyword2}"
        self.output_dir = Path(self.folder_name)
        self.output_dir.mkdir(exist_ok=True)
        print(f"Created output directory: {self.output_dir}")
    
    def save_draft(self, content: str, version: int) -> Path:
        filename = f"draft_{version}.md"
        filepath = self.output_dir / filename
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(content)
        print(f"Saved draft to: {filepath}")
        return filepath
    
    def save_review(self, review: EditorReview, version: int) -> Path:
        filename = f"review_{version}.md"
        filepath = self.output_dir / filename
        content = f"# Editorial Review {version}\\n\\n"
        content += f"**Approved:** {review.is_approved}\\n\\n"
        content += f"**Comments:**\\n{review.comments}\\n"
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(content)
        print(f"Saved review to: {filepath}")
        return filepath
    
    def save_final_blog(self, content: str) -> Path:
        filepath = self.output_dir / "final_blog.md"
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(content)
        print(f"Saved final blog to: {filepath}")
        return filepath
    
    def save_persona_details(self, persona_prompt: str, topic: str) -> Path:
        filename = "writer_persona.md"
        filepath = self.output_dir / filename
        content = f"# Writer Persona for: {topic}\\n\\n"
        content += f"**Generated on:** {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\\n\\n"
        content += f"## Writer Persona Details\\n{persona_prompt}\\n"
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(content)
        print(f"Saved writer persona to: {filepath}")
        return filepath
    
    def save_editor_persona_details(self, editor_persona: str, review_criteria: str, topic: str) -> Path:
        filename = "editor_persona.md"
        filepath = self.output_dir / filename
        content = f"# Editor Persona for: {topic}\\n\\n"
        content += f"**Generated on:** {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\\n\\n"
        content += f"## Editor Persona Details\\n{editor_persona}\\n\\n"
        content += f"## Review Criteria\\n{review_criteria}\\n"
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(content)
        print(f"Saved editor persona to: {filepath}")
        return filepath
    
    def save_search_results(self, search_results: List[SearchResult]) -> Path:
        filename = "search_results.md"
        filepath = self.output_dir / filename
        content = f"# Search Results\\n\\n"
        content += f"**Generated on:** {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\\n"
        content += f"**Total Results:** {len(search_results)}\\n\\n"
        
        for i, result in enumerate(search_results, 1):
            content += f"## Result {i}\\n\\n"
            content += f"**Title:** {result.title}\\n\\n"
            content += f"**URL:** {result.url}\\n\\n"
            content += f"**Snippet:** {result.snippet}\\n\\n"
            content += "---\\n\\n"
        
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(content)
        print(f"Saved search results to: {filepath}")
        return filepath
    
    def save_requirement_analysis(self, analysis: RequirementAnalysis, topic: str) -> Path:
        filename = "requirement_analysis.md"
        filepath = self.output_dir / filename
        content = f"# Requirement Analysis for: {topic}\\n\\n"
        content += f"**Generated on:** {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\\n\\n"
        content += f"## Persona Requirements\\n{analysis.persona_requirements}\\n\\n"
        content += f"## Content Requirements\\n{analysis.content_requirements}\\n\\n"
        content += f"## Review Requirements\\n{analysis.review_requirements}\\n"
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(content)
        print(f"Saved requirement analysis to: {filepath}")
        return filepath

print("Enhanced utility classes with web searcher implementations defined successfully!")

In [None]:
# Add missing save_fact_check_report method to FileManager
def save_fact_check_report(self, fact_check_report, version: int):
    """Save comprehensive fact-check report."""
    filename = f"fact_check_{version}.md"
    filepath = self.output_dir / filename
    
    content = f"# Fact-Check Report {version}\\n\\n"
    content += f"**Generated on:** {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\\n\\n"
    
    # Summary
    content += f"## Summary\\n\\n"
    content += f"- **Total Claims Checked:** {fact_check_report.total_claims}\\n"
    content += f"- **Verified Claims:** {fact_check_report.verified_claims}\\n"
    content += f"- **Contradicted Claims:** {fact_check_report.contradicted_claims}\\n"
    content += f"- **Insufficient Evidence:** {fact_check_report.insufficient_evidence_claims}\\n"
    content += f"- **Overall Reliability:** {fact_check_report.overall_reliability.upper()}\\n\\n"
    
    # Critical Issues
    if fact_check_report.critical_issues:
        content += f"## Critical Issues\\n\\n"
        for issue in fact_check_report.critical_issues:
            content += f"- {issue}\\n"
        content += "\\n"
    else:
        content += f"## Critical Issues\\n\\nNone detected.\\n\\n"
    
    # Detailed Results
    content += f"## Detailed Verification Results\\n\\n"
    for i, verification in enumerate(fact_check_report.verifications, 1):
        content += f"### Claim {i}\\n\\n"
        content += f"**Claim:** {verification.claim.claim_text}\\n\\n"
        content += f"**Type:** {verification.claim.claim_type}\\n"
        content += f"**Importance:** {verification.claim.importance}\\n"
        content += f"**Status:** {verification.verification_status.upper()}\\n"
        content += f"**Confidence:** {verification.confidence:.2f}\\n\\n"
        content += f"**Context:** {verification.claim.context}\\n\\n"
        content += f"**Explanation:** {verification.explanation}\\n\\n"
        
        if verification.search_queries:
            content += f"**Search Queries Used:**\\n"
            for query in verification.search_queries:
                content += f"- {query}\\n"
            content += "\\n"
        
        if verification.evidence:
            content += f"**Evidence Sources ({len(verification.evidence)}):**\\n"
            for j, evidence in enumerate(verification.evidence[:5], 1):  # Top 5 sources
                content += f"{j}. **{evidence.title}**\\n"
                content += f"   URL: {evidence.url}\\n"
                content += f"   Summary: {evidence.snippet}\\n\\n"
            
            if len(verification.evidence) > 5:
                content += f"   ... and {len(verification.evidence) - 5} more sources\\n"
        
        content += "---\\n\\n"
    
    with open(filepath, 'w', encoding='utf-8') as f:
        f.write(content)
    print(f"Saved fact-check report to: {filepath}")
    return filepath

# Patch the FileManager class to include the missing method
FileManager.save_fact_check_report = save_fact_check_report

print("FileManager patched with save_fact_check_report method!")

## Agent Implementation: Persona Architect

In [None]:
class PersonaArchitect:
    def __init__(self, model):
        self.model = model
    
    def generate_persona_and_queries(self, topic: str, persona_requirements: str) -> PersonaResult:
        """Generate persona using intelligently analyzed requirements."""
        prompt = f"""
You are a Persona Architect agent. Your task is to create a detailed writer persona and search queries for a blog post.

TOPIC: {topic}

PERSONA-SPECIFIC REQUIREMENTS:
{persona_requirements}

REQUIREMENTS:
1. Generate a detailed, topic-specific writer persona based on the provided persona requirements
2. Generate at least 3 optimized web search queries relevant to the topic
3. Output must be a valid JSON object with:
   - "persona_prompt": detailed writer persona description
   - "search_queries": list of at least 3 search query strings

The persona should include:
- Professional background and expertise (based on requirements)
- Writing style and tone (as specified in requirements)
- Target audience understanding (from requirements)
- Unique perspective or angle (guided by requirements)
- Personality traits and approach (from requirements)

Search queries should be:
- Specific and targeted to the topic
- Diverse in scope (different angles/aspects)
- Optimized for finding current, reliable information
- Aligned with the persona's expertise level

IMPORTANT: Focus on the persona-specific requirements provided. These have been intelligently extracted and categorized for persona generation.

Return ONLY the JSON object, no additional text.
"""
        
        try:
            response = self.model.generate_content(prompt)
            response_text = response.text.strip()
            
            # Clean the response to extract JSON
            if response_text.startswith('```json'):
                response_text = response_text[7:-3].strip()
            elif response_text.startswith('```'):
                response_text = response_text[3:-3].strip()
            
            # Debug: Print the response for troubleshooting
            print(f"DEBUG - Persona generation response length: {len(response_text)}")
            print(f"DEBUG - First 100 chars: {response_text[:100]}")
            
            data = json.loads(response_text)
            
            # Ensure we have the correct data types
            persona_prompt = data.get('persona_prompt', '')
            search_queries = data.get('search_queries', [])
            
            # Handle case where persona_prompt might be a dict or other type
            if isinstance(persona_prompt, dict):
                persona_prompt = str(persona_prompt)
            elif not isinstance(persona_prompt, str):
                persona_prompt = str(persona_prompt)
                
            # Ensure search_queries is a list
            if not isinstance(search_queries, list):
                search_queries = [str(search_queries)]
            
            return PersonaResult(
                persona_prompt=persona_prompt,
                search_queries=search_queries
            )
            
        except json.JSONDecodeError as e:
            print(f"JSON parsing error in PersonaArchitect: {e}")
            print(f"Raw response: {response_text}")
            raise
        except Exception as e:
            print(f"Error in PersonaArchitect: {e}")
            print(f"Raw response: {response.text if 'response' in locals() else 'No response'}")
            raise

class EditorPersonaArchitect:
    """Generate independent editor persona based on topic."""
    
    def __init__(self, model):
        self.model = model
    
    def generate_editor_persona(self, topic: str, review_requirements: str) -> EditorPersonaResult:
        """Generate editor persona using intelligently analyzed requirements."""
        prompt = f"""
You are an Editor Persona Architect. Create an independent editorial persona and review criteria.

TOPIC: {topic}

REVIEW-SPECIFIC REQUIREMENTS:
{review_requirements}

REQUIREMENTS:
1. Generate an editor persona COMPLETELY INDEPENDENT from any writer persona
2. The editor should be a topic expert with relevant editorial experience
3. Create specific review criteria based on the review requirements provided
4. Output must be a valid JSON object with:
   - "editor_persona": detailed editor persona description
   - "review_criteria": specific editorial standards and criteria

The editor persona should include:
- Professional editorial background relevant to the topic
- Subject matter expertise (independent perspective)
- Editorial experience and credentials
- Quality standards and expertise areas
- Editorial philosophy and approach

Review criteria should be based on:
- The review requirements provided (intelligently analyzed)
- Topic-specific accuracy requirements
- Quality standards from requirements
- Compliance and validation needs
- Editorial best practices for this domain

IMPORTANT: Use the review requirements provided - these have been intelligently extracted and categorized for editorial review purposes.

Return ONLY the JSON object, no additional text.
"""
        
        try:
            response = self.model.generate_content(prompt)
            response_text = response.text.strip()
            
            # Clean the response to extract JSON
            if response_text.startswith('```json'):
                response_text = response_text[7:-3].strip()
            elif response_text.startswith('```'):
                response_text = response_text[3:-3].strip()
            
            data = json.loads(response_text)
            
            editor_persona = data.get('editor_persona', '')
            review_criteria = data.get('review_criteria', '')
            
            # Ensure strings
            if not isinstance(editor_persona, str):
                editor_persona = str(editor_persona)
            if not isinstance(review_criteria, str):
                review_criteria = str(review_criteria)
            
            return EditorPersonaResult(
                editor_persona=editor_persona,
                review_criteria=review_criteria
            )
            
        except Exception as e:
            print(f"Error in EditorPersonaArchitect: {e}")
            raise

print("Enhanced PersonaArchitect and EditorPersonaArchitect with intelligent requirements implemented successfully!")

## Agent Implementation: Research Analyst

In [None]:
class ResearchAnalyst:
    def __init__(self, model, web_searcher: WebSearcher):
        self.model = model
        self.web_searcher = web_searcher
    
    def conduct_research(self, search_queries: List[str]) -> ResearchResult:
        """Conduct real web research using Google Custom Search."""
        print("   Performing real web searches...")
        
        # Perform actual web searches
        search_results = self.web_searcher.search_multiple_queries(search_queries, results_per_query=10)
        
        if not search_results:
            print("   No search results found, falling back to knowledge-based research")
            return self._fallback_research(search_queries)
        
        print(f"   Found {len(search_results)} total search results")
        
        # Prepare search content for analysis
        search_content = self._format_search_results(search_results)
        
        # Use LLM to analyze and synthesize the search results
        analysis_prompt = f"""
You are a Research Analyst. Analyze the following real web search results and create a comprehensive research summary.

SEARCH QUERIES USED:
{chr(10).join(f'- {query}' for query in search_queries)}

SEARCH RESULTS TO ANALYZE:
{search_content}

TASK:
Analyze and synthesize these real search results into a comprehensive research summary. 

REQUIREMENTS:
1. Extract key facts, statistics, and insights from the search results
2. Identify recent developments and trends mentioned in the results
3. Note expert opinions and authoritative sources
4. Synthesize information from multiple sources
5. Maintain accuracy to the source material
6. Include relevant examples and case studies found

Format your response as:
--- RESEARCH START ---
[Your comprehensive analysis and synthesis of the search results]
--- RESEARCH END ---

Focus on creating a coherent narrative from the search results while preserving factual accuracy.
"""
        
        try:
            response = self.model.generate_content(analysis_prompt)
            content = response.text.strip()
            
            return ResearchResult(
                content=content, 
                source_count=len(search_results),
                search_results=search_results
            )
            
        except Exception as e:
            print(f"   Error analyzing search results: {e}")
            print("   Falling back to basic search result compilation")
            return self._compile_search_results(search_results)
    
    def _format_search_results(self, search_results: List[SearchResult]) -> str:
        """Format search results for LLM analysis."""
        formatted = ""
        for i, result in enumerate(search_results, 1):
            formatted += f"\\n=== RESULT {i} ===\\n"
            formatted += f"Title: {result.title}\\n"
            formatted += f"URL: {result.url}\\n"
            formatted += f"Content: {result.snippet}\\n"
        return formatted
    
    def _compile_search_results(self, search_results: List[SearchResult]) -> ResearchResult:
        """Compile search results into a basic research format."""
        content = f"SOURCES CONSULTED: {len(search_results)} web sources\\n\\n"
        content += "--- RESEARCH START ---\\n"
        content += "Based on real web search results:\\n\\n"
        
        for i, result in enumerate(search_results, 1):
            content += f"{i}. **{result.title}**\\n"
            content += f"   Source: {result.url}\\n"
            content += f"   Summary: {result.snippet}\\n\\n"
        
        content += "--- RESEARCH END ---"
        
        return ResearchResult(
            content=content,
            source_count=len(search_results),
            search_results=search_results
        )
    
    def _fallback_research(self, search_queries: List[str]) -> ResearchResult:
        """Fallback to knowledge-based research if web search fails."""
        prompt = f"""
You are a Research Analyst. Web search is unavailable, so provide research based on your knowledge.

SEARCH QUERIES:
{chr(10).join(f'- {query}' for query in search_queries)}

Provide comprehensive information based on your training data. Format as:
SOURCES CONSULTED: Knowledge base (web search unavailable)

--- RESEARCH START ---
[Your research content here]
--- RESEARCH END ---
"""
        
        try:
            response = self.model.generate_content(prompt)
            content = response.text.strip()
            return ResearchResult(content=content, source_count=0, search_results=[])
        except Exception as e:
            print(f"   Fallback research failed: {e}")
            raise

print("ResearchAnalyst agent with real web search implemented successfully!")

## Agent Implementation: Content Synthesizer

In [None]:
class ContentSynthesizer:
    def __init__(self, model):
        self.model = model
    
    def write_blog_post(self, persona_prompt: str, researched_content: str, 
                       topic: str, content_requirements: str, editorial_feedback: Optional[str] = None) -> BlogDraft:
        """Write blog post using intelligently analyzed content requirements."""
        
        base_prompt = f"""
You are a Content Synthesizer agent. Your task is to write a high-quality blog post.

WRITER PERSONA:
{persona_prompt}

RESEARCH CONTENT:
{researched_content}

TOPIC: {topic}

CONTENT-SPECIFIC REQUIREMENTS:
{content_requirements}

FUNDAMENTAL PRINCIPLES TO FOLLOW:
{QUALITY_PRINCIPLES}

REQUIREMENTS:
1. Follow ALL content requirements specified above (these have been intelligently analyzed and extracted)
2. Use the writer persona voice and style exactly
3. Adhere strictly to all Fundamental Principles (P1-P5)
4. Include a compelling title, introduction, body, and conclusion
5. Ensure all claims are traceable to the research content
6. Make the post engaging with a strong hook and memorable conclusion

IMPORTANT NOTES:
- The content requirements above include length specifications, structural requirements, formatting needs, and other content-specific instructions
- These requirements have been intelligently parsed and categorized specifically for content creation
- Pay special attention to any length requirements, structural specifications, or formatting instructions
- If length requirements are specified, ensure you meet or exceed them with substantive, valuable content
"""
        
        if editorial_feedback:
            base_prompt += f"""

EDITORIAL FEEDBACK TO ADDRESS:
{editorial_feedback}

IMPORTANT: Specifically address the editor's comments while maintaining the quality principles AND the content requirements.
"""
        
        base_prompt += """

Write the blog post in Markdown format. Include the title as an H1 header.
Focus on creating high-quality, valuable content that meets all specified requirements.
"""
        
        try:
            response = self.model.generate_content(base_prompt)
            content = response.text.strip()
            
            # Determine version number based on whether this is a revision
            version = 1 if editorial_feedback is None else 2
            
            return BlogDraft(content=content, version=version)
            
        except Exception as e:
            print(f"Error in ContentSynthesizer: {e}")
            raise

print("Enhanced ContentSynthesizer with intelligent content requirements implemented successfully!")

## Agent Implementation: Critic/Editor

In [None]:
class CriticEditor:
    def __init__(self, model, web_searcher):
        self.model = model
        self.web_searcher = web_searcher
        self.fact_checker = FactChecker(model, web_searcher)
    
    def _count_words(self, text: str) -> int:
        """Count words in the blog post content."""
        # Remove markdown formatting for accurate word count
        clean_text = re.sub(r'[#*`\-\[\]()]+', ' ', text)
        clean_text = re.sub(r'https?://[^\s]+', '', clean_text)
        words = re.findall(r'\b\w+\b', clean_text.lower())
        return len(words)
    
    def review_draft(self, draft_blog: str, research_content: str, 
                    editor_persona: str, editor_review_criteria: str, 
                    content_requirements: str, topic: str) -> Tuple[EditorReview, FactCheckReport]:
        """Review draft with comprehensive fact-checking using intelligent requirement analysis."""
        
        # Count words in the draft
        word_count = self._count_words(draft_blog)
        
        print("  - Performing comprehensive fact-checking...")
        fact_check_report = self.fact_checker.fact_check_article(draft_blog, topic)
        
        # Determine if fact-checking passes
        fact_check_passes = fact_check_report.overall_reliability in ["high", "medium"]
        has_critical_issues = len(fact_check_report.critical_issues) > 0
        
        prompt = f"""
You are a Critic/Editor with the following persona and expertise:

EDITOR PERSONA:
{editor_persona}

SPECIALIZED REVIEW CRITERIA:
{editor_review_criteria}

BLOG DRAFT TO REVIEW:
{draft_blog}

RESEARCH CONTENT FOR FACT-CHECKING:
{research_content}

CONTENT REQUIREMENTS TO VALIDATE:
{content_requirements}

FUNDAMENTAL PRINCIPLES TO EVALUATE AGAINST:
{QUALITY_PRINCIPLES}

COMPREHENSIVE FACT-CHECK REPORT:
- Total Claims Checked: {fact_check_report.total_claims}
- Verified Claims: {fact_check_report.verified_claims}
- Contradicted Claims: {fact_check_report.contradicted_claims}
- Insufficient Evidence: {fact_check_report.insufficient_evidence_claims}
- Overall Reliability: {fact_check_report.overall_reliability}
- Critical Issues: {len(fact_check_report.critical_issues)}

CRITICAL FACTUAL ISSUES FOUND:
{chr(10).join(fact_check_report.critical_issues) if fact_check_report.critical_issues else "None"}

DETAILED FACT-CHECK RESULTS:
{self._format_fact_check_details(fact_check_report)}

WORD COUNT ANALYSIS:
- Current word count: {word_count} words

EVALUATION CRITERIA:
1. Check adherence to each fundamental principle (P1-P5)
2. Verify factual consistency with research material
3. CRITICAL: Assess fact-check results - articles with contradicted claims must be rejected
4. Assess compliance with specialized review criteria
5. Validate against ALL content requirements (including length, structure, format)
6. Check topic-specific accuracy and expertise level
7. Assess audience appropriateness
8. Verify meeting of technical specifications

FACT-CHECKING REQUIREMENTS:
- Articles with "unreliable" fact-check rating must be REJECTED
- Articles with contradicted high-priority claims must be REJECTED
- Articles with multiple contradicted claims should be REJECTED
- Only approve articles with "high" or "medium" reliability ratings

REQUIREMENTS:
1. Output must be a valid JSON object with:
   - "is_approved": boolean (true if draft meets ALL criteria INCLUDING fact-checking)
   - "comments": string with clear, constructive, actionable feedback

2. If approved (is_approved: true), provide brief affirmation including fact-check confidence
3. If not approved (is_approved: false), provide specific, actionable feedback:
   - Factual errors that must be corrected (be specific about claims and evidence)
   - Which principles or criteria need attention
   - Content requirement violations (length, structure, format, etc.)
   - Topic-specific improvements needed
   - Suggestions for improvement

IMPORTANT: 
- Use your specialized editorial expertise to provide domain-specific feedback
- CRITICAL: Factual accuracy is non-negotiable - reject articles with significant factual errors
- Check that ALL content requirements have been met (these were intelligently analyzed)
- Be thorough but constructive, focusing on helping improve the draft
- If content requirements specify minimum lengths, structural elements, or specific formats, ensure they are met
- For factual issues, provide specific guidance on what needs to be verified/corrected

Return ONLY the JSON object, no additional text.
"""
        
        try:
            response = self.model.generate_content(prompt)
            response_text = response.text.strip()
            
            # Clean the response to extract JSON
            if response_text.startswith('```json'):
                response_text = response_text[7:-3].strip()
            elif response_text.startswith('```'):
                response_text = response_text[3:-3].strip()
            
            data = json.loads(response_text)
            
            # Override approval if fact-checking fails
            is_approved = data['is_approved']
            if fact_check_report.overall_reliability == "unreliable" or has_critical_issues:
                is_approved = False
                if data['is_approved']:  # Was approved but fact-check failed
                    data['comments'] += f"\n\nFACT-CHECK OVERRIDE: Article rejected due to factual reliability issues. Overall reliability: {fact_check_report.overall_reliability}. Critical issues must be addressed before approval."
            
            return EditorReview(
                is_approved=is_approved,
                comments=data['comments']
            ), fact_check_report
            
        except Exception as e:
            print(f"Error in CriticEditor: {e}")
            print(f"Raw response: {response.text if 'response' in locals() else 'No response'}")
            
            # Return a safe fallback with fact-check results
            return EditorReview(
                is_approved=False,
                comments=f"Error during editorial review: {e}. Additionally, fact-check reliability: {fact_check_report.overall_reliability}"
            ), fact_check_report
    
    def _format_fact_check_details(self, fact_check_report: FactCheckReport) -> str:
        """Format fact-check details for the editor prompt."""
        if not fact_check_report.verifications:
            return "No specific claims were fact-checked."
        
        details = ""
        for i, verification in enumerate(fact_check_report.verifications[:10], 1):  # Limit to top 10 for prompt length
            details += f"{i}. CLAIM: {verification.claim.claim_text}\n"
            details += f"   STATUS: {verification.verification_status.upper()} (confidence: {verification.confidence:.2f})\n"
            details += f"   IMPORTANCE: {verification.claim.importance}\n"
            if verification.verification_status == "contradicted":
                details += f"   ISSUE: {verification.explanation}\n"
            details += "\n"
        
        if len(fact_check_report.verifications) > 10:
            details += f"... and {len(fact_check_report.verifications) - 10} more claims checked.\n"
        
        return details

print("Enhanced CriticEditor with comprehensive fact-checking and intelligent requirement validation implemented successfully!")

## Main Workflow Orchestration

In [None]:
class DynamicBlogWriter:
    def __init__(self, model, web_searcher):
        self.model = model
        self.web_searcher = web_searcher
        self.requirement_analyzer = RequirementAnalyzer(model)
        self.persona_architect = PersonaArchitect(model)
        self.editor_persona_architect = EditorPersonaArchitect(model)
        self.research_analyst = ResearchAnalyst(model, web_searcher)
        self.content_synthesizer = ContentSynthesizer(model)
        self.critic_editor = CriticEditor(model, web_searcher)  # Pass web_searcher for fact-checking
        self.folder_name_generator = FolderNameGenerator(model)
    
    def _display_markdown_section(self, title: str, content: str, max_lines: int = None):
        """Display content as formatted markdown with optional truncation."""
        # Fix newline characters for proper markdown display
        content = content.replace('\\n', '\n').replace('\\t', '\t')
        
        lines = content.split('\n')
        if max_lines and len(lines) > max_lines:
            truncated_content = '\n'.join(lines[:max_lines])
            truncated_content += f"\n\n*... (showing first {max_lines} lines of {len(lines)} total)*"
            display(Markdown(f"### {title}\n\n{truncated_content}"))
        else:
            display(Markdown(f"### {title}\n\n{content}"))
    
    def _display_search_results(self, search_results: List[SearchResult]):
        """Display search results in a formatted table."""
        if not search_results:
            return
        
        content = f"**Total Results:** {len(search_results)}\n\n"
        
        for i, result in enumerate(search_results[:10], 1):  # Show top 10
            content += f"**{i}. {result.title}**\n"
            content += f"URL: {result.url}\n"
            content += f"Summary: {result.snippet}\n\n"
            content += "---\n\n"
        
        if len(search_results) > 10:
            content += f"*... and {len(search_results) - 10} more results*"
        
        self._display_markdown_section("Real Web Search Results", content, max_lines=30)
    
    def _display_fact_check_report(self, fact_check_report: FactCheckReport, cycle: int):
        """Display comprehensive fact-check results."""
        # Summary
        summary = f"""**Fact-Check Summary for Review Cycle {cycle}**

**Overall Reliability:** {fact_check_report.overall_reliability.upper()}
**Total Claims Checked:** {fact_check_report.total_claims}
**Verified:** {fact_check_report.verified_claims} | **Contradicted:** {fact_check_report.contradicted_claims} | **Insufficient Evidence:** {fact_check_report.insufficient_evidence_claims}"""
        
        if fact_check_report.critical_issues:
            summary += f"\n\n**CRITICAL ISSUES ({len(fact_check_report.critical_issues)}):**\n"
            for issue in fact_check_report.critical_issues:
                summary += f"- {issue}\n"
        else:
            summary += f"\n\n**No critical factual issues detected.**"
        
        self._display_markdown_section(f"Fact-Check Report {cycle}", summary, max_lines=15)
        
        # Detailed claims (if any contradicted)
        if fact_check_report.contradicted_claims > 0:
            contradicted = [v for v in fact_check_report.verifications if v.verification_status == "contradicted"]
            details = "**CONTRADICTED CLAIMS REQUIRING ATTENTION:**\n\n"
            
            for i, verification in enumerate(contradicted, 1):
                details += f"{i}. **{verification.claim.claim_text}**\n"
                details += f"   Priority: {verification.claim.importance.upper()}\n"
                details += f"   Issue: {verification.explanation}\n\n"
            
            self._display_markdown_section(f"Critical Fact-Check Issues {cycle}", details, max_lines=20)
    
    def generate_blog_post(self, topic: str, style_and_background: str) -> Tuple[str, Path]:
        """
        Enhanced workflow with intelligent requirement analysis and comprehensive fact-checking.
        Returns: (final_blog_content, final_blog_path)
        """
        print(f"Starting Enhanced Dynamic Blog Writer with FACT-CHECKING for topic: '{topic}'")
        print("=" * 70)
        
        # Initialize file manager with smart naming
        file_manager = FileManager(topic, self.folder_name_generator)
        
        try:
            # Phase 0: Intelligent Requirement Analysis
            print("Phase 0: Analyzing and categorizing requirements intelligently...")
            requirement_analysis = self.requirement_analyzer.analyze_requirements(style_and_background, topic)
            print("Requirements intelligently analyzed and categorized")
            
            # Display requirement analysis with proper formatting
            analysis_summary = f"""**Persona Requirements:**
{requirement_analysis.persona_requirements}

**Content Requirements:**
{requirement_analysis.content_requirements}

**Review Requirements:**
{requirement_analysis.review_requirements}"""
            
            self._display_markdown_section("Intelligent Requirement Analysis", analysis_summary, max_lines=20)
            
            # Save requirement analysis
            file_manager.save_requirement_analysis(requirement_analysis, topic)
            
            # Phase 1: Persona Generation (Writer + Editor) using categorized requirements
            print("Phase 1: Generating writer and editor personas with targeted requirements...")
            
            # Generate writer persona using persona-specific requirements
            persona_result = self.persona_architect.generate_persona_and_queries(topic, requirement_analysis.persona_requirements)
            print("Generated writer persona using targeted persona requirements")
            
            # Generate INDEPENDENT editor persona using review-specific requirements  
            editor_result = self.editor_persona_architect.generate_editor_persona(topic, requirement_analysis.review_requirements)
            print("Generated independent editor persona using targeted review requirements")
            
            # Display writer persona
            persona_text = str(persona_result.persona_prompt)
            self._display_markdown_section("Writer Persona (From Persona Requirements)", persona_text, max_lines=15)
            
            # Display editor persona
            editor_text = str(editor_result.editor_persona)
            self._display_markdown_section("Editor Persona (From Review Requirements)", editor_text, max_lines=15)
            
            # Display search queries
            queries_text = "\n".join(f"- {query}" for query in persona_result.search_queries)
            self._display_markdown_section("Generated Search Queries", queries_text)
            
            # Save personas to files
            file_manager.save_persona_details(persona_text, topic)
            file_manager.save_editor_persona_details(editor_text, editor_result.review_criteria, topic)
            
            # Phase 2: Real Web Research
            print("\nPhase 2: Conducting REAL web research...")
            research_result = self.research_analyst.conduct_research(persona_result.search_queries)
            print(f"Research completed - {research_result.source_count} real web sources")
            
            # Display actual search results
            if research_result.search_results:
                self._display_search_results(research_result.search_results)
                file_manager.save_search_results(research_result.search_results)
            
            # Display research analysis
            research_content = research_result.content
            if "--- RESEARCH START ---" in research_content and "--- RESEARCH END ---" in research_content:
                start_idx = research_content.find("--- RESEARCH START ---") + len("--- RESEARCH START ---")
                end_idx = research_content.find("--- RESEARCH END ---")
                clean_research = research_content[start_idx:end_idx].strip()
            else:
                clean_research = research_content
            
            source_info = f"**Real Web Sources:** {research_result.source_count}\n\n{clean_research}"
            self._display_markdown_section("Research Analysis", source_info, max_lines=25)
            
            # Phase 3: Content Generation & Review Loop with FACT-CHECKING
            print("\nPhase 3: Starting content generation with COMPREHENSIVE FACT-CHECKING...")
            
            current_draft = None
            review_cycle = 0
            max_cycles = 3
            all_reviews = []
            all_fact_checks = []
            
            while review_cycle < max_cycles:
                review_cycle += 1
                print(f"\nReview Cycle {review_cycle}/{max_cycles}")
                
                # Generate or revise draft using content-specific requirements
                if current_draft is None:
                    print("  - Writing initial draft using targeted content requirements...")
                    draft = self.content_synthesizer.write_blog_post(
                        persona_text,
                        research_result.content,
                        topic,
                        requirement_analysis.content_requirements  # Use targeted content requirements
                    )
                else:
                    print("  - Revising draft based on editorial AND fact-check feedback...")
                    draft = self.content_synthesizer.write_blog_post(
                        persona_text,
                        research_result.content,
                        topic,
                        requirement_analysis.content_requirements,  # Use targeted content requirements
                        editorial_feedback=last_review.comments
                    )
                
                # Save draft
                draft_path = file_manager.save_draft(draft.content, review_cycle)
                current_draft = draft
                
                # Display draft with word count
                word_count = self.critic_editor._count_words(draft.content)
                draft_preview = f"**Word Count:** {word_count} words\n\n{draft.content}"
                self._display_markdown_section(f"Draft {review_cycle}", draft_preview, max_lines=25)
                
                # Review draft WITH COMPREHENSIVE FACT-CHECKING
                print("  - Reviewing draft with specialized editor and COMPREHENSIVE FACT-CHECKING...")
                review, fact_check_report = self.critic_editor.review_draft(
                    draft.content, 
                    research_result.content,
                    editor_result.editor_persona,
                    editor_result.review_criteria,
                    requirement_analysis.content_requirements,  # Use targeted content requirements for validation
                    topic
                )
                
                # Save both review and fact-check report
                review_path = file_manager.save_review(review, review_cycle)
                fact_check_path = file_manager.save_fact_check_report(fact_check_report, review_cycle)
                
                all_reviews.append((review_cycle, review))
                all_fact_checks.append((review_cycle, fact_check_report))
                
                # Display comprehensive review results
                review_content = f"**Approved:** {'YES' if review.is_approved else 'NO'}\n"
                review_content += f"**Fact-Check Reliability:** {fact_check_report.overall_reliability.upper()}\n\n"
                review_content += f"**Editorial Feedback:**\n{review.comments}"
                self._display_markdown_section(f"Editorial Review {review_cycle}", review_content)
                
                # Display detailed fact-check results
                self._display_fact_check_report(fact_check_report, review_cycle)
                
                if review.is_approved:
                    print(f"Draft approved after {review_cycle} cycle(s) with fact-check reliability: {fact_check_report.overall_reliability}")
                    break
                else:
                    if fact_check_report.overall_reliability in ["unreliable", "low"]:
                        print("Draft needs revision due to FACTUAL ISSUES...")
                    else:
                        print("Draft needs revision due to editorial feedback...")
                    last_review = review
                    
                    if review_cycle == max_cycles:
                        print(f"Maximum review cycles ({max_cycles}) reached. Using last draft.")
                        print(f"FINAL FACT-CHECK STATUS: {fact_check_report.overall_reliability}")
            
            # Phase 4: Finalization with Final Assessment
            print("\nPhase 4: Finalizing blog post with final fact-check assessment...")
            final_path = file_manager.save_final_blog(current_draft.content)
            
            # Display final blog post with word count
            final_word_count = self.critic_editor._count_words(current_draft.content)
            final_preview = f"**Final Word Count:** {final_word_count} words\n\n{current_draft.content}"
            self._display_markdown_section("Final Blog Post", final_preview)
            
            # Final fact-check status
            final_fact_check = all_fact_checks[-1][1] if all_fact_checks else None
            
            # Display comprehensive summary with fact-checking
            summary = f"""**Topic:** {topic}
**Smart Folder:** {file_manager.folder_name}
**Review Cycles:** {review_cycle}
**Final Status:** {'Approved' if all_reviews and all_reviews[-1][1].is_approved else 'Max cycles reached'}
**FACT-CHECK RELIABILITY:** {final_fact_check.overall_reliability.upper() if final_fact_check else 'Unknown'}
**Final Word Count:** {final_word_count} words
**Real Web Sources:** {research_result.source_count}
**Search Queries Used:** {len(persona_result.search_queries)}

**FACT-CHECKING SUMMARY:**
  - Claims Verified: {final_fact_check.verified_claims if final_fact_check else 0}
  - Claims Contradicted: {final_fact_check.contradicted_claims if final_fact_check else 0}
  - Critical Issues: {len(final_fact_check.critical_issues) if final_fact_check else 0}
  
**Intelligence Features:**
  - Requirement Analysis: AI-categorized requirements
  - Targeted Persona: Generated from persona-specific requirements
  - Specialized Editor: Created from review-specific requirements  
  - Smart Content: Uses content-specific requirements
  - COMPREHENSIVE FACT-CHECKING: Real-time verification of all factual claims
  
**Output Folder:** `{file_manager.output_dir}`
**Final Blog File:** `{final_path}`

**Research Quality:** Real-time web search results
**Fact-Check Quality:** Comprehensive online verification of all claims
**Intelligence Level:** Advanced requirement categorization and fact verification
**Reliability Assurance:** {final_fact_check.overall_reliability.upper() if final_fact_check else 'Unknown'} factual reliability"""
            
            self._display_markdown_section("Enhanced Generation Summary with Fact-Checking", summary)
            
            print("\nBlog generation completed with COMPREHENSIVE FACT-CHECKING!")
            print(f"CRITICAL: Final fact-check reliability is {final_fact_check.overall_reliability.upper() if final_fact_check else 'Unknown'}")
            if final_fact_check and final_fact_check.critical_issues:
                print(f"WARNING: {len(final_fact_check.critical_issues)} critical factual issues detected")
            
            return current_draft.content, final_path
            
        except Exception as e:
            print(f"\nError during blog generation: {e}")
            print("Please check your API keys and internet connection.")
            raise

print("COMPREHENSIVE DynamicBlogWriter with FACT-CHECKING and AI-powered requirement analysis implemented successfully!")

## Initialize System

In [None]:
# Initialize the Dynamic Blog Writer system
if USE_FREE_SEARCH:
    # Free web search - no additional API keys needed
    web_searcher = FreeWebSearcher()
    search_method = "FREE DuckDuckGo search"
else:
    # Google Custom Search API
    web_searcher = WebSearcher(GOOGLE_CSE_API_KEY, GOOGLE_CSE_ID)
    search_method = "Google Custom Search API"

blog_writer = DynamicBlogWriter(model, web_searcher)

print("Dynamic Blog Writer system initialized and ready!")
print(f"Search method: {search_method}")
if USE_FREE_SEARCH:
    print("Benefits: No additional API keys required, completely free!")
else:
    print("Benefits: Higher quality results, 100 free searches/day")

## Usage Example

In [None]:
# Example usage - uncomment and modify as needed
topic = "The Future of Artificial Intelligence in Healthcare"
style_and_background = """
Target audience: General tech-savvy readers interested in healthcare innovation
Tone: Professional but accessible, optimistic yet balanced
Style: Informative with real-world examples, approximately 500 words
Background: Write from the perspective of someone knowledgeable about both AI and healthcare trends
"""

# Uncomment the following lines to run the system:
# final_content, final_path = blog_writer.generate_blog_post(topic, style_and_background)
# print(f"\nFinal blog post saved to: {final_path}")

## System Testing

In [None]:
## Documentation

### Project Overview
This Dynamic Micro-Blog Writer implements a complete multi-agent system for autonomous blog generation with **COMPREHENSIVE FACT-CHECKING** and the following features:

#### Implemented Features
- **Persona Architect**: Generates writer personas and search queries from user input
- **Research Analyst**: Performs real web searches and consolidates findings  
- **Content Synthesizer**: Writes blog posts following persona and research
- **Critic/Editor**: Reviews drafts with COMPREHENSIVE FACT-CHECKING against quality principles
- **FACT-CHECKING SYSTEM**: Real-time verification of all factual claims via web search
- **Iterative Review Loop**: Up to 3 cycles of draft revision based on editorial AND fact-check feedback
- **File Management**: Automatic folder creation with timestamped naming and fact-check reports
- **Error Handling**: Comprehensive error handling with informative messages
- **Google Colab Integration**: Secure API key management using Colab secrets

#### NEW: Comprehensive Fact-Checking System
- **Claim Extraction**: AI identifies verifiable claims (statistics, dates, names, facts, quotes)
- **Evidence Search**: Real-time web searches for each claim using targeted queries
- **Verification Analysis**: LLM compares claims against search evidence
- **Reliability Scoring**: Overall article reliability (high/medium/low/unreliable)
- **Critical Issue Detection**: Identifies contradicted claims that must be corrected
- **Detailed Reports**: Comprehensive fact-check reports with evidence sources

#### Quality Principles (P1-P5)
1. **Evidentiary Support**: All claims traceable to research material
2. **Clarity and Conciseness**: Precise, unambiguous writing
3. **Engaging Narrative**: Strong hook, logical flow, memorable conclusion
4. **Structural Integrity**: Clear title, introduction, body, conclusion
5. **Intellectual Honesty**: Accurate information representation

#### CRITICAL: Fact-Checking Requirements
- **Articles with "unreliable" fact-check rating are REJECTED**
- **Articles with contradicted high-priority claims are REJECTED**
- **Only articles with "high" or "medium" reliability are approved**
- **All factual claims are verified against real-time web search**

#### Usage Instructions for Google Colab
1. **Setup Secrets**: Add `GEMINI_API_KEY` to Colab secrets (key icon in sidebar)
2. **Run Cells**: Execute all cells in order to initialize the system
3. **Customize**: Modify the example topic and style guide as needed
4. **Execute**: Uncomment and run the test function or main usage example
5. **Review**: Check fact-check reports for reliability assessment
6. **Download**: Use Colab's file browser to download generated content

#### Enhanced Output Structure (in Colab filesystem)
```
/content/YYYYMMDD_topic_abbreviation/
├── requirement_analysis.md (NEW: AI-categorized requirements)
├── writer_persona.md
├── editor_persona.md (NEW: Independent editor persona)
├── search_results.md (Real web search results)
├── draft_1.md
├── review_1.md
├── fact_check_1.md (NEW: Comprehensive fact-check report)
├── draft_2.md (if revision needed)
├── review_2.md (if revision needed)
├── fact_check_2.md (NEW: Updated fact-check report)
├── draft_3.md (if second revision needed)
├── review_3.md (if second revision needed)
├── fact_check_3.md (NEW: Final fact-check report)
└── final_blog.md
```

#### Google Colab Security Features
- **Secure API Storage**: API keys stored in encrypted Colab secrets
- **No Hardcoded Keys**: No sensitive information in notebook code
- **Access Control**: Secrets only accessible when explicitly enabled
- **Session Isolation**: Keys don't persist between sessions

#### Enhanced Acceptance Criteria Met
- AC-1: User can run workflow with topic and style guide only
- AC-2: All four agents execute as defined in requirements
- AC-3: Iterative review loop with draft/review saving
- AC-4: Proper termination after approval or 3 cycles
- AC-5: Correct file generation in uniquely named folders
- AC-6: Coherent, stylistically appropriate, factually consistent output
- **NEW**: Secure integration with Google Colab secrets management
- **NEW**: COMPREHENSIVE fact-checking with real-time verification
- **NEW**: AI-powered requirement analysis and categorization
- **NEW**: Independent editor personas with specialized criteria

#### CRITICAL SAFEGUARDS
- **Factual Accuracy**: Non-negotiable requirement - articles with factual errors are rejected
- **Evidence-Based**: All claims must be verified against real web sources
- **Transparency**: Detailed fact-check reports show verification status of each claim
- **Reliability Assurance**: Only high/medium reliability articles are approved

**Ready for Google Colab with COMPREHENSIVE FACT-CHECKING! Simply add your API key to secrets and run all cells.**

## Documentation

### Project Overview
This Dynamic Micro-Blog Writer implements a complete multi-agent system for autonomous blog generation with the following features:

#### Implemented Features
- **Persona Architect**: Generates writer personas and search queries from user input
- **Research Analyst**: Simulates web research and consolidates findings  
- **Content Synthesizer**: Writes 500-word blog posts following persona and research
- **Critic/Editor**: Reviews drafts against quality principles and provides feedback
- **Iterative Review Loop**: Up to 3 cycles of draft revision based on editorial feedback
- **File Management**: Automatic folder creation with timestamped naming convention
- **Error Handling**: Comprehensive error handling with informative messages
- **Google Colab Integration**: Secure API key management using Colab secrets

#### Quality Principles (P1-P5)
1. **Evidentiary Support**: All claims traceable to research material
2. **Clarity and Conciseness**: Precise, unambiguous writing
3. **Engaging Narrative**: Strong hook, logical flow, memorable conclusion
4. **Structural Integrity**: Clear title, introduction, body, conclusion
5. **Intellectual Honesty**: Accurate information representation

#### Usage Instructions for Google Colab
1. **Setup Secrets**: Add `GEMINI_API_KEY` to Colab secrets (key icon in sidebar)
2. **Run Cells**: Execute all cells in order to initialize the system
3. **Customize**: Modify the example topic and style guide as needed
4. **Execute**: Uncomment and run the test function or main usage example
5. **Download**: Use Colab's file browser to download generated content

#### Output Structure (in Colab filesystem)
```
/content/YYYYMMDD_topic_abbreviation/
├── writer_persona.md
├── draft_1.md
├── review_1.md
├── draft_2.md (if revision needed)
├── review_2.md (if revision needed)
├── draft_3.md (if second revision needed)
├── review_3.md (if second revision needed)
└── final_blog.md
```

#### Google Colab Security Features
- **Secure API Storage**: API keys stored in encrypted Colab secrets
- **No Hardcoded Keys**: No sensitive information in notebook code
- **Access Control**: Secrets only accessible when explicitly enabled
- **Session Isolation**: Keys don't persist between sessions

#### Acceptance Criteria Met
- AC-1: User can run workflow with topic and style guide only
- AC-2: All four agents execute as defined in requirements
- AC-3: Iterative review loop with draft/review saving
- AC-4: Proper termination after approval or 3 cycles
- AC-5: Correct file generation in uniquely named folders
- AC-6: Coherent, stylistically appropriate, factually consistent output
- **NEW**: Secure integration with Google Colab secrets management

**Ready for Google Colab! Simply add your API key to secrets and run all cells.**