# Structured Report Generation with Ollama and Python

This notebook demonstrates how to build a structured report generation system using basic Python code and the Ollama API. Unlike complex frameworks like LangGraph, we'll implement all the necessary components from scratch, making it easier to understand and customize.

## What We'll Build

We'll create an AI-powered report generator that:
- Uses local language models via Ollama
- Performs web searches to gather information
- Generates structured reports with consistent formatting
- Implements tool calling and routing without complex frameworks

## Why Local Models?

- **Privacy**: Your data stays on your machine
- **Cost**: No API fees or usage limits
- **Control**: Full control over the model and generation process
- **Learning**: Better understanding of how these systems work

## Setup and Installation

First, let's install the required packages:

In [None]:
# Install required packages
!pip install ollama pydantic duckduckgo-search

Make sure you have Ollama installed and running on your system. If not, download it from https://ollama.com/download

Pull the model we'll use:

In [None]:
# Pull the model (this will take a few minutes the first time)
!ollama pull gemma3n

In [1]:
# Import required libraries
import json
import ollama
from typing import List, Dict, Any, Optional
from pydantic import BaseModel, Field
from duckduckgo_search import DDGS
from datetime import datetime
import time

## Part 1: Structured Output with Pydantic

One of the key features we need is the ability to get structured, consistent outputs from our language model. We'll use Pydantic to define schemas and ensure the model returns data in the format we expect.

In [2]:
# Define the structure for a report section
class ReportSection(BaseModel):
    title: str = Field(description="The title of this section")
    content: str = Field(description="The main content of this section")
    key_points: List[str] = Field(description="Key takeaways from this section")
    sources: List[str] = Field(description="Sources used for this section", default=[])

# Define the overall report structure
class StructuredReport(BaseModel):
    title: str = Field(description="The main title of the report")
    executive_summary: str = Field(description="A brief executive summary")
    sections: List[ReportSection] = Field(description="The main sections of the report")
    conclusion: str = Field(description="The conclusion of the report")
    generated_at: str = Field(default_factory=lambda: datetime.now().isoformat())

# Example of how to use structured output with Ollama
def generate_structured_output(prompt: str, schema: BaseModel):
    """
    Generate structured output using Ollama with a Pydantic schema
    """
    response = ollama.chat(
        model='gemma3n',
        messages=[{
            'role': 'user',
            'content': prompt
        }],
        format=schema.model_json_schema()
    )
    
    # Parse the response
    return schema.model_validate_json(response['message']['content'])

In [3]:
# Test structured output generation
test_section = generate_structured_output(
    "Create a report section about the benefits of renewable energy",
    ReportSection
)
print(f"Title: {test_section.title}")
print(f"Content: {test_section.content[:200]}...")
print(f"Key Points: {test_section.key_points}")

Title: Benefits of Renewable Energy: A Comprehensive Overview
Content: ## Benefits of Renewable Energy: A Comprehensive Overview

Transitioning to renewable energy sources offers a multitude of benefits, spanning environmental, economic, and social dimensions. This secti...
Key Points: ['Reduced greenhouse gas emissions and improved air quality.', 'Job creation, energy independence, and price stability.', 'Improved public health, energy access, and community empowerment.']


In [6]:
from IPython.display import display, Markdown

display(Markdown(f"### {test_section.title}\n\n{test_section.content}\n\n**Key Points:**\n" + "\n".join(f"- {kp}" for kp in test_section.key_points)))

### Benefits of Renewable Energy: A Comprehensive Overview

## Benefits of Renewable Energy: A Comprehensive Overview

Transitioning to renewable energy sources offers a multitude of benefits, spanning environmental, economic, and social dimensions. This section details these advantages, highlighting the compelling reasons for prioritizing renewable energy development and deployment.

**1. Environmental Advantages:**

*   **Reduced Greenhouse Gas Emissions:**  Renewable energy sources like solar, wind, hydro, and geothermal produce little to no greenhouse gas emissions during operation. This is crucial in mitigating climate change by reducing the concentration of heat-trapping gases in the atmosphere.  Compared to fossil fuels, renewables significantly lower the carbon footprint of electricity generation.
*   **Improved Air Quality:** Unlike fossil fuel power plants, renewable energy facilities do not release harmful air pollutants such as sulfur dioxide, nitrogen oxides, and particulate matter. This leads to cleaner air, reducing respiratory illnesses and improving public health.
*   **Water Conservation:** Many renewable energy technologies, particularly solar and wind, require significantly less water than traditional power plants that rely on water for cooling. This is especially important in water-stressed regions.
*   **Reduced Environmental Impact:** Renewable energy development can minimize habitat destruction and ecosystem disruption compared to fossil fuel extraction and transportation.  Careful planning and siting are essential to further minimize any potential impacts.

**2. Economic Advantages:**

*   **Job Creation:** The renewable energy sector is a rapidly growing industry, creating numerous jobs in manufacturing, installation, maintenance, and research & development.  This provides economic opportunities and stimulates local economies.
*   **Energy Independence & Security:**  Renewable energy sources are domestically available in most countries, reducing reliance on volatile global fossil fuel markets and enhancing energy security.  This shields economies from price fluctuations and geopolitical instability.
*   **Price Stability:**  Once a renewable energy facility is built, the fuel (sun, wind, water) is free. This leads to more stable and predictable electricity prices compared to fossil fuels, which are subject to market fluctuations.
*   **Economic Development:**  Renewable energy projects can attract investment, stimulate local economic activity, and create new business opportunities in rural areas.
*   **Reduced Healthcare Costs:** Improved air quality resulting from renewable energy adoption translates to lower healthcare costs associated with respiratory and cardiovascular diseases.

**3. Social Advantages:**

*   **Improved Public Health:** Cleaner air and water contribute to improved public health outcomes and a higher quality of life.
*   **Energy Access:** Renewable energy technologies, particularly off-grid solar systems, can provide electricity to remote communities that lack access to the traditional power grid, improving education, healthcare, and economic opportunities.
*   **Community Empowerment:**  Community-owned renewable energy projects can empower local communities, fostering economic development and promoting energy democracy.
*   **Resilience:**  Decentralized renewable energy systems, such as rooftop solar with battery storage, can enhance grid resilience and provide power during outages caused by natural disasters or other disruptions.

**Conclusion:**

The benefits of renewable energy are undeniable and far-reaching.  By embracing renewable energy, we can create a cleaner, more sustainable, and more equitable future for all.  Continued investment in research, development, and deployment of renewable energy technologies is essential to realizing these benefits and addressing the urgent challenges of climate change and energy security.

**Key Points:**
- Reduced greenhouse gas emissions and improved air quality.
- Job creation, energy independence, and price stability.
- Improved public health, energy access, and community empowerment.

## Part 2: Web Search Integration

To create comprehensive reports, we need to gather information from the web. We'll implement a simple web search function that our AI can use.

In [7]:
def web_search(query: str, max_results: int = 5) -> List[Dict[str, str]]:
    """
    Perform a web search using DuckDuckGo
    
    Args:
        query: The search query
        max_results: Maximum number of results to return
    
    Returns:
        List of search results with title, body, and href
    """
    try:
        results = DDGS().text(query, max_results=max_results)
        return [
            {
                "title": r.get("title", ""),
                "snippet": r.get("body", ""),
                "url": r.get("href", "")
            }
            for r in results
        ]
    except Exception as e:
        print(f"Search error: {e}")
        return []

# Test the web search function
search_results = web_search("artificial intelligence applications 2024")
for i, result in enumerate(search_results[:3]):
    print(f"\n{i+1}. {result['title']}")
    print(f"   {result['snippet'][:100]}...")
    print(f"   URL: {result['url']}")


1. ArtificialAiming
   www.ArtificialAiming.net - The best website for quality cheats for games like GTA, BattleField, Call...
   URL: https://www.artificialaiming.net/forum/index.php

2. HWID spoofer - ArtificialAiming
   Mar 13, 2015 · Hallo wollte mal frageb ob es HWID spoofer schon gibt bevor ich den Cheat Kaufe geht ...
   URL: https://www.artificialaiming.net/forum/german-speaking-forum/98858-hwid-spoofer.html

3. ArtificialAiming
   Anthem Feb 21, 2019 - 2:37 AM - by HelioS A first build of our new Anthem cheat is now available to ...
   URL: https://www.artificialaiming.net/forum/


  results = DDGS().text(query, max_results=max_results)


## Part 3: Building a Simple Tool Calling System

Now let's implement a basic tool calling system that allows our LLM to use functions like web search.

In [8]:
# Define schemas for tool calling
class ToolCall(BaseModel):
    tool_name: str = Field(description="The name of the tool to call")
    arguments: Dict[str, Any] = Field(description="Arguments to pass to the tool")

class ToolResponse(BaseModel):
    tool_name: str
    result: Any
    success: bool
    error: Optional[str] = None

# Available tools
AVAILABLE_TOOLS = {
    "web_search": {
        "function": web_search,
        "description": "Search the web for information",
        "parameters": {
            "query": "The search query string",
            "max_results": "Maximum number of results (default: 5)"
        }
    }
}

def execute_tool(tool_call: ToolCall) -> ToolResponse:
    """
    Execute a tool call and return the result
    """
    try:
        if tool_call.tool_name not in AVAILABLE_TOOLS:
            return ToolResponse(
                tool_name=tool_call.tool_name,
                result=None,
                success=False,
                error=f"Unknown tool: {tool_call.tool_name}"
            )
        
        tool_func = AVAILABLE_TOOLS[tool_call.tool_name]["function"]
        result = tool_func(**tool_call.arguments)
        
        return ToolResponse(
            tool_name=tool_call.tool_name,
            result=result,
            success=True
        )
    except Exception as e:
        return ToolResponse(
            tool_name=tool_call.tool_name,
            result=None,
            success=False,
            error=str(e)
        )

In [12]:
def llm_with_tools(prompt: str, include_tool_results: bool = True) -> str:
    """
    Process a prompt that may require tool usage
    """
    # First, ask the LLM if it needs to use any tools
    tool_check_prompt = f"""
    You have access to the following tools:
    {json.dumps(AVAILABLE_TOOLS, indent=2)}
    
    User request: {prompt}
    
    Do you need to use any tools to answer this request? 
    If yes, specify which tool and with what arguments.
    Respond with a ToolCall object or say "NO_TOOLS_NEEDED" if you can answer directly.
    """
    
    response = ollama.chat(
        model='gemma3n',
        messages=[{'role': 'user', 'content': tool_check_prompt}]
    )
    
    response_text = response['message']['content']
    
    # Check if tools are needed
    if "NO_TOOLS_NEEDED" in response_text:
        # Answer directly without tools
        direct_response = ollama.chat(
            model='gemma3n',
            messages=[{'role': 'user', 'content': prompt}]
        )
        return direct_response['message']['content']
    
    # Try to parse tool call
    try:
        # Extract JSON from response
        import re
        json_match = re.search(r'\{.*\}', response_text, re.DOTALL)
        if json_match:
            tool_call_data = json.loads(json_match.group())
            tool_call = ToolCall(**tool_call_data)
            
            # Execute the tool
            tool_result = execute_tool(tool_call)
            
            if include_tool_results:
                # Generate final response with tool results
                final_prompt = f"""
                Original request: {prompt}
                
                Tool used: {tool_result.tool_name}
                Tool result: {json.dumps(tool_result.result, indent=2)}
                
                Please provide a comprehensive answer to the original request using the tool results.
                """
                
                final_response = ollama.chat(
                    model='gemma3n',
                    messages=[{'role': 'user', 'content': final_prompt}]
                )
                return final_response['message']['content']
            else:
                return json.dumps(tool_result.dict(), indent=2)
    except Exception as e:
        print(f"Error parsing tool call: {e}")
        # Fall back to direct response
        direct_response = ollama.chat(
            model='gemma3n',
            messages=[{'role': 'user', 'content': prompt}]
        )
        return direct_response['message']['content']

In [13]:
# Test the tool calling system
result = llm_with_tools("What are the latest developments in quantum computing in 2024?")
print(result[:500] + "...")

TypeError: Object of type function is not JSON serializable

## Part 4: Report Generation Pipeline

Now let's build the main report generation pipeline that combines all our components.

In [None]:
class ReportGenerator:
    def __init__(self, model_name: str = 'gemma3n'):
        self.model_name = model_name
        
    def research_topic(self, topic: str, num_searches: int = 3) -> List[Dict[str, Any]]:
        """
        Research a topic by performing multiple web searches
        """
        # Generate search queries
        query_prompt = f"""
        Generate {num_searches} different search queries to research the topic: "{topic}"
        Make the queries specific and diverse to cover different aspects.
        Return as a JSON list of strings.
        """
        
        response = ollama.chat(
            model=self.model_name,
            messages=[{'role': 'user', 'content': query_prompt}]
        )
        
        try:
            queries = json.loads(response['message']['content'])
        except:
            # Fallback to simple query generation
            queries = [
                topic,
                f"{topic} latest developments",
                f"{topic} challenges and opportunities"
            ]
        
        # Perform searches
        all_results = []
        for query in queries[:num_searches]:
            print(f"Searching for: {query}")
            results = web_search(query, max_results=3)
            all_results.extend(results)
            time.sleep(1)  # Be nice to the search API
        
        return all_results
    
    def generate_section(self, topic: str, section_title: str, research_data: List[Dict]) -> ReportSection:
        """
        Generate a specific section of the report
        """
        prompt = f"""
        Create a report section about: {section_title}
        Topic: {topic}
        
        Use the following research data:
        {json.dumps(research_data[:5], indent=2)}
        
        Make the content informative and well-structured.
        Include specific examples and data points from the research.
        """
        
        return generate_structured_output(prompt, ReportSection)
    
    def generate_report(self, topic: str, sections: List[str] = None) -> StructuredReport:
        """
        Generate a complete structured report on a topic
        """
        print(f"Generating report on: {topic}")
        
        # Default sections if not provided
        if sections is None:
            sections = [
                "Overview and Current State",
                "Key Developments and Trends",
                "Challenges and Opportunities",
                "Future Outlook"
            ]
        
        # Research the topic
        print("\nResearching topic...")
        research_data = self.research_topic(topic)
        
        # Generate sections
        report_sections = []
        for section_title in sections:
            print(f"\nGenerating section: {section_title}")
            section = self.generate_section(topic, section_title, research_data)
            report_sections.append(section)
        
        # Generate executive summary and conclusion
        summary_prompt = f"""
        Based on the following report sections about "{topic}", 
        create an executive summary that highlights the key findings.
        
        Sections:
        {json.dumps([s.dict() for s in report_sections], indent=2)}
        
        Make it concise but comprehensive.
        """
        
        summary_response = ollama.chat(
            model=self.model_name,
            messages=[{'role': 'user', 'content': summary_prompt}]
        )
        
        conclusion_prompt = f"""
        Based on the report about "{topic}", write a strong conclusion that:
        1. Summarizes the main findings
        2. Provides actionable insights
        3. Suggests areas for future research or attention
        """
        
        conclusion_response = ollama.chat(
            model=self.model_name,
            messages=[{'role': 'user', 'content': conclusion_prompt}]
        )
        
        # Create the final report
        report = StructuredReport(
            title=f"Comprehensive Report: {topic}",
            executive_summary=summary_response['message']['content'],
            sections=report_sections,
            conclusion=conclusion_response['message']['content']
        )
        
        return report

In [15]:
# Helper function to display report nicely
def display_report(report: StructuredReport):
    print(f"\n{'='*80}")
    print(f"\n{report.title}")
    print(f"\nGenerated: {report.generated_at}")
    print(f"\n{'='*80}")
    
    print(f"\n\nEXECUTIVE SUMMARY")
    print(f"\n{report.executive_summary}")
    
    for section in report.sections:
        print(f"\n\n{'-'*60}")
        print(f"\n{section.title.upper()}")
        print(f"\n{section.content}")
        
        if section.key_points:
            print(f"\n**Key Points:**")
            for point in section.key_points:
                print(f"  • {point}")
    
    print(f"\n\n{'-'*60}")
    print(f"\nCONCLUSION")
    print(f"\n{report.conclusion}")
    print(f"\n{'='*80}\n")

## Part 5: Example Applications

Let's use our report generator to create some example reports.

In [16]:
# Example 1: Technology Report
generator = ReportGenerator()
tech_report = generator.generate_report(
    "Artificial Intelligence in Healthcare 2024",
    sections=[
        "Current Applications in Healthcare",
        "Recent Breakthroughs and Innovations",
        "Regulatory and Ethical Considerations",
        "Market Analysis and Future Trends"
    ]
)

display_report(tech_report)

Generating report on: Artificial Intelligence in Healthcare 2024

Researching topic...


ResponseError: model "llama3.1" not found, try pulling it first

In [None]:
# Example 2: Market Research Report
market_report = generator.generate_report(
    "Electric Vehicle Market Analysis",
    sections=[
        "Market Size and Growth Projections",
        "Key Players and Competition",
        "Technology Trends and Innovations",
        "Investment Opportunities and Risks"
    ]
)

# Save to file
with open('ev_market_report.json', 'w') as f:
    json.dump(market_report.dict(), f, indent=2)
print("Report saved to ev_market_report.json")

## Part 6: Advanced Features

Let's add some advanced features to make our report generator more robust.

In [None]:
# Retry logic for failed generations
def generate_with_retry(func, max_retries=3, *args, **kwargs):
    """
    Retry a function call if it fails
    """
    for attempt in range(max_retries):
        try:
            return func(*args, **kwargs)
        except Exception as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            if attempt == max_retries - 1:
                raise
            time.sleep(2 ** attempt)  # Exponential backoff

# Section validation
def validate_section(section: ReportSection) -> bool:
    """
    Validate that a section meets quality standards
    """
    if len(section.content) < 100:
        print("Warning: Section content too short")
        return False
    
    if len(section.key_points) < 2:
        print("Warning: Too few key points")
        return False
    
    return True

# Enhanced report generator with validation
class EnhancedReportGenerator(ReportGenerator):
    def generate_section(self, topic: str, section_title: str, research_data: List[Dict]) -> ReportSection:
        """
        Generate a section with validation and retry logic
        """
        section = generate_with_retry(
            super().generate_section,
            max_retries=3,
            topic=topic,
            section_title=section_title,
            research_data=research_data
        )
        
        # Validate and regenerate if needed
        if not validate_section(section):
            print(f"Regenerating section: {section_title}")
            enhanced_prompt = f"""
            Create a DETAILED report section about: {section_title}
            Topic: {topic}
            
            Requirements:
            - Content must be at least 200 words
            - Include at least 3 key points
            - Use specific examples from the research data
            
            Research data:
            {json.dumps(research_data[:5], indent=2)}
            """
            section = generate_structured_output(enhanced_prompt, ReportSection)
        
        return section

In [None]:
# Custom report templates
REPORT_TEMPLATES = {
    "technical": {
        "sections": [
            "Technical Overview",
            "Architecture and Implementation",
            "Performance Analysis",
            "Security Considerations",
            "Integration Guidelines"
        ]
    },
    "business": {
        "sections": [
            "Market Overview",
            "Competitive Analysis",
            "Revenue Models",
            "Risk Assessment",
            "Strategic Recommendations"
        ]
    },
    "research": {
        "sections": [
            "Literature Review",
            "Methodology",
            "Key Findings",
            "Discussion",
            "Future Research Directions"
        ]
    }
}

def generate_templated_report(topic: str, template: str = "technical"):
    """
    Generate a report using a predefined template
    """
    if template not in REPORT_TEMPLATES:
        raise ValueError(f"Unknown template: {template}")
    
    generator = EnhancedReportGenerator()
    sections = REPORT_TEMPLATES[template]["sections"]
    
    return generator.generate_report(topic, sections)

In [None]:
# Example: Generate a technical report
technical_report = generate_templated_report(
    "Quantum Computing Applications",
    template="technical"
)

# Export to markdown
def export_to_markdown(report: StructuredReport, filename: str):
    """
    Export report to markdown format
    """
    with open(filename, 'w') as f:
        f.write(f"# {report.title}\n\n")
        f.write(f"*Generated: {report.generated_at}*\n\n")
        f.write(f"## Executive Summary\n\n{report.executive_summary}\n\n")
        
        for section in report.sections:
            f.write(f"## {section.title}\n\n")
            f.write(f"{section.content}\n\n")
            
            if section.key_points:
                f.write("### Key Points\n\n")
                for point in section.key_points:
                    f.write(f"- {point}\n")
                f.write("\n")
        
        f.write(f"## Conclusion\n\n{report.conclusion}\n")

export_to_markdown(technical_report, "quantum_computing_report.md")
print("Report exported to quantum_computing_report.md")

## Conclusion and Next Steps

We've built a complete structured report generation system using basic Python and Ollama. Here's what we accomplished:

1. **Structured Output**: Used Pydantic schemas to ensure consistent report formatting
2. **Web Search Integration**: Added the ability to research topics using DuckDuckGo
3. **Tool Calling**: Implemented a simple but effective tool calling system
4. **Report Pipeline**: Created a complete pipeline for generating multi-section reports
5. **Advanced Features**: Added validation, retry logic, and templates

### Ideas for Extension:

- Add more tools (Wikipedia search, arxiv papers, news APIs)
- Implement citation tracking and source verification
- Add support for generating charts and visualizations
- Create a web interface for the report generator
- Add support for different output formats (PDF, HTML)
- Implement collaborative report generation with multiple LLMs
- Add fact-checking and consistency validation

### Key Takeaways:

- Local LLMs can be powerful tools for structured content generation
- You don't need complex frameworks to build useful AI applications
- Combining web search with LLMs creates more accurate and up-to-date content
- Structured output ensures consistency and makes integration easier

Happy report generating! 🚀