# Laboratory Session: LangChain with OllamaLLM - Document Analyzer

**Session Breakdown:**

1. Setup and Environment Preparation (15 minutes)
2. LangChain and OllamaLLM Fundamentals (30 minutes)
3. Practical Implementation - Document Analyzer (45 minutes)
4. Lab Wrap-Up and Discussion (15 minutes)

**Resources**

- [Ollama Official Documentation](https://ollama.ai/docs)
- [LangChain Documentation](https://python.langchain.com/)

# LangChain Workshop: Environment Setup

### 1. Prerequisites and System Check

#### Python and Environment Preparation

**Create venv and activate** to do with VSC assistant

```bash
python -m venv langchain_workshop # Create a virtual environment
source langchain_workshop/bin/activate # Activate the virtual environment (On Unix or MacOS)
```

In [1]:
# Verify Python Installation
!python --version

Python 3.8.10


### 2. Library Installation

#### Install Required Libraries

In [2]:
!pip install --upgrade pip



In [None]:
# Install core libraries
!pip install \
    langchain \
    ollama \
    pypdf \
    transformers \
    sentence-transformers \
    chromadb \
    unstructured \
    langchain_community

### 3. Ollama and Model Setup

#### Verify Ollama Installation

In [None]:
# Verify Ollama is running
import ollama

# List available models
print(ollama.list())

#### Download Ollama Model

In [None]:
# Pull a suitable open-source model
!ollama pull llama3.2

### 4. Environment Verification Script

In [2]:
import sys
import ollama
import langchain

def check_environment():
    """Perform comprehensive environment check."""
    print("--- Environment Check ---\n")
    
    # Python version
    print(f"Python Version: {sys.version}")
    
    # Library versions
    print(f"LangChain Version: {langchain.__version__}\n")
    
    # Verify model availability
    try:
        models = ollama.list()
        print("Ollama Models Available:\n")
        for model in models['models']:
            print(f"- {model['model']}")
    except Exception as e:
        print(f"Error checking Ollama models: {e}")

In [3]:
# Run the check
check_environment()

--- Environment Check ---

Python Version: 3.8.10 (default, Nov  7 2024, 13:10:47) 
[GCC 9.4.0]
LangChain Version: 0.2.17

Ollama Models Available:

- llama3.2:latest


### 5. Troubleshooting Guide

In [4]:
def troubleshoot_ollama():
    """Display Ollama troubleshooting guidelines."""
    print("\n--- Ollama Troubleshooting Guide ---")
    print("Common Issues and Solutions:")
    
    issues = {
        "Installation": [
            "Ensure Ollama is installed",
            "Check system PATH",
            "Verify administrative permissions"
        ],
        "Connection": [
            "Check internet connectivity",
            "Verify firewall settings",
            "Restart Ollama service"
        ],
        "Model Download": [
            "Sufficient disk space",
            "Stable internet connection",
            "Use `ollama pull` command"
        ]
    }
    
    for category, solutions in issues.items():
        print(f"\n{category} Troubleshooting:")
        for solution in solutions:
            print(f"- {solution}")

# Display troubleshooting guide
troubleshoot_ollama()


--- Ollama Troubleshooting Guide ---
Common Issues and Solutions:

Installation Troubleshooting:
- Ensure Ollama is installed
- Check system PATH
- Verify administrative permissions

Connection Troubleshooting:
- Check internet connectivity
- Verify firewall settings
- Restart Ollama service

Model Download Troubleshooting:
- Sufficient disk space
- Stable internet connection
- Use `ollama pull` command


# LangChain and Large Language Models - Theoretical Foundation

## Table of Contents
1. Introduction to LangChain
2. Understanding Large Language Models
3. Core Components of LangChain
4. Working with Local LLMs
5. Design Patterns and Best Practices - NO

## Introduction to LangChain

### What is LangChain?

LangChain is a powerful framework that revolutionizes how developers work with Large Language Models (LLMs). At its core, it's an orchestration layer that helps you build sophisticated applications powered by AI.

#### Core Concepts

##### 1. Component Architecture
LangChain provides modular building blocks that you can combine:
- **Prompt Templates**: Design and standardize your interactions with LLMs
- **Model Interfaces**: Connect to various LLMs (like GPT, Ollama, etc.)
- **Memory Systems**: Manage conversation history and context
- **Data Connectors**: Interface with external data sources
- **Chains**: Combine components into processing pipelines

##### 2. Key Features

###### Chain Management
- Create sequences of operations
- Pass data between components
- Handle errors and retries
- Manage state and context

###### Data Handling
- Process various data formats
- Transform inputs and outputs
- Cache responses
- Handle streaming

###### Integration Capabilities
- Connect to multiple LLM providers
- Interface with databases
- Work with document formats
- Support vector stores

<img src="imgs/key-features-chart.svg" width="1000" style="display: block; margin: 0 auto" >

#### Real-World Applications

LangChain enables you to build:
1. **Question-Answering Systems**
   - Process documents
   - Generate accurate responses
   - Maintain context

2. **Chatbots**
   - Handle conversations
   - Remember user interactions
   - Process complex queries

3. **Document Analysis**
   - Extract information
   - Generate summaries
   - Answer specific questions

4. **Code Analysis**
   - Process source code
   - Generate documentation
   - Explain functionality

#### Benefits

##### For Developers
- Reduced development time
- Standardized interfaces
- Built-in best practices
- Extensive documentation

##### For Applications
- Better scalability
- Improved reliability
- Consistent behavior
- Enhanced performance

This comprehensive framework simplifies complex LLM implementations while providing the flexibility needed for sophisticated applications.

## Understanding Large Language Models

### What are Large Language Models?

Large Language Models (LLMs) are sophisticated artificial intelligence systems that can understand, generate, and manipulate human language. Think of them as incredibly advanced pattern recognition systems that have been trained on massive amounts of text from books, websites, and other written materials.

### How do LLMs Work?

#### The Training Process
1. Data Collection
   LLMs start by "reading" enormous amounts of text data – billions of words from sources like books, websites, articles, and social media. This gives them exposure to how humans use language in various contexts.

2. Pattern Recognition
   During training, LLMs learn to recognize patterns in how words and phrases are used together. They develop an understanding of grammar, context, and common relationships between concepts.

3. Neural Networks
   The "brain" of an LLM is a neural network – a complex system of interconnected nodes inspired by how human brains work. These networks learn to process language by adjusting the strength of connections between nodes based on the patterns they observe.

<img src="imgs/LLM-training-process.png" width="1000" style="display: block; margin: 0 auto">

### Key Characteristics

#### 1. Understanding Context
LLMs don't just look at individual words; they consider the entire context of a conversation or text. This allows them to:
- Understand nuanced meanings
- Recognize subtle differences in similar phrases
- Maintain coherent conversations across multiple exchanges

#### 2. Generation Capabilities
These models can:
- Write text that sounds natural and human-like
- Adapt their writing style to different formats (emails, stories, technical documents)
- Translate between languages
- Summarize long texts into shorter versions

#### 3. Knowledge Base
Through their training, LLMs develop a broad knowledge base that includes:
- Facts about the world
- Common sense understanding
- Basic reasoning abilities
- Cultural references and contexts

### What LLMs Can't Do

- They don't truly "understand" in the way humans do
- They can't learn from conversations in real-time
- They don't have genuine emotions or consciousness
- They can sometimes make mistakes or generate incorrect information

### Types of LLMs
1. **Cloud-based Models**
   - OpenAI GPT Series
   - Anthropic Claude
   - Google PaLM

2. **Local Models**
   - Ollama Models
   - Llama 2
   - GPT4All
   - LocalAI

### Comparing Local vs. Cloud LLMs

| Aspect | Local LLMs | Cloud LLMs |
|--------|------------|------------|
| Privacy | High | Depends on Provider |
| Cost | One-time/Free | Pay-per-use |
| Latency | Hardware Dependent | Network Dependent |
| Setup Complexity | Higher | Lower |
| Customization | More Flexible | Limited |

## Core Components of LangChain

### 1. Chains

Chains are sequences of operations that:
- Process inputs systematically
- Combine multiple components
- Manage state and memory
- Handle errors and edge cases

Example Chain Structure:
```python
input → Prompt Template → LLM → Output Parser → final output
```

**Basic Chain Syntax**
```python
chain = prompt | self.llm
response = chain.invoke({
    'context': context,
    'question': question
})
```

#### Step-by-Step Breakdown

##### 1. Chain Creation (`chain = prompt | self.llm`)
- The `|` (pipe) operator creates a sequential chain
- Purpose: "Take the output of the prompt template and feed it into the LLM"
- Similar to Unix pipes: output of one command becomes input to another
- Under the hood process:
  1. Format prompt template with variables
  2. Send formatted prompt to LLM for processing

##### 2. Prompt Template Definition
```python
prompt = PromptTemplate.from_template(
    "Invoice Data:\n{context}\n\n"
    "Question: {question}\n\n"
    "Provide a detailed, data-driven answer."
)
```
- Template contains two variables: `{context}` and `{question}`
- Variables are filled with actual values during invocation

##### 3. LLM Configuration
```python
self.llm = Ollama(model='mistral', temperature=0.1)
```

##### 4. Chain Invocation
```python
response = chain.invoke({
    'context': context,
    'question': question
})
```
Process flow:
1. Takes input dictionary
2. Formats prompt template
3. Sends to Ollama
4. Returns response

#### Example Usage

```python
# Context and question
context = "Invoice #1: $500, Invoice #2: $300"
question = "What's the total amount?"

# Generated prompt
formatted_prompt = """
Invoice Data:
Invoice #1: $500, Invoice #2: $300

Question: What's the total amount?

Provide a detailed, data-driven answer.
"""

chain = prompt | self.llm
response = chain.invoke({'context': context, 'question': question})

# Example response
response = "The total amount across the two invoices is $800, calculated by adding Invoice #1 ($500) and Invoice #2 ($300)."
```

### 2. Prompts

Prompts in LangChain are sophisticated templates that shape how LLMs understand and respond to requests. Think of them as smart instruction sets that combine fixed text with dynamic variables to generate consistent and targeted responses.

Prompts are structured templates that:
- Guide LLM behavior
- Include context and instructions
- Handle variable substitution
- Enforce output formats

#### Core Components of a Prompt

##### 1. Base Template
```python
from langchain.prompts import PromptTemplate

template = """
Given the following information about a customer:
Name: {customer_name}
Purchase History: {purchase_history}

Please provide a personalized product recommendation.
Focus on their buying patterns and potential needs.
"""
```

##### 2. Variable Placeholders
- Enclosed in curly braces: `{variable_name}`
- Dynamically filled at runtime
- Can include multiple variables
- Support complex data structures

##### 3. Instructions
```python
prompt = PromptTemplate.from_template(
    """
    Please analyze this financial data:
    {financial_data}
    
    Instructions:
    1. Identify key trends
    2. Calculate growth rates
    3. Flag any anomalies
    4. Provide actionable insights
    
    Format your response as bullet points.
    """
)
```

#### Types of Prompts

##### 1. Simple Prompts
```python
basic_prompt = PromptTemplate(
    template="What is the capital of {country}?",
    input_variables=["country"]
)
```

##### 2. Structured Prompts
```python
analysis_prompt = PromptTemplate(
    template="""
    Context: {context}
    Question: {question}
    
    Provide a detailed answer that:
    - References specific data points
    - Explains the reasoning
    - Includes relevant examples
    """,
    input_variables=["context", "question"]
)
```

##### 3. Few-Shot Prompts
```python
few_shot_prompt = PromptTemplate(
    template="""
    Example 1:
    Input: High fever, cough
    Output: Possible flu, seek medical attention
    
    Example 2:
    Input: Headache, fatigue
    Output: Could be stress, rest recommended
    
    New Case:
    Input: {symptoms}
    Output:
    """,
    input_variables=["symptoms"]
)
```

#### Output Control

##### 1. Format Specification
```python
formatted_prompt = PromptTemplate(
    template="""
    Analyze this sales data: {data}
    
    Return the analysis in the following JSON format:
    {
        "total_sales": number,
        "top_products": [string],
        "growth_rate": percentage
    }
    """,
    input_variables=["data"]
)
```

##### 2. Response Guidelines
```python
guided_prompt = PromptTemplate(
    template="""
    Review this code: {code}
    
    Provide feedback in these categories:
    1. Security Issues
    2. Performance Optimizations
    3. Code Style
    4. Potential Bugs
    
    For each category, list specific findings and recommendations.
    """,
    input_variables=["code"]
)
```

#### Best Practices

##### 1. Clarity
- Use clear, specific instructions
- Break down complex tasks
- Provide examples when needed

##### 2. Structure
- Organize information logically
- Use formatting for readability
- Include section headers

##### 3. Variables
- Use descriptive variable names
- Validate input data
- Handle missing values

##### 4. Output Control
- Specify desired format
- Include validation criteria
- Set clear expectations

**Practical Example**
```python
invoice_analysis_prompt = PromptTemplate(
    template="""
    Analyze this invoice data:
    {invoice_data}
    
    Provide:
    1. Total revenue calculation
    2. Payment status summary
    3. Customer payment patterns
    4. Risk assessment
    
    Format as a business report with sections and bullet points.
    Include specific numbers and percentages.
    Flag any concerning patterns.
    """,
    input_variables=["invoice_data"]
)
```
**Remember**: A well-crafted prompt is the foundation of reliable and accurate LLM responses. Take time to design prompts that clearly communicate your requirements and constraints.

### 3. Memory

Memory systems in LangChain:
- Store conversation history
- Maintain context
- Handle token limitations
- Enable stateful interactions

### 4. Output Parsers

Output parsers:
- Structure LLM responses
- Validate outputs
- Transform data formats
- Handle errors

## Working with Local LLMs: Ollama

#### What is Ollama?
Ollama is an open-source tool that simplifies running Large Language Models (LLMs) locally on your machine. It provides:
- Easy model management
- Local execution without cloud dependencies
- Command-line and API interfaces
- Support for multiple models
- Custom model support
- API compatibility

#### How Ollama Works

##### 1. Model Management
```bash
# List available models
ollama list

# Pull a model
ollama pull mistral

# Remove a model
ollama rm mistral
```

##### 2. Model Storage
- Models are stored locally at:
  - macOS: `~/.ollama/models`
  - Linux: `~/.ollama/models`
  - Windows: `C:\Users\[User]\.ollama\models`

##### 3. Available Models
Popular models include:
- Mistral
- Llama 2
- CodeLlama
- Phi-2
- Neural Chat

#### Using Ollama

##### 1. Command Line Interface
```bash
# Basic usage
ollama run mistral "What is the capital of France?"

# Interactive chat
ollama run mistral
```

##### 2. Python Integration
```python
import ollama

# Create a conversation
response = ollama.chat(model='mistral', 
    messages=[
        {
            'role': 'user',
            'content': 'What is the capital of France?'
        }
    ]
)

print(response['message']['content'])
```

##### 3. Model Parameters
```python
# Configure model behavior
response = ollama.generate(
    model='mistral',
    prompt='Write a poem about AI',
    temperature=0.7,
    top_k=50,
    top_p=0.95,
    num_predict=100
)
```

#### Integration with LangChain

```python
from langchain_community.llms import Ollama

# Initialize Ollama LLM
llm = Ollama(model="mistral")

# Simple generation
response = llm.invoke("Write a hello world program in Python")

# With parameters
llm = Ollama(
    model="mistral",
    temperature=0.1,
    num_ctx=4096
)
```

#### Troubleshooting

##### Common Issues
1. **Model Not Found**
```bash
ollama pull mistral  # Re-download model
```

2. **Service Not Running**
```bash
ollama serve  # Start service
```

3. **Memory Issues**
- Reduce context length
- Close unnecessary applications
- Consider smaller models

##### System Requirements
- Minimum 8GB RAM
- 4GB free disk space per model
- x86_64 or ARM64 processor

## Design Patterns and Best Practices NO

### 1. Prompt Engineering Patterns
- **Zero-shot Learning**: No examples needed
- **Few-shot Learning**: Include examples in prompt
- **Chain-of-Thought**: Break down complex reasoning
- **Self-Consistency**: Multiple passes for verification

### 2. Error Handling
```python
try:
    response = chain.invoke(input_data)
except LLMError:
    # Handle model errors
except ChainError:
    # Handle chain execution errors
except ParseError:
    # Handle output parsing errors
```

### 3. Performance Optimization
- Cache frequently used results
- Batch similar requests
- Implement retry mechanisms
- Monitor token usage

### 4. Security Considerations
1. Input Validation
2. Output Sanitization
3. Rate Limiting
4. Access Control

## Further Reading

1. LangChain Documentation
   - [Python Documentation](https://python.langchain.com/)
   - [JavaScript Documentation](https://js.langchain.com/)

2. Ollama Resources
   - [Official Documentation](https://ollama.ai/docs)
   - [Model Library](https://ollama.ai/library)

3. Related Topics
   - Prompt Engineering
   - Vector Databases
   - Embeddings
   - RAG (Retrieval Augmented Generation)

# LangChain Invoice Analyzer Exercise

## Objective
Create an invoice analysis tool that can:
- Load JSON invoice data
- Perform financial analytics
- Generate insights
- Answer specific questions about invoices


## Sample Invoice JSON Structure
```json
[
    {
        "invoice_id": "INV-2024-001",
        "customer_name": "TechCorp Solutions",
        "date": "2024-01-15",
        "total_amount": 5750.25,
        "items": [
            {"name": "Software License", "quantity": 10, "unit_price": 450.00},
            {"name": "Cloud Services", "quantity": 5, "unit_price": 250.50}
        ],
        "payment_status": "Paid",
        "tax_rate": 0.18
    },
    {
        ...
    }
]
```

## Task 1: Implement AI-Powered Financial Insights Generator

Given the class AdvancedInvoiceAnalyzer, create a method that generates AI-powered insights from invoice data using LangChain and Ollama. The method should compile financial data and use an LLM to provide business analysis.

### Method Structure
```python
def ai_powered_insights(self):
    """Your implementation here"""
```

### The method should:
1. Combine data from three existing methods:
   - `financial_summary()`
   - `generate_tax_analysis()`
   - `item_level_analysis()`

2. Create a structured context string containing:
   - Financial Overview
   - Tax Analysis
   - Top Items Analysis

3. Use LangChain's PromptTemplate to create an analysis prompt

4. Return AI-generated insights using the LLM

### Expected Format of Context String
```text
Financial Overview:
- Total Invoices: [number]
- Total Revenue: $[amount]
- Payment Status: [breakdown]

Tax Analysis:
- Total Tax Collected: $[amount]
- Average Tax Rate: [percentage]

Top Items:
[item analysis data]
```

### Steps to Complete
1. Collect data from existing analysis methods
2. Format the context string with f-strings
3. Create a PromptTemplate for analysis
4. Set up LangChain chain
5. Return the insights

## Task 2: Implement Interactive Query Handler

### Your Task
Create a method that allows users to ask questions about invoice data and receive AI-generated answers. The method should take a user question, combine it with invoice data context, and use an LLM to generate a response.

### Method Structure
```python
def interactive_query(self, question):
    """Your implementation here"""
```

### Method Parameters
- `question`: String containing the user's query about the invoices

### The method should:
1. Create a prompt template with context and question
2. Generate a response using the LLM

### Expected Prompt Format
```text
Invoice Data:
[dataframe contents]

Question: [user's question]

Provide a detailed, data-driven answer. 
If the question cannot be directly answered, explain why.
```

### Steps to Complete
1. Create PromptTemplate with two variables:
   - `context`
   - `question`
2. Set up LangChain chain
3. Return the answer

### Sample Usage
```python
# Example usage
analyzer = AdvancedInvoiceAnalyzer(data)
response = analyzer.interactive_query("What is the total revenue?")
```

## Your implementation

In [None]:
!pip install pandas

In [6]:
# Import Required Libraries
import json
import pandas as pd
from langchain.llms import Ollama
from langchain.prompts import PromptTemplate
from langchain.schema.runnable import RunnablePassthrough

In [None]:
class AdvancedInvoiceAnalyzer:
    def __init__(self, invoices_data, model):
        """
        Initialize the invoice analyzer with comprehensive data processing.
        
        :param invoices_data: List of invoice dictionaries
        :param model: Name of the model to use
        """
        # Convert to DataFrame with expanded item details
        self.df = self._prepare_dataframe(invoices_data)
        
        # Setup Ollama Language Model
        self.llm = Ollama(model=model, temperature=0.1)
    
    def _prepare_dataframe(self, invoices_data):
        """
        Prepare a comprehensive DataFrame with expanded invoice details.
        
        :param invoices_data: List of invoice dictionaries
        :return: Processed pandas DataFrame
        """
        # Flatten the invoices to include item-level details
        flattened_invoices = []
        for invoice in invoices_data:
            base_invoice = invoice.copy()
            for item in invoice['items']:
                invoice_item = base_invoice.copy()
                invoice_item.update(item)
                invoice_item['item_total'] = item['quantity'] * item['unit_price']
                flattened_invoices.append(invoice_item)
        
        return pd.DataFrame(flattened_invoices)
    
    def financial_summary(self):
        """
        Generate comprehensive financial summary.
        
        :return: Dictionary of financial insights
        """
        summary = {
            'total_invoices': len(self.df['invoice_id'].unique()),
            'total_revenue': self.df['total_amount'].sum(),
            'average_invoice_value': self.df['total_amount'].mean(),
            'payment_status_breakdown': self.df['payment_status'].value_counts().to_dict(),
            'top_customers': (self.df.groupby('customer_name')['total_amount']
                               .sum()
                               .nlargest(3)
                               .to_dict())
        }
        return summary
    
    def generate_tax_analysis(self):
        """
        Analyze tax implications across invoices.
        
        :return: Detailed tax analysis
        """
        tax_summary = {
            'total_tax_collected': (self.df['total_amount'] * self.df['tax_rate']).sum(),
            'average_tax_rate': self.df['tax_rate'].mean(),
            'tax_rate_distribution': self.df['tax_rate'].value_counts().to_dict()
        }
        return tax_summary
    
    def item_level_analysis(self):
        """
        Perform detailed analysis of invoice items.
        
        :return: Comprehensive item-level insights
        """
        item_summary = (self.df.groupby('name').agg({
            'quantity': 'sum',
            'item_total': 'sum',
            'unit_price': 'mean'
        }).sort_values('item_total', ascending=False))
        
        return item_summary
    
    def ai_powered_insights(self):
        """
        Generate AI-powered natural language insights.
        
        :return: Conversational financial analysis
        """

        summary = self.financial_summary()
        tax = self.generate_tax_analysis()
        item_level_analysis = self.item_level_analysis()
        
        # Prepare comprehensive context
        context = f"""
        Financial Overview:
        - Total Invoices: {summary['total_invoices']}
        - Total Revenue: ${summary['total_revenue']}
        - Payment Status: {summary['payment_status_breakdown']}
        
        Tax Analysis:
        - Total Tax Collected: ${tax['total_tax_collected']}
        - Average Tax Rate: {tax['average_tax_rate']}
        
        Top Items:
        {item_level_analysis.to_string()}
        """
        
        # Create a prompt template for generating insights
        prompt = PromptTemplate.from_template(
            "Provide a detailed, data-driven answer. "
            "Focus on the financial aspects, potential risks and "
            "strategic opportunities:\n\n{context}")
        
        # Create a chain with the LLM
        chain = prompt | self.llm
        return chain.invoke({ 'context': context })
    
    def interactive_query(self, question):
        """
        Answer specific questions about the invoices.
        
        :param question: User's specific query
        :return: Analytical response
        """
        # Prepare context with all invoice details
        context = self.df.to_string()
        
        # Create a flexible query prompt
        prompt = PromptTemplate.from_template(
            "Invoice Data:\n{context}\n\n"
            "Question: {question}\n\n"
            "Provide a detailed, data-driven answer. "
            "If the question cannot be directly answered, explain why."
        )
        
        # Create a chain with the LLM
        chain = prompt | self.llm
        return chain.invoke({
            'context': context,
            'question': question
        })

## Test Your implementation

In [16]:
# Load invoices from JSON file
with open('data/invoice-data.json', 'r') as file:
    invoices = json.load(file)

# Initialize the analyzer
analyzer = AdvancedInvoiceAnalyzer(invoices, model='llama3.2')

In [17]:
# Financial Summary
financial_summary = analyzer.financial_summary()

print("Financial Summary:")
print(json.dumps(financial_summary, indent=2))

Financial Summary:
{
  "total_invoices": 13,
  "total_revenue": 200853.0,
  "average_invoice_value": 7725.115384615385,
  "payment_status_breakdown": {
    "Paid": 12,
    "Pending": 8,
    "Overdue": 6
  },
  "top_customers": {
    "Smart Solutions Group": 31561.0,
    "Tech Solutions Pro": 25500.0,
    "Software Experts Inc": 22900.0
  }
}


In [18]:
# Tax Analysis
tax_analysis = analyzer.generate_tax_analysis()
print("Tax Analysis:")
print(json.dumps(tax_analysis, indent=2))

Tax Analysis:
{
  "total_tax_collected": 37378.84,
  "average_tax_rate": 0.18,
  "tax_rate_distribution": {
    "0.18": 6,
    "0.15": 6,
    "0.2": 4,
    "0.21": 2,
    "0.16": 2,
    "0.19": 2,
    "0.17": 2,
    "0.22": 2
  }
}


In [19]:
# Item-Level Analysis
item_level_analysis = analyzer.item_level_analysis()
print("Item-Level Analysis:")
print(item_level_analysis)

Item-Level Analysis:
                        quantity  item_total  unit_price
name                                                    
Software License              30    14500.00      475.00
AI Implementation              1    12000.00    12000.00
Analytics Dashboard            1     8000.00     8000.00
Custom Development            50     7500.00      150.00
Enterprise Package             3     7500.00     2500.00
Security Audit                 1     6500.00     6500.00
Project Management            25     5250.00      210.00
App Development                1     5000.00     5000.00
Server Maintenance            12     4800.00      400.00
Data Migration                 1     3780.50     3780.50
Cloud Storage Plus             5     3750.00      750.00
Implementation Support        15     3450.00      230.00
Consulting Hours              16     3200.00      200.00
Network Setup                  1     3000.00     3000.00
Website Hosting               10     2500.00      250.00
Firewall S

In [20]:
item_level_analysis

Unnamed: 0_level_0,quantity,item_total,unit_price
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Software License,30,14500.0,475.0
AI Implementation,1,12000.0,12000.0
Analytics Dashboard,1,8000.0,8000.0
Custom Development,50,7500.0,150.0
Enterprise Package,3,7500.0,2500.0
Security Audit,1,6500.0,6500.0
Project Management,25,5250.0,210.0
App Development,1,5000.0,5000.0
Server Maintenance,12,4800.0,400.0
Data Migration,1,3780.5,3780.5


In [21]:
# AI-Powered Insights
ai_powered_insights = analyzer.ai_powered_insights()
print("\nAI-Powered Insights:")
print(ai_powered_insights)


AI-Powered Insights:
**Financial Analysis and Strategic Opportunities**

Based on the provided data, here's a detailed analysis of the financial aspects, potential risks, and strategic opportunities:

**Revenue Breakdown:**

1. **Software License**: $14500.00 (30 units) @ $475.00/unit = $14250.00
2. **AI Implementation**: $12000.00 (1 unit)
3. **Analytics Dashboard**: $8000.00 (1 unit)
4. **Custom Development**: $7500.00 (50 units) @ $150.00/unit = $7500.00
5. **Enterprise Package**: $7500.00 (3 units) @ $2500.00/unit = $7500.00
6. **Security Audit**: $6500.00 (1 unit)
7. **Project Management**: $5250.00 (25 units) @ $210.00/unit = $52625.00
8. **App Development**: $5000.00 (1 unit)
9. **Server Maintenance**: $4800.00 (12 units) @ $400.00/unit = $57600.00
10. **Data Migration**: $3780.50 (1 unit)
11. **Cloud Storage Plus**: $3750.00 (5 units) @ $750.00/unit = $18750.00
12. **Implementation Support**: $3450.00 (15 units) @ $230.00/unit = $52625.00
13. **Consulting Hours**: $3200.00 (16

In [23]:
# Interactive query
print("Interactive Query:")
question = input()
print("i'm thinking...")
answer = analyzer.interactive_query(question)
print(answer)

Interactive Query:
i'm thinking...
To determine the number of invoices, we can analyze the provided data.

The data consists of 25 rows, each representing an invoice with its corresponding details such as invoice ID, client name, date, total amount, and status.

Upon reviewing the data, it appears that there are indeed 25 separate invoices listed. Each row represents a unique invoice, and the number of rows directly corresponds to the number of invoices.

Therefore, based on this analysis, I can confidently conclude that you have **25** invoices.
