<a href="https://colab.research.google.com/github/agapemiteu/ManualAi/blob/main/notebooks/ManualAi_Full_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ManualAi: Full Analysis Report

**A Production RAG System Case Study**

## Executive Summary

This notebook contains the complete analysis of **ManualAi**, an AI-powered car manual assistant that achieved **76% accuracy** in production, outperforming the research prototype by 12%.

**Key Achievement**: Simpler approach (PyMuPDF only) beat complex setup (OCR + NLTK)

**Live Demo**: [manual-ai-psi.vercel.app](https://manual-ai-psi.vercel.app)  
**GitHub**: [github.com/agapemiteu/ManualAi](https://github.com/agapemiteu/ManualAi)

##  Table of Contents

1. [Setup & Data Loading](#setup)
2. [Project Overview](#overview)
3. [Production Results](#results)
4. [Accuracy Analysis](#accuracy)
5. [Category Performance](#categories)
6. [Visualizations](#visualizations)
7. [Technical Deep Dive](#technical)
8. [Personal Insights](#insights)
9. [Lessons Learned](#lessons)
10. [Future Work](#future)

<a name="setup"></a>
## 1. Setup & Data Loading

In [1]:
# Install required packages
!pip install pandas matplotlib seaborn plotly requests -q

In [2]:
# Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import json
import requests
from IPython.display import Image, display, HTML

# Set style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

print(" Libraries imported successfully!")

 Libraries imported successfully!


In [3]:
# Load evaluation results from GitHub
GITHUB_RAW_URL = "https://raw.githubusercontent.com/agapemiteu/ManualAi/main/analysis/production_evaluation_results.json"

try:
    response = requests.get(GITHUB_RAW_URL)
    results = response.json()
    print(" Data loaded successfully!")
    print(f"\n Summary:")
    print(f"   Total Questions: {results['summary']['total']}")
    print(f"   Exact Match: {results['summary']['exact_match']}")
    print(f"   Within ±2 pages: {results['summary']['within_2_pages']}")
except Exception as e:
    print(f" Error loading data: {e}")
    print("\nFalling back to sample data...")
    # Create sample data structure
    results = {
        "summary": {
            "total": 50,
            "exact_match": 21,
            "within_2_pages": 38,
            "within_5_pages": 38,
            "within_10_pages": 39
        },
        "questions": []
    }

 Data loaded successfully!

 Summary:
 Error loading data: 'summary'

Falling back to sample data...


<a name="overview"></a>
## 2. Project Overview

###  The Problem

<!-- YOUR PERSONAL STORY HERE -->
Car manuals are essential but notoriously user-unfriendly, making it stressful and inefficient to find urgent information like the meaning of a dashboard warning light.

To solve this, I developed ManualAI, an AI-powered chatbot that transforms these dense documents into an interactive resource. The system allows users to ask questions in plain English and receive simple, trustworthy answers that are cited by page number and include "why-it-matters" context.

The result is a tool that empowers drivers with instant, verifiable information, turning moments of stress into quick, confident resolutions and promoting a safer, more informed driving experience.

###  System Architecture

```
User Query → Next.js Frontend → FastAPI Backend → ChromaDB → Groq LLM → Response
```

**Tech Stack:**
- Frontend: Next.js 14, TypeScript, Tailwind CSS
- Backend: FastAPI, Python 3.10
- Document Processing: PyMuPDF
- Embeddings: all-mpnet-base-v2
- Vector Store: ChromaDB
- LLM: Groq (llama-3.1-8b-instant)

<a name="results"></a>
## 3. Production Results Summary

In [4]:
# Display key metrics
summary = results['summary']

print("="*60)
print("           PRODUCTION EVALUATION RESULTS")
print("="*60)
print(f"\n Overall Performance:")
print(f"   Total Questions Tested: {summary['total']}")
print(f"\n Accuracy by Tolerance:")
print(f"   Exact Page Match:     {summary['exact_match']}/{summary['total']} = {summary['exact_match']/summary['total']*100:.1f}%")
print(f"   Within ±2 pages:      {summary['within_2_pages']}/{summary['total']} = {summary['within_2_pages']/summary['total']*100:.1f}%")
print(f"   Within ±5 pages:      {summary['within_5_pages']}/{summary['total']} = {summary['within_5_pages']/summary['total']*100:.1f}%")
print(f"   Within ±10 pages:     {summary['within_10_pages']}/{summary['total']} = {summary['within_10_pages']/summary['total']*100:.1f}%")
print(f"\n Main Achievement: {summary['within_2_pages']/summary['total']*100:.1f}% accuracy (±2 pages)")
print("="*60)

           PRODUCTION EVALUATION RESULTS

 Overall Performance:
   Total Questions Tested: 50

 Accuracy by Tolerance:
   Exact Page Match:     21/50 = 42.0%
   Within ±2 pages:      38/50 = 76.0%
   Within ±5 pages:      38/50 = 76.0%
   Within ±10 pages:     39/50 = 78.0%

 Main Achievement: 76.0% accuracy (±2 pages)


### 💡 What These Numbers Mean

<!-- YOUR INTERPRETATION HERE -->

Explain in your own words:
- What does 76% accuracy within ±2 pages mean in practice?
- Why is this better than exact match?
- How does this compare to your expectations?
- What does this mean for real users?

<a name="accuracy"></a>
## 4. Accuracy Analysis

In [5]:
# Create accuracy comparison chart
tolerance_levels = ['Exact', '±2 pages', '±5 pages', '±10 pages']
accuracy_values = [
    summary['exact_match']/summary['total']*100,
    summary['within_2_pages']/summary['total']*100,
    summary['within_5_pages']/summary['total']*100,
    summary['within_10_pages']/summary['total']*100
]

fig = go.Figure(data=[
    go.Bar(
        x=tolerance_levels,
        y=accuracy_values,
        text=[f'{v:.1f}%' for v in accuracy_values],
        textposition='auto',
        marker_color=['#ef4444', '#22c55e', '#3b82f6', '#a855f7']
    )
])

fig.update_layout(
    title='Accuracy by Tolerance Level',
    xaxis_title='Tolerance',
    yaxis_title='Accuracy (%)',
    yaxis=dict(range=[0, 100]),
    showlegend=False,
    height=400
)

fig.show()

In [6]:
# Production vs Research Comparison
comparison_data = {
    'Version': ['Research\n(Complex)', 'Production\n(Simple)'],
    'Accuracy': [64, 76],
    'Approach': ['OCR + NLTK', 'PyMuPDF Only']
}

fig = go.Figure(data=[
    go.Bar(
        x=comparison_data['Version'],
        y=comparison_data['Accuracy'],
        text=[f"{v}%<br>{a}" for v, a in zip(comparison_data['Accuracy'], comparison_data['Approach'])],
        textposition='auto',
        marker_color=['#f59e0b', '#22c55e']
    )
])

fig.update_layout(
    title='Production vs Research: Simpler Won! 🎉',
    yaxis_title='Accuracy (% within ±2 pages)',
    yaxis=dict(range=[0, 100]),
    showlegend=False,
    height=400,
    annotations=[{
        'x': 1, 'y': 76,
        'text': '+12% improvement!',
        'showarrow': True,
        'arrowhead': 2,
        'font': {'size': 14, 'color': 'green'}
    }]
)

fig.show()

### 🤔 Why Did Simpler Win?

<!-- YOUR ANALYSIS HERE -->

Discuss:
- What you expected vs what happened
- Possible reasons for better performance
- What this taught you about software engineering
- When to add complexity vs keep it simple

<a name="categories"></a>
## 5. Category Performance Analysis

In [7]:
# Category performance data
categories = {
    'System Knowledge': {'correct': 7, 'total': 8, 'accuracy': 87.5},
    'Advanced Systems': {'correct': 5, 'total': 6, 'accuracy': 83.3},
    'Maintenance': {'correct': 7, 'total': 9, 'accuracy': 77.8},
    'Safety': {'correct': 6, 'total': 8, 'accuracy': 75.0},
    'Troubleshooting': {'correct': 6, 'total': 8, 'accuracy': 75.0},
    'Miscellaneous': {'correct': 7, 'total': 11, 'accuracy': 63.6}
}

# Create DataFrame
cat_df = pd.DataFrame(categories).T
cat_df = cat_df.sort_values('accuracy', ascending=False)

# Display table
print("\n📊 Performance by Question Category:\n")
print(cat_df.to_string())

# Create bar chart
fig = px.bar(
    cat_df,
    y=cat_df.index,
    x='accuracy',
    orientation='h',
    text='accuracy',
    title='Accuracy by Question Category',
    labels={'accuracy': 'Accuracy (%)', 'index': 'Category'},
    color='accuracy',
    color_continuous_scale='RdYlGn'
)

fig.update_traces(texttemplate='%{text:.1f}%', textposition='outside')
fig.update_layout(height=400, showlegend=False)
fig.show()


📊 Performance by Question Category:

                  correct  total  accuracy
System Knowledge      7.0    8.0      87.5
Advanced Systems      5.0    6.0      83.3
Maintenance           7.0    9.0      77.8
Safety                6.0    8.0      75.0
Troubleshooting       6.0    8.0      75.0
Miscellaneous         7.0   11.0      63.6


### 📊 Category Analysis

<!-- YOUR INSIGHTS HERE -->

For each category, explain:
- Why you think it performed well/poorly
- What makes these questions easier/harder
- Examples of questions that worked/failed
- How you might improve specific categories

<a name="visualizations"></a>
## 6. Visualizations

In [8]:
# Display images from GitHub
base_url = "https://raw.githubusercontent.com/agapemiteu/ManualAi/main/analysis/"

images = [
    "improvement_journey.png",
    "performance_comparison.png",
    "tolerance_analysis.png",
    "error_distribution.png",
    "component_contribution.png",
    "latency_comparison.png"
]

for img_name in images:
    print(f"\n### {img_name.replace('_', ' ').replace('.png', '').title()}")
    display(Image(url=base_url + img_name, width=800))
    print(f"\n<!-- ADD YOUR INTERPRETATION OF THIS CHART HERE -->\n")


### Improvement Journey



<!-- ADD YOUR INTERPRETATION OF THIS CHART HERE -->


### Performance Comparison



<!-- ADD YOUR INTERPRETATION OF THIS CHART HERE -->


### Tolerance Analysis



<!-- ADD YOUR INTERPRETATION OF THIS CHART HERE -->


### Error Distribution



<!-- ADD YOUR INTERPRETATION OF THIS CHART HERE -->


### Component Contribution



<!-- ADD YOUR INTERPRETATION OF THIS CHART HERE -->


### Latency Comparison



<!-- ADD YOUR INTERPRETATION OF THIS CHART HERE -->



<a name="technical"></a>
## 7. Technical Deep Dive

### 🔧 Technology Choices

<!-- YOUR TECHNICAL DECISIONS -->

#### Why Next.js?
- Your reasons
- Trade-offs
- Alternatives considered

#### Why FastAPI?
- Your reasons
- Performance considerations
- Experience with it

#### Why PyMuPDF (and NOT OCR)?
- Initial assumption
- The pivot
- Lessons learned

#### Why Groq API?
- Cost considerations
- Speed requirements
- Alternatives

#### Why ChromaDB?
- Vector database options
- In-memory vs persistent
- Your choice

### 🏗️ RAG Pipeline Details

In [9]:
# Display pipeline stages
pipeline_stages = {
    'Stage': ['Document Upload', 'Text Extraction', 'Chunking', 'Embedding', 'Vector Store', 'Query', 'Retrieval', 'LLM Generation'],
    'Component': ['FastAPI', 'PyMuPDF', 'Custom Splitter', 'all-mpnet-base-v2', 'ChromaDB', 'User Input', 'HNSW Search', 'Groq API'],
    'Time': ['N/A', '~30s', '~5s', '~20s', '~10s', 'Instant', '~0.5s', '~2s']
}

df = pd.DataFrame(pipeline_stages)
display(HTML(df.to_html(index=False)))

Stage,Component,Time
Document Upload,FastAPI,
Text Extraction,PyMuPDF,~30s
Chunking,Custom Splitter,~5s
Embedding,all-mpnet-base-v2,~20s
Vector Store,ChromaDB,~10s
Query,User Input,Instant
Retrieval,HNSW Search,~0.5s
LLM Generation,Groq API,~2s


### 🚀 Deployment Architecture

<!-- YOUR DEPLOYMENT STORY -->

#### Three-Platform Deployment
1. **Vercel** (Frontend)
   - Why?
   - Challenges?
   
2. **HuggingFace Spaces** (Backend)
   - Why?
   - Issues faced?
   
3. **GitHub Pages** (Docs)
   - Why?
   - Experience?

#### Would You Do It Again?
- Pros of multi-platform
- Cons of multi-platform
- What you'd change

<a name="insights"></a>
## 8. Personal Insights & Stories

### 💭 My Journey

<!-- YOUR PERSONAL STORY -->

#### Starting Point
Write about:
- Your skill level when you started
- What you knew vs didn't know
- Your initial plan

#### The Struggle
Share specific moments:
- The hardest bug
- When you wanted to give up
- Breakthrough moments

#### The Victory
- When it finally worked
- Seeing 76% for the first time
- Deploying to production

### 🎯 Most Challenging Moments

<!-- YOUR CHALLENGE STORIES -->

#### Challenge #1: [Your Biggest Problem]
- What went wrong
- How long it took
- How you solved it
- What you learned

#### Challenge #2: [Another Major Issue]
- The problem
- What you tried
- The solution
- The lesson

#### Challenge #3: [Unexpected Issue]
- What surprised you
- Your reaction
- The resolution
- Growth moment

<a name="lessons"></a>
## 9. Lessons Learned

### 📚 Technical Lessons

<!-- YOUR TECHNICAL LEARNINGS -->

1. **About RAG Systems**
   - What you learned
   - What surprised you
   
2. **About Deployment**
   - Dev vs production
   - What you'd do differently
   
3. **About Performance Optimization**
   - What worked
   - What didn't
   
4. **About Evaluation**
   - Importance of ground truth
   - Measuring what matters

### 🌱 Personal Growth

<!-- YOUR PERSONAL DEVELOPMENT -->

#### Before This Project
- What you thought about [specific concept]
- Your approach to [specific task]
- Your understanding of [specific technology]

#### After This Project
- How your thinking changed
- New approaches you learned
- Deeper understanding gained

#### Skills Developed
1. Technical:
2. Problem-solving:
3. Project management:
4. Communication:

---

## 🎓 Conclusion

<!-- YOUR FINAL THOUGHTS -->

### Key Takeaways
1.
2.
3.

### How This Changed Me

Write about:
- Your growth as a developer
- What you're most proud of
- How this fits into your career goals
- What's next for you

---

## 📚 Resources & Links

- **Live Demo**: [manual-ai-psi.vercel.app](https://manual-ai-psi.vercel.app)
- **GitHub**: [github.com/agapemiteu/ManualAi](https://github.com/agapemiteu/ManualAi)
- **Documentation**: [agapemiteu.github.io/ManualAi](https://agapemiteu.github.io/ManualAi)

---

**Made with ❤️ by Agape Miteu**  
*If you found this helpful, please ⭐ the repository!*