Skip to content

Sunil-KumarSharma/AutoReportAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

AutoReportAI: LLM-Powered Monthly Report Generator

Description

AutoReportAI is an AI-powered solution that automates monthly report generation. It collects data from diverse sources (CSV, SQL, APIs), processes and analyzes key performance indicators, and generates insightful summaries using advanced machine learning and LLM integrations. Seamlessly automating Excel report creation with embedded visualizations and secure offline capabilities, AutoReportAI empowers analysts and decision-makers to streamline their workflows and focus on strategic insights.

Key Highlights

  • Accepts data from CSV files, SQL queries, or APIs
  • Computes KPIs and detects anomalies
  • Generates action-oriented summaries using GPT or open-source LLMs
  • Outputs multi-sheet Excel reports with charts and summaries
  • Offers optional PDF exports and embedded image support
  • CLI-ready, configurable, and schedule-friendly

Problem Statement

Manual monthly reporting is often slow, repetitive, and error-prone for data analysts, operations teams, and business units. Teams spend hours collecting data from multiple sources, computing key performance indicators (KPIs), writing performance summaries, and formatting Excel or PDF reports for stakeholders.

This fragmented process lacks automation and consistency. It often involves:

  • Extracting data from CSV files, SQL queries, or API responses
  • Performing calculations like MoM growth, totals, or breakdowns
  • Writing human-friendly insights manually
  • Reformatting visual charts and tables every month
  • Embedding screenshots or notes as part of the report

There is no modern, developer-centric tool that combines all these tasks with the power of Large Language Models (LLMs) for generating summaries, all in a secure, offline-compatible environment.

**AutoReportAI ** addresses this challenge by offering an integrated reporting engine that ingests data, computes KPIs, generates executive summaries using GPT or open-source LLMs, and outputs well-formatted Excel and PDF reports — complete with charts and optional visuals — all through a simple command-line workflow.

Who This Is For

AutoReportAI is built for modern teams who are responsible for recurring reporting, performance analysis, and decision support. Whether you are working with raw data or managing stakeholder communication, this tool provides a streamlined path from data to insight.

Ideal Users:

  • Data Analysts: Automate recurring monthly report creation, generate consistent insights, and minimize manual chart formatting.
  • Operations Teams: Get a centralized tool to monitor departmental KPIs and trends with business-friendly summaries.
  • Business Intelligence Teams: Save time on writing narrative performance breakdowns with AI-powered summaries.
  • Team Leads & Executives: Receive well-formatted Excel and PDF reports with clear trends, visuals, and AI-summarized actions.
  • Developers & Engineers: Integrate reporting automation into workflows using CLI and config-driven customization.

AutoReportAI makes it easier for both technical and non-technical users to collaborate on high-quality reports without getting lost in spreadsheets, SQL queries, or copy-pasting.

Architecture Overview

The architecture of **AutoReportAI ** is designed for modularity, extensibility, and ease of integration into any data workflow. It follows a step-by-step pipeline from raw data ingestion to polished, presentation-ready reports.

Workflow Overview:

  1. Data Ingest Layer

    • Input accepted via CSV files, SQL query results, or API calls
    • Pre-processing includes data validation, schema alignment, and sanitization
  2. KPI Engine

    • Performs calculations like MoM growth, YoY trends, totals, ratios, and outlier detection
    • Supports grouping and aggregating by client, department, or user
  3. LLM Summary Generator

    • Feeds performance data into structured prompt templates
    • Utilizes OpenAI (GPT-4) or open-source models (LLaMA2, Mistral) for summarization
    • Returns clear, business-friendly explanations and insights
  4. Excel & PDF Writer

    • Multi-sheet Excel report using openpyxl with:
      • Raw data
      • KPI tables
      • Charts (bar, line, pie)
      • Executive summary sheet
    • Optional PDF export using pdfkit or WeasyPrint
  5. Delivery Layer

    • Reports saved to local folders or exported via email (optional SMTP integration)
    • Image attachments (e.g. dashboard screenshots) embedded in report assets

This clean, layered approach ensures AutoReportAI is production-ready, testable, and easily extendable with more features like forecasting, PII masking, or real-time dashboards.

Visual Architecture Idea:

+----------------------+     +------------------+     +--------------------+     +---------------------+     +------------------+
|    Data Sources      | --> |   KPI Processor  | --> |  LLM Summary Engine | --> | Excel & PDF Builder | --> | Report Delivery  |
| (CSV, SQL, API)      |     | (MoM, Trends...) |     | (GPT, LLaMA, etc.)  |     | (openpyxl, charts)  |     | (Email, Local)   |
+----------------------+     +------------------+     +--------------------+     +---------------------+     +------------------+

Each module can be independently configured, tested, and enhanced without breaking the overall flow.

Tech Stack & Requirements

AutoReportAI is built using reliable, production-grade Python libraries and modern APIs. It emphasizes modularity, clarity, and extensibility, making it easy to customize and deploy in various environments — including cost-sensitive or offline setups.

Core Technologies:

  • Python 3.10+ – Primary programming language
  • openai – For accessing GPT-3.5/GPT-4 APIs (optional, paid usage)
  • transformers – For running models like Mistral, Falcon, and BLOOM from Hugging Face
  • llama-cpp-python – For running quantized LLaMA models locally on CPU/GPU
  • openpyxl – For creating and formatting Excel .xlsx reports
  • matplotlib / plotly – For generating charts (bar, pie, line)
  • pdfkit / WeasyPrint – For generating PDF snapshots
  • pandas – For data manipulation and aggregation
  • APScheduler / cron – For scheduling recurring report jobs
  • dotenv – For managing environment variables securely

System Requirements:

  • Python 3.10 or later
  • Access to OpenAI API (if GPT-based summaries are enabled)
  • Node.js (required by pdfkit if using wkhtmltopdf)
  • wkhtmltopdf or WeasyPrint installed locally (for PDF export)

Flexible LLM Support via Environment Variable:

AutoReportAI allows dynamic switching between OpenAI and open-source LLMs using an environment variable such as LLM_PROVIDER. Supported values include:

  • openai – Uses GPT models via API
  • local-llama – Uses LLaMA models via llama-cpp
  • huggingface – Uses any Hugging Face-hosted transformer model

This flexibility helps balance cost, privacy, and performance, giving users full control over their reporting engine’s intelligence layer.

Optional Enhancements:

  • SMTP credentials (for automated email delivery)
  • GPU-enabled system or Apple M-series chip (for efficient local LLM execution)
  • Hugging Face API key (if using hosted OSS models)
  • Local .gguf model files (for fully offline summarization)

All dependencies will be listed in requirements.txt, with setup instructions provided in the README.md for quick onboarding.

Input Format

AutoReportAI is designed with flexibility and clarity at its core, enabling users to feed in structured data seamlessly from various sources. This ensures consistent data processing and accurate report generation across diverse teams and workflows.

Supported Input Types

1. CSV Files (Primary Method)

  • Most straightforward and widely supported format
  • Must include headers in the first row for column mapping
  • Supports UTF-8 encoding (recommended)
  • Can contain one or more metric or dimension columns
  • Multiple CSVs can be grouped and processed by department or business unit

Example CSV File:

Date,Department,Revenue,Expenses,Users
2025-03-01,Sales,15000,7000,134
2025-03-02,Sales,17800,6200,151
...etc

CSV File Placement:

/data/
  ├── finance_mar_2025.csv
  ├── marketing_mar_2025.csv
  └── operations_mar_2025.csv

2. SQL Query Results (Exported to CSV)

  • Compatible with any SQL database that supports CSV export
  • You can query from tools like PostgreSQL, MySQL, or Snowflake
  • Simply export the result to .csv and place it inside /data/

Note: Live SQL query execution support is part of the upcoming roadmap.

3. API-Based Data (Planned Feature)

  • JSON format with key-value pairs per record
  • Expected to support pagination, headers, and basic auth
  • Will be transformable into tabular format automatically

Prompt Templates (Advanced Customization)

AutoReportAI allows users to define the tone and content structure of the LLM-generated summaries using prompt templates.

Example prompt file path:

/prompts/summary_prompt.txt

Prompt Template Features:

  • Customize the summary style: executive, operational, technical
  • Define voice/tone: formal, conversational, analytical
  • Predefine what the summary must include: performance trends, risks, next steps, or recommendations
  • These are dynamically inserted into the LLM input prompt at runtime

Environment Variables (for Input)

Users can control which data file and prompt to use through CLI arguments or .env configurations:

DATA_SOURCE=./data/finance_mar_2025.csv
PROMPT_TEMPLATE=./prompts/summary_prompt.txt
MONTH=Mar
YEAR=2025

CLI Reference

The CLI interface is designed to be explicit and developer-friendly:

python report_engine.py \
  --source ./data/finance_mar_2025.csv \
  --month Mar \
  --year 2025 \
  --prompt ./prompts/summary_prompt.txt

Summary

  • Input must be well-structured (CSV with headers is preferred)
  • Each department can have a separate CSV file
  • Prompt templates offer full control over LLM behavior
  • Environment variables or CLI flags allow dynamic control per run

This setup ensures that your reports remain modular, repeatable, and adaptable to business needs — whether offline, automated via cron, or integrated into larger pipelines.

Output Format

AutoReportAI is built to deliver high-impact, presentation-ready reports in multiple formats, serving the needs of analysts, department heads, and C-level executives alike. Each report is structured for clarity, traceability, and business usability.

Primary Output: Excel Reports (.xlsx)

Excel is the default and most feature-rich output format. Each generated .xlsx report includes multiple structured sheets:

  • Sheet 1: Raw Data

    • Contains the unmodified source data from CSV or API
    • Useful for audit trails, internal review, or backup
    • Retains original column headers for traceability
  • Sheet 2: KPI Dashboard

    • Features computed KPIs like:
      • Total revenue, expenses, or usage metrics
      • Month-over-Month (MoM) and Year-over-Year (YoY) trends
      • Category or user-level breakdowns
    • Dynamically grouped based on department or segmentation
  • Sheet 3: Visual Charts

    • Bar charts for performance comparison
    • Line charts for trend analysis
    • Pie charts for category contribution
    • Powered by plotly or matplotlib and inserted via openpyxl
  • Sheet 4: Executive Summary

    • LLM-generated narrative tailored for non-technical readers
    • Explains trends, identifies outliers, and offers action items
    • Can be customized via prompt templates to match executive tone or regional language preferences
  • Embedded Image Support (Optional)

    • Embed diagrams, dashboards, or whiteboard snapshots
    • Uploaded via CLI or GUI and placed into assets/
    • Referenced visually in the Excel report with captions

Secondary Output: PDF Reports

  • PDF snapshot of the Excel report
  • Styled for distribution to clients, VPs, or stakeholders
  • Generated using pdfkit or WeasyPrint
  • Includes:
    • Executive summary
    • Charts and KPI tables
    • Headers, footers, page numbers (customizable)

Lightweight Output: Markdown Summary

  • Single .md file per report
  • Contains only the executive summary (text only)
  • Useful for embedding in GitHub repos, Notion docs, or internal portals
  • Can serve as a changelog or snapshot for monthly performance

Output Directory Structure

Each run is timestamped or organized by reporting cycle (month/year). Example:

/reports/Mar_2025/
  ├── finance_report.xlsx          # Full report
  ├── finance_report.pdf           # Shareable snapshot
  ├── finance_summary.md           # Executive summary only
  └── assets/
       └── dashboard_finance.png   # Embedded visuals

Output Customization & CLI Flags

AutoReportAI allows full control over what is generated:

  • Enable or disable specific formats (--excel, --pdf, --markdown)
  • Choose whether to embed uploaded images (--embed-images)
  • Control visual themes or layout via optional YAML config (chart_config.yaml)
  • Set summary tone and section structure through prompt templates

Recap: What Makes AutoReportAI Output Valuable

  • ✅ Multi-format delivery for maximum accessibility
  • ✅ Combines raw data, insights, and visuals in one package
  • ✅ Customizable for different stakeholders (internal or external)
  • ✅ Organized for historical tracking and compliance

Whether the report is for internal review, board meetings, or client presentations — AutoReportAI ensures it is clear, complete, and professional every time.

Code Flow Summary

AutoReportAI is built with an enterprise-grade, modular architecture that cleanly separates responsibilities across services, utilities, configuration layers, and report generation logic. The design follows industry best practices to ensure scalability, reusability, and maintainability.

Step-by-Step Process

1. Input Data Loading

  • Source can be specified via CLI or environment variable
  • Reads .csv file and loads it into a pandas DataFrame
  • Validates data schema, handles missing/null values
  • Supports batch processing of multiple files per department/client

2. KPI Calculation & Analysis

  • Calculates metrics such as:
    • Aggregates: Totals, Averages, Min/Max
    • Trends: MoM, YoY
    • Segmented breakdowns (by department, region, etc.)
    • Optional: anomaly detection using statistical thresholds or ML
  • Output structured as JSON-ready dictionary or DataFrame

3. Prompt Preparation

  • Combines KPI insights with predefined prompt templates
  • Injects values into placeholders for summary clarity
  • Supports multiple tone/formats for executive, operational, or technical use cases

4. LLM Summary Generation

  • Pluggable model handler based on LLM_PROVIDER
    • openai (API)
    • llama-cpp (local)
    • transformers (HuggingFace)
  • Prompts submitted securely and responses parsed for formatting

5. Excel + Chart Report Generation

  • Multi-sheet .xlsx file assembled using openpyxl
    • Sheet 1: Raw data
    • Sheet 2: KPI metrics
    • Sheet 3: Visuals (charts via plotly or matplotlib)
    • Sheet 4: Executive summary text
  • User-uploaded images (dashboards, whiteboards) embedded as visual aids

6. PDF & Markdown Export (Optional)

  • Clean layout PDF exported using WeasyPrint or pdfkit
  • Markdown version includes text summary only
  • All exports versioned and named clearly for traceability

7. Logging, Auditing & Traceability

  • Structured JSON logs for every report run
  • Logs include: timestamps, model used, report status, and error traces
  • Environment metadata (e.g. file names, API version) captured for auditing

Enterprise-Grade Code Structure

├── core/
│   ├── kpi_engine.py            # Metric computations and analytics
│   ├── llm_handler.py           # LLM API abstraction (OpenAI, LLaMA, etc.)
│   ├── prompt_builder.py        # Template injection and formatting
│   └── chart_generator.py       # Charting and visualization logic
│
├── io/
│   ├── excel_writer.py          # Excel multi-sheet writer
│   ├── pdf_exporter.py          # PDF conversion and styling
│   ├── markdown_exporter.py     # Markdown summary output
│   └── image_embedder.py        # Embeds images in reports
│
├── config/
│   ├── config_loader.py         # Loads config.json and .env
│   └── chart_styles.yaml        # Optional chart customization
│
├── cli/
│   └── report_engine.py         # CLI entry point and argument parser
│
├── utils/
│   ├── validators.py            # Schema checks, null handlers
│   └── logger.py                # Centralized logging module
│
├── prompts/                    # User-defined LLM templates
├── logs/                       # JSON-formatted execution logs
├── reports/                    # Output files stored by cycle/month
├── config.json / .env          # Runtime settings
└── requirements.txt            # Dependency list

This structure allows AutoReportAI to support scaling, plugin injection, and advanced automation workflows. Every component can be unit tested, extended, or swapped independently without disrupting the full pipeline.

Command-Line Usage Instructions

AutoReportAI is designed to be run from the command line with clearly defined flags for flexibility and reproducibility. Users can easily trigger report generation for different departments, time periods, and output formats with a single command.

Basic Syntax

python cli/report_engine.py --source <input_file.csv> --month <Month> --year <Year>

Required Flags

  • --source → Path to input CSV file
  • --month → Month to label the report (e.g., Jan, Feb, Mar)
  • --year → Year of the report (e.g., 2025)

Optional Flags

  • --prompt → Path to custom LLM prompt template
  • --output-dir → Directory to store the reports (default is ./reports/)
  • --excel → Generate Excel output (enabled by default)
  • --pdf → Enable PDF export
  • --markdown → Enable Markdown export
  • --embed-images → Embed local images from assets/ folder
  • --llm-provider → Choose between openai, llama-cpp, or huggingface
  • --model-name → Specify the exact model to use (e.g., gpt-4, mistral-7b, etc.)

Example Usage

python cli/report_engine.py \
  --source ./data/marketing_mar_2025.csv \
  --month Mar \
  --year 2025 \
  --prompt ./prompts/summary_prompt.txt \
  --pdf \
  --markdown \
  --embed-images \
  --llm-provider openai \
  --model-name gpt-4

Environment Variable Support

Instead of passing everything via CLI, you can use a .env file or config.json for repeatable runs:

DATA_SOURCE=./data/marketing_mar_2025.csv
MONTH=Mar
YEAR=2025
PROMPT_TEMPLATE=./prompts/summary_prompt.txt
LLM_PROVIDER=openai
MODEL_NAME=gpt-4
ENABLE_PDF=true
ENABLE_MARKDOWN=true

Cron/Job Scheduler Friendly

Because the tool is fully CLI-based, you can schedule recurring reports easily:

0 9 1 * * /usr/bin/python3 /path/to/cli/report_engine.py --source ./data/ops.csv --month Mar --year 2025 --pdf

This usage flexibility allows integration into CI pipelines, cloud jobs, or enterprise automation platforms with minimal setup.

Customization Tips

AutoReportAI is highly customizable through prompt templates, environment variables, config files, and CLI flags. These options ensure you can tailor the report format, language, and logic to your organization’s exact needs.

1. Prompt Customization

You can modify the style, tone, and structure of the LLM-generated summaries using .txt templates placed in the prompts/ folder.

Example Template Variables:

  • {{month}}, {{year}} → Auto-filled based on input flags
  • {{kpi_summary}} → Injects KPI calculations dynamically

Prompt Modes Supported:

  • Executive-level summary
  • Department-wise breakdown
  • Performance trend analysis
  • Actionable insights and recommendations

2. Output Control

Customize output format and destination through CLI flags or environment variables:

  • Enable/disable Excel, PDF, or Markdown output
  • Set custom output directory
  • Embed uploaded images
  • Customize chart styling via YAML config (planned)

3. LLM Provider Switching

Use LLM_PROVIDER in .env or as CLI flag to toggle between:

  • openai → GPT-3.5 / GPT-4 via API
  • llama-cpp → Local LLaMA or Mistral models
  • huggingface → Hugging Face-hosted or self-hosted models

Also set:

  • MODEL_NAME=gpt-4 or mistral-7b
  • MAX_TOKENS=800 to control output length

4. Retry & Resilience Controls

To ensure reliability in production:

  • Use built-in retry logic (with exponential backoff) for LLM calls
  • Enable structured JSON logging of errors and responses
  • Configure TIMEOUT, RETRY_COUNT, and LOG_LEVEL in .env

5. Config-Based Execution

Use config.json or .env to define reusable reporting profiles:

  • Set up default prompt paths, output folders, and model preferences
  • Ideal for automation or CI/CD setups

Sample .env:

DATA_SOURCE=./data/finance_feb_2025.csv
LLM_PROVIDER=llama-cpp
MODEL_NAME=mistral-7b
ENABLE_PDF=true
ENABLE_MARKDOWN=true
EMBED_IMAGES=true

These options give teams fine-grained control over both the content and behavior of AutoReportAI . Whether you need executive-ready summaries, audit-compliant outputs, or fully offline deployments, AutoReportAI adapts to your workflow.

Enhancement Ideas

AutoReportAI is designed with extensibility in mind. Below are several enhancement opportunities that can elevate it from a powerful reporting tool to a full-scale reporting platform.

We are fully committed to bringing these features to life. A dedicated Streamlit-based GUI is in active planning, and the rest of these enhancements will follow soon in upcoming releases.

1. Streamlit-Based GUI (Planned)

  • Drag-and-drop CSV file upload
  • LLM configuration through a form
  • Real-time report preview and download
  • Interactive toggles for output options (Excel, PDF, Markdown)
  • Summary tone/style selector

2. Report History & Versioning

  • Save and track previous runs per department and time period
  • Include version control and changelog entries
  • Display comparison between historical reports

3. Multi-Language Summary Output

  • Enable multilingual summaries via prompt translation
  • Add language flag (--lang) to support localization
  • Useful for international or regional teams

4. Tone & Persona Control

  • Executive vs operational summary modes
  • Customize summary style: optimistic, critical, neutral
  • Templates could include context such as goals, targets, or risks

5. Forecasting & Predictive KPIs

  • Add simple trend projection using linear regression or ARIMA
  • Forecast revenue, usage, or cost based on historical data
  • Visualize confidence intervals in charts

6. Live Dashboard Integration

  • Auto-generate a dynamic dashboard using Streamlit or Dash
  • Include LLM summaries, charts, and historical trends
  • Allow download/export on demand

7. Multi-Report Batch Mode

  • Accept a folder of CSV files and generate reports in batch
  • Group by department, region, or client automatically
  • CLI flag: --multi-source ./data/

8. PII Masking & Security Features

  • Detect and mask sensitive data fields (emails, names, IDs)
  • Add audit logs and report anonymization options
  • Compliance-ready for industries like healthcare or finance

9. Export to Database or BI Tools

  • Push reports or KPI summaries to PostgreSQL, BigQuery, or Notion
  • Enable Airtable export or Webhook triggers

10. Auto Email Delivery

  • Send completed reports automatically via SMTP
  • Use .env for email credentials and default recipients
  • Support daily, weekly, or monthly delivery options

11. Prompt Feedback Loop (Planned)

  • Collect feedback from users about LLM summary quality
  • Feed improvement signals back to refine prompt templates
  • Enable iterative improvement over time

These enhancements will help AutoReportAI become a complete, intelligent reporting solution for enterprises and data teams. We are actively iterating to ship them soon.

License

AutoReportAI is released under the MIT License, a permissive open-source license that encourages both individual and commercial usage. You are free to use, modify, distribute, and integrate this project into your own solutions, subject to the following conditions:

Key Permissions:

  • ✅ Commercial use
  • ✅ Modification
  • ✅ Distribution
  • ✅ Private use
  • ✅ Incorporation into closed-source or SaaS tools

Limitations:

  • ❌ No warranty provided
  • ❌ Liability limitations apply

Obligations:

  • Attribution must be retained in source files
  • Significant changes must be documented in your forks or releases

You’ll find the full license text in the LICENSE file in the root of the repository. By using AutoReportAI , you agree to comply with these terms.

For enterprise licensing, white-labeled versions, or collaboration inquiries, feel free to contact the maintainers.

Demo Output Preview

Below is a sample snapshot of what AutoReportAI generates after processing a typical departmental dataset. This helps users visualize what to expect from the Excel, PDF, and Markdown output files.

Sample Excel Report (.xlsx)

Sheet Structure:

  • Sheet 1: Raw Data → Full tabular data with filters and formatting
  • Sheet 2: KPI Dashboard → Summary metrics (totals, growth, breakdowns)
  • Sheet 3: Visual Charts → Bar, line, and pie charts dynamically rendered
  • Sheet 4: Executive Summary → LLM-generated summary in clear business language

Example Snippet (Executive Summary):

"In March 2025, the Marketing department saw a 14.8% MoM increase in user engagement and a 12% decrease in ad spend, resulting in a 21% boost in conversion efficiency. The campaign targeting product B outperformed expectations. It is recommended to scale similar content formats in Q2."

Chart Example:

  • Bar chart comparing MoM performance across departments
  • Pie chart showing category contribution to total revenue
  • Line graph of trend over time

Sample PDF Report (.pdf)

  • Contains same content as Excel but formatted for clean printing
  • Page headers, footers, title, and company branding available

Sample Markdown Output (.md)

### March 2025 Marketing Summary
- User growth: +14.8%
- Ad spend: -12%
- Conversion uplift: +21%
- Top campaign: Product B
- Recommendation: Increase budget for content campaigns

Folder Structure Example

/reports/Mar_2025/
  ├── marketing_report.xlsx
  ├── marketing_report.pdf
  ├── marketing_summary.md
  └── assets/
       └── campaign_dashboard.png

This preview should give teams clarity and confidence in integrating AutoReportAI into their monthly reporting workflows. You can replace these sample visuals and text with actual project output after your first run.

About

AutoReportAI is an AI-powered solution automating monthly reports. It integrates data collection (CSV, SQL, APIs), KPI analysis, and LLM insights with Excel automation and visualization. Designed for secure, offline-ready reporting, it streamlines workflows and drives informed decisions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors