AutoReportAI: LLM-Powered Monthly Report Generator

Description

AutoReportAI is an AI-powered solution that automates monthly report generation. It collects data from diverse sources (CSV, SQL, APIs), processes and analyzes key performance indicators, and generates insightful summaries using advanced machine learning and LLM integrations. Seamlessly automating Excel report creation with embedded visualizations and secure offline capabilities, AutoReportAI empowers analysts and decision-makers to streamline their workflows and focus on strategic insights.

AutoReportAI: LLM-Powered Monthly Report Generator

Key Highlights

Accepts data from CSV files, SQL queries, or APIs
Computes KPIs and detects anomalies
Generates action-oriented summaries using GPT or open-source LLMs
Outputs multi-sheet Excel reports with charts and summaries
Offers optional PDF exports and embedded image support
CLI-ready, configurable, and schedule-friendly

Problem Statement

Manual monthly reporting is often slow, repetitive, and error-prone for data analysts, operations teams, and business units. Teams spend hours collecting data from multiple sources, computing key performance indicators (KPIs), writing performance summaries, and formatting Excel or PDF reports for stakeholders.

This fragmented process lacks automation and consistency. It often involves:

Extracting data from CSV files, SQL queries, or API responses
Performing calculations like MoM growth, totals, or breakdowns
Writing human-friendly insights manually
Reformatting visual charts and tables every month
Embedding screenshots or notes as part of the report

There is no modern, developer-centric tool that combines all these tasks with the power of Large Language Models (LLMs) for generating summaries, all in a secure, offline-compatible environment.

**AutoReportAI ** addresses this challenge by offering an integrated reporting engine that ingests data, computes KPIs, generates executive summaries using GPT or open-source LLMs, and outputs well-formatted Excel and PDF reports — complete with charts and optional visuals — all through a simple command-line workflow.

Who This Is For

AutoReportAI is built for modern teams who are responsible for recurring reporting, performance analysis, and decision support. Whether you are working with raw data or managing stakeholder communication, this tool provides a streamlined path from data to insight.

Ideal Users:

Data Analysts: Automate recurring monthly report creation, generate consistent insights, and minimize manual chart formatting.
Operations Teams: Get a centralized tool to monitor departmental KPIs and trends with business-friendly summaries.
Business Intelligence Teams: Save time on writing narrative performance breakdowns with AI-powered summaries.
Team Leads & Executives: Receive well-formatted Excel and PDF reports with clear trends, visuals, and AI-summarized actions.
Developers & Engineers: Integrate reporting automation into workflows using CLI and config-driven customization.

AutoReportAI makes it easier for both technical and non-technical users to collaborate on high-quality reports without getting lost in spreadsheets, SQL queries, or copy-pasting.

Architecture Overview

The architecture of **AutoReportAI ** is designed for modularity, extensibility, and ease of integration into any data workflow. It follows a step-by-step pipeline from raw data ingestion to polished, presentation-ready reports.

Workflow Overview:

Data Ingest Layer
- Input accepted via CSV files, SQL query results, or API calls
- Pre-processing includes data validation, schema alignment, and sanitization
KPI Engine
- Performs calculations like MoM growth, YoY trends, totals, ratios, and outlier detection
- Supports grouping and aggregating by client, department, or user
LLM Summary Generator
- Feeds performance data into structured prompt templates
- Utilizes OpenAI (GPT-4) or open-source models (LLaMA2, Mistral) for summarization
- Returns clear, business-friendly explanations and insights
Excel & PDF Writer
- Multi-sheet Excel report using openpyxl with:
  - Raw data
  - KPI tables
  - Charts (bar, line, pie)
  - Executive summary sheet
- Optional PDF export using pdfkit or WeasyPrint
Delivery Layer
- Reports saved to local folders or exported via email (optional SMTP integration)
- Image attachments (e.g. dashboard screenshots) embedded in report assets

This clean, layered approach ensures AutoReportAI is production-ready, testable, and easily extendable with more features like forecasting, PII masking, or real-time dashboards.

Visual Architecture Idea:

+----------------------+     +------------------+     +--------------------+     +---------------------+     +------------------+
|    Data Sources      | --> |   KPI Processor  | --> |  LLM Summary Engine | --> | Excel & PDF Builder | --> | Report Delivery  |
| (CSV, SQL, API)      |     | (MoM, Trends...) |     | (GPT, LLaMA, etc.)  |     | (openpyxl, charts)  |     | (Email, Local)   |
+----------------------+     +------------------+     +--------------------+     +---------------------+     +------------------+

Each module can be independently configured, tested, and enhanced without breaking the overall flow.

Tech Stack & Requirements

AutoReportAI is built using reliable, production-grade Python libraries and modern APIs. It emphasizes modularity, clarity, and extensibility, making it easy to customize and deploy in various environments — including cost-sensitive or offline setups.

Core Technologies:

Python 3.10+ – Primary programming language
openai – For accessing GPT-3.5/GPT-4 APIs (optional, paid usage)
transformers – For running models like Mistral, Falcon, and BLOOM from Hugging Face
llama-cpp-python – For running quantized LLaMA models locally on CPU/GPU
openpyxl – For creating and formatting Excel .xlsx reports
matplotlib / plotly – For generating charts (bar, pie, line)
pdfkit / WeasyPrint – For generating PDF snapshots
pandas – For data manipulation and aggregation
APScheduler / cron – For scheduling recurring report jobs
dotenv – For managing environment variables securely

System Requirements:

Python 3.10 or later
Access to OpenAI API (if GPT-based summaries are enabled)
Node.js (required by pdfkit if using wkhtmltopdf)
wkhtmltopdf or WeasyPrint installed locally (for PDF export)

Flexible LLM Support via Environment Variable:

AutoReportAI allows dynamic switching between OpenAI and open-source LLMs using an environment variable such as LLM_PROVIDER. Supported values include:

openai – Uses GPT models via API
local-llama – Uses LLaMA models via llama-cpp
huggingface – Uses any Hugging Face-hosted transformer model

This flexibility helps balance cost, privacy, and performance, giving users full control over their reporting engine’s intelligence layer.

Optional Enhancements:

SMTP credentials (for automated email delivery)
GPU-enabled system or Apple M-series chip (for efficient local LLM execution)
Hugging Face API key (if using hosted OSS models)
Local .gguf model files (for fully offline summarization)

All dependencies will be listed in requirements.txt, with setup instructions provided in the README.md for quick onboarding.

Input Format

AutoReportAI is designed with flexibility and clarity at its core, enabling users to feed in structured data seamlessly from various sources. This ensures consistent data processing and accurate report generation across diverse teams and workflows.

Supported Input Types

1. CSV Files (Primary Method)

Most straightforward and widely supported format
Must include headers in the first row for column mapping
Supports UTF-8 encoding (recommended)
Can contain one or more metric or dimension columns
Multiple CSVs can be grouped and processed by department or business unit

Example CSV File:

Date,Department,Revenue,Expenses,Users
2025-03-01,Sales,15000,7000,134
2025-03-02,Sales,17800,6200,151
...etc

CSV File Placement:

/data/
  ├── finance_mar_2025.csv
  ├── marketing_mar_2025.csv
  └── operations_mar_2025.csv

2. SQL Query Results (Exported to CSV)

Compatible with any SQL database that supports CSV export
You can query from tools like PostgreSQL, MySQL, or Snowflake
Simply export the result to .csv and place it inside /data/

Note: Live SQL query execution support is part of the upcoming roadmap.

3. API-Based Data (Planned Feature)

JSON format with key-value pairs per record
Expected to support pagination, headers, and basic auth
Will be transformable into tabular format automatically

Prompt Templates (Advanced Customization)

AutoReportAI allows users to define the tone and content structure of the LLM-generated summaries using prompt templates.

Example prompt file path:

/prompts/summary_prompt.txt

Prompt Template Features:

Customize the summary style: executive, operational, technical
Define voice/tone: formal, conversational, analytical
Predefine what the summary must include: performance trends, risks, next steps, or recommendations
These are dynamically inserted into the LLM input prompt at runtime

Environment Variables (for Input)

Users can control which data file and prompt to use through CLI arguments or .env configurations:

DATA_SOURCE=./data/finance_mar_2025.csv
PROMPT_TEMPLATE=./prompts/summary_prompt.txt
MONTH=Mar
YEAR=2025

CLI Reference

The CLI interface is designed to be explicit and developer-friendly:

python report_engine.py \
  --source ./data/finance_mar_2025.csv \
  --month Mar \
  --year 2025 \
  --prompt ./prompts/summary_prompt.txt

Summary

Input must be well-structured (CSV with headers is preferred)
Each department can have a separate CSV file
Prompt templates offer full control over LLM behavior
Environment variables or CLI flags allow dynamic control per run

This setup ensures that your reports remain modular, repeatable, and adaptable to business needs — whether offline, automated via cron, or integrated into larger pipelines.

Output Format

AutoReportAI is built to deliver high-impact, presentation-ready reports in multiple formats, serving the needs of analysts, department heads, and C-level executives alike. Each report is structured for clarity, traceability, and business usability.

Primary Output: Excel Reports (`.xlsx`)

Excel is the default and most feature-rich output format. Each generated .xlsx report includes multiple structured sheets:

Sheet 1: Raw Data
- Contains the unmodified source data from CSV or API
- Useful for audit trails, internal review, or backup
- Retains original column headers for traceability
Sheet 2: KPI Dashboard
- Features computed KPIs like:
  - Total revenue, expenses, or usage metrics
  - Month-over-Month (MoM) and Year-over-Year (YoY) trends
  - Category or user-level breakdowns
- Dynamically grouped based on department or segmentation
Sheet 3: Visual Charts
- Bar charts for performance comparison
- Line charts for trend analysis
- Pie charts for category contribution
- Powered by plotly or matplotlib and inserted via openpyxl
Sheet 4: Executive Summary
- LLM-generated narrative tailored for non-technical readers
- Explains trends, identifies outliers, and offers action items
- Can be customized via prompt templates to match executive tone or regional language preferences
Embedded Image Support (Optional)
- Embed diagrams, dashboards, or whiteboard snapshots
- Uploaded via CLI or GUI and placed into assets/
- Referenced visually in the Excel report with captions

Secondary Output: PDF Reports

PDF snapshot of the Excel report
Styled for distribution to clients, VPs, or stakeholders
Generated using pdfkit or WeasyPrint
Includes:
- Executive summary
- Charts and KPI tables
- Headers, footers, page numbers (customizable)

Lightweight Output: Markdown Summary

Single .md file per report
Contains only the executive summary (text only)
Useful for embedding in GitHub repos, Notion docs, or internal portals
Can serve as a changelog or snapshot for monthly performance

Output Directory Structure

Each run is timestamped or organized by reporting cycle (month/year). Example:

/reports/Mar_2025/
  ├── finance_report.xlsx          # Full report
  ├── finance_report.pdf           # Shareable snapshot
  ├── finance_summary.md           # Executive summary only
  └── assets/
       └── dashboard_finance.png   # Embedded visuals

Output Customization & CLI Flags

AutoReportAI allows full control over what is generated:

Enable or disable specific formats (--excel, --pdf, --markdown)
Choose whether to embed uploaded images (--embed-images)
Control visual themes or layout via optional YAML config (chart_config.yaml)
Set summary tone and section structure through prompt templates

Recap: What Makes AutoReportAI Output Valuable

✅ Multi-format delivery for maximum accessibility
✅ Combines raw data, insights, and visuals in one package
✅ Customizable for different stakeholders (internal or external)
✅ Organized for historical tracking and compliance

Whether the report is for internal review, board meetings, or client presentations — AutoReportAI ensures it is clear, complete, and professional every time.

Code Flow Summary

AutoReportAI is built with an enterprise-grade, modular architecture that cleanly separates responsibilities across services, utilities, configuration layers, and report generation logic. The design follows industry best practices to ensure scalability, reusability, and maintainability.

Step-by-Step Process

1. Input Data Loading

Source can be specified via CLI or environment variable
Reads .csv file and loads it into a pandas DataFrame
Validates data schema, handles missing/null values
Supports batch processing of multiple files per department/client

2. KPI Calculation & Analysis

Calculates metrics such as:
- Aggregates: Totals, Averages, Min/Max
- Trends: MoM, YoY
- Segmented breakdowns (by department, region, etc.)
- Optional: anomaly detection using statistical thresholds or ML
Output structured as JSON-ready dictionary or DataFrame

3. Prompt Preparation

Combines KPI insights with predefined prompt templates
Injects values into placeholders for summary clarity
Supports multiple tone/formats for executive, operational, or technical use cases

4. LLM Summary Generation

Pluggable model handler based on LLM_PROVIDER
- openai (API)
- llama-cpp (local)
- transformers (HuggingFace)
Prompts submitted securely and responses parsed for formatting

5. Excel + Chart Report Generation

Multi-sheet .xlsx file assembled using openpyxl
- Sheet 1: Raw data
- Sheet 2: KPI metrics
- Sheet 3: Visuals (charts via plotly or matplotlib)
- Sheet 4: Executive summary text
User-uploaded images (dashboards, whiteboards) embedded as visual aids

6. PDF & Markdown Export (Optional)

Clean layout PDF exported using WeasyPrint or pdfkit
Markdown version includes text summary only
All exports versioned and named clearly for traceability

7. Logging, Auditing & Traceability

Structured JSON logs for every report run
Logs include: timestamps, model used, report status, and error traces
Environment metadata (e.g. file names, API version) captured for auditing

Enterprise-Grade Code Structure

├── core/
│   ├── kpi_engine.py            # Metric computations and analytics
│   ├── llm_handler.py           # LLM API abstraction (OpenAI, LLaMA, etc.)
│   ├── prompt_builder.py        # Template injection and formatting
│   └── chart_generator.py       # Charting and visualization logic
│
├── io/
│   ├── excel_writer.py          # Excel multi-sheet writer
│   ├── pdf_exporter.py          # PDF conversion and styling
│   ├── markdown_exporter.py     # Markdown summary output
│   └── image_embedder.py        # Embeds images in reports
│
├── config/
│   ├── config_loader.py         # Loads config.json and .env
│   └── chart_styles.yaml        # Optional chart customization
│
├── cli/
│   └── report_engine.py         # CLI entry point and argument parser
│
├── utils/
│   ├── validators.py            # Schema checks, null handlers
│   └── logger.py                # Centralized logging module
│
├── prompts/                    # User-defined LLM templates
├── logs/                       # JSON-formatted execution logs
├── reports/                    # Output files stored by cycle/month
├── config.json / .env          # Runtime settings
└── requirements.txt            # Dependency list

This structure allows AutoReportAI to support scaling, plugin injection, and advanced automation workflows. Every component can be unit tested, extended, or swapped independently without disrupting the full pipeline.

Command-Line Usage Instructions

AutoReportAI is designed to be run from the command line with clearly defined flags for flexibility and reproducibility. Users can easily trigger report generation for different departments, time periods, and output formats with a single command.

Basic Syntax

python cli/report_engine.py --source <input_file.csv> --month <Month> --year <Year>

Required Flags

--source → Path to input CSV file
--month → Month to label the report (e.g., Jan, Feb, Mar)
--year → Year of the report (e.g., 2025)

Optional Flags

--prompt → Path to custom LLM prompt template
--output-dir → Directory to store the reports (default is ./reports/)
--excel → Generate Excel output (enabled by default)
--pdf → Enable PDF export
--markdown → Enable Markdown export
--embed-images → Embed local images from assets/ folder
--llm-provider → Choose between openai, llama-cpp, or huggingface
--model-name → Specify the exact model to use (e.g., gpt-4, mistral-7b, etc.)

Example Usage

python cli/report_engine.py \
  --source ./data/marketing_mar_2025.csv \
  --month Mar \
  --year 2025 \
  --prompt ./prompts/summary_prompt.txt \
  --pdf \
  --markdown \
  --embed-images \
  --llm-provider openai \
  --model-name gpt-4

Environment Variable Support

Instead of passing everything via CLI, you can use a .env file or config.json for repeatable runs:

DATA_SOURCE=./data/marketing_mar_2025.csv
MONTH=Mar
YEAR=2025
PROMPT_TEMPLATE=./prompts/summary_prompt.txt
LLM_PROVIDER=openai
MODEL_NAME=gpt-4
ENABLE_PDF=true
ENABLE_MARKDOWN=true

Cron/Job Scheduler Friendly

Because the tool is fully CLI-based, you can schedule recurring reports easily:

0 9 1 * * /usr/bin/python3 /path/to/cli/report_engine.py --source ./data/ops.csv --month Mar --year 2025 --pdf

This usage flexibility allows integration into CI pipelines, cloud jobs, or enterprise automation platforms with minimal setup.

Customization Tips

AutoReportAI is highly customizable through prompt templates, environment variables, config files, and CLI flags. These options ensure you can tailor the report format, language, and logic to your organization’s exact needs.

1. Prompt Customization

You can modify the style, tone, and structure of the LLM-generated summaries using .txt templates placed in the prompts/ folder.

Example Template Variables:

{{month}}, {{year}} → Auto-filled based on input flags
{{kpi_summary}} → Injects KPI calculations dynamically

Prompt Modes Supported:

Executive-level summary
Department-wise breakdown
Performance trend analysis
Actionable insights and recommendations

2. Output Control

Customize output format and destination through CLI flags or environment variables:

Enable/disable Excel, PDF, or Markdown output
Set custom output directory
Embed uploaded images
Customize chart styling via YAML config (planned)

3. LLM Provider Switching

Use LLM_PROVIDER in .env or as CLI flag to toggle between:

openai → GPT-3.5 / GPT-4 via API
llama-cpp → Local LLaMA or Mistral models
huggingface → Hugging Face-hosted or self-hosted models

Also set:

MODEL_NAME=gpt-4 or mistral-7b
MAX_TOKENS=800 to control output length

4. Retry & Resilience Controls

To ensure reliability in production:

Use built-in retry logic (with exponential backoff) for LLM calls
Enable structured JSON logging of errors and responses
Configure TIMEOUT, RETRY_COUNT, and LOG_LEVEL in .env

5. Config-Based Execution

Use config.json or .env to define reusable reporting profiles:

Set up default prompt paths, output folders, and model preferences
Ideal for automation or CI/CD setups

Sample .env:

DATA_SOURCE=./data/finance_feb_2025.csv
LLM_PROVIDER=llama-cpp
MODEL_NAME=mistral-7b
ENABLE_PDF=true
ENABLE_MARKDOWN=true
EMBED_IMAGES=true

These options give teams fine-grained control over both the content and behavior of AutoReportAI . Whether you need executive-ready summaries, audit-compliant outputs, or fully offline deployments, AutoReportAI adapts to your workflow.

Enhancement Ideas

AutoReportAI is designed with extensibility in mind. Below are several enhancement opportunities that can elevate it from a powerful reporting tool to a full-scale reporting platform.

We are fully committed to bringing these features to life. A dedicated Streamlit-based GUI is in active planning, and the rest of these enhancements will follow soon in upcoming releases.

1. Streamlit-Based GUI (Planned)

Drag-and-drop CSV file upload
LLM configuration through a form
Real-time report preview and download
Interactive toggles for output options (Excel, PDF, Markdown)
Summary tone/style selector

2. Report History & Versioning

Save and track previous runs per department and time period
Include version control and changelog entries
Display comparison between historical reports

3. Multi-Language Summary Output

Enable multilingual summaries via prompt translation
Add language flag (--lang) to support localization
Useful for international or regional teams

4. Tone & Persona Control

Executive vs operational summary modes
Customize summary style: optimistic, critical, neutral
Templates could include context such as goals, targets, or risks

5. Forecasting & Predictive KPIs

Add simple trend projection using linear regression or ARIMA
Forecast revenue, usage, or cost based on historical data
Visualize confidence intervals in charts

6. Live Dashboard Integration

Auto-generate a dynamic dashboard using Streamlit or Dash
Include LLM summaries, charts, and historical trends
Allow download/export on demand

7. Multi-Report Batch Mode

Accept a folder of CSV files and generate reports in batch
Group by department, region, or client automatically
CLI flag: --multi-source ./data/

8. PII Masking & Security Features

Detect and mask sensitive data fields (emails, names, IDs)
Add audit logs and report anonymization options
Compliance-ready for industries like healthcare or finance

9. Export to Database or BI Tools

Push reports or KPI summaries to PostgreSQL, BigQuery, or Notion
Enable Airtable export or Webhook triggers

10. Auto Email Delivery

Send completed reports automatically via SMTP
Use .env for email credentials and default recipients
Support daily, weekly, or monthly delivery options

11. Prompt Feedback Loop (Planned)

Collect feedback from users about LLM summary quality
Feed improvement signals back to refine prompt templates
Enable iterative improvement over time

These enhancements will help AutoReportAI become a complete, intelligent reporting solution for enterprises and data teams. We are actively iterating to ship them soon.

License

AutoReportAI is released under the MIT License, a permissive open-source license that encourages both individual and commercial usage. You are free to use, modify, distribute, and integrate this project into your own solutions, subject to the following conditions:

Key Permissions:

✅ Commercial use
✅ Modification
✅ Distribution
✅ Private use
✅ Incorporation into closed-source or SaaS tools

Limitations:

❌ No warranty provided
❌ Liability limitations apply

Obligations:

Attribution must be retained in source files
Significant changes must be documented in your forks or releases

You’ll find the full license text in the LICENSE file in the root of the repository. By using AutoReportAI , you agree to comply with these terms.

For enterprise licensing, white-labeled versions, or collaboration inquiries, feel free to contact the maintainers.

Demo Output Preview

Below is a sample snapshot of what AutoReportAI generates after processing a typical departmental dataset. This helps users visualize what to expect from the Excel, PDF, and Markdown output files.

Sample Excel Report (`.xlsx`)

Sheet Structure:

Sheet 1: Raw Data → Full tabular data with filters and formatting
Sheet 2: KPI Dashboard → Summary metrics (totals, growth, breakdowns)
Sheet 3: Visual Charts → Bar, line, and pie charts dynamically rendered
Sheet 4: Executive Summary → LLM-generated summary in clear business language

Example Snippet (Executive Summary):

"In March 2025, the Marketing department saw a 14.8% MoM increase in user engagement and a 12% decrease in ad spend, resulting in a 21% boost in conversion efficiency. The campaign targeting product B outperformed expectations. It is recommended to scale similar content formats in Q2."

Chart Example:

Bar chart comparing MoM performance across departments
Pie chart showing category contribution to total revenue
Line graph of trend over time

Sample PDF Report (`.pdf`)

Contains same content as Excel but formatted for clean printing
Page headers, footers, title, and company branding available

Sample Markdown Output (`.md`)

### March 2025 Marketing Summary
- User growth: +14.8%
- Ad spend: -12%
- Conversion uplift: +21%
- Top campaign: Product B
- Recommendation: Increase budget for content campaigns

Folder Structure Example

/reports/Mar_2025/
  ├── marketing_report.xlsx
  ├── marketing_report.pdf
  ├── marketing_summary.md
  └── assets/
       └── campaign_dashboard.png

This preview should give teams clarity and confidence in integrating AutoReportAI into their monthly reporting workflows. You can replace these sample visuals and text with actual project output after your first run.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation