📊 Spreadsheet Agent with LangChain 🔗

A powerful LangChain-powered agent that interacts with spreadsheets using natural language. Query, process, and summarize your data with simple conversational commands.

🚀 New to this project? → START_HERE.md - Get running in 60 seconds!

✨ Key Features

🔗 Powered by LangChain: Multi-LLM support (GPT-4, Claude, and more)
🤖 Multiple Agent Types: OpenAI Functions, ReAct, and custom agents
🔍 Natural Language Queries: Filter, sort, and select data using plain English
🔄 Data Processing: Add columns, aggregate data, compute metrics
📝 Smart Summarization: Get human-readable insights from your data
💬 Conversational Interface: Chain operations naturally across multiple turns
🎯 Dynamic Tool Selection: Agent chooses the right tools automatically
📚 Memory-Enabled: Maintains context throughout the conversation
🌊 Streaming Support: Real-time output via LangChain callbacks
⏱️ Performance Timing: See execution time for every query
🔍 SQL Query Display: See the SQL equivalent of every operation

🔗 LangChain Integration

This agent is built on LangChain, giving you:

Supported LLM Providers

✅ OpenAI: GPT-4, GPT-3.5-turbo
✅ Anthropic: Claude 3 Opus, Claude 3 Sonnet
🔄 Any LangChain LLM: Custom configurations supported

Switch Models Instantly

# Use GPT-4
python main.py --model gpt-4

# Use Claude
python main.py --model claude-3-opus-20240229

# Use GPT-3.5 (faster/cheaper)
python main.py --model gpt-3.5-turbo

See LANGCHAIN_INTEGRATION.md for full details!

⚡ Quick Start

Just want to start NOW? → See START_HERE.md for 3-step setup!

Installation

# 1. Install dependencies
pip install -r requirements.txt

# 2. Set your API key
export OPENAI_API_KEY='sk-your-key-here'

# 3. Run the agent
python reports_agent.py

That's it! Then type 1 to load your first report.

Basic Usage

🎯 NEW: Reports Agent (Recommended for reports/ folder)

Specialized interface for the 30+ business reports in reports/ folder:

# Quick start - interactive mode
python reports_agent.py

# Or use the launcher
./analyze_reports.sh

# List all available reports
python reports_agent.py --list

Super fast loading - just type a number:

📊 You: 1              # Loads first report
📊 You: sales.csv      # Or type filename

See REPORTS_GUIDE.md for detailed examples!

Interactive CLI (General Purpose)

python main.py

Example session:

📊 You: Load the file examples/sales_data.csv

🤖 Agent: Loaded 24 rows and 7 columns from examples/sales_data.csv
...

📊 You: Show rows where Sales > 2000

🤖 Agent: Query returned 8 rows:
...

📊 You: Add a Profit column as Revenue - Cost

🤖 Agent: Successfully added column 'Profit'
...

📊 You: Summarize the top 5 results

🤖 Agent: Data Summary:
- Total rows: 24
...

Batch Mode

python main.py --batch "Load examples/sales_data.csv" "Show rows where Sales > 2000" "Group by Region and sum Sales"

Python API

from agent import create_agent

# Create agent
agent = create_agent(model_name='gpt-4', verbose=True)

# Run queries
result = agent.run("Load the file examples/sales_data.csv")
print(result['output'])

result = agent.run("Show rows where Sales > 2000")
print(result['output'])

result = agent.run("Add a Profit column as Revenue - Cost")
print(result['output'])

Project Structure

sqldemo/
├── agent.py                 # Main agent implementation
├── tools.py                 # Tool definitions (Query, Process, Summarize)
├── main.py                  # CLI interface
├── requirements.txt         # Dependencies
├── examples/                # Example data and notebooks
│   ├── sales_data.csv      # Sample dataset
│   ├── example_1_query_only.ipynb
│   ├── example_2_query_and_process.ipynb
│   └── example_3_full_workflow.ipynb
└── tests/                   # Unit tests
    └── test_tools.py

Available Tools

1. Load Spreadsheet

Load CSV or Excel files into memory.

Example: "Load the file data.csv"

2. Query Spreadsheet

Filter, select, and sort data using natural language.

Examples:

"Show rows where Sales > 1000"
"Filter for Region == 'North'"
"Get top 10 rows sorted by Revenue descending"
"Select columns: Name, Sales, Profit"

3. Process Spreadsheet

Transform and aggregate data.

Examples:

"Add a Profit column calculated as Revenue - Cost"
"Group by Category and sum Sales"
"Calculate mean Sales by Region"
"Normalize the Price column to 0-1 range"

4. Summarize Results

Generate natural language summaries.

Examples:

"Summarize the top 5 results"
"Explain what this data shows"
"Write a brief report of key insights"

5. Get Spreadsheet Info

View current spreadsheet metadata.

Example: Type info in the CLI

Example Workflows

Workflow 1: Sales Analysis

1. Load examples/sales_data.csv
2. Filter for sales above 2000
3. Add a Profit column (Revenue - Cost)
4. Summarize the top 5 results

Workflow 2: Regional Performance

1. Load examples/sales_data.csv
2. Group by Region and sum Revenue
3. Summarize which region performed best

Workflow 3: Product Margins

1. Load examples/sales_data.csv
2. Add a Margin column ((Revenue - Cost) / Revenue * 100)
3. Group by Product and calculate average Margin
4. Summarize which product has the best margins

CLI Commands

In interactive mode, you can use these commands:

help or ? - Show help message
info - Display current spreadsheet info
history - Show operation history
reset - Reset spreadsheet to original state
clear - Clear conversation memory
exit or quit - Exit the program

Command-Line Options

python main.py [OPTIONS]

Options:
  --model TEXT         OpenAI model to use (default: gpt-4)
  --temperature FLOAT  LLM temperature (default: 0)
  --verbose           Enable verbose output
  --batch TEXT...     Run in batch mode with commands
  --api-key TEXT      OpenAI API key

Examples

See the examples/ directory for Jupyter notebooks demonstrating:

Query-only operations
Query + processing workflows
Full workflows with summarization

Requirements

Python 3.9+
OpenAI API key
Dependencies listed in requirements.txt

Architecture

The system consists of:

LangChain Agent: OpenAI Functions Agent with conversation memory
Spreadsheet State: In-memory dataframe with operation history
Tool Suite: Specialized tools for different operations
CLI Interface: Interactive and batch modes

Supported File Formats

CSV (.csv)
Excel (.xlsx, .xls)

Future Enhancements

Troubleshooting

Issue: "No spreadsheet loaded" error Solution: Make sure to load a spreadsheet first using load_spreadsheet tool

Issue: API key not found Solution: Set OPENAI_API_KEY environment variable or use --api-key flag

Issue: Tool parsing errors Solution: Try rephrasing your request or check the verbose output for details

Contributing

Contributions welcome! Please ensure:

Code follows existing style
Tests pass
Documentation is updated

License

MIT License - see LICENSE file for details

Contact

For questions or issues, please open a GitHub issue.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
examples		examples
reports		reports
tests		tests
.env.example		.env.example
.gitignore		.gitignore
FEATURES_SUMMARY.md		FEATURES_SUMMARY.md
LANGCHAIN_INTEGRATION.md		LANGCHAIN_INTEGRATION.md
LANGCHAIN_QUICKSTART.md		LANGCHAIN_QUICKSTART.md
LANGCHAIN_UPGRADE_SUMMARY.txt		LANGCHAIN_UPGRADE_SUMMARY.txt
PROJECT_OVERVIEW.md		PROJECT_OVERVIEW.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
REPORTS_GUIDE.md		REPORTS_GUIDE.md
SQL_QUERIES.md		SQL_QUERIES.md
START_HERE.md		START_HERE.md
TIMING_FEATURE.md		TIMING_FEATURE.md
WHATS_NEW.md		WHATS_NEW.md
agent.py		agent.py
analyze_reports.sh		analyze_reports.sh
langchain_examples.py		langchain_examples.py
main.py		main.py
output.txt		output.txt
reports_agent.py		reports_agent.py
requirements.txt		requirements.txt
test_sql_display.py		test_sql_display.py
test_timing.py		test_timing.py
tools.py		tools.py

Kaboom2025/sqldemo

Folders and files

Latest commit

History

Repository files navigation

📊 Spreadsheet Agent with LangChain 🔗

✨ Key Features

🔗 LangChain Integration

Supported LLM Providers

Switch Models Instantly

⚡ Quick Start

Installation

Basic Usage

🎯 NEW: Reports Agent (Recommended for reports/ folder)

Interactive CLI (General Purpose)

Batch Mode

Python API

Project Structure

Available Tools

1. Load Spreadsheet

2. Query Spreadsheet

3. Process Spreadsheet

4. Summarize Results

5. Get Spreadsheet Info

Example Workflows

Workflow 1: Sales Analysis

Workflow 2: Regional Performance

Workflow 3: Product Margins

CLI Commands

Command-Line Options

Examples

Requirements

Architecture

Supported File Formats

Future Enhancements

Troubleshooting

Contributing

License

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages