Skip to content

rows/rows_vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“Š rows_vision

rows_vision is an open-source API service that extracts structured data from visual content like charts, receipts, and screenshots using vision-based classifiers and LLMs. It's built for fast local deployment, and works entirely in memory β€” no cloud storage required.

Supported types:

  • 1: Line chart (single line)
  • 2: Line chart (multiple lines)
  • 3: Bar/column chart
  • 4: Scatter plot
  • 5: Pie or doughnut chart
  • 6: Table
  • 7: Receipt/Invoice
  • 8: Other (e.g., infographic with extractable data)


πŸš€ Features

  • Multi-Model AI Support: Choose from Anthropic Claude, OpenAI GPT-4, Google Gemini, or Groq models
  • Chart Analysis: Extract data from line charts, bar charts, scatter plots, pie charts
  • Table & Receipt Processing: Parse structured data from tables and receipts
  • Flexible Input: Process images from URLs or local files
  • In-Memory Processing: No cloud storage required - everything runs locally
  • Docker Ready: Easy deployment with Docker containers
  • Production Ready: Built-in health checks, logging, and error handling
  • Performance Metrics: Optional timing information for monitoring

🧠 Example Use Case

Upload the URL of chart screenshot and receive a structured JSON like:

{
    "result": [
        {
            "X": 130,
            "Y": "2000"
        },
        {
            "X": 90,
            "Y": "2005"
        },
        {
            "X": 30,
            "Y": "2010"
        },
        {
            "X": 25,
            "Y": "2015"
        },
        {
            "X": 60,
            "Y": "2020"
        },
        {
            "X": 110,
            "Y": "2025"
        }
    ]
}

πŸš€ Quick Start

🐳 Docker Deployment (Recommended)

# 1. Clone and setup
git clone https://github.com/rows/rows_vision.git
cd rows_vision

# 2. Run setup script
chmod +x setup.sh
./setup.sh

# 3. Add your API keys to .env
nano .env  # Add at least one API key

# 4. Build and run with Docker
docker build -t rows-vision .
docker run -d --name rows-vision-api -p 8080:8080 --env-file .env rows-vision

# 5. Test the API (wait 30 seconds for startup)
sleep 30
curl http://localhost:8080/health

🐍 Local Python Development

Linux/macOS:

# 1. Clone and setup
git clone https://github.com/rows/rows_vision.git
cd rows_vision
chmod +x setup.sh && ./setup.sh

# 2. Create virtual environment
python3.11 -m venv venv
source venv/bin/activate

# 3. Install dependencies and run
pip install -r requirements.txt
nano .env  # Add API keys
python main.py

Windows (PowerShell):

# 1. Clone repository
git clone https://github.com/rows/rows_vision.git
cd rows_vision

# 2. Copy environment template
Copy-Item ".env.example" ".env"

# 3. Edit .env file with your API keys
notepad .env

# 4. Create and activate virtual environment
python -m venv venv
venv\Scripts\Activate.ps1

# 5. Install dependencies and run
pip install -r requirements.txt
python main.py

Windows (Git Bash - Alternative):

# If you have Git Bash installed, you can use the Linux/macOS commands:
chmod +x setup.sh
./setup.sh
# Then follow the Linux/macOS steps above
Why Docker? Docker Local Python
Setup Time 5 minutes 10-15 minutes
Dependencies Automatic Manual
Consistency Same everywhere "Works on my machine"
Production Ready Yes Needs additional setup

βš™οΈ Configuration

Environment Variables

Create a .env file with your API credentials:

# Required: At least one AI API key
API_KEY_ANTHROPIC=sk-ant-your-key-here
API_KEY_OPENAI=sk-your-key-here
API_KEY_GEMINI=AIzaSy-your-key-here
API_KEY_GROQ=gsk_your-key-here

# Optional: Model Configuration
ANTHROPIC_MODEL=claude-3-5-sonnet-20241022
OPENAI_MODEL=gpt-4o
GEMINI_MODEL=gemini-2.0-flash
GROQ_MODEL=meta-llama/llama-4-scout-17b-16e-instruct

# Optional: Server Settings
HOST=0.0.0.0
PORT=8080
DEBUG=false
LOG_LEVEL=INFO
MAX_FILE_SIZE=10485760  # 10MB

Supported AI Models

Model Classification Extraction Notes
anthropic βœ… βœ… Claude Sonnet, high accuracy
openai βœ… βœ… GPT-4o, good performance
google βœ… βœ… Gemini Flash, fast processing
groq βœ… βœ… Llama, cost-effective

πŸ”Œ API Endpoints

POST /api/run

Process an image from a URL.

Request:

curl -X POST 'http://localhost:8080/api/run' \
--header 'Content-Type: application/json' \
--data '{
  "image_url": "https://pbs.twimg.com/media/GoCeF4wbwAE24ln?format=jpg&name=large",
  "model_classification": "anthropic",
  "model_extraction": "anthropic",
  "time_outputs": true
}'

Python Example:

import requests

url = "http://localhost:8080/api/run"
payload = {
    "image_url": "https://pbs.twimg.com/media/GoCeF4wbwAE24ln?format=jpg&name=large",
    "model_classification": "anthropic",
    "model_extraction": "anthropic",
    "time_outputs": True
}

response = requests.post(url, json=payload)
print(response.json())

Response:

{
  "result": [
    {"Month": "January", "Sales": 1000, "Profit": 200}
  ],
  "metrics": {
    "total_time": 2.345
  }
}

POST /api/run-file

Process an image from URL or local file path.

{
  "image_url": "https://example.com/chart.png",
  // OR
  "file_path": "/path/to/local/image.jpg",
  "model_classification": "anthropic",
  "model_extraction": "anthropic"
}

πŸš€ Production Deployment

Docker Compose (Recommended)

Create docker-compose.yml:

version: '3.8'
services:
  rows-vision:
    build: .
    container_name: rows-vision-api
    ports:
      - "8080:8080"
    env_file:
      - .env
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
docker-compose up -d

Cloud Deployment

Google Cloud Run:

gcloud run deploy rows-vision --source . --platform managed --allow-unauthenticated

AWS ECS / Digital Ocean / Others: Use the Docker image built above with your preferred container orchestration platform.

Traditional Deployment

# Using Gunicorn (production WSGI server)
pip install gunicorn
gunicorn --bind 0.0.0.0:8080 --workers 4 --timeout 120 main:app

πŸ” Monitoring & Health

# Health check
curl http://localhost:8080/health

# Docker container status
docker ps
docker logs rows-vision-api --tail 50 -f

# Resource monitoring
docker stats rows-vision-api

πŸ— Technical Details

Supported Formats: PNG, JPG, JPEG, GIF, WEBP, HEIC
Chart Types: Line, Bar, Scatter, Pie, Tables, Receipts
Processing: In-memory, no file storage required
Architecture: Flask API + AI model backends


πŸ“ Project Structure

rows_vision/
β”œβ”€β”€ src/                     # Source code
β”‚   β”œβ”€β”€ main.py             # Flask application
β”‚   β”œβ”€β”€ config.py           # Configuration
β”‚   β”œβ”€β”€ image_analyzer.py   # Data extraction
β”‚   β”œβ”€β”€ image_classifier.py # Image classification
β”‚   └── rows_vision.py      # Main orchestrator
β”œβ”€β”€ prompts/                # AI prompt templates
β”œβ”€β”€ main.py                 # Application entry point
β”œβ”€β”€ requirements.txt        # Dependencies
β”œβ”€β”€ Dockerfile             # Container definition
β”œβ”€β”€ setup.sh               # Automated setup script
└── .env.example          # Environment template

🚧 To-Do

  • Support user prompt for finer operations
  • Improve error handling βœ… Done
  • Docker deployment βœ… Done
  • Production-ready logging βœ… Done
  • Support for batch processing
  • PDF processing improvements

πŸ› Troubleshooting

Missing API Keys:

# Check if keys are loaded
docker run --env-file .env rows-vision python -c "import os; print('Keys loaded:', bool(os.getenv('API_KEY_ANTHROPIC')))"

Container Issues:

# Check logs
docker logs rows-vision-api

# Debug mode
docker run -it --env-file .env -e DEBUG=true rows-vision

Port Conflicts:

# Use different port
docker run -d -p 8081:8080 --env-file .env rows-vision

πŸ“„ License

This project is licensed under the MIT License.


πŸ™Œ Contributions

PRs and issues are welcome. Please fork the repo and submit changes via pull request.


πŸ“£ Maintainer

Created by @asamagaio at Rows.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published