Enterprise AI-Powered DataOps Platform with Multi-Model Routing
An intelligent data operations platform that automatically profiles, cleans, and analyzes data using Claude Sonnet 4.5, GPT-5 mini, Gemini 2.0 Flash, and Azure GPT-4o-mini with smart model routing.
π LIVE DEMO | π Documentation | πΌ Portfolio
- Claude Sonnet 4.5 for complex reasoning and data analysis
- GPT-5 mini for fast structured outputs (latest OpenAI!)
- Gemini 2.0 Flash for vision and multimodal tasks (FREE!)
- Azure GPT-4o-mini for enterprise compliance
- Automatic fallback routing with LiteLLM
- Auto Data Profiling: Upload CSV/Excel/JSON β instant quality analysis
- LLM-Powered Insights: AI explains your data and suggests improvements
- Smart SQL Generation: Natural language β production SQL (coming soon)
- Dashboard Vision: Upload screenshots β extract metrics (coming soon)
- Conversational BI: Ask questions about your data (coming soon)
- Frontend: Next.js 14 (TypeScript, Tailwind CSS, React Query)
- Backend: FastAPI (Python 3.11+)
- Database: PostgreSQL with DuckDB for analytics
- Deployment: Docker + Cloud Run ready
- Cost: ~$15-30/month during active use
- Docker & Docker Compose
- Python 3.11+
- Node.js 18+
- API Keys (at least one):
- Anthropic (Claude)
- OpenAI (GPT-4)
- Google AI (Gemini) - Recommended for free tier!
git clone <your-repo-url>
cd dataops-copilot# Copy example env file
cp backend/.env.example backend/.env
# Edit with your API keys
nano backend/.envMinimum required in .env:
ANTHROPIC_API_KEY=sk-ant-xxxxx
OPENAI_API_KEY=sk-xxxxx
GOOGLE_API_KEY=xxxxx# Start all services (backend, postgres, redis)
docker-compose up -d
# View logs
docker-compose logs -f backendThe backend will be available at: http://localhost:8000
cd frontend
npm install
npm run devThe frontend will be available at: http://localhost:3000
- Navigate to http://localhost:3000
- Click "Launch App"
- Upload a CSV, Excel, or JSON file
- Toggle "Use AI insights" (recommended)
- Click "Analyze File"
- View comprehensive profiling results with:
- Basic statistics (rows, columns, nulls)
- Column-level analysis
- Data quality issues
- AI-generated insights and recommendations
Use the included sample_data/sales_data.csv for testing.
dataops-copilot/
βββ backend/ # FastAPI Backend
β βββ app/
β β βββ main.py # FastAPI app entry
β β βββ core/
β β β βββ config.py # Configuration
β β βββ routers/ # API endpoints
β β β βββ data.py # Data upload & profiling
β β β βββ health.py # Health checks
β β βββ services/ # Business logic
β β β βββ llm_router.py # Multi-model routing (LiteLLM)
β β β βββ data_profiler.py # Data analysis
β β βββ models/ # Pydantic schemas
β βββ requirements.txt # Python dependencies
β βββ Dockerfile
β
βββ frontend/ # Next.js Frontend
β βββ app/ # Next.js 14 App Router
β β βββ page.tsx # Landing page
β β βββ dashboard/ # Main app
β β βββ layout.tsx
β βββ components/
β β βββ features/ # Feature components
β βββ lib/
β β βββ api.ts # API client
β βββ package.json
β
βββ docker-compose.yml # Local development
Run without Docker:
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run the server
uvicorn app.main:app --reload --port 8000API Documentation:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
cd frontend
# Install dependencies
npm install
# Run dev server
npm run dev
# Build for production
npm run build
npm start# Backend tests
cd backend
pytest
# Frontend tests
cd frontend
npm test- FastAPI - Modern async Python web framework
- LiteLLM - Unified API for multiple LLM providers
- Pandas/Polars - Data manipulation
- DuckDB - In-memory SQL analytics
- SQLAlchemy - ORM
- Redis - Caching and task queue
- Pydantic - Data validation
- Next.js 14 - React framework with App Router
- TypeScript - Type safety
- Tailwind CSS - Utility-first CSS
- React Query - Server state management
- Axios - HTTP client
- Lucide React - Icon library
- Claude Sonnet 4.5 - Complex reasoning ($3/$15 per 1M tokens)
- GPT-5 mini - Latest OpenAI model ($0.15/$0.60 per 1M tokens - estimated)
- Gemini 2.0 Flash - FREE during preview! ($0/$0 per 1M tokens)
- Azure GPT-4o-mini - Enterprise option ($0.165/$0.66 per 1M tokens)
Backend:
# Install Railway CLI
npm install -g @railway/cli
# Login
railway login
# Deploy
railway upFrontend (Vercel):
# Install Vercel CLI
npm install -g vercel
# Deploy
cd frontend
vercel --prodBackend (Railway):
ANTHROPIC_API_KEY=sk-ant-xxxxx
OPENAI_API_KEY=sk-xxxxx
GOOGLE_API_KEY=xxxxx
DATABASE_URL=postgresql://...
REDIS_URL=redis://...
DEBUG=False
Frontend (Vercel):
NEXT_PUBLIC_API_URL=https://your-railway-backend.up.railway.app
- LLM APIs: ~$1-3/month (Gemini 2.0 Flash is FREE!)
- Hosting: $0 (free tiers)
- Total: ~$1-3/month
- LLM APIs: ~$5-10/month (mostly using free/cheap models)
- Railway: $5/month (500 hrs)
- Database: $0 (Supabase free tier)
- Total: ~$10-15/month
Cost Optimization:
- Use Gemini 2.0 Flash for most tasks (FREE!)
- Use GPT-5 mini for fast structured outputs (latest OpenAI!)
- Reserve Claude for complex reasoning only
- Enable prompt caching
- Use DuckDB for in-memory analytics (free)
- Multi-model routing setup
- Data profiling with LLM insights
- Next.js frontend with file upload
- Docker development environment
- Natural language to SQL generation
- Query execution with DuckDB
- Interactive chart generation
- Dashboard screenshot OCR (Gemini)
- Metric extraction
- Auto-dashboard generation
- Data cleaning workflows
- Schema mapping
- Export to Power BI/Tableau
- User authentication
- Database integration (Snowflake, BigQuery)
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
MIT License - feel free to use this for your portfolio or commercial projects!
William Kim - AI Engineer
- π LinkedIn
- π GitHub
- π§ williamcjk11@gmail.com
If this helped you land interviews, give it a βοΈ!
Note: This is a portfolio project demonstrating full-stack AI engineering skills. For production use, add proper authentication, rate limiting, and monitoring.