___ __ __ _
/ _ | __ __ / /_ ___ / / ___ ___ _ (_)___
/ __ |/ // // __// _ \/ /__/ _ \/ _ `// / __/
/_/ |_|\_,_/ \__/ \___/____/\___/\_, //_/\__/
/___/
█████╗ ██╗ ██╗████████╗ ██████╗ ██╗ ██████╗ ██████╗ ██╗ ██████╗
██╔══██╗██║ ██║╚══██╔══╝██╔═══██╗██║ ██╔═══██╗██╔════╝ ██║██╔════╝
███████║██║ ██║ ██║ ██║ ██║██║ ██║ ██║██║ ███╗██║██║
██╔══██║██║ ██║ ██║ ██║ ██║██║ ██║ ██║██║ ██║██║██║
██║ ██║╚██████╔╝ ██║ ╚██████╔╝███████╗╚██████╔╝╚██████╔╝██║╚██████╗
╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═════╝
Designed by Google Gemini | Written by Anthropic Claude Opus
A U T O L O G I C P I P E L I N E
╔══════════════════════════════════════════════════════════════════════════════════╗
║ ║
║ Sketch / Audio / Text ║
║ | ║
║ v ║
║ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ ║
║ │ │ │ │ │ │ ║
║ │ MODULE 1 │ │ MODULE 2 │ │ MODULE 3 │ ║
║ │ Ingest │ ───> │ Gemini Brain │ ───> │ Deploy │ ║
║ │ │ │ │ │ │ ║
║ │ multi_ingest │ │ core_gen │ │ auto_deploy │ ║
║ │ │ │ │ │ │ ║
║ └─────────────────┘ └─────────────────┘ └─────────────────┘ ║
║ OpenCV + Whisper Gemini 1.5 Pro Firebase Hosting ║
║ Image validation Multi-Agent Plan Package + Deploy ║
║ Audio transcription Code Generation Live URL output ║
║ | ║
║ v ║
║ ╔═══════════╗ ║
║ ║ LIVE APP ║ ║
║ ╚═══════════╝ ║
║ ║
╚══════════════════════════════════════════════════════════════════════════════════╝
| Feature | Description | |
|---|---|---|
| 🎨 | Multimodal Input | Feed it a napkin sketch, a voice memo, or raw text. Hand-drawn wireframes via OpenCV, voice notes via Whisper, plain English descriptions -- all at once or any combination. |
| 🧠 | Gemini 1.5 Pro Planning | The Gemini Brain analyzes your multimodal context, understands layout from sketches, extracts intent from transcriptions, and produces a structured JSON execution plan. |
| 👥 | Multi-Agent Code Generation | Four specialized AI agents -- Boss (orchestrator), Jordan (frontend), Alex (backend), Sam (DevOps) -- execute tasks in dependency order, writing production-quality code. |
| 🚀 | One-Click Firebase Deploy | Generated code is auto-packaged into a public/ directory, Firebase configs are written programmatically, and firebase deploy runs non-interactively. You get a live URL. |
| 📺 | Real-Time WebUI Monitoring | The bundled index.html provides a cyberpunk-styled dashboard with an interactive terminal simulator showing the full pipeline execution flow. |
| 🔧 | Pre-Processing Layer | Image validation and enhancement via OpenCV, audio denoising and transcription via OpenAI Whisper, base64 encoding for Gemini's multimodal API -- all handled before the brain even fires. |
| 🧩 | Agent Marketplace | Drop-in _team/ templates let you define custom agent personas, swap specialists, or add new roles. The architecture is built for extensibility. |
git clone https://github.com/q15432123/AutoLogic.git
cd AutoLogic
pip install -r requirements.txtcp .env.example .envEdit .env with your keys:
GEMINI_API_KEY=your-gemini-api-key-here
FIREBASE_PROJECT_ID=your-firebase-project-idGet a Gemini API key at aistudio.google.com/apikey
# Interactive mode (prompts you for input)
python run_main.py
# Direct text input
python run_main.py --text "Build a todo app with dark mode"
# Full multimodal
python run_main.py --image sketch.png --audio notes.mp3 --text "portfolio site"
# Generate code only, skip deployment
python run_main.py --text "landing page" --skip-deployAutoLogic's multi-agent architecture is driven by Agent DNA templates -- structured role definitions that tell each agent who they are, what they own, and how they write code.
_team/
boss.yaml # Orchestrator: decomposes goals, assigns tasks, reviews output
jordan.yaml # Frontend: HTML5, CSS3, JS -- pixel-perfect UI from sketches
alex.yaml # Backend: Flask, Express, APIs -- server logic & data models
sam.yaml # DevOps: Firebase, configs, README -- deployment pipeline
custom/ # YOUR agents go here
designer.yaml # Example: a Figma-to-code specialist
qa.yaml # Example: automated testing agent
Each template defines:
| Field | Purpose |
|---|---|
role |
Agent's identity and expertise area |
owns |
File paths and directories the agent is responsible for |
style |
Coding conventions, frameworks, and patterns to follow |
depends_on |
Which agents must complete before this one runs |
tools |
Available actions: write_file, run_command, etc. |
Want a new specialist? Drop a YAML file in
_team/custom/and the planning engine picks it up automatically. Swap Jordan for a React specialist. Add a QA agent that writes tests. The roster is yours to define.
AutoLogic is a three-module sequential pipeline. Each module is a standalone Python file that can be tested independently.
Input Sources Processing Output
─────────────────────────────────────────────────────────────────────────
[Image] .png/.jpg ──> OpenCV validate + base64 encode ──┐
[Audio] .mp3/.wav ──> Whisper transcribe to text ──┤──> Context Dict
[Text] string ──> Pass through ──┘
- Validates image dimensions and readability via OpenCV
- Transcribes audio using OpenAI Whisper (
basemodel, ~140MB) - Encodes images to base64 for Gemini's multimodal input
- Returns a unified context dictionary with all processed inputs
Context Dict ──> Gemini 1.5 Pro ──> JSON Task Plan ──> Agent Loop ──> Files on Disk
| |
| "Analyze sketch + text, | Boss -> Jordan -> Alex -> Sam
| decompose into agent tasks" | (dependency-ordered execution)
| |
v v
Structured Plan _workspaces/
with dependencies frontend/
backend/
firebase.json
- Sends multimodal context (text + optional image) to Gemini 1.5 Pro
- Gemini returns a JSON task plan with agent assignments and dependencies
- Each agent receives its task as a prompt and generates complete file contents
- All files are written to
_workspaces/with proper directory structure
_workspaces/ ──> Package to public/ ──> firebase.json + .firebaserc ──> firebase deploy ──> LIVE URL
- Scans workspace for deployable web assets (HTML, CSS, JS, images)
- Copies everything into a
public/directory for Firebase Hosting - Programmatically generates
firebase.jsonand.firebaserc(no interactivefirebase init) - Runs
firebase deploy --non-interactiveand extracts the hosting URL - Generates a deployment report in
DEPLOY_REPORT.md
| Traditional Workflow | CodeSnap / Copilot | AutoLogic | |
|---|---|---|---|
| Input | Text prompts only | Screenshot of code | Sketch + Voice + Text (true multimodal) |
| Planning | You plan everything | Suggests next line | Gemini decomposes the entire project |
| Execution | You write every file | Autocomplete fragments | 4 agents write all files in parallel |
| Output | Code on your machine | Code on your machine | Live deployed app with URL |
| Deployment | Manual Firebase setup | Not included | Fully automated, zero-config |
| Time | Hours to days | Faster typing | Sketch to live site in under 60 seconds |
╔══════════════════════════════════════════════════════════════════════════╗
║ ║
║ CodeSnap helps you show code. ║
║ ║
║ AutoLogic helps you SHUT UP AND BUILD. ║
║ ║
╚══════════════════════════════════════════════════════════════════════════╝
AutoLogic/
├── run_main.py # Entry point -- chains all 3 modules
├── multi_ingest.py # Module 1: Multimodal ingestion engine
├── core_gen.py # Module 2: Gemini planning + agent code gen
├── auto_deploy.py # Module 3: Firebase packaging & deployment
├── index.html # Cyberpunk WebUI dashboard
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
├── .gitignore
└── _workspaces/ # Generated output (gitignored)
opencv-python >= 4.8.0
openai-whisper >= 20231117
google-generativeai >= 0.8.0
python-dotenv >= 1.0.0
Pillow >= 10.0.0
Plus: Firebase CLI (npm install -g firebase-tools) for deployment.
MIT License
Copyright (c) 2025 AutoLogic
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
┌─────────────────────────────────────────────────────────────┐
│ │
│ Architecture & Conceptual Design ... Google Gemini │
│ Implementation & Documentation ..... Anthropic Claude │
│ │
│ Two AIs. One pipeline. Zero excuses. │
│ │
└─────────────────────────────────────────────────────────────┘
Built with the combined intelligence of Google Gemini and Anthropic Claude
From whiteboard to production. No humans were mass-deployed in the making of this pipeline.