Proactive Governance for the Agentic Cloud
Traditional FinOps is reactive—you find out about a $5,000 bill 48 hours too late. In the era of autonomous agents, costs can spiral in seconds.
Budget-Aware AI Squad is a decentralized framework that integrates financial self-awareness into AI agent meshes. It acts as a "Fiscal Guardrail," ensuring that autonomous systems stay within budget while maintaining high task performance.
| Feature | Description |
|---|---|
| 🛡️ Agentic Circuit Breakers | Automatically halts recursive agent "chatter" before budgets are exceeded |
| ⚖️ Dynamic Model Routing | Intelligently switches between local SLMs (via Ollama) and frontier LLMs based on task complexity and remaining funds |
| 📈 Real-time Telemetry | Dashboard for unit-cost-per-task tracking (UCST), shifting from infrastructure monitoring to agentic monitoring |
┌─────────────────────────────────────────────────────────────┐
│ SUPERVISOR AGENT │
│ (Orchestrator & Handover Manager) │
└──────────────────┬──────────────────────┬───────────────────┘
│ │
▼ ▼
┌──────────────────────────┐ ┌──────────────────────────────┐
│ ACCOUNTANT AGENT │◄───│ RESEARCHER AGENT │
│ (Financial Gatekeeper) │ │ (Cloud Worker / Boto3) │
│ │ │ │
│ • Circuit Breaker │ │ • Interacts with LocalStack│
│ • Budget Validation │ │ • BLOCKED until approved │
│ • Spend Forecasting │ │ │
└──────────────────────────┘ └──────────────────────────────┘
│ │
│ ▼
│ ┌──────────────────────────────┐
│ │ WRITER AGENT │
│ │ (Document Polisher) │
│ │ │
│ │ • Executive summaries │
│ │ • Professional formatting │
│ └──────────────────────────────┘
│
▼
┌───────────────┐
│ LLM BRAIN │ ◄─── brain.py
│ (Ollama) │
└───────────────┘
- Supervisor Agent: Orchestrates the workflow and handles handovers between agents
- Accountant Agent: The "Financial Gatekeeper" — implements Agentic Circuit Breakers. No task can execute unless the Accountant validates the forecasted spend against remaining budget
- Researcher Agent: The "Cloud Worker" — analyzes topics and generates technical summaries using Boto3 and LLM Brain
- Writer Agent: The "Polisher" — transforms raw research into executive-ready documents
| Component | Technology | Endpoint |
|---|---|---|
| Local LLM | Ollama (Llama 3.1) | http://localhost:11434 |
| Cloud Simulation | LocalStack | http://localhost:4566 |
| Language | Python 3.14 | Windows/Linux/macOS |
Autonomous-Cloud-Governance/
├── brain.py # Central LLM interface ("Voice Box" for agents)
├── bridge.py # Phase 1: Digital Office milestone
├── researcher.py # Researcher Agent - Cloud analysis & summaries
├── writer.py # Writer Agent - Executive document generation
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore rules
└── README.md # This file
Central LLM interface for all agents. Features:
LLMBrainclass for structured LLM interactionsask_llama()convenience function for quick queries- Cost simulation: Tracks token usage at $0.015/1k tokens
- Fiscal ledger for budget-aware operations
Demonstrates the foundational AI-to-cloud connection:
- Generates content via local LLM (Ollama)
- Stores artifacts in simulated S3 (LocalStack)
- Zero cloud cost proof of concept
Cloud analysis specialist:
- Reads research topics from S3 (
research_topic.txt) - Generates 3-point technical summaries via LLM Brain
- Saves research notes to S3 (
research_notes.txt) - Full cost tracking per session
Document transformation specialist:
- Reads raw research notes from S3
- Transforms into polished executive summaries
- Saves reports to S3 (
reports/executive_summary.txt) - Professional formatting with C-level readability
- Python 3.14+ installed
- Ollama installed and running
- LocalStack installed and running
# Clone the repository
git clone https://github.com/SiD-array/Autonomous-Cloud-Governance.git
cd Autonomous-Cloud-Governance
# Create virtual environment
python -m venv venv
# Activate virtual environment
# Windows:
.\venv\Scripts\Activate.ps1
# Linux/macOS:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt# Terminal 1: Start Ollama
ollama serve
# Terminal 2: Pull the model (first time only)
ollama pull llama3.1
# Terminal 3: Start LocalStack
localstack start# Step 1: Test the LLM Brain connection
python brain.py
# Step 2: Run the Digital Office milestone
python bridge.py
# Step 3: Run the Researcher Agent (creates research_notes.txt)
python researcher.py
# Step 4: Run the Writer Agent (creates executive_summary.txt)
python writer.pyThe system simulates costs to enable budget governance:
| Metric | Value |
|---|---|
| Token estimation | ~1 token per 4 characters |
| Cost rate | $0.015 per 1,000 tokens |
| Actual cloud cost | $0.00 (LocalStack simulation) |
| Actual LLM cost | $0.00 (Ollama local execution) |
Researcher Agent: ~$0.008 (550 tokens)
Writer Agent: ~$0.021 (1,400 tokens)
─────────────────────────────────────────
Total Pipeline: ~$0.029 (1,950 tokens)
- Phase 1: Digital Office — LLM + Cloud connection ($0.00)
- Phase 2: Researcher Agent — Cloud analysis with cost tracking
- Phase 3: Writer Agent — Document transformation pipeline
- Phase 4: Accountant Agent — Budget circuit breakers
- Phase 5: Supervisor Agent — Multi-agent orchestration
- Phase 6: Real-time Telemetry Dashboard
- Phase 7: Production deployment with real AWS
- Local First: Always favor local execution (Ollama) to save costs
- Cost Awareness: Log every token usage as simulated cost in the fiscal ledger
- Circuit Breakers: No cloud action without Accountant approval
- Proactive Governance: Forecast and approve costs BEFORE execution
This is a course project for CSCI-750: Cloud Computing (Spring 2026).
MIT License - See LICENSE file for details.
Building fiscally responsible autonomous systems for the Agentic Era