🤖 AI Development Insight Stack

Comprehensive observability platform for monitoring multiple AI development tools with cost tracking, user attribution, and privacy-first design.

Quick Start | Documentation | Dashboards | Architecture

📑 Table of Contents

Overview
Key Features
Quick Start
Supported AI Tools
Architecture
Dashboards
Configuration
Cost Tracking
Deployment Options
Documentation
Project Status
Contributing
License
Acknowledgments

🎯 Overview

As AI-powered development tools become essential to modern software engineering, understanding their usage patterns, costs, and performance has become critical. AI Development Insight Stack provides comprehensive observability for your AI tooling ecosystem, giving you the visibility you need to optimize costs, improve performance, and ensure compliance.

The Problem

Modern development teams use multiple AI tools (Claude Code, GitHub Copilot, OpenAI Codex, etc.), but lack visibility into:

Cost attribution - Which teams, projects, and users are driving AI costs?
Usage patterns - Who's using what, when, and how effectively?
Performance metrics - Are API calls fast enough? Are we hitting rate limits?
Error tracking - What's failing and why?

The Solution

This platform provides a production-ready, privacy-first observability stack that:

Collects telemetry from multiple AI tools using OpenTelemetry standards
Tracks costs in real-time with accurate pricing across 15+ AI models
Provides 8 specialized Grafana dashboards for comprehensive insights
Supports three deployment options: Docker Compose (local), AWS ECS (cloud), and Kubernetes (enterprise)
Respects privacy with GDPR-compliant user ID hashing and PII removal

Who Should Use This?

Engineering Leaders - Understand AI tool ROI and optimize team spending
DevOps Teams - Monitor AI tool performance and infrastructure health
Finance Teams - Track and forecast AI development costs
Security Teams - Ensure compliance and audit AI tool usage
Individual Developers - Understand personal usage patterns and optimize workflows

✨ Key Features

🤖 Multi-Tool Support

Monitor Claude Code, GitHub Copilot, OpenAI API, and any tool with OpenTelemetry integration - all in one unified platform.

💰 Real-Time Cost Tracking

Accurate cost calculation for 15+ AI models with automatic updates from pricing tables. Track costs by user, project, team, and organization.

👥 User & Project Attribution

Understand who's using what, when, and for which projects. Perfect for chargeback, capacity planning, and usage optimization.

📊 8 Grafana Dashboards

Pre-built dashboards covering:

Overview & health monitoring
Performance analysis & optimization
Error tracking & debugging
Cost analysis & forecasting
User activity & attribution
Project analytics
Multi-tool comparison
Resource usage & rate limits

🔒 Privacy-First Design

GDPR Compliant - User IDs automatically hashed using SHA-256
PII Removal - Usernames, email addresses, and user agents stripped
Configurable Retention - Control how long data is stored
Secure by Default - Best practices baked into all deployment options

🚀 Three Deployment Options

Docker Compose - 5-minute local setup for development
AWS ECS - Production-ready cloud deployment with Terraform
Kubernetes - Enterprise-grade orchestration with high availability

🎯 Production-Ready Configurations

Auto-scaling based on load
High availability across all components
Persistent storage with automatic backups
Health checks and self-healing
Comprehensive logging and monitoring

📈 Observability Stack

Built on industry-standard open-source tools:

OpenTelemetry - Vendor-neutral telemetry collection
Prometheus - Time-series metrics storage
Loki - Log aggregation and querying
Tempo - Distributed tracing
Grafana - Visualization and alerting

🚀 Quick Start

Get up and running in 5 minutes with Docker Compose:

Prerequisites

Docker and Docker Compose installed (Get Docker)
4GB RAM available for Docker
Ports available: 3001, 3100, 3200, 4317, 4318, 9090

Installation

# 1. Clone the repository
git clone git@github.com:PimpMyNines/AI-Development-Insight-Stack.git
cd AI-Development-Insight-Stack

# 2. Configure environment variables
cp .env.example .env

# Edit .env to set your Grafana password (IMPORTANT!)
# Default is admin/admin - CHANGE THIS for production!

# 3. Start the stack
docker-compose -f docker/docker-compose.yml up -d

# 4. Verify all services are running
docker-compose -f docker/docker-compose.yml ps

Expected output: All 5 services (otel-collector, prometheus, loki, tempo, grafana) should show "Up".

Configure Your AI Tool

Option A: Claude Code

# Set environment variable
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317

# Or add to Claude Code settings.json
{
  "telemetry": {
    "enabled": true,
    "endpoint": "http://localhost:4317",
    "protocol": "grpc"
  }
}

Option B: OpenAI API

# Install the wrapper
pip install -r bridges/openai-wrapper/python/requirements.txt

# Use as drop-in replacement
import openai_wrapper as openai

client = openai.OpenAI(api_key="your-key")
# All API calls automatically tracked!

Option C: Generic OTLP Integration

# Use our Python SDK
from otlp_client import OTLPClient

client = OTLPClient(
    endpoint="http://localhost:4317",
    tool_name="my-ai-tool",
    user_id="developer@company.com",
    project_name="my-project"
)

client.track_completion(
    model="gpt-4",
    input_tokens=100,
    output_tokens=50,
    latency_ms=1234
)

See bridges/generic-otlp/ for Node.js examples.

Access Grafana

Open your browser to http://localhost:3001
Login with credentials from .env (default: admin / admin)
- ⚠️ IMPORTANT: Change the default password immediately!
Navigate to Dashboards → AI Dev Tools folder
Open any dashboard to see your telemetry

Verify Data Flow

# Check OpenTelemetry Collector is receiving data
curl http://localhost:4318/v1/traces

# Check Prometheus has metrics
curl http://localhost:9090/api/v1/query?query=ai_dev_insight_request_total

# View container logs
docker-compose -f docker/docker-compose.yml logs -f otel-collector

That's it! You now have a complete AI development observability platform running locally.

🛠️ Supported AI Tools

AI Tool	Status	Integration Method	Documentation
Claude Code	✅ Production	Native OTLP support	Setup Guide
OpenAI API	✅ Production	Python wrapper library	Integration Guide
Generic OTLP	✅ Production	Python/Node.js SDK	SDK Examples
GitHub Copilot	🚧 Planned	VS Code extension	Coming in Phase 2
AWS CodeWhisperer	🚧 Planned	IDE plugin	Coming in Phase 3
Cursor AI	🚧 Planned	Native integration	Coming in Phase 3
Codeium	🚧 Planned	SDK wrapper	Coming in Phase 3

Adding Your Own Tool? Use our Generic OTLP SDK to integrate any AI tool in minutes.

🏗️ Architecture

System Overview

graph TB
    subgraph AI Tools
        CC[Claude Code]
        OAI[OpenAI API]
        GHC[GitHub Copilot]
        CUST[Custom Tools]
    end

    subgraph Telemetry Collection
        OTEL[OpenTelemetry Collector<br/>Port 4317/4318]
    end

    subgraph Storage Backends
        PROM[Prometheus<br/>Metrics Storage<br/>15 days retention]
        LOKI[Loki<br/>Log Storage<br/>31 days retention]
        TEMPO[Tempo<br/>Trace Storage<br/>30 days retention]
    end

    subgraph Visualization
        GRAF[Grafana Dashboards<br/>8 Pre-built Views]
    end

    CC -->|OTLP gRPC| OTEL
    OAI -->|OTLP HTTP| OTEL
    GHC -->|OTLP gRPC| OTEL
    CUST -->|OTLP| OTEL

    OTEL -->|Metrics Export| PROM
    OTEL -->|Logs Export| LOKI
    OTEL -->|Traces Export| TEMPO

    PROM --> GRAF
    LOKI --> GRAF
    TEMPO --> GRAF

Data Flow

Collection - AI tools send telemetry to OpenTelemetry Collector via OTLP (gRPC port 4317 or HTTP port 4318)
Processing - Collector processes telemetry:
- Hashes user IDs for privacy (SHA-256)
- Removes PII (names, emails, user agents)
- Calculates costs using pricing table
- Adds standard attributes for multi-tool support
- Batches data for efficiency
Storage - Processed data exported to specialized backends:
- Prometheus - Metrics (request rates, latency, costs)
- Loki - Logs (errors, debug info, events)
- Tempo - Traces (end-to-end request flows)
Visualization - Grafana queries all backends to provide unified dashboards

Component Details

Component	Role	Endpoint	Retention
OpenTelemetry Collector	Receive, process, and route telemetry	`localhost:4317` (gRPC) `localhost:4318` (HTTP)	N/A
Prometheus	Time-series metrics storage	`localhost:9090`	15 days
Loki	Log aggregation and storage	`localhost:3100`	31 days
Tempo	Distributed tracing backend	`localhost:3200`	30 days
Grafana	Dashboards and visualization	`localhost:3001`	N/A

Resource Usage (typical single user):

CPU: ~0.5 cores
Memory: ~2GB total
Disk: ~10GB for 30 days of data

📊 Dashboards

1. AI Development Overview

Purpose: High-level health monitoring and quick status check across all AI tools.

Key Metrics:

Request rate (requests/second)
Success rate (% of 2xx responses)
P50/P95/P99 latency
Error count by tool
Active sessions
Recent logs

Best For: Daily check-ins, incident detection, executive summaries

2. Tool Activity & Usage

Purpose: Deep dive into how each AI tool is being used.

Key Metrics:

Requests by tool over time
Model usage distribution (GPT-4 vs Claude vs others)
Request throughput by endpoint
Average response time by method
Concurrent requests per tool

Best For: Usage pattern analysis, tool comparison, capacity planning

3. Sessions & Users

Purpose: Understand who's using AI tools and how.

Key Metrics:

Active users by tool
Session duration analysis
User activity heatmap
Top users by request count
User retention metrics

Best For: User adoption tracking, engagement analysis, chargeback

4. Token & Cost Analysis

Purpose: Track and optimize AI development costs.

Key Metrics:

Cost per hour/day/month
Token usage (input vs output)
Cost by user, project, and team
Cost by model (which models are expensive?)
Budget alerts and forecasting

Best For: Cost optimization, budget planning, financial reporting

💡 Pro Tip: Set budget alerts to notify when daily costs exceed thresholds!

5. Performance Deep Dive (Coming Soon)

Purpose: Optimize latency and throughput.

Key Metrics:

Latency percentiles by endpoint
Token processing speed
Queue depth and wait times
Rate limit headroom

6. Error Tracking & Debugging (Coming Soon)

Purpose: Identify and resolve issues quickly.

Key Metrics:

Error rate trends
HTTP status code distribution (4xx vs 5xx)
Top error sources
Error logs with context
Failed request traces

7. Multi-Tool Comparison (Coming Soon)

Purpose: Compare performance across different AI tools.

Key Metrics:

Cost per request by tool
Latency comparison
Success rate by tool
Feature usage patterns

8. Resource Usage & Limits (Coming Soon)

Purpose: Monitor infrastructure and avoid rate limits.

Key Metrics:

Rate limit consumption
Infrastructure CPU/memory
Network bandwidth
Storage usage

Dashboard Screenshots

Screenshots coming soon - dashboards auto-populate once you start sending telemetry!

⚙️ Configuration

Environment Variables

Key configuration options in .env:

# Grafana Credentials (CHANGE THESE!)
GRAFANA_ADMIN_USER=admin
GRAFANA_ADMIN_PASSWORD=change-this-password

# Environment Label
ENVIRONMENT=development

# Data Retention
PROMETHEUS_RETENTION=15d         # Metrics retention
LOKI_RETENTION_HOURS=744         # 31 days
TEMPO_RETENTION_HOURS=720        # 30 days

# Resource Limits
OTEL_MEMORY_LIMIT=512M
PROMETHEUS_MEMORY_LIMIT=2G

See .env.example for all available options.

Customizing Retention Periods

Increase retention for compliance:

# Edit .env
PROMETHEUS_RETENTION=90d
LOKI_RETENTION_HOURS=2160    # 90 days
TEMPO_RETENTION_HOURS=2160   # 90 days

# Restart services
docker-compose -f docker/docker-compose.yml restart

Note: Longer retention requires more disk space (approximately 3GB per 30 days per user).

Updating Pricing Tables

AI model pricing changes frequently. Update costs in configs/pricing/pricing-table.yaml:

claude_models:
  claude-opus-4:
    input_price_per_mtok: 15.00    # Update this
    output_price_per_mtok: 75.00   # And this

The OpenTelemetry Collector automatically reloads changes within 60 seconds.

Advanced Configuration

OpenTelemetry Collector: configs/otel/collector-config.yaml
Prometheus: configs/prometheus/prometheus.yml
Loki: configs/loki/local-config.yaml
Tempo: configs/tempo/tempo.yaml
Grafana: configs/grafana/

💰 Cost Tracking

How It Works

The platform calculates costs in real-time using a hybrid approach:

API-Provided Costs (preferred) - If the AI tool's API includes cost data, use it directly
Calculated Costs (fallback) - Calculate from token counts using the pricing table

Cost Calculation Formula:

Total Cost = (Input Tokens / 1,000,000 × Input Price) +
             (Output Tokens / 1,000,000 × Output Price)

Precision: $0.0001 (4 decimal places)

Supported Models & Pricing

Current pricing as of January 2025:

Model	Input (per 1M tokens)	Output (per 1M tokens)	Provider
Claude Opus 4	$15.00	$75.00	Anthropic
Claude Sonnet 4	$3.00	$15.00	Anthropic
Claude Haiku 3.5	$0.80	$4.00	Anthropic
GPT-4 Turbo	$10.00	$30.00	OpenAI
GPT-4	$30.00	$60.00	OpenAI
GPT-3.5 Turbo	$0.50	$1.50	OpenAI
Gemini Pro	$0.50	$1.50	Google

Plus 9 more models - See configs/pricing/pricing-table.yaml for complete pricing.

Budget Alerts

Configure automatic alerts when costs exceed thresholds:

Level	User (per day)	Project (per day)	Organization (per day)
Warning	$50	$200	$1,000
Critical	$100	$500	$2,500

Alerts appear in Grafana dashboards and can trigger notifications (email, Slack, PagerDuty).

Cost Optimization Recommendations

The platform automatically suggests optimizations:

Model downgrade - "Switch from GPT-4 to GPT-3.5 for simple tasks (10x cost reduction)"
Wasteful patterns - "User retries failed requests 50+ times/day"
Inefficient usage - "Project generates 90% output tokens in error responses"

🚀 Deployment Options

Choose the deployment that fits your needs:

Comparison Table

Feature	Docker Compose	AWS ECS	Kubernetes
Setup Time	5 minutes	30 minutes	45 minutes
Best For	Local dev, single user	Production, small-medium teams	Enterprise, large teams
High Availability	❌ No	✅ Yes	✅ Yes
Auto-Scaling	❌ No	✅ Yes	✅ Yes
Cost	Free (local)	~$100-300/month	~$150-500/month
Maintenance	Low	Medium	Medium-High
Multi-Region	❌ No	⚠️ Manual	✅ Yes
Load Balancing	❌ No	✅ Built-in	✅ Built-in
Persistent Storage	Local volumes	EBS + S3	PV + S3
Backup/Recovery	Manual	Automated	Automated

Docker Compose (Local Development)

Pros:

Fastest setup (5 minutes)
No cloud costs
Perfect for testing and development
Easy to modify and experiment

Cons:

No high availability
Single machine only
Manual backups
Not suitable for production

Quick Start: See Quick Start section above.

AWS ECS (Production Cloud)

Pros:

Production-ready with high availability
Auto-scaling based on load
Managed infrastructure (less maintenance)
AWS-native security and monitoring
Cost-effective for small-medium teams

Cons:

AWS-specific (vendor lock-in)
Requires AWS account and permissions
Monthly costs (~$100-300)

Quick Start:

# Coming in Phase 2!
cd infrastructure/ecs
terraform init
terraform apply

# Outputs: Load balancer URL and endpoints

Documentation: Coming soon in docs/deployment/aws-ecs.md

Kubernetes (Enterprise)

Pros:

Cloud-agnostic (runs on AWS, GCP, Azure, on-prem)
Best scalability and flexibility
Advanced features (service mesh, CRDs, operators)
Industry standard for large deployments

Cons:

Most complex setup
Requires Kubernetes expertise
Higher infrastructure costs
More maintenance overhead

Quick Start:

# Coming in Phase 3!
cd infrastructure/kubernetes
terraform init
terraform apply

# Or use Helm charts
helm install ai-insight ./helm-chart

Documentation: Coming soon in docs/deployment/kubernetes.md

📚 Documentation

Comprehensive documentation organized by use case:

Getting Started

Quick Start Guide - Get running in 5 minutes
Installation & Setup - Detailed installation instructions
Architecture Overview - Understand the system design

Configuration

Claude Code Configuration - Configure Claude Code telemetry
OpenAI Wrapper Setup - Integrate OpenAI API
Generic OTLP SDK - Add any AI tool
Environment Variables - All configuration options

Deployment Guides

Docker Compose - Local development setup
AWS ECS Guide - Coming in Phase 2
Kubernetes Guide - Coming in Phase 3

Operations

Dashboard Guide - Understand each dashboard
Cost Tracking - Optimize AI spending
Troubleshooting - Common issues and solutions
Maintenance & Backups - Keep your system healthy

Development

Contributing Guide - Add features and fix bugs
Architecture Deep Dive - Detailed system design
API Reference - Integration APIs
SDK Development - Build custom integrations

📈 Project Status

Current Version: 1.0.0 (Production Ready) Last Updated: January 7, 2025 Overall Progress: 38% Complete

Completed (Phase 1)

✅ Core Infrastructure

Multi-tool OpenTelemetry collector configuration
Prometheus, Loki, Tempo backends
Grafana with auto-provisioned data sources
Docker Compose deployment

✅ Cost Tracking

Comprehensive pricing table (15+ AI models)
Real-time cost calculation
Budget alerts and thresholds

✅ Privacy & Security

GDPR-compliant user ID hashing
PII removal from telemetry
Secure default configurations

✅ Integration SDKs

Python OTLP SDK with examples
Node.js OTLP SDK with TypeScript
OpenAI API wrapper (drop-in replacement)

✅ Dashboards

AI Development Overview
Tool Activity & Usage
Sessions & Users
Token & Cost Analysis

In Progress

🚧 Dashboard Enhancements

Performance Deep Dive dashboard
Error Tracking dashboard
Multi-Tool Comparison dashboard
Resource Usage dashboard

🚧 Documentation

Deployment guides (AWS ECS, Kubernetes)
Advanced configuration tutorials
Video walkthroughs

Roadmap

Phase 2 (Q1 2025) - Advanced Dashboards & Cloud Deployments

Complete remaining 4 dashboards
AWS ECS Terraform deployment
Kubernetes Helm charts
Alert rule templates

Phase 3 (Q2 2025) - Extended Tool Support

GitHub Copilot VS Code extension
AWS CodeWhisperer integration
Cursor AI native support
Multi-region deployments

Phase 4 (Q3 2025) - Enterprise Features

SSO/SAML authentication
Multi-tenancy support
Advanced RBAC
Audit logging

Phase 5 (Q4 2025) - SaaS Platform

Hosted offering (optional)
Advanced analytics and ML insights
Benchmark reports
API rate limit management

See PROGRESS.md for detailed task tracking.

🤝 Contributing

We welcome contributions! This is an open-source project under the MIT license.

How to Contribute

Fork the repository

git fork git@github.com:PimpMyNines/AI-Development-Insight-Stack.git

Create a feature branch

git checkout -b feature/amazing-new-feature

Make your changes
- Add new AI tool integrations
- Improve dashboards
- Fix bugs
- Enhance documentation

Test thoroughly

# Start the stack
docker-compose -f docker/docker-compose.yml up -d

# Run your tests
# Verify dashboards still work

Submit a pull request
- Describe what you changed and why
- Include screenshots for UI changes
- Reference any related issues

Areas We Need Help

Tool Integrations - Add support for more AI tools
Dashboard Improvements - Enhance existing dashboards or create new ones
Cloud Deployments - Help with AWS ECS and Kubernetes configurations
Documentation - Tutorials, guides, and video content
Testing - Add automated tests and CI/CD pipelines

Development Setup

# Clone the repo
git clone git@github.com:PimpMyNines/AI-Development-Insight-Stack.git
cd AI-Development-Insight-Stack

# Start development environment
docker-compose -f docker/docker-compose.yml up -d

# Make changes to configs
vim configs/otel/collector-config.yaml

# Reload configuration
docker-compose -f docker/docker-compose.yml restart otel-collector

# Test your changes
# Send test telemetry, check dashboards

Code of Conduct

Be respectful and inclusive
Provide constructive feedback
Focus on what's best for the project
Help newcomers get started

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

What this means:

✅ Commercial use allowed
✅ Modification allowed
✅ Distribution allowed
✅ Private use allowed
⚠️ Liability and warranty limitations

🙏 Acknowledgments

This project is built on the shoulders of giants. Special thanks to:

OpenTelemetry - For creating the vendor-neutral observability standard
Prometheus - For reliable time-series metrics storage
Grafana Labs - For Grafana, Loki, and Tempo
Anthropic - For Claude and native OTLP support
The Open Source Community - For endless inspiration and support

Built With

📞 Support & Community

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: docs/

Quick Links

Quick Start in 5 Minutes
Troubleshooting Common Issues
Dashboard Guide
Contributing Guidelines
Project Roadmap

Made with ❤️ by developers, for developers

Start monitoring your AI development tools in 5 minutes

Get Started | View Demos | Read Docs

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
bridges		bridges
configs		configs
docker		docker
docs/configuration		docs/configuration
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
PHASE1_COMPLETE.md		PHASE1_COMPLETE.md
PROGRESS.md		PROGRESS.md
README.md		README.md
SECURITY.md		SECURITY.md
SESSION_SUMMARY.md		SESSION_SUMMARY.md

Folders and files

Latest commit

History

Repository files navigation

🤖 AI Development Insight Stack

📑 Table of Contents

🎯 Overview

The Problem

The Solution

Who Should Use This?

✨ Key Features

🤖 Multi-Tool Support

💰 Real-Time Cost Tracking

👥 User & Project Attribution

📊 8 Grafana Dashboards

🔒 Privacy-First Design

🚀 Three Deployment Options

🎯 Production-Ready Configurations

📈 Observability Stack

🚀 Quick Start

Prerequisites

Installation

Configure Your AI Tool

Option A: Claude Code

Option B: OpenAI API

Option C: Generic OTLP Integration

Access Grafana

Verify Data Flow

🛠️ Supported AI Tools

🏗️ Architecture

System Overview

Data Flow

Component Details

📊 Dashboards

1. AI Development Overview

2. Tool Activity & Usage

3. Sessions & Users

4. Token & Cost Analysis

5. Performance Deep Dive (Coming Soon)

6. Error Tracking & Debugging (Coming Soon)

7. Multi-Tool Comparison (Coming Soon)

8. Resource Usage & Limits (Coming Soon)

Dashboard Screenshots

⚙️ Configuration

Environment Variables

Customizing Retention Periods

Updating Pricing Tables

Advanced Configuration

💰 Cost Tracking

How It Works

Supported Models & Pricing

Budget Alerts

Cost Optimization Recommendations

🚀 Deployment Options

Comparison Table

Docker Compose (Local Development)

AWS ECS (Production Cloud)

Kubernetes (Enterprise)

📚 Documentation

Getting Started

Configuration

Deployment Guides

Operations

Development

📈 Project Status

Completed (Phase 1)

In Progress

Roadmap

🤝 Contributing

How to Contribute

Areas We Need Help

Development Setup

Code of Conduct

📄 License

🙏 Acknowledgments

Built With

📞 Support & Community

Quick Links

About

Resources

Packages