Skip to content

PimpMyNines/AI-Development-Insight-Stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– AI Development Insight Stack

License: MIT Version Status

Comprehensive observability platform for monitoring multiple AI development tools with cost tracking, user attribution, and privacy-first design.

Quick Start | Documentation | Dashboards | Architecture


πŸ“‘ Table of Contents


🎯 Overview

As AI-powered development tools become essential to modern software engineering, understanding their usage patterns, costs, and performance has become critical. AI Development Insight Stack provides comprehensive observability for your AI tooling ecosystem, giving you the visibility you need to optimize costs, improve performance, and ensure compliance.

The Problem

Modern development teams use multiple AI tools (Claude Code, GitHub Copilot, OpenAI Codex, etc.), but lack visibility into:

  • Cost attribution - Which teams, projects, and users are driving AI costs?
  • Usage patterns - Who's using what, when, and how effectively?
  • Performance metrics - Are API calls fast enough? Are we hitting rate limits?
  • Error tracking - What's failing and why?

The Solution

This platform provides a production-ready, privacy-first observability stack that:

  • Collects telemetry from multiple AI tools using OpenTelemetry standards
  • Tracks costs in real-time with accurate pricing across 15+ AI models
  • Provides 8 specialized Grafana dashboards for comprehensive insights
  • Supports three deployment options: Docker Compose (local), AWS ECS (cloud), and Kubernetes (enterprise)
  • Respects privacy with GDPR-compliant user ID hashing and PII removal

Who Should Use This?

  • Engineering Leaders - Understand AI tool ROI and optimize team spending
  • DevOps Teams - Monitor AI tool performance and infrastructure health
  • Finance Teams - Track and forecast AI development costs
  • Security Teams - Ensure compliance and audit AI tool usage
  • Individual Developers - Understand personal usage patterns and optimize workflows

✨ Key Features

πŸ€– Multi-Tool Support

Monitor Claude Code, GitHub Copilot, OpenAI API, and any tool with OpenTelemetry integration - all in one unified platform.

πŸ’° Real-Time Cost Tracking

Accurate cost calculation for 15+ AI models with automatic updates from pricing tables. Track costs by user, project, team, and organization.

πŸ‘₯ User & Project Attribution

Understand who's using what, when, and for which projects. Perfect for chargeback, capacity planning, and usage optimization.

πŸ“Š 8 Grafana Dashboards

Pre-built dashboards covering:

  • Overview & health monitoring
  • Performance analysis & optimization
  • Error tracking & debugging
  • Cost analysis & forecasting
  • User activity & attribution
  • Project analytics
  • Multi-tool comparison
  • Resource usage & rate limits

πŸ”’ Privacy-First Design

  • GDPR Compliant - User IDs automatically hashed using SHA-256
  • PII Removal - Usernames, email addresses, and user agents stripped
  • Configurable Retention - Control how long data is stored
  • Secure by Default - Best practices baked into all deployment options

πŸš€ Three Deployment Options

  • Docker Compose - 5-minute local setup for development
  • AWS ECS - Production-ready cloud deployment with Terraform
  • Kubernetes - Enterprise-grade orchestration with high availability

🎯 Production-Ready Configurations

  • Auto-scaling based on load
  • High availability across all components
  • Persistent storage with automatic backups
  • Health checks and self-healing
  • Comprehensive logging and monitoring

πŸ“ˆ Observability Stack

Built on industry-standard open-source tools:

  • OpenTelemetry - Vendor-neutral telemetry collection
  • Prometheus - Time-series metrics storage
  • Loki - Log aggregation and querying
  • Tempo - Distributed tracing
  • Grafana - Visualization and alerting

πŸš€ Quick Start

Get up and running in 5 minutes with Docker Compose:

Prerequisites

  • Docker and Docker Compose installed (Get Docker)
  • 4GB RAM available for Docker
  • Ports available: 3001, 3100, 3200, 4317, 4318, 9090

Installation

# 1. Clone the repository
git clone git@github.com:PimpMyNines/AI-Development-Insight-Stack.git
cd AI-Development-Insight-Stack

# 2. Configure environment variables
cp .env.example .env

# Edit .env to set your Grafana password (IMPORTANT!)
# Default is admin/admin - CHANGE THIS for production!

# 3. Start the stack
docker-compose -f docker/docker-compose.yml up -d

# 4. Verify all services are running
docker-compose -f docker/docker-compose.yml ps

Expected output: All 5 services (otel-collector, prometheus, loki, tempo, grafana) should show "Up".

Configure Your AI Tool

Option A: Claude Code

# Set environment variable
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317

# Or add to Claude Code settings.json
{
  "telemetry": {
    "enabled": true,
    "endpoint": "http://localhost:4317",
    "protocol": "grpc"
  }
}

Option B: OpenAI API

# Install the wrapper
pip install -r bridges/openai-wrapper/python/requirements.txt

# Use as drop-in replacement
import openai_wrapper as openai

client = openai.OpenAI(api_key="your-key")
# All API calls automatically tracked!

Option C: Generic OTLP Integration

# Use our Python SDK
from otlp_client import OTLPClient

client = OTLPClient(
    endpoint="http://localhost:4317",
    tool_name="my-ai-tool",
    user_id="developer@company.com",
    project_name="my-project"
)

client.track_completion(
    model="gpt-4",
    input_tokens=100,
    output_tokens=50,
    latency_ms=1234
)

See bridges/generic-otlp/ for Node.js examples.

Access Grafana

  1. Open your browser to http://localhost:3001
  2. Login with credentials from .env (default: admin / admin)
    • ⚠️ IMPORTANT: Change the default password immediately!
  3. Navigate to Dashboards β†’ AI Dev Tools folder
  4. Open any dashboard to see your telemetry

Verify Data Flow

# Check OpenTelemetry Collector is receiving data
curl http://localhost:4318/v1/traces

# Check Prometheus has metrics
curl http://localhost:9090/api/v1/query?query=ai_dev_insight_request_total

# View container logs
docker-compose -f docker/docker-compose.yml logs -f otel-collector

That's it! You now have a complete AI development observability platform running locally.


πŸ› οΈ Supported AI Tools

AI Tool Status Integration Method Documentation
Claude Code βœ… Production Native OTLP support Setup Guide
OpenAI API βœ… Production Python wrapper library Integration Guide
Generic OTLP βœ… Production Python/Node.js SDK SDK Examples
GitHub Copilot 🚧 Planned VS Code extension Coming in Phase 2
AWS CodeWhisperer 🚧 Planned IDE plugin Coming in Phase 3
Cursor AI 🚧 Planned Native integration Coming in Phase 3
Codeium 🚧 Planned SDK wrapper Coming in Phase 3

Adding Your Own Tool? Use our Generic OTLP SDK to integrate any AI tool in minutes.


πŸ—οΈ Architecture

System Overview

graph TB
    subgraph AI Tools
        CC[Claude Code]
        OAI[OpenAI API]
        GHC[GitHub Copilot]
        CUST[Custom Tools]
    end

    subgraph Telemetry Collection
        OTEL[OpenTelemetry Collector<br/>Port 4317/4318]
    end

    subgraph Storage Backends
        PROM[Prometheus<br/>Metrics Storage<br/>15 days retention]
        LOKI[Loki<br/>Log Storage<br/>31 days retention]
        TEMPO[Tempo<br/>Trace Storage<br/>30 days retention]
    end

    subgraph Visualization
        GRAF[Grafana Dashboards<br/>8 Pre-built Views]
    end

    CC -->|OTLP gRPC| OTEL
    OAI -->|OTLP HTTP| OTEL
    GHC -->|OTLP gRPC| OTEL
    CUST -->|OTLP| OTEL

    OTEL -->|Metrics Export| PROM
    OTEL -->|Logs Export| LOKI
    OTEL -->|Traces Export| TEMPO

    PROM --> GRAF
    LOKI --> GRAF
    TEMPO --> GRAF
Loading

Data Flow

  1. Collection - AI tools send telemetry to OpenTelemetry Collector via OTLP (gRPC port 4317 or HTTP port 4318)
  2. Processing - Collector processes telemetry:
    • Hashes user IDs for privacy (SHA-256)
    • Removes PII (names, emails, user agents)
    • Calculates costs using pricing table
    • Adds standard attributes for multi-tool support
    • Batches data for efficiency
  3. Storage - Processed data exported to specialized backends:
    • Prometheus - Metrics (request rates, latency, costs)
    • Loki - Logs (errors, debug info, events)
    • Tempo - Traces (end-to-end request flows)
  4. Visualization - Grafana queries all backends to provide unified dashboards

Component Details

Component Role Endpoint Retention
OpenTelemetry Collector Receive, process, and route telemetry localhost:4317 (gRPC)
localhost:4318 (HTTP)
N/A
Prometheus Time-series metrics storage localhost:9090 15 days
Loki Log aggregation and storage localhost:3100 31 days
Tempo Distributed tracing backend localhost:3200 30 days
Grafana Dashboards and visualization localhost:3001 N/A

Resource Usage (typical single user):

  • CPU: ~0.5 cores
  • Memory: ~2GB total
  • Disk: ~10GB for 30 days of data

πŸ“Š Dashboards

1. AI Development Overview

Purpose: High-level health monitoring and quick status check across all AI tools.

Key Metrics:

  • Request rate (requests/second)
  • Success rate (% of 2xx responses)
  • P50/P95/P99 latency
  • Error count by tool
  • Active sessions
  • Recent logs

Best For: Daily check-ins, incident detection, executive summaries


2. Tool Activity & Usage

Purpose: Deep dive into how each AI tool is being used.

Key Metrics:

  • Requests by tool over time
  • Model usage distribution (GPT-4 vs Claude vs others)
  • Request throughput by endpoint
  • Average response time by method
  • Concurrent requests per tool

Best For: Usage pattern analysis, tool comparison, capacity planning


3. Sessions & Users

Purpose: Understand who's using AI tools and how.

Key Metrics:

  • Active users by tool
  • Session duration analysis
  • User activity heatmap
  • Top users by request count
  • User retention metrics

Best For: User adoption tracking, engagement analysis, chargeback


4. Token & Cost Analysis

Purpose: Track and optimize AI development costs.

Key Metrics:

  • Cost per hour/day/month
  • Token usage (input vs output)
  • Cost by user, project, and team
  • Cost by model (which models are expensive?)
  • Budget alerts and forecasting

Best For: Cost optimization, budget planning, financial reporting

πŸ’‘ Pro Tip: Set budget alerts to notify when daily costs exceed thresholds!


5. Performance Deep Dive (Coming Soon)

Purpose: Optimize latency and throughput.

Key Metrics:

  • Latency percentiles by endpoint
  • Token processing speed
  • Queue depth and wait times
  • Rate limit headroom

6. Error Tracking & Debugging (Coming Soon)

Purpose: Identify and resolve issues quickly.

Key Metrics:

  • Error rate trends
  • HTTP status code distribution (4xx vs 5xx)
  • Top error sources
  • Error logs with context
  • Failed request traces

7. Multi-Tool Comparison (Coming Soon)

Purpose: Compare performance across different AI tools.

Key Metrics:

  • Cost per request by tool
  • Latency comparison
  • Success rate by tool
  • Feature usage patterns

8. Resource Usage & Limits (Coming Soon)

Purpose: Monitor infrastructure and avoid rate limits.

Key Metrics:

  • Rate limit consumption
  • Infrastructure CPU/memory
  • Network bandwidth
  • Storage usage

Dashboard Screenshots

Screenshots coming soon - dashboards auto-populate once you start sending telemetry!


βš™οΈ Configuration

Environment Variables

Key configuration options in .env:

# Grafana Credentials (CHANGE THESE!)
GRAFANA_ADMIN_USER=admin
GRAFANA_ADMIN_PASSWORD=change-this-password

# Environment Label
ENVIRONMENT=development

# Data Retention
PROMETHEUS_RETENTION=15d         # Metrics retention
LOKI_RETENTION_HOURS=744         # 31 days
TEMPO_RETENTION_HOURS=720        # 30 days

# Resource Limits
OTEL_MEMORY_LIMIT=512M
PROMETHEUS_MEMORY_LIMIT=2G

See .env.example for all available options.

Customizing Retention Periods

Increase retention for compliance:

# Edit .env
PROMETHEUS_RETENTION=90d
LOKI_RETENTION_HOURS=2160    # 90 days
TEMPO_RETENTION_HOURS=2160   # 90 days

# Restart services
docker-compose -f docker/docker-compose.yml restart

Note: Longer retention requires more disk space (approximately 3GB per 30 days per user).

Updating Pricing Tables

AI model pricing changes frequently. Update costs in configs/pricing/pricing-table.yaml:

claude_models:
  claude-opus-4:
    input_price_per_mtok: 15.00    # Update this
    output_price_per_mtok: 75.00   # And this

The OpenTelemetry Collector automatically reloads changes within 60 seconds.

Advanced Configuration


πŸ’° Cost Tracking

How It Works

The platform calculates costs in real-time using a hybrid approach:

  1. API-Provided Costs (preferred) - If the AI tool's API includes cost data, use it directly
  2. Calculated Costs (fallback) - Calculate from token counts using the pricing table

Cost Calculation Formula:

Total Cost = (Input Tokens / 1,000,000 Γ— Input Price) +
             (Output Tokens / 1,000,000 Γ— Output Price)

Precision: $0.0001 (4 decimal places)

Supported Models & Pricing

Current pricing as of January 2025:

Model Input (per 1M tokens) Output (per 1M tokens) Provider
Claude Opus 4 $15.00 $75.00 Anthropic
Claude Sonnet 4 $3.00 $15.00 Anthropic
Claude Haiku 3.5 $0.80 $4.00 Anthropic
GPT-4 Turbo $10.00 $30.00 OpenAI
GPT-4 $30.00 $60.00 OpenAI
GPT-3.5 Turbo $0.50 $1.50 OpenAI
Gemini Pro $0.50 $1.50 Google

Plus 9 more models - See configs/pricing/pricing-table.yaml for complete pricing.

Budget Alerts

Configure automatic alerts when costs exceed thresholds:

Level User (per day) Project (per day) Organization (per day)
Warning $50 $200 $1,000
Critical $100 $500 $2,500

Alerts appear in Grafana dashboards and can trigger notifications (email, Slack, PagerDuty).

Cost Optimization Recommendations

The platform automatically suggests optimizations:

  • Model downgrade - "Switch from GPT-4 to GPT-3.5 for simple tasks (10x cost reduction)"
  • Wasteful patterns - "User retries failed requests 50+ times/day"
  • Inefficient usage - "Project generates 90% output tokens in error responses"

πŸš€ Deployment Options

Choose the deployment that fits your needs:

Comparison Table

Feature Docker Compose AWS ECS Kubernetes
Setup Time 5 minutes 30 minutes 45 minutes
Best For Local dev, single user Production, small-medium teams Enterprise, large teams
High Availability ❌ No βœ… Yes βœ… Yes
Auto-Scaling ❌ No βœ… Yes βœ… Yes
Cost Free (local) ~$100-300/month ~$150-500/month
Maintenance Low Medium Medium-High
Multi-Region ❌ No ⚠️ Manual βœ… Yes
Load Balancing ❌ No βœ… Built-in βœ… Built-in
Persistent Storage Local volumes EBS + S3 PV + S3
Backup/Recovery Manual Automated Automated

Docker Compose (Local Development)

Pros:

  • Fastest setup (5 minutes)
  • No cloud costs
  • Perfect for testing and development
  • Easy to modify and experiment

Cons:

  • No high availability
  • Single machine only
  • Manual backups
  • Not suitable for production

Quick Start: See Quick Start section above.


AWS ECS (Production Cloud)

Pros:

  • Production-ready with high availability
  • Auto-scaling based on load
  • Managed infrastructure (less maintenance)
  • AWS-native security and monitoring
  • Cost-effective for small-medium teams

Cons:

  • AWS-specific (vendor lock-in)
  • Requires AWS account and permissions
  • Monthly costs (~$100-300)

Quick Start:

# Coming in Phase 2!
cd infrastructure/ecs
terraform init
terraform apply

# Outputs: Load balancer URL and endpoints

Documentation: Coming soon in docs/deployment/aws-ecs.md


Kubernetes (Enterprise)

Pros:

  • Cloud-agnostic (runs on AWS, GCP, Azure, on-prem)
  • Best scalability and flexibility
  • Advanced features (service mesh, CRDs, operators)
  • Industry standard for large deployments

Cons:

  • Most complex setup
  • Requires Kubernetes expertise
  • Higher infrastructure costs
  • More maintenance overhead

Quick Start:

# Coming in Phase 3!
cd infrastructure/kubernetes
terraform init
terraform apply

# Or use Helm charts
helm install ai-insight ./helm-chart

Documentation: Coming soon in docs/deployment/kubernetes.md


πŸ“š Documentation

Comprehensive documentation organized by use case:

Getting Started

Configuration

Deployment Guides

  • Docker Compose - Local development setup
  • AWS ECS Guide - Coming in Phase 2
  • Kubernetes Guide - Coming in Phase 3

Operations

Development


πŸ“ˆ Project Status

Current Version: 1.0.0 (Production Ready) Last Updated: January 7, 2025 Overall Progress: 38% Complete

Completed (Phase 1)

βœ… Core Infrastructure

  • Multi-tool OpenTelemetry collector configuration
  • Prometheus, Loki, Tempo backends
  • Grafana with auto-provisioned data sources
  • Docker Compose deployment

βœ… Cost Tracking

  • Comprehensive pricing table (15+ AI models)
  • Real-time cost calculation
  • Budget alerts and thresholds

βœ… Privacy & Security

  • GDPR-compliant user ID hashing
  • PII removal from telemetry
  • Secure default configurations

βœ… Integration SDKs

  • Python OTLP SDK with examples
  • Node.js OTLP SDK with TypeScript
  • OpenAI API wrapper (drop-in replacement)

βœ… Dashboards

  • AI Development Overview
  • Tool Activity & Usage
  • Sessions & Users
  • Token & Cost Analysis

In Progress

🚧 Dashboard Enhancements

  • Performance Deep Dive dashboard
  • Error Tracking dashboard
  • Multi-Tool Comparison dashboard
  • Resource Usage dashboard

🚧 Documentation

  • Deployment guides (AWS ECS, Kubernetes)
  • Advanced configuration tutorials
  • Video walkthroughs

Roadmap

Phase 2 (Q1 2025) - Advanced Dashboards & Cloud Deployments

  • Complete remaining 4 dashboards
  • AWS ECS Terraform deployment
  • Kubernetes Helm charts
  • Alert rule templates

Phase 3 (Q2 2025) - Extended Tool Support

  • GitHub Copilot VS Code extension
  • AWS CodeWhisperer integration
  • Cursor AI native support
  • Multi-region deployments

Phase 4 (Q3 2025) - Enterprise Features

  • SSO/SAML authentication
  • Multi-tenancy support
  • Advanced RBAC
  • Audit logging

Phase 5 (Q4 2025) - SaaS Platform

  • Hosted offering (optional)
  • Advanced analytics and ML insights
  • Benchmark reports
  • API rate limit management

See PROGRESS.md for detailed task tracking.


🀝 Contributing

We welcome contributions! This is an open-source project under the MIT license.

How to Contribute

  1. Fork the repository

    git fork git@github.com:PimpMyNines/AI-Development-Insight-Stack.git
  2. Create a feature branch

    git checkout -b feature/amazing-new-feature
  3. Make your changes

    • Add new AI tool integrations
    • Improve dashboards
    • Fix bugs
    • Enhance documentation
  4. Test thoroughly

    # Start the stack
    docker-compose -f docker/docker-compose.yml up -d
    
    # Run your tests
    # Verify dashboards still work
  5. Submit a pull request

    • Describe what you changed and why
    • Include screenshots for UI changes
    • Reference any related issues

Areas We Need Help

  • Tool Integrations - Add support for more AI tools
  • Dashboard Improvements - Enhance existing dashboards or create new ones
  • Cloud Deployments - Help with AWS ECS and Kubernetes configurations
  • Documentation - Tutorials, guides, and video content
  • Testing - Add automated tests and CI/CD pipelines

Development Setup

# Clone the repo
git clone git@github.com:PimpMyNines/AI-Development-Insight-Stack.git
cd AI-Development-Insight-Stack

# Start development environment
docker-compose -f docker/docker-compose.yml up -d

# Make changes to configs
vim configs/otel/collector-config.yaml

# Reload configuration
docker-compose -f docker/docker-compose.yml restart otel-collector

# Test your changes
# Send test telemetry, check dashboards

Code of Conduct

  • Be respectful and inclusive
  • Provide constructive feedback
  • Focus on what's best for the project
  • Help newcomers get started

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

What this means:

  • βœ… Commercial use allowed
  • βœ… Modification allowed
  • βœ… Distribution allowed
  • βœ… Private use allowed
  • ⚠️ Liability and warranty limitations

πŸ™ Acknowledgments

This project is built on the shoulders of giants. Special thanks to:

  • OpenTelemetry - For creating the vendor-neutral observability standard
  • Prometheus - For reliable time-series metrics storage
  • Grafana Labs - For Grafana, Loki, and Tempo
  • Anthropic - For Claude and native OTLP support
  • The Open Source Community - For endless inspiration and support

Built With


πŸ“ž Support & Community

Quick Links


Made with ❀️ by developers, for developers

Start monitoring your AI development tools in 5 minutes

Get Started | View Demos | Read Docs

About

AI Development Insight Stack (ADIS) is an intelligent observability and analytics platform designed to provide real-time insights across the entire software development lifecycle. It leverages AI-driven analysis to monitor, correlate, and optimize development workflows, CI/CD pipelines, and infrastructure performance.

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors