Skip to content

Conversation

@raycoderhk
Copy link
Contributor

@raycoderhk raycoderhk commented Sep 6, 2025

Complete Analytics MCP Example with Streamlined Documentation

🎯 Core Features:

  • Model Context Protocol (MCP) server for analytics data with 8 focused tools
  • GitHub API data collection with dynamic 30-day historical progression
  • Cloudflare Analytics Engine integration for serverless time-series storage
  • Grafana dashboard integration with optimized time-series visualization
  • Wrangler secrets management for secure production deployment
  • Copy-paste friendly examples for all MCP tools

🔧 Key Components:

  • generate-batch-data.js: Real GitHub API data collection (anthropics/claude-code)
  • src/: Complete MCP server implementation (7 TypeScript files)
  • test/: Integration tests with full Analytics Engine compatibility
  • Comprehensive tool examples: Individual field breakdowns for easy testing
  • Security-first configuration: No hardcoded credentials in repository

📊 Analytics Capabilities:

  • 8 MCP tools: track_metric, track_batch_metrics, query_analytics, get_metrics_summary, get_time_series, analyze_trends, list_datasets, get_recent_data
  • Real-time data ingestion: 25 data points per batch (Analytics Engine limits)
  • Production-ready deployment: Wrangler secrets for Account ID and API Token

Production Ready:

  • All tests passing with Analytics Engine SQL compatibility
  • Security-first: No personal credentials exposed in repository
  • User-friendly: Copy-paste examples for immediate testing
  • Cloudflare optimized: Follows official Workers Analytics Engine patterns

Files included:

  • Essential configuration: README.md, package.json, wrangler.jsonc, tsconfig.json
  • Data generation: generate-batch-data.js (GitHub API integration)
  • MCP implementation: src/*.ts (7 TypeScript files)
  • Test suite: test/*.ts + test/tsconfig.json
  • Dependencies: pnpm-lock.yaml

Complete Analytics MCP server implementation with real GitHub data collection:

🎯 Core Features:
- Model Context Protocol (MCP) server for analytics data
- Real GitHub API data collection with dynamic date calculation
- Cloudflare Analytics Engine integration for time-series storage
- Grafana dashboard integration with HTTP endpoints and CORS
- Batch processing support (10-record Analytics Engine limits)

🔧 Key Components:
- generate-batch-data.js: Real GitHub data collection script
- src/: Complete MCP server implementation (7 TypeScript files)
- test/: Integration tests with Analytics Engine compatibility
- Grafana endpoints: /grafana/query and /grafana/health

📊 Real Data Support:
- GitHub API integration for actual repository statistics
- Dynamic 30-day historical data generation
- Multi-repository dashboard support
- Production-ready error handling and fallbacks

📚 Documentation:
- Comprehensive setup guide with MCP Inspector testing
- Step-by-step Grafana dashboard configuration
- Analytics Engine SQL compatibility guide
- Troubleshooting section with real examples

✅ All tests passing (14/14) with Analytics Engine compatibility

Files included (14 essential files):
- README.md, package.json, wrangler.jsonc, tsconfig.json
- generate-batch-data.js (real GitHub data script)
- src/*.ts (7 files: complete MCP server implementation)
- test/*.ts + test/tsconfig.json (tests + TypeScript config)
- pnpm-lock.yaml (dependencies)
Added vitest.config.ts to enable proper Cloudflare Workers test environment:
- Enables cloudflare:test imports for test utilities
- Configures Workers runtime for integration tests
- All 14 tests now passing ✅

Complete Analytics MCP system now includes 15 essential files:
- 7 src/*.ts files (MCP server implementation)
- 2 test/* files (tests + configs)
- 6 config/doc files (deployment, compilation, documentation)
…encies

Fixed pnpm lockfile sync issue:
- Updated pnpm-lock.yaml to match analytics-mcp package.json
- Resolved dependency specification mismatches
- All workspace dependencies now properly resolved

Dependencies now correctly locked:
- @modelcontextprotocol/sdk (catalog)
- @nullshot/mcp (workspace)
- hono ^4.7.6 (for CORS support)
- All test dependencies properly resolved
…oken consistency

- Add optional 'column' parameter to analyze_trends tool (defaults to double1)
- Users can now specify which column (double1, double2, double3, etc.) to analyze
- Fix logic bug where specifying a column caused 'Insufficient data' error
- Standardize on CLOUDFLARE_API_TOKEN throughout codebase (repository.ts, schema.ts)
- Update README.md to use CLOUDFLARE_API_TOKEN consistently
- Improve UX with better error handling and user control

Tested: Column parameter works correctly for both auto-detection and user-specified columns
…m README

- Update analyze_trends to reflect single metric analysis (not array)
- Add column parameter documentation
- Remove algorithm parameter (unused)
- Remove detect_anomalies and track_agent_metrics from tool lists
- Clarify that column parameter auto-detects best column if not specified
- Remove unused algorithm parameter from analyze_trends tool
- Update README documentation to reflect single metric analysis
- Remove detect_anomalies and track_agent_metrics from README
- Keep tools in code for backward compatibility but remove from documentation

Working features:
- analyze_trends with column parameter ✅
- Grafana dashboard with updated API token ✅
- MCP Inspector testing ✅
- Remove track_agent_metrics and detect_anomalies tools completely
- Remove corresponding test methods
- Remove unused schemas from schema.ts
- Fix syntax errors in tools.ts from incomplete removals
- All 12 tests now pass successfully
- Remove incorrect schema validation that expected systemId and metrics
- Fix SQL syntax for Analytics Engine compatibility
- Simplify to basic health check using github_stats dataset
- Handle errors gracefully and return appropriate status
- Remove monitor_system_health tool from tools.ts
- Remove monitorSystemHealth method from repository.ts
- Remove MonitorSystemHealthSchema from schema.ts
- Remove corresponding test from test suite
- Tests now pass with 11 tests instead of 12

The tool was not providing meaningful value since it was just a basic query check.
- Remove hardcoded 'daily_pr_stats' filter from SQL query
- Add dynamic WHERE clause that uses dimensions parameter when provided
- Now dimensions parameter actually filters the data as expected
- If no dimensions provided, returns all data in time range
- If dimensions provided, filters by blob2 IN (dimensions)

This fixes the issue where different dimensions returned same results.
- Change get_time_series from 'filters' to 'dimensions' parameter for consistency with get_metrics_summary
- Update tool definition, schema, repository method, and tests
- Now both tools use the same 'dimensions: string[]' pattern
- Simplifies API: dimensions: ['claude_rich_data'] vs filters: {event_type: 'claude_rich_data'}
- Maintains same functionality with cleaner, consistent interface

BREAKING CHANGE: get_time_series now uses 'dimensions' instead of 'filters' parameter
- Update get_time_series documentation with new dimensions parameter
- Add practical examples with claude_rich_data and github_stats
- Clarify that get_metrics_summary and get_time_series no longer need code changes
- Update section title to reflect current flexibility
- Note that analyze_trends may still need adaptation

The tools are now much more user-friendly with consistent dimensions parameter.
- Remove invalid 'ORDER BY timestamp' - Analytics Engine doesn't expose timestamp column
- Change to 'ORDER BY blob3 DESC' which uses the date field
- Remove hardcoded WHERE filter to make examples more generic
- Add 'as Date' alias for clarity
- Examples now work without errors

Fixes Analytics Engine API error: unable to find type of column: timestamp
## Major Enhancements

### 📚 README Documentation
- Standardize all examples to use 'github_stats' dataset (only bound dataset)
- Add comprehensive dataset binding configuration guide
- Enhance get_recent_data documentation with parameters and use cases
- Fix analyze_trends documentation (remove incorrect 'hardcoded filters' claim)
- Remove outdated monitor_system_health references
- Fix SQL examples for Analytics Engine compatibility

### 🔧 Data Generation Script
- Remove mock data fallbacks for transparency (fail clearly if GitHub API unavailable)
- Restore realistic historical progression to current real GitHub values
- Add accurate data source labeling: 'github_api_with_simulated_progression'
- Fix organization name back to 'anthropics' (correct GitHub org)
- Enhance error handling with clear failure messages

### ⚙️ Repository & Tools
- Enhance list_datasets to show logical datasets grouped by event types
- Fix Analytics Engine SQL compatibility (remove unsupported MIN/MAX on strings)
- Improve dataset discovery (shows github_stats:claude_rich_data with record counts)

## Result
- All 8 tools are now fully flexible with no hardcoded filters
- All 11 tests passing
- Documentation is accurate and user-friendly
- Data generation is transparent with real GitHub API foundation
…P Inspector setup, standardize dataset names, correct Analytics Engine limits
- Remove Power of Built-in Time Series Tools
- Remove What You'll Build
- Remove Troubleshooting
- Remove Verification Complete
- Remove Next Steps
- Remove Working Example Dashboard
- Remove Demonstrated Features
- Remove Technical Architecture

Keep only essential setup and usage sections for cleaner, focused docs
- Remove Step 7: Configure Analytics Engine Access
- Remove Step 8: Verify Setup
- Renumber Step 10 → Step 7: Create Dashboard Panels
- Renumber Step 11 → Step 8: View Your Analytics Dashboard

Steps now flow cleanly: 1→2→3→4→5→6→7→8
- Add complete working examples for all 8 MCP tools:
  * track_metric - single data point tracking
  * track_batch_metrics - bulk data ingestion
  * query_analytics - custom SQL queries
  * get_metrics_summary - aggregated statistics
  * get_time_series - time series data for visualization
  * analyze_trends - trend detection and pattern analysis
  * list_datasets - dataset discovery and metadata
  * get_recent_data - recent records inspection

- Include request/response examples for each tool
- Add Quick Reference table for tool overview
- Organize tools by category (Data Writing, Query & Analysis, Utility)
- Use realistic GitHub PR analytics examples throughout
- Provide clear descriptions and use cases for each tool
…opy-paste

- Split track_metric JSON into 3 separate fields: dataset, dimensions, metrics
- Add individual JSON snippets for each field so users can copy just what they need
- Keep complete request example for reference
- Makes it easier for users to test individual components
- Add optional column parameter back to analyze_trends tool definition
- Update AnalyzeTrendsSchema to include column field validation
- Pass column parameter to repository.analyzeTrends method
- Break down analyze_trends README example into individual fields
- Column parameter allows users to specify which Analytics Engine column to analyze (double1, double2, etc.)
- Defaults to double1 if not specified (auto-detection)
- Fixes inconsistency where repository supported column but tool definition didn't expose it
Security improvements:
- Remove hardcoded CLOUDFLARE_ACCOUNT_ID from wrangler.jsonc
- Protect personal account information in public repository

Production deployment:
- Add wrangler secret put instructions for both Account ID and API Token
- Provide clear step-by-step setup before deployment
- Use consistent secrets pattern for all credentials

Documentation improvements:
- Update tool count from 11 to 8 tools (accurate count)
- Add real list_datasets response with 5 key datasets
- Clarify localhost (.env) vs production (secrets) authentication
- Break down tool examples into copy-paste friendly individual fields
- Fix query_analytics SQL example (remove problematic blob3 alias)
- Add User Details:Read permission for API token requirements

Tool functionality:
- Break down track_metric, track_batch_metrics, get_metrics_summary, get_time_series into individual fields
- Update architecture note to reflect wrangler secrets usage
@raycoderhk raycoderhk changed the title feat: Add complete Analytics MCP example with Grafana integration feat: Add Analytics Engine MCP example Sep 9, 2025
@raycoderhk
Copy link
Contributor Author

development, testing and documentation completed, CI check passed
ready for reviewer check

@Disturbing Disturbing linked an issue Sep 15, 2025 that may be closed by this pull request
@allenwyma allenwyma merged commit 1bad02a into main Sep 28, 2025
1 check passed
@allenwyma allenwyma deleted the feat/analytics-mcp-minimal branch September 28, 2025 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Analytics Engine Example

3 participants