Enterprise Autonomous SRE Platform powered by Gemini 3 Flash Preview
ChronosOps autonomously investigates production incidents by reasoning over deployments and telemetry using Google's Gemini 3 Flash Preview, producing evidence-backed root-cause hypotheses, explainable reasoning, actionable recommendations, and exportable postmortems.
ChronosOps transforms incident response from reactive to autonomous:
- β‘ Autonomous Investigation: AI-driven evidence collection and reasoning loop that iteratively improves analysis
- π§ Gemini 3-Powered Reasoning: Advanced AI reasoning with explainable hypotheses and confidence scoring
- π Complete Traceability: Visual explainability graphs showing evidence β reasoning β conclusion paths
- π‘οΈ Enterprise Safety: Policy-gated operations, RBAC, data redaction, and tamper-evident audit chains
- π Production-Ready: One-command setup, comprehensive observability, and enterprise-grade UI
Result: Reduce MTTR (Mean Time To Resolution) by 70% through autonomous root-cause analysis and actionable recommendations.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Web Console (Next.js) β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Analyze β β Incidents β β Exports β β
β β Page β β List β β Center β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β HTTP/REST
βββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββ
β API Server (NestJS) β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Authentication & Authorization β β
β β (JWT/OIDC, RBAC: Viewer/Analyst/Admin) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Incidents β β Investigation β β Evidence β β
β β Module β β Loop β β Collectors β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β β
β ββββββββΌββββββββββββββββββΌβββββββββββββββββββΌβββββββ β
β β Gemini 3 Flash Preview Reasoning β β
β β (Hypothesis Ranking, Confidence, Explainability)β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Policy β β Audit β β Postmortem β β
β β Gating β β Chain β β Generator β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββ
β PostgreSQL (Prisma ORM) β
β β’ Incidents, Analyses, Evidence Bundles β
β β’ Investigation Sessions & Iterations β
β β’ Prompt Traces, Postmortems, Audit Events β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
User/System β Create Incident β Evidence Collection β Evidence Bundle
Sources:
- Scenarios: Pre-defined test scenarios (latency-spike, error-spike-config, etc.)
- Google Cloud: Real incidents from
status.cloud.google.com - Generic API: PagerDuty, Datadog, New Relic, Custom sources
Evidence Bundle β Gemini 3 Reasoning β Hypothesis Ranking β Analysis Result
Gemini 3 Flash Preview:
- Analyzes evidence artifacts (metrics, logs, traces, deploys, config)
- Ranks root-cause hypotheses with confidence scores
- Provides explainable reasoning with evidence references
- Suggests recommended actions and missing evidence needs
Analysis β Confidence Check β Evidence Request β Collection β Re-analysis β Loop
Features:
- Model-Directed: Gemini 3 can request specific evidence types
- Bounded Iterations: Max iterations, confidence targets, no-progress detection
- Policy-Gated: All requests validated against safety policies
- Full Audit: Every iteration recorded with decision JSON
Analysis β Postmortem Generation β Markdown/JSON Export β Audit Chain Verification
Core AI Capabilities:
- Autonomous Reasoning: Analyzes evidence bundles and ranks hypotheses
- Explainable AI: Provides rationale with evidence references for each hypothesis
- Confidence Scoring: Overall confidence (0-1) and per-hypothesis confidence
- Evidence Requests: Can autonomously request additional evidence types
- Structured Output: Validated JSON responses conforming to strict schemas
Configuration:
- Model:
gemini-3-flash-preview(configurable viaGEMINI_MODEL) - API Key: Set
GEMINI_API_KEYin environment variables - Prompt Versioning: Tracked for reproducibility
- State Machine: Bounded iterations with stop conditions
- Model-Directed: Gemini 3 requests specific evidence (METRICS, LOGS, TRACES, etc.)
- Deterministic Fallback: Completeness-based plan if model requests unavailable
- Stop Conditions: Confidence target reached, max iterations, no progress
- Full Audit Trail: Every iteration recorded with decision JSON
- Safe Mode (Default ON): Collectors run in STUB mode unless explicitly allowlisted
- Policy Gating: Evidence requests validated (time windows, max items, allowlists)
- RBAC Enforcement: Role-based access (Viewer, Analyst, Admin)
- Data Redaction: Sensitive data (sourcePayload, prompt traces) redacted for non-admins
- Request Limits: Body size limits (2MB), rate limiting configured
- Explainability Graph: Visual trace from evidence β reasoning β conclusion
- Analysis Comparison: Compare two analyses to detect drift
- Audit Chain: Hash-linked audit log for tamper detection
- Integrity Verification: Verify audit chain continuity and detect tampering
- Evidence References: Every hypothesis/action linked to specific evidence artifacts
- Scenarios: Pre-defined test scenarios with realistic telemetry
- Google Cloud: Real incidents from status.cloud.google.com
- Generic API: Unified ingestion endpoint for:
- PagerDuty
- Datadog
- New Relic
- Custom sources
- Normalization: Source-specific data normalized to common format
Incident Creation (/analyze):
- Source tabs (Scenarios / Google Cloud / API Integration)
- Timeline preview with deployment markers
- Source badges and traceability indicators
Incident Workspace (/incidents/[id]):
- Evidence Bundle: Completeness scores, type grid, hash display
- Analysis Results: Gemini 3 reasoning, hypothesis ranking, confidence scores
- Investigation Loop: Iteration timeline, model requests, stop conditions
- Explainability Graph: Visual trace (ready for interactive visualization)
- Analysis Comparison: Drift detection between analyses
- Postmortem: Markdown/JSON export
- Audit Chain: Integrity verification
Incident List (/incidents):
- Source badges, status filters, quick stats
Export Center (/exports):
- Postmortem and JSON bundle exports with detailed breakdowns
- JWT/OIDC: JWKS validation with Keycloak integration
- RBAC: Three roles (Viewer, Analyst, Admin)
- Service-Layer Enforcement: RBAC checks in controllers and services
- Public Endpoints: Health, readiness, version checks
Collectors:
GcpMetricsCollector: Metrics data (p95 latency, error rate, RPS)DeploysCollector: Deployment historyConfigDiffCollector: Configuration changesLogsCollector: Log entriesTracesCollector: Distributed traces
Policy:
- Safe Mode enforcement (STUB vs REAL)
- Time window bounds
- Max items per request
- Allowlist-based access control
- V2 Format: Structured JSON with hypotheses, actions, evidence references
- Markdown Export: Human-readable postmortem
- JSON Export: Machine-readable for integrations
- Version Tracking: Generator version for reproducibility
- Node.js (LTS recommended)
- pnpm (workspace manager)
- PostgreSQL (for persistence)
- Gemini API Key - Get one at Google AI Studio
# 1. Clone and setup
git clone <repo-url>
cd chronosops
cp .env.example .env
# 2. Edit .env and set GEMINI_API_KEY
# GEMINI_API_KEY=your-api-key-here
# GEMINI_MODEL=gemini-3-flash-preview # (default)
# 3. Start all services
docker compose up -d --build
# Services available at:
# - Web: http://localhost:3000
# - API: http://localhost:4000
# - PostgreSQL: localhost:5432# 1. Install dependencies
pnpm install
# 2. Configure environment
cp .env.example .env
# Edit .env:
# - DATABASE_URL=postgresql://user:pass@localhost:5432/chronosops
# - GEMINI_API_KEY=your-api-key-here
# - GEMINI_MODEL=gemini-3-flash-preview
# 3. Setup database
cd apps/api
pnpm prisma migrate dev
pnpm prisma generate
# 4. Seed scenarios (optional)
pnpm seed:scenarios
# 5. Start services
cd ../..
pnpm dev # Starts API + Web in parallel
# Or separately:
pnpm dev:api # API only (port 4000)
pnpm dev:web # Web only (port 3000)# Check API health
curl http://localhost:4000/v1/health
# Should return: {"ok":true,"database":"connected"}
# Check API version
curl http://localhost:4000/v1/version
# Should return version info with git SHA
# Access web UI
open http://localhost:3000GET /v1/health- Health check (database connectivity)GET /v1/ready- Readiness check (database + migrations)GET /v1/version- Version info (git SHA, build time, prompt version)
POST /v1/incidents/analyze- Analyze new incidentPOST /v1/incidents/ingest- Generic ingestion APIGET /v1/incidents- List incidentsGET /v1/incidents/:id- Get incident detailsPOST /v1/incidents/:id/reanalyze- Re-run analysisGET /v1/incidents/:id/analyses/:a/compare/:b- Compare analysesGET /v1/incidents/:incidentId/analyses/:analysisId/explainability-graph- Get explainability graphGET /v1/incidents/:id/verify- Verify audit chain integrity
POST /v1/incidents/:id/investigate- Start autonomous investigationGET /v1/investigations/incident/:incidentId- Get investigation sessionsGET /v1/investigations/:sessionId- Get session status
GET /v1/incidents/:id/prompt-traces- List prompt tracesGET /v1/incidents/prompt-traces/:id- Get specific traceGET /v1/incidents/evidence-bundles/:bundleId- Get evidence bundle
GET /v1/incidents/:id/postmortems- List postmortemsGET /v1/incidents/postmortems/:id- Get postmortem detailsGET /v1/incidents/postmortems/:id/markdown- Get markdown export
Backend:
- NestJS: TypeScript framework with dependency injection
- Prisma: Type-safe database ORM
- PostgreSQL: Persistent storage
- Google Generative AI: Gemini 3 Flash Preview integration
Frontend:
- Next.js: React framework with SSR
- Tailwind CSS: Utility-first styling
- React Query: Data fetching and caching
Infrastructure:
- Docker Compose: One-command local setup
- JWT/OIDC: Authentication with Keycloak
- Structured Logging: Request correlation and observability
- Ingestion: Incident created from scenario, Google Cloud, or API
- Collection: Evidence collectors gather metrics, logs, traces, deploys, config
- Bundle: Evidence artifacts assembled into content-addressed bundle
- Reasoning: Gemini 3 analyzes evidence and ranks hypotheses
- Investigation (Optional): Autonomous loop collects additional evidence
- Postmortem: Structured postmortem generated with all findings
- Audit: All operations recorded in tamper-evident audit chain
Core Entities:
Incident: Incident metadata (source, timeline, status)IncidentAnalysis: Analysis results with reasoning JSONEvidenceBundle: Content-addressed evidence artifactsInvestigationSession: Autonomous investigation sessionInvestigationIteration: Per-iteration records with decisionsPromptTrace: Full prompt/request/response tracesPostmortem: Postmortem snapshots (Markdown + JSON)AuditEvent: Hash-linked audit chain events
Design Principles:
- Insert-only (never overwrite) for full audit trail
- Content-addressed bundles (immutable, hash-based)
- Hash-chained audit log (tamper-evident)
Required:
DATABASE_URL- PostgreSQL connection stringGEMINI_API_KEY- Google AI API key (for reasoning)
Optional:
GEMINI_MODEL- Gemini model name (default:gemini-3-flash-preview)CHRONOSOPS_SAFE_MODE- Safe mode toggle (default:true)CHRONOSOPS_ALLOW_REAL_GCP_METRICS- Allow real metrics collectionCHRONOSOPS_AUTH_REQUIRED- Enable authentication (default:true)CHRONOSOPS_AUTH_ISSUER_URL- OIDC issuer URLCHRONOSOPS_AUTH_AUDIENCE- Expected audience claimCHRONOSOPS_AUTH_JWKS_URI- JWKS endpoint
See .env.example for complete list.
- Complete Flow Documentation - Step-by-step workflow guide
- Production Workflow Showcase - UI feature highlights
- Debugging Guide - Debug mode and troubleshooting
- Ship Checklist - Production readiness checklist
- Ingestion Integration Guide - API integration guide
# Start API in debug mode (from root)
pnpm debug:api
# Or with breakpoint on start
pnpm debug:api:brk
# Then attach VS Code debugger (F5 β "Attach to API")- Health Check:
curl http://localhost:4000/v1/health - Create Incident: Use
/analyzepage to create scenario-based incident - View Analysis: Check
/incidents/[id]for Gemini 3 reasoning results - Start Investigation: Trigger autonomous investigation loop
- Verify Audit: Check audit chain integrity
chronosops/
βββ apps/
β βββ api/ # NestJS API server
β β βββ src/
β β β βββ modules/ # Feature modules
β β β βββ reasoning/ # Gemini 3 integration
β β β βββ collectors/# Evidence collectors
β β β βββ investigation/ # Autonomous loop
β β βββ prisma/ # Database schema
β βββ web/ # Next.js web console
β βββ app/ # Pages and routes
βββ packages/
β βββ contracts/ # Shared Zod schemas
βββ docs/ # Documentation
# Development
pnpm dev # Start API + Web
pnpm dev:api # API only
pnpm dev:web # Web only
pnpm debug:api # API with debugger
# Database
cd apps/api
pnpm prisma migrate dev # Run migrations
pnpm prisma generate # Generate Prisma client
pnpm seed:scenarios # Seed test scenarios
# Build
pnpm build # Build all packages- Post-Deployment Incident: Analyze latency/error spikes after deployment
- Configuration Change: Investigate incidents after config updates
- Multi-Source Correlation: Combine incidents from PagerDuty, Datadog, etc.
- Autonomous Investigation: Let AI collect evidence and iterate to high confidence
- Postmortem Generation: Generate structured postmortems automatically
MIT License
Copyright (c) 2026 ChronosOps
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Built with:
- Google Gemini 3 Flash Preview for advanced AI reasoning
- NestJS for robust backend architecture
- Next.js for modern web UI
- Prisma for type-safe database access
ChronosOps - Transforming incident response through autonomous AI reasoning.