Skip to content

jsurrea/AgentX

Repository files navigation

AgentX — SRE Incident Intake & Triage Agent

Solution Introduction

AgentX is an autonomous SRE agent that ingests incident reports for an e-commerce platform, performs AI-powered triage by analyzing the actual source code, and routes issues through a complete ticketing workflow with real-time notifications. Built for the AgentX Hackathon, it demonstrates how LLM agents with tool-use capabilities can dramatically accelerate incident response by providing engineers with structured, code-grounded triage assessments within seconds of a report being filed.

The system accepts multimodal input (text descriptions, screenshots, log files), uses Claude via OpenRouter with a tool-use loop to search and read the Saleor e-commerce codebase, classifies severity and category, creates labeled tickets in a self-hosted Gitea instance, and notifies the engineering team via email (Resend) and real-time WebSocket push notifications. When a ticket is resolved — either through Gitea or the AgentX UI — the original reporter is automatically notified.

Our approach prioritizes production-readiness: structured JSON logging with trace correlation, Langfuse LLM observability, prompt injection guardrails, file upload validation, and a clean Docker Compose deployment that brings up the entire system with a single command.

Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                       Docker Compose Network                         │
│                                                                      │
│  ┌──────────────┐      ┌───────────────┐       ┌──────────────────┐  │
│  │   Frontend   │─────▶│    Backend     │─────▶│    PostgreSQL    │  │
│  │  React/Nginx │      │    FastAPI     │      │    (shared DB)   │  │
│  │  :3000       │      │    :8000       │      │    :5432 (int)   │  │
│  └──────────────┘      └───────┬────────┘      └──────────────────┘  │
│         │                      │                                     │
│         │ WebSocket            │                                     │
│         ◀──────────────────────┤                                     │
│                     ┌──────────┼──────────────┐                      │
│                     ▼          ▼              ▼                      │
│              ┌───────────┐ ┌───────┐  ┌────────────┐                 │
│              │ OpenRouter│ │ Redis │  │  Langfuse  │                 │
│              │  (Claude) │ │ :6379 │  │   :3002    │                 │
│              │ (external)│ │ (int) │  │            │                 │
│              └─────┬─────┘ └───────┘  └────────────┘                 │
│                    │                                                 │
│              ┌─────┴───────────────────┐                             │
│              ▼                         ▼                             │
│        ┌───────────┐          ┌──────────────┐                       │
│        │   Gitea   │          │    Resend    │                       │
│        │   :3001   │          │  (external)  │                       │
│        │  Ticketing│          │    Email     │                       │
│        └───────────┘          └──────────────┘                       │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────┐         │
│  │  /ecommerce-codebase (Saleor — cloned at build time)    │         │
│  └─────────────────────────────────────────────────────────┘         │
└──────────────────────────────────────────────────────────────────────┘

Tech Stack

Component Technology Purpose
Backend Python / FastAPI API + Agent orchestration
Frontend React / TypeScript / Vite Incident submission & tracking
LLM Claude via OpenRouter Multimodal triage with tools
Ticketing Gitea (self-hosted) Issue tracking with webhooks
Email Resend Team & reporter notifications
Communicator WebSocket (in-app) Real-time push notifications
Observability Langfuse (self-hosted) LLM trace visualization
Database PostgreSQL Shared persistence layer
E-commerce Saleor Target codebase for analysis

E2E Flow

  1. Reporter submits incident via UI (text + optional images/logs)
  2. Agent triages: searches Saleor codebase, reads relevant code, classifies severity/category
  3. Agent creates a labeled ticket in Gitea with full triage report
  4. Agent notifies engineering team via email (Resend) and WebSocket
  5. Engineer resolves ticket in Gitea (or via AgentX UI)
  6. Agent notifies the original reporter that the incident is resolved

Features

  • Multimodal input: Text descriptions + image screenshots + log file uploads
  • Code-grounded triage: Agent searches and reads the actual Saleor codebase
  • Tool-use loop: Up to 8 iterations of search/read before producing assessment
  • Severity classification: Critical / High / Medium / Low with confidence scoring
  • Real-time updates: WebSocket push for live status progression
  • Self-hosted ticketing: Gitea with auto-created labels and webhooks
  • Email notifications: Team alerts on new tickets, reporter alerts on resolution
  • Prompt injection guardrails: Pattern detection + safety wrapping
  • File validation: MIME type checking, size limits, path traversal prevention
  • Structured logging: JSON logs with trace ID correlation
  • LLM observability: Langfuse traces for every triage operation

Quick Start

See QUICKGUIDE.md

Project Structure

AgentX/
├── docker-compose.yml          # All services
├── .env.example                # Environment template
├── backend/
│   ├── Dockerfile
│   └── app/
│       ├── main.py             # FastAPI + lifespan
│       ├── config.py           # Pydantic Settings
│       ├── agent/              # Triage orchestrator + tools + guardrails
│       ├── api/                # REST routes + WebSocket
│       ├── database/           # Models + repositories
│       ├── integrations/       # OpenRouter, Gitea, Resend clients
│       ├── services/           # Pipeline orchestration
│       └── observability/      # Structured logging
├── frontend/
│   ├── Dockerfile
│   └── src/
│       ├── pages/              # Dashboard, Submit, List, Detail
│       ├── components/         # StatusBadge, FileUpload, TriageSummary
│       └── services/           # API client
└── scripts/
    └── init-db.sh              # PostgreSQL multi-DB init

Team

  • Valeria Mora
  • Sebastian Urrea

License

MIT

About

Finalist @ Softserve Agent X Hackathon - Autonomous SRE agent that ingests incident reports for an e-commerce platform, performs AI-powered triage by analyzing the actual source code, and routes issues through a complete ticketing workflow with real-time notifications

Topics

Resources

License

Stars

Watchers

Forks

Contributors