Skip to content

rangasreenivas/IncidentAnalyzer

Repository files navigation

AI Incident Analyzer

An intelligent .NET Core API that analyzes application logs to detect anomalies, identify root causes, and suggest resolutions using AI-powered insights from Claude AI.

Overview

The AI Incident Analyzer is a sophisticated system that processes API logs and application event data to:

  • Detect Anomalies: Identify unusual patterns and error spikes in log data
  • Identify Root Causes: Pinpoint the underlying cause of incidents using AI analysis
  • Suggest Resolutions: Provide actionable, prioritized resolution steps with implementation guidance

Features

AI-Powered Analysis

  • Leverages Claude 3.5 Sonnet for intelligent log analysis
  • Contextual understanding of error patterns and system behavior
  • Smart fallback mechanisms for resilience

🔍 Comprehensive Incident Analysis

  • Anomaly detection with confidence scoring
  • Root cause identification with affected component mapping
  • Priority-based resolution suggestions with time estimates

📊 Production Ready

  • Full dependency injection and configuration management
  • Structured logging and error handling
  • RESTful API with intuitive endpoints
  • HTTPS support

Architecture

┌──────────────────────────────────────────────────────────────┐
│                     REST API Endpoint                         │
│               POST /api/incidents/analyze                     │
└────────────────────────────┬─────────────────────────────────┘
                             │
                             ▼
        ┌────────────────────────────────────────┐
        │   IncidentAnalysisController            │
        │   - Validates incoming requests        │
        │   - Orchestrates analysis pipeline     │
        └────────────────┬───────────────────────┘
                         │
                         ▼
        ┌────────────────────────────────────────┐
        │  IncidentAnalysisService               │
        │  - Coordinates all analysis steps      │
        │  - Generates incident summary          │
        │  - Calculates severity scores          │
        └─────┬──────────────┬─────────┬─────────┘
              │              │         │
    ┌─────────▼───────┐ ┌───▼──────┐ ┌─▼────────────┐
    │  Anomaly        │ │ Root     │ │ Resolution   │
    │  Detection      │ │ Cause    │ │ Suggestion   │
    │  Service        │ │ Service  │ │ Service      │
    └────────┬────────┘ └───┬──────┘ └──────┬───────┘
             │               │               │
             │               ▼               │
             │    ┌─────────────────────┐   │
             │    │   ClaudeAI Service  │   │
             └───►│                     │◄──┘
                  │ - HTTP Client       │
                  │ - JSON Serialization│
                  │ - Response Parsing  │
                  └────────┬────────────┘
                           │
                           ▼
           ┌──────────────────────────────┐
           │  Anthropic Claude API        │
           │  (claude-3-5-sonnet)         │
           └──────────────────────────────┘

Getting Started

Prerequisites

Installation

  1. Clone or navigate to the project directory:
cd C:\Srinivas\IncidentAnalyzer\AIIncidentAnalyzer
  1. Configure your Anthropic API key in appsettings.json:
{
  "ClaudeAI": {
    "ApiKey": "sk-ant-your-api-key-here",
    "Model": "claude-3-5-sonnet-20241022",
    "MaxTokens": 2048,
    "Temperature": 0.7
  }
}

Alternatively, set the environment variable:

$env:ANTHROPIC_API_KEY = "sk-ant-your-api-key-here"
  1. Restore dependencies and build:
dotnet restore
dotnet build
  1. Run the application:
dotnet run

The API will be available at https://localhost:7101 (or the configured port).

API Endpoints

POST /api/incidents/analyze

Analyzes logs to detect anomalies, identify root causes, and suggest resolutions.

Request Body:

{
  "logs": [
    {
      "timestamp": "2026-04-01T10:30:45Z",
      "level": "ERROR",
      "source": "OrderService",
      "message": "Timeout connecting to database",
      "stackTrace": null,
      "metadata": {
        "requestId": "req-123",
        "durationMs": 30000
      }
    }
  ],
  "incidentId": "incident-2026-0401-001",
  "serviceName": "OrderProcessingAPI",
  "startTime": "2026-04-01T10:00:00Z",
  "endTime": "2026-04-01T11:00:00Z"
}

Response:

{
  "incidentId": "incident-2026-0401-001",
  "incidentSummary": "Incident detected in OrderProcessingAPI. Analyzed 45 logs. Anomaly detected: True. Primary cause: Database Connection Timeout. Confidence level: 92.5%",
  "analyzedAt": "2026-04-01T10:35:00Z",
  "anomalyDetection": {
    "isAnomaly": true,
    "anomalyScore": 0.78,
    "description": "Detected 12 errors and 8 warnings out of 45 logs",
    "anomalousLogs": [
      "Timeout connecting to database",
      "Connection pool exhausted",
      "Query execution exceeded timeout"
    ]
  },
  "rootCause": {
    "primaryCause": "Database Connection Timeout",
    "description": "The application cannot establish connections to the database within the timeout period",
    "confidence": 0.925,
    "affectedComponent": "Data Access Layer",
    "contributingFactors": [
      "High database load",
      "Slow query execution",
      "Connection pool exhaustion"
    ]
  },
  "recommendedResolutions": [
    {
      "action": "Optimize Database Queries",
      "description": "Review and optimize slow-running database queries",
      "priority": 1,
      "implementationSteps": "1. Run query analysis\n2. Add missing indexes\n3. Refactor complex queries",
      "estimatedResolutionTime": "04:00:00"
    },
    {
      "action": "Increase Connection Pool Size",
      "description": "Increase the number of available database connections",
      "priority": 1,
      "implementationSteps": "1. Update connection string\n2. Reconfigure pool settings\n3. Monitor connection usage",
      "estimatedResolutionTime": "00:30:00"
    }
  ],
  "overallSeverity": 0.82
}

GET /api/incidents/health

Health check endpoint to verify the service is running.

Response:

{
  "status": "healthy",
  "timestamp": "2026-04-01T10:35:00Z"
}

Configuration

appsettings.json

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
  "AllowedHosts": "*",
  "ClaudeAI": {
    "ApiKey": "your-anthropic-api-key-here",
    "Model": "claude-3-5-sonnet-20241022",
    "MaxTokens": 2048,
    "Temperature": 0.7
  }
}

Configuration Options:

Setting Description Default
ClaudeAI:ApiKey Your Anthropic API key Required
ClaudeAI:Model Claude model to use claude-3-5-sonnet-20241022
ClaudeAI:MaxTokens Maximum tokens in response 2048
ClaudeAI:Temperature Response creativity (0-1) 0.7

Project Structure

AIIncidentAnalyzer/
├── Models/
│   ├── LogEntry.cs                 # Log entry model
│   ├── AnomalyResult.cs            # Anomaly detection result
│   ├── RootCauseAnalysis.cs        # Root cause analysis result
│   ├── ResolutionSuggestion.cs     # Resolution suggestion model
│   ├── IncidentAnalysisRequest.cs  # API request model
│   └── IncidentAnalysisResponse.cs # API response model
├── Services/
│   ├── IClaudeAIService.cs         # Claude AI interface
│   ├── ClaudeAIService.cs          # Claude AI implementation
│   ├── IAnomalyDetectionService.cs
│   ├── AnomalyDetectionService.cs
│   ├── IRootCauseAnalysisService.cs
│   ├── RootCauseAnalysisService.cs
│   ├── IResolutionSuggestionService.cs
│   ├── ResolutionSuggestionService.cs
│   ├── IIncidentAnalysisService.cs
│   └── IncidentAnalysisService.cs
├── Controllers/
│   └── IncidentsController.cs      # REST API controller
├── Configuration/
│   └── ClaudeAIOptions.cs          # Configuration options
├── Program.cs                       # Application startup
├── appsettings.json                 # Configuration file
└── AIIncidentAnalyzer.csproj        # Project file

Data Models

LogEntry

public class LogEntry
{
    public DateTime Timestamp { get; set; }
    public string Level { get; set; }              // DEBUG, INFO, WARN, ERROR
    public string Source { get; set; }             // Service/Component name
    public string Message { get; set; }            // Log message
    public string? StackTrace { get; set; }        // Exception stack trace
    public Dictionary<string, object>? Metadata { get; set; }
}

AnomalyResult

public class AnomalyResult
{
    public bool IsAnomaly { get; set; }
    public double AnomalyScore { get; set; }       // 0 to 1
    public string Description { get; set; }
    public List<string> AnomalousLogs { get; set; }
}

RootCauseAnalysis

public class RootCauseAnalysis
{
    public string PrimaryCause { get; set; }
    public string Description { get; set; }
    public double Confidence { get; set; }         // 0 to 1
    public List<string> ContributingFactors { get; set; }
    public string AffectedComponent { get; set; }
}

ResolutionSuggestion

public class ResolutionSuggestion
{
    public string Action { get; set; }
    public string Description { get; set; }
    public int Priority { get; set; }              // 1=High, 2=Medium, 3=Low
    public string? ImplementationSteps { get; set; }
    public TimeSpan? EstimatedResolutionTime { get; set; }
}

Usage Examples

Example 1: Analyzing Database Timeout Incident

curl -X POST https://localhost:7101/api/incidents/analyze \
  -H "Content-Type: application/json" \
  -d @sample-logs.json

Example 2: Using PowerShell

$logs = @(
    @{
        timestamp = "2026-04-01T10:30:45Z"
        level = "ERROR"
        source = "OrderService"
        message = "Timeout connecting to database"
    }
) | ConvertTo-Json

$body = @{
    logs = $logs
    serviceName = "OrderProcessingAPI"
} | ConvertTo-Json

Invoke-WebRequest -Uri "https://localhost:7101/api/incidents/analyze" `
    -Method Post `
    -Body $body `
    -ContentType "application/json"

How It Works

Step 1: Anomaly Detection

The service analyzes the log distribution to identify unusual patterns:

  • Error rate analysis
  • Pattern recognition
  • Severity assessment
  • Uses Claude AI for intelligent pattern matching

Step 2: Root Cause Analysis

Claude AI examines error messages and logs to determine the likely cause:

  • Error message analysis
  • Stack trace examination
  • Contextual understanding
  • Generates confidence scores

Step 3: Resolution Generation

Based on the identified root cause, the system suggests actionable fixes:

  • Priority-based recommendations
  • Step-by-step implementation guidance
  • Time estimates for resolution
  • Both immediate and long-term solutions

Error Handling

The service includes robust error handling with fallback mechanisms:

  • API Unavailable: Falls back to heuristic-based analysis
  • Invalid JSON: Returns 400 Bad Request with clear error message
  • Missing Logs: Returns 400 Bad Request
  • Server Error: Returns 500 with error details

Performance Considerations

  • Timeout: Set reasonable timeout values for Claude API calls
  • Log Size: Limits analysis to first 10 error logs to manage token usage
  • Batch Processing: Consider batching large log sets for analysis
  • Caching: Future versions can cache common patterns for faster analysis

Security

  • API Key Management: Never commit API keys to version control
  • Environment Variables: Use environment variables for sensitive configuration
  • HTTPS: Always use HTTPS in production
  • Input Validation: All requests are validated before processing

Troubleshooting

Issue: "API key is not configured"

Solution: Ensure ANTHROPIC_API_KEY environment variable or ClaudeAI:ApiKey in appsettings.json is set.

Issue: "No logs provided for analysis"

Solution: Ensure your request includes a non-empty logs array.

Issue: "Error calling Claude API"

Solution:

  • Verify your API key is valid
  • Check internet connectivity
  • Ensure you have available API quota

Issue: "Failed to deserialize Claude response"

Solution: This indicates Claude returned invalid JSON. Check logs for the raw response.

Future Enhancements

  • 📊 Dashboard for incident tracking
  • 📈 Analytics and trend analysis
  • 🔔 Real-time alerting
  • 💾 Database integration for incident history
  • 🔌 Integration with popular monitoring tools (Datadog, New Relic, etc.)
  • 📱 Mobile app support
  • 🧠 Custom ML models for specific domains

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

This project is provided as-is for incident analysis and monitoring purposes.

Support

For issues or questions:

  1. Check the Troubleshooting section
  2. Review Claude API documentation: https://docs.anthropic.com
  3. Check application logs for detailed error information

Version

Current Version: 1.0.0
Last Updated: April 1, 2026
Framework: .NET 10.0


Built with ❤️ using Claude AI for intelligent incident analysis

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages