Skip to content

Asmi2911/novaops-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NovaOps AI

AI-powered incident analysis and remediation system for operational logs

NovaOps is an end-to-end system that ingests raw system logs and produces structured incident analysis, including severity classification, root-cause hypotheses, remediation plans, and validation checks.

It combines LLM reasoning + vector retrieval + structured outputs to simulate real-world incident debugging workflows used in DevOps and SRE environments.


✨ Features

  • 🔍 Log Analysis — Paste raw logs and extract meaningful signals
  • 🧠 AI Reasoning Engine — Structured triage and diagnosis using LLMs
  • 📊 Incident Console — Predefined incident simulations (autoscaling, rollout, CPU spike)
  • 🧩 Evidence Retrieval — FAISS-based vector search for contextual grounding
  • 🛠️ Remediation Planning — Step-by-step actionable recovery plan
  • Validation Checks — Ensures correctness of applied fixes
  • ⚠️ Risk Awareness — Safety considerations before applying changes
  • 📁 History Tracking — Persisted analysis results

🏗️ Architecture

User Input (Logs / Incident)
        ↓
Embedding + Retrieval (FAISS)
        ↓
LLM Reasoning (Triage)
        ↓
Hypothesis Generation
        ↓
Remediation Planning
        ↓
Validation + Reporting

🛠️ Tech Stack

Frontend

  • Next.js (App Router)
  • TypeScript
  • Tailwind CSS

Backend

  • FastAPI
  • FAISS (vector search)
  • Pydantic
  • AWS Bedrock (Amazon Nova)

📸 Screens

Analyze Page

Paste logs → get structured output:

  • Summary
  • Severity
  • Root cause
  • Plan
  • Validation

Incident Console

Predefined real-world scenarios:

  • Autoscaling misconfiguration
  • Bad deployment rollout
  • CPU spike

🚀 Getting Started

1. Clone repository

git clone https://github.com/YOUR_USERNAME/novaops-ai.git
cd novaops-ai

2. Backend setup

cd backend
python -m venv .venv
.venv\Scripts\activate   # Windows
pip install -r requirements.txt
uvicorn app.main:app --reload

3. Frontend setup

cd frontend
npm install
npm run dev

4. Run the app:

👉 http://localhost:3000


🧪 Sample Logs (for testing)

Paste this into Analyze page:

2026-04-11T09:10:02Z INFO  auth-service  Login attempt user=admin ip=192.168.1.45
2026-04-11T09:10:05Z WARN  auth-service  Failed login attempt user=admin ip=192.168.1.45
2026-04-11T09:10:07Z WARN  auth-service  Failed login attempt user=admin ip=192.168.1.45
2026-04-11T09:10:10Z ERROR auth-service  Account locked due to repeated failures user=admin

2026-04-11T09:11:30Z ERROR db-service    Query timeout query_id=q-9123 duration=5000ms
2026-04-11T09:11:30Z ERROR api-gateway   Request failed route=/api/v1/orders status=504

2026-04-11T09:12:20Z ERROR order-service Exception in OrderProcessor
java.lang.NullPointerException: Cannot read property 'price' of undefined

🎯 Use Cases

  • DevOps incident debugging
  • SRE workflows
  • Log anomaly detection
  • Root cause analysis
  • AI-assisted operations

🧠 Key Highlights

  • Combines retrieval + reasoning + planning
  • Produces structured outputs (not raw LLM text)
  • Designed as a multi-stage AI system
  • Simulates real-world production incident workflows

📌 Future Improvements

  • Multi-agent orchestration (planner, validator agents)
  • Real-time log streaming
  • Incident clustering
  • Slack / PagerDuty integration
  • Cloud deployment (Render + Vercel)

👩‍💻 Author

Asmita Sonavane


About

AI-powered incident analysis system using LLMs, FAISS vector search, and structured remediation planning

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors