AI-powered incident analysis and remediation system for operational logs
NovaOps is an end-to-end system that ingests raw system logs and produces structured incident analysis, including severity classification, root-cause hypotheses, remediation plans, and validation checks.
It combines LLM reasoning + vector retrieval + structured outputs to simulate real-world incident debugging workflows used in DevOps and SRE environments.
- 🔍 Log Analysis — Paste raw logs and extract meaningful signals
- 🧠 AI Reasoning Engine — Structured triage and diagnosis using LLMs
- 📊 Incident Console — Predefined incident simulations (autoscaling, rollout, CPU spike)
- 🧩 Evidence Retrieval — FAISS-based vector search for contextual grounding
- 🛠️ Remediation Planning — Step-by-step actionable recovery plan
- ✅ Validation Checks — Ensures correctness of applied fixes
⚠️ Risk Awareness — Safety considerations before applying changes- 📁 History Tracking — Persisted analysis results
User Input (Logs / Incident)
↓
Embedding + Retrieval (FAISS)
↓
LLM Reasoning (Triage)
↓
Hypothesis Generation
↓
Remediation Planning
↓
Validation + Reporting
- Next.js (App Router)
- TypeScript
- Tailwind CSS
- FastAPI
- FAISS (vector search)
- Pydantic
- AWS Bedrock (Amazon Nova)
Paste logs → get structured output:
- Summary
- Severity
- Root cause
- Plan
- Validation
Predefined real-world scenarios:
- Autoscaling misconfiguration
- Bad deployment rollout
- CPU spike
git clone https://github.com/YOUR_USERNAME/novaops-ai.git
cd novaops-aicd backend
python -m venv .venv
.venv\Scripts\activate # Windows
pip install -r requirements.txt
uvicorn app.main:app --reloadcd frontend
npm install
npm run devPaste this into Analyze page:
2026-04-11T09:10:02Z INFO auth-service Login attempt user=admin ip=192.168.1.45
2026-04-11T09:10:05Z WARN auth-service Failed login attempt user=admin ip=192.168.1.45
2026-04-11T09:10:07Z WARN auth-service Failed login attempt user=admin ip=192.168.1.45
2026-04-11T09:10:10Z ERROR auth-service Account locked due to repeated failures user=admin
2026-04-11T09:11:30Z ERROR db-service Query timeout query_id=q-9123 duration=5000ms
2026-04-11T09:11:30Z ERROR api-gateway Request failed route=/api/v1/orders status=504
2026-04-11T09:12:20Z ERROR order-service Exception in OrderProcessor
java.lang.NullPointerException: Cannot read property 'price' of undefined
- DevOps incident debugging
- SRE workflows
- Log anomaly detection
- Root cause analysis
- AI-assisted operations
- Combines retrieval + reasoning + planning
- Produces structured outputs (not raw LLM text)
- Designed as a multi-stage AI system
- Simulates real-world production incident workflows
- Multi-agent orchestration (planner, validator agents)
- Real-time log streaming
- Incident clustering
- Slack / PagerDuty integration
- Cloud deployment (Render + Vercel)
Asmita Sonavane