Skip to content

Sanhajio/devopsiphai

Repository files navigation

Devopsiphai

A Claude Code skill that audits the operational health of a project in production.


Why

Most vibe-coded projects ship fast. Every time I onboard on a new project, I find the same gaps, devopsiphai asks five questions instead:

  1. Can I onboard easily? Can a new developer clone the repo, start the project, and contribute without asking anyone for anything?

  2. Can I deploy safely? Can I ship a new version to staging and production without being afraid of breaking either environment?

  3. Do I know what is running where? After a deploy, can I tell exactly what version is live, when it was deployed, and what migration state the database is in?

  4. Can I see what is happening? Do I have metrics, error tracking, and analytics to know whether the deploy went well and whether users are behaving as expected?

  5. Can I recover if something goes wrong? If errors are reported, can I roll back the application, the database, and the frontend independently — and do I have a runbook for each?

These are the questions that matter when you are running a live product with a small team. The ARC score is the answer to all five, expressed as a grade. Scoring is kind of gamifying the process of getting to a operationally good project.


ARC Framework

ARC — Automation, Reporting, Control — is a framework for operational excellence I built with previous CTOs. It is more digestible for small teams than existing frameworks. Every audit is scored against three pillars:

Pillar Answers Covers
Automation Can I deploy safely? · Do I know what is running where? CI/CD, artifact promotion, version tracking, migration automation, security scanning
Reporting Can I see what is happening? Observability, metrics, error tracking, alerting, dashboards, product analytics
Control Can I onboard easily? · Can I recover? Onboarding, secrets management, backup/rollback, IaC reproducibility, git auditability

Grades

A  — Present and enforced (automated, no manual bypass possible)
B  — Present, partially enforced
C  — Present, not enforced (exists but can be bypassed)
D  — Partially present, not enforced
F  — Absent

One grade per pillar. One overall ARC grade.


What it produces

A full audit run produces two files in /tmp/devopsiphai/<timestamp>/:

<project>-audit-<timestamp>.md

The full audit report covering:

  • Phase 1 — Factual map of the project: repo structure, stack, architecture, auth, database, secrets, observability, hosting, onboarding, git workflow, testing, AI usage, code quality, questionable decisions
  • Phase 2 — Deep domain audits: CI/CD, containers, IaC, security, observability, onboarding
  • Phase 3 — ARC score with the five questions answered, per-pillar grades, findings by severity, and priority actions
  • Phase 4 — Workflow checks: runtime lockdown, developer environment, testing layers, task runner, local-first stack
  • Phase 5 — Delivery checks: artifact identity, pipeline correctness, migration automation, rollback strategy, infrastructure codification

TODO.md

A flat, ordered, actionable checklist derived entirely from the audit findings:

# TODO — my-project
Generated by devopsiphai | 2026-03-17 | ARC: D/D/F overall: D

## 🔴 Immediate

- [ ] [XS] Rotate AWS IAM key AKIA... — go to AWS Console → IAM → Security credentials
- [ ] [XS] Run `git rm --cached backend/.aws/.env.dev` and push

## 🟠 Workflow Foundation

- [ ] [XS] Add `.nvmrc` to frontend/ with content `20` (matches node:20-slim in Dockerfile)
- [ ] [XS] Add `.python-version` to backend/ with content `3.11`
- [ ] [S] Add `scripts/db-dump.sh` — dumps staging Postgres using pg_dump to `./dumps/<timestamp>.sql`
- [ ] [S] Add `scripts/db-load.sh` — loads a given .sql file into local Supabase via pg_restore
- [ ] [S] Add `scripts/env-generate.sh` — reads ports and keys from running containers,
         writes backend/.env and frontend/.env
- [ ] [XS] Add `.env.example` to backend/ — list all 56 vars, mark Clerk/Stripe/OpenAI
         as "# obtain from tech lead"

## 🟡 Pipeline & Delivery

- [ ] [S] Tag Docker images with `${{ github.sha }}` in deploy-staging.yml
- [ ] [S] Add `versions.json` tracking artifact version + migration checkpoint per environment
- [ ] [M] Update deploy-prod.yml to promote staging image instead of rebuilding
  - [ ] [XS] Read image tag from versions.json staging entry
  - [ ] [XS] Pull that image from ECR
  - [ ] [XS] Deploy to production App Runner
  - [ ] [XS] Update versions.json production entry on success

...

Every item names an actual file, script, or config. M and L tasks are broken into XS subtasks. Nothing says "improve" or "consider."


Install

# Clone the repo
git clone https://github.com/sanhajio/devopsiphai ~/.claude/skills/

# Or copy the skill to Claude Code's skills directory
cp -r devopsiphai/skills/devopsiphai ~/.claude/skills/

Restart Claude Code. The skill is available automatically.

Requirements

  • Claude Code with subagent support
  • Projects accessible from your local filesystem or a mounted path

Usage

Point Claude Code at your project and trigger the skill naturally:

audit the devops side of this project
devopsiphai my project at ~/code/myapp
give me an ARC score for this repo
run just the preliminary audit on this project
generate a TODO for this project

Entrypoints

Intent What runs
Full audit Phases 1 → 2 → 3 → 4 → 5 → 6
Exploration only Phase 1 (preliminary audit)
Specific section "just check secrets", "check my CI", etc.
ARC score only Phase 3 (requires prior audit output)
TODO only Phase 6 (requires Phases 3–5 output)

Configuration

skills/devopsiphai/config.yaml:

# Output directory for timestamped reports
output:
  directory: /tmp/devopsiphai

# Sections to skip in the preliminary audit (1.1–1.17)
skip_sections: []
  # - "1.14"  # AI Usage
  # - "1.16"  # Questionable Architecture Decisions

# Domains to skip in the domain audit
skip_domains: []
  # - iac
  # - containers

# ARC pillar weights (must sum to 1.0 per pillar)
arc_weights:
  automation:
    cicd: 0.40
    testing: 0.30
    containers: 0.15
    code_quality: 0.15
  reporting:
    observability: 0.50
    logging_tracing: 0.30
    user_audit_trail: 0.20
  control:
    security: 0.30
    iac: 0.25
    backup_rollback: 0.20
    git_auditability: 0.25

Design principles

Phase 1 is facts only. No suggestions, no judgement, no icons. The audit maps what exists before it evaluates anything. This prevents the common failure mode of AI audits: confidently critiquing things that don't exist or misreading what does.

Every check derives from what was found. Phase 4 and 5 checks are generated from Phase 1 findings. If the project uses Python, it checks for .python-version. If it uses MongoDB, it checks for mongodump. Nothing is assumed.

The TODO is derived, not invented. Phase 6 reads FAIL/PARTIAL checks from Phases 4–5 and CRITICAL/HIGH/MEDIUM findings from Phase 3. Every item in the TODO names an actual file path, tool, or config key from the audit. No generic recommendations.

Parallel subagents per section. Each of the 17 Phase 1 sections runs as an independent subagent. On a real codebase this makes Phase 1 fast enough to be useful in practice.

Local-first. The skill flags any service that could run locally but is hitting external SaaS, and suggests self-hostable alternatives. Anything that can run in Docker should run locally for development.


Skill structure

devopsiphai/
├── SKILL.md                  ← router, ARC definition, phase sequencing
├── preliminary-audit.md      ← Phase 1: 17 factual sections
├── domain-audit.md           ← Phase 2: routes to reference files
├── arc-scoring.md            ← Phase 3: ARC grades + findings
├── workflow.md               ← Phase 4: developer workflow checks
├── delivery.md               ← Phase 5: pipeline + rollback checks
├── todogen.md                ← Phase 6: TODO generation
├── post-run.md               ← report dump + feedback prompt
├── config.yaml               ← weights, skip lists, output directory
├── analysis/
│   ├── onboarding.md         ← onboarding domain audit
│   ├── observability.md      ← observability domain audit
│   └── security.md           ← security domain audit
└── references/
    ├── cicd.md               ← CI/CD domain audit
    ├── containers.md         ← containers domain audit
    └── iac.md                ← IaC domain audit

Contributing

If you use this, run it on a project, have opinions about the ARC framework, or just want to share what you found — please reach out. Feedback, disagreements, and improvements are all welcome and would genuinely make my day.

The skill improves through use. After each audit run, Claude Code will ask if you want to log feedback. That feedback is saved to /tmp/devopsiphai/<timestamp>/feedback.md.

To contribute improvements:

  1. Run the skill on a real project
  2. Note what was wrong, missing, or unhelpful
  3. Open a PR with the change to the relevant phase file

The most valuable contributions are: checks that should exist but don't, output formats that are hard to read, and TODO items that weren't granular enough to act on. You can find me on:


License

MIT

About

A [Claude Code](https://claude.ai/code) skill that audits the operational runnability of a project in production. #claude #claudecode # skills

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors