Releases · vivianjeet/langgraph-pr-audit-agent

08 Jun 11:49

a5e3bb6

v1.0 - multi-agent PR audit on a Gemini model-tier router Latest

Latest

A LangGraph PR-audit agent: plan-execute with reflexion, four-type memory
with a rule lifecycle, an MCP compliance server, and a single model-tier
router over Gemini with context caching, extended-thinking routing and
fail-closed retries. Every audit reports its cost and serving tier; the
CI gate blocks a merge on an unaddressed critical finding until a human
approves.

Assets 2

08 Jun 09:29

vivianjeet

v0.9

9126387

v0.9 - extended-thinking router and full router consolidation

Adds a Gemini extended-thinking path for the hardest regulated reviews (gated by a
deterministic complexity heuristic) and routes every node's model call through one
router, so each gets consistent tier selection, fail-closed handling and a complete
cost trace. Includes a fix so per-call output ceilings apply.

Assets 2

08 Jun 09:27

vivianjeet

v0.8

798cdcc

v0.8 - Langfuse cost tracking, tool-choice benchmark, docs alignment

Rollup over v0.7: Langfuse cost tracking at the router callback (per-tier spend,
fallback events, scores on trace, report CLIs), a fail-closed tier fix (security
stays on Pro on cache miss), a tool-choice benchmark that captures thinking tokens
with the docs corrected to match, a consolidated command reference, four newly
documented scripts and removal of batch-mode framing from the prefix-cache docs.

Assets 2

07 Jun 18:36

vivianjeet

v0.7

b47880d

v0.7 - context caching + tool-choice

Model-tier router (UnifiedLLMClient): every node selects by tier on the
shared retry/rotation spine; fail-closed QuotaExhaustedError preserved.
Context caching on two axes: per-PR diff cache reused across the Flash
nodes (~74% input-cost cut, verified live), cross-PR prefix cache on the
security node; both respect the 2048-token floor.
Tool-choice benchmark across the four Gemini function-calling modes;
Instructor retained for forced structured extraction.
Central src/config.py for all tunables (one-way config -> state).

Tests: 210 passed, 2 deselected.

Assets 2

07 Jun 11:55

vivianjeet

v0.6

fbc55d7

v0.6 - model-tier router, central config, verdict-driven human review

Adds a model-tier router (UnifiedLLMClient) on top of the Gemini retry/
rotation spine, with tiered fallback and per-call cost accounting, plus a
single src/config.py for every tunable value. Human-review verdicts now drive
the report status, rule learning and the CI exit code through one shared path.
Scoring is multiplicative and citation verification is whitespace-safe.

Built on the v0.2-v0.5 line.

Assets 2

06 Jun 17:05

vivianjeet

v0.5

7bab53f

v0.5 - grounded compliance citations

Verified verbatim regulatory spans per compliance claim (quote -> substring-
verify -> drop hallucinated), on the Gemini spine. Plus the MCP week:
agent-as-client/server over stdio, multi-framework rule packs, a raw-stdio
test client, and the compliance audit trail.

Assets 2

06 Jun 12:47

vivianjeet

v0.4

f776c69

v0.4 - Compliance grounding over MCP

This release makes the agent ground its findings in real regulations, and
ships the memory, context, and reliability work that backs a production audit.

Highlights

Compliance grounding over MCP

The agent now speaks the Model Context Protocol on both ends. A new compliance
stage triages whether a diff is regulated and pulls the matching regulatory
passages, so a security finding cites the exact clause it breaks - for example,
a SQL-injection diff is grounded in the RBI Cyber Security Framework and
OWASP A03.

MCP client - the compliance node calls search_compliance_docs over
stdio (retrieve -> compliance -> plan).
MCP server - compliance-rag (FastMCP) exposes the same retrieval as a
reusable tool any MCP client can call, from Claude Desktop to a raw-SDK
client, with no glue.
Pluggable framework packs - RBI, HIPAA, PCI-DSS, OWASP, and GDPR ship by
default. Adding a framework is dropping a packs/*.yaml file and re-running
the seeder; no code change.
Fails soft - a missing server or an unregulated diff yields empty context
and a visible trace line, never a crash and never a silent "clean".

Memory, context, and orchestration

Four-type agent memory (semantic, episodic, procedural, in-context) with a
typed procedural-rule lifecycle and a governance CLI.
Priority-ordered context budgeting and an in-graph history-compression node
for long sessions.
The three audits run concurrently, with thread-safe API-key rotation under
the fan-out.
Pluggable checkpointer: in-memory by default, opt-in durable SQLite.

Reliability

A depleted-credits 429 is classified as terminal billing rather than a
transient rate limit, so it rotates keys instead of burning retries.
Corpus seeding is batched to stay under the embedding per-minute quota.
MCP tool output is normalized back to structured records on the client.

Verification

163 tests passing (pytest -m "not integration").
Live end-to-end: a regulated diff produces cross-framework citations; an
unregulated diff short-circuits with no lookup.

Full changelog: v0.3...v0.4

Assets 2

03 Jun 20:54

vivianjeet

v0.3

6f1ebe0

v0.3 - concurrent audits, rule governance, and design-rationale docs

Concurrent audit fan-out (async nodes, overlapping Gemini calls) with thread-safe key rotation in the retry layer.
Procedural-rule governance: full lifecycle (seeded / learned_pending / learned_approved / rejected / retired) plus an offline review
CLI with a near-duplicate hint and per-rule PR-verdict provenance.
Latency benchmark harness and a design-rationale README section.
CI hardening: hermetic test job, async-correct gate, dependency and PR-approval-lookup fixes.

Assets 2

Releases: vivianjeet/langgraph-pr-audit-agent

v1.0 - multi-agent PR audit on a Gemini model-tier router

Uh oh!

v0.9 - extended-thinking router and full router consolidation

Uh oh!

v0.8 - Langfuse cost tracking, tool-choice benchmark, docs alignment

Uh oh!

v0.7 - context caching + tool-choice

Uh oh!

v0.6 - model-tier router, central config, verdict-driven human review

Uh oh!

v0.5 - grounded compliance citations

Uh oh!

v0.4 - Compliance grounding over MCP

Highlights

Compliance grounding over MCP

Memory, context, and orchestration

Reliability

Verification

Uh oh!

v0.3 - concurrent audits, rule governance, and design-rationale docs

Uh oh!