Run an agent-driven code review in GitLab CI, parse inline comments, post deduplicated merge request discussions, and report per-run token usage and cost.
- Node.js
>=24 gitavailable in the runtime- A pipeline running in a merge request context (
CI_PROJECT_ID,CI_MERGE_REQUEST_IID)
Run without installing:
npx @ikko-dev/gitlab-reviewOr install in your project:
npm i -D @ikko-dev/gitlab-review
npx gitlab-review --helpThis package exposes the gitlab-review binary through:
bin/gitlab-review.js(runtime shim)dist/cli.js(compiled CLI)
gitlab-review [options]Common local dry-run:
gitlab-review \
--project 123 \
--mr 42 \
--gitlab-url https://gitlab.example.com \
--gitlab-token "$GITLAB_TOKEN" \
--api-key "$GITLAB_REVIEW_API_KEY" \
--dry-runreview:
image: node:24
stage: post
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
variables:
GIT_DEPTH: '0'
script:
- npx @ikko-dev/gitlab-review
artifacts:
when: always
paths:
- gitlab-review.md
- review-comments.json
- review-usage.jsonThe CLI auto-resolves values from CI variables and common token/key names.
| Variable | Purpose |
|---|---|
CI_PROJECT_ID |
Default for --project |
CI_MERGE_REQUEST_IID |
Default for --mr |
CI_SERVER_URL |
Default for --gitlab-url |
CI_SERVER_HOST |
Fallback for --gitlab-url as https://$CI_SERVER_HOST |
GITLAB_TOKEN |
Preferred GitLab API token (PRIVATE-TOKEN) |
GLAB_CLI_TOKEN |
Fallback GitLab API token (PRIVATE-TOKEN) |
CI_JOB_TOKEN |
Fallback token (JOB-TOKEN) |
GITLAB_PRIVATE_TOKEN |
Fallback token (PRIVATE-TOKEN) |
GITLAB_REVIEW_API_KEY |
Preferred AI API key |
ANTHROPIC_API_KEY |
Fallback AI API key |
CLAUDE_API_KEY |
Fallback AI API key |
GITLAB_REVIEW_MODEL |
Default for --model |
GITLAB_REVIEW_MIN_SEVERITY |
Default for --min-severity |
GITLAB_REVIEW_THINKING_LEVEL |
Default for --thinking |
GITLAB_REVIEW_POSTING_MODE |
Default for --posting-mode |
GITLAB_REVIEW_POST_SUMMARY |
Set to false/0 to skip the MR-level summary note |
GITLAB_REVIEW_FORCE_REVIEW |
Set to true/1 to review even if the commit was already reviewed |
GITLAB_REVIEW_SKILLS |
Comma-separated list of built-in skill names to enable (e.g. code-review) |
GITLAB_REVIEW_OTEL |
Set to 1 to enable the OpenTelemetry bridge (generic OTLP spans + metrics) |
| Flag | Description | Default |
|---|---|---|
--project <id> |
GitLab project ID/path | CI_PROJECT_ID |
--mr <iid> |
Merge request IID | CI_MERGE_REQUEST_IID |
--gitlab-url <url> |
GitLab URL | CI_SERVER_URL or https://${CI_SERVER_HOST} |
--gitlab-token <token> |
GitLab token | GITLAB_TOKEN, GLAB_CLI_TOKEN, CI_JOB_TOKEN, GITLAB_PRIVATE_TOKEN |
--api-key <key> |
API key passed to the review agent | GITLAB_REVIEW_API_KEY, ANTHROPIC_API_KEY, CLAUDE_API_KEY |
--model <provider/id> |
Model passed to the review agent | GITLAB_REVIEW_MODEL or anthropic/claude-sonnet-4-5 |
--min-severity <level> |
info, warn, critical |
GITLAB_REVIEW_MIN_SEVERITY or info |
--thinking <level> |
off, minimal, low, medium, high, xhigh |
GITLAB_REVIEW_THINKING_LEVEL or off |
--posting-mode <mode> |
direct or draft (atomic bulk publish) |
GITLAB_REVIEW_POSTING_MODE or direct |
--no-summary |
Skip posting/updating the MR-level summary note | summary posting is on by default |
--force-review |
Review even if the current commit was already reviewed | GITLAB_REVIEW_FORCE_REVIEW or false |
--review-file <path> |
Raw gitlab-review output file |
gitlab-review.md |
--output <path> |
Generated payload artifact file | review-comments.json |
--cwd <path> |
Working directory | process.cwd() |
--skill <name> |
Enable a built-in skill by name (repeatable) | GITLAB_REVIEW_SKILLS or none |
--dry-run |
Generate artifacts and skip posting | false |
--no-post |
Same behavior as --dry-run |
false |
--help, -h |
Show help | - |
--version, -v |
Show version | - |
--thinking controls extended thinking on the underlying agent. Thinking tokens are billed at the model's output token rate, so higher levels cost more — the Review usage: line and review-usage.json reflect that cost.
--posting-mode draft creates GitLab draft notes for every fresh comment and publishes them atomically via POST /draft_notes/bulk_publish. The reviewer either appears fully on the MR or not at all, instead of leaking partial state if the job is interrupted. If a draft creation fails mid-flight, the run sweeps the partial drafts before reporting the failure; if the job is killed before that, the next run's orphan cleanup picks them up. Requires a GitLab version that exposes the draft_notes and bulk_publish endpoints (≥ 15.10) and a token whose user can own draft notes — keep direct for older self-hosted instances or restricted tokens. bulk_publish publishes all of the current user's drafts on the MR, so use a dedicated bot account if multiple processes may share the token.
Skills are domain-specific review modules that sharpen the agent's focus on a particular class of bug or pattern. Each skill injects a focused instruction block and optional reference files into the system prompt.
| Name | What it does |
|---|---|
code-review |
Adversarial correctness review: finds real, demonstrable bugs only. Reports nothing without a concrete proof path (specific input → failure → observable symptom). Includes per-language reference files for JavaScript/TypeScript and PHP/Laravel. |
Enable a built-in skill with --skill:
gitlab-review --skill code-reviewOr set it permanently via the environment variable:
variables:
GITLAB_REVIEW_SKILLS: code-reviewMultiple skills can be specified by repeating --skill or comma-separating values in GITLAB_REVIEW_SKILLS:
gitlab-review --skill code-review --skill my-custom-skillDrop a skill directory anywhere between the git root and cwd. The reviewer walks up the tree and loads every skill it finds:
.agents/skills/<name>/SKILL.md # preferred location
.claude/skills/<name>/SKILL.md # alternative location
SKILL.md follows the agentskills.io format — a YAML frontmatter block followed by the skill body:
---
name: my-skill
description: One-line description shown in the summary footer.
---
Your skill instructions here. The reviewer reads these as part of its system prompt.A references/ subdirectory alongside SKILL.md is optional. Any files placed there are made available to the reviewer by path — the agent can read them on demand using its file-reading tool.
Project skills take precedence over built-in skills with the same name. A skill closer to cwd overrides one closer to the git root.
When skills are active, their names appear in the MR summary note footer:
Skills: `code-review`gitlab-review.md: raw review text returned by the agentreview-comments.json: generated comment objects including:- parsed comment payload
- computed fingerprints
- duplicate status
- final GitLab discussion payload
review-usage.json: token and cost breakdown for the run (tokens.{input,output,cacheRead,cacheWrite,total},cost.{input,output,cacheRead,cacheWrite,total},model)
The CLI also prints a one-line summary at the end of the run:
Review usage: 12,345 in / 678 out tokens — $0.0421 (anthropic/claude-sonnet-4-5)
Use these files for CI debugging and auditing.
gitlab-review publishes opt-in Node.js diagnostics_channel tracing events with no external telemetry dependency. Subscribers can listen before calling run() or from a Node preload/import hook before running the CLI.
Base tracing channel names:
@ikko-dev/gitlab-review:run@ikko-dev/gitlab-review:gitlab.get_merge_request@ikko-dev/gitlab-review:gitlab.get_latest_version@ikko-dev/gitlab-review:git.prepare_history@ikko-dev/gitlab-review:git.get_merge_diff@ikko-dev/gitlab-review:reviewer.run@ikko-dev/gitlab-review:review.parse@ikko-dev/gitlab-review:gitlab.get_discussions@ikko-dev/gitlab-review:comments.build@ikko-dev/gitlab-review:artifact.write_output@ikko-dev/gitlab-review:gitlab.post_comments@ikko-dev/gitlab-review:gitlab.upsert_summary
Node emits tracing subchannels as tracing:<base>:start, :end, :asyncStart, :asyncEnd, and :error. Payloads include safe run metadata (runId, phase, project, MR, GitLab URL, model, severity, timings, comment counts, and sanitized errorInfo) and intentionally exclude tokens/API keys.
When --posting-mode draft is used, the gitlab.post_comments payload also exposes draftsAbandoned, draftsCreated, draftsDeletedPrePublish, and draftsPublished counters describing the draft lifecycle within the run.
The reviewer.run payload exposes a usage field ({ model, tokens, cost }) once the agent has returned. The same usage is forwarded onto the top-level run payload so a subscriber on run:asyncEnd sees the final token and cost totals for the review.
import { diagnosticChannels, run } from '@ikko-dev/gitlab-review';
const onStart = (ctx) => console.log('review started', ctx.runId);
const onEnd = (ctx) => console.log('review completed', ctx.durationMs, ctx.generated);
const onError = (ctx) => console.error('review failed', ctx.errorInfo);
diagnosticChannels.run.start.subscribe(onStart);
diagnosticChannels.run.asyncEnd.subscribe(onEnd);
diagnosticChannels.run.error.subscribe(onError);
await run(config);GITLAB_REVIEW_OTEL=1 enables a bridge that subscribes to the diagnostics channels and emits OTLP spans, GenAI client metrics, and structured log records. The OTel runtime is bundled — no extra installs required.
Exporter selection follows the standard OTEL_* env vars (OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, OTEL_EXPORTER_OTLP_PROTOCOL, …). Anything that ingests OTLP works: Tempo, Mimir, Loki, Jaeger, Datadog, Honeycomb, SigNoz, and so on.
The full trace hierarchy in Tempo is:
invoke_workflow gitlab-review
└── invoke_agent gitlab-review
├── gen_ai.agent.turn (turn 1)
│ ├── execute_tool Read
│ └── execute_tool Grep
├── gen_ai.agent.turn (turn 2)
│ └── execute_tool Read
└── gen_ai.agent.turn (turn N)
invoke_workflow gitlab-review— root span per run, carryinggitlab.project_id,gitlab.mr_iid, comment counters, andgen_ai.*totals.invoke_agent gitlab-review— wraps the full agent call. Tagged withgen_ai.provider.name,gen_ai.request.model,gen_ai.response.model,gen_ai.operation.name=invoke_agent, aggregate token and cost attributes.gen_ai.agent.turn— one child span per agent turn with per-turn token counts, cost, model, and stop reason.execute_tool <name>— one grandchild span per tool call (gen_ai.tool.name,gen_ai.tool.call.id). Error status is set on failed calls.gitlab-review.<phase>— one span per remaining phase (gitlab.get_merge_request,git.get_merge_diff,gitlab.post_comments, …) for latency and error rates.
The bridge emits two sets of metrics.
GenAI client metrics follow the OpenTelemetry GenAI semantic conventions (gen_ai.*) and are emitted per LLM call:
| Metric | Unit | Purpose |
|---|---|---|
gen_ai.client.operation.duration |
s | Overall agent call duration |
gen_ai.client.token.usage |
{token} | Token counts per turn by type |
gen_ai.client.cost |
usd | Cost per turn |
gen_ai.client.time_to_first_token |
s | TTFT per turn (recorded on first streaming event) |
Review-level metrics are emitted once per complete run (success or failure):
| Metric | Type | Labels |
|---|---|---|
gitlab_review_run_duration_seconds |
Histogram | gitlab.project_path, gitlab.pipeline_source, gitlab_review.dry_run, gitlab_review.status |
gitlab_review_total_cost_usd |
Histogram | gitlab.project_path, gitlab_review.dry_run, gitlab_review.status |
gitlab_review_comments_total |
Counter | gitlab.project_path, gitlab_review.dry_run |
gitlab_review_drafts_published_total |
Counter | gitlab.project_path, gitlab_review.dry_run |
gitlab_review_phase_duration_seconds |
Histogram | gitlab.project_path, gitlab_review.phase, gitlab_review.status |
gitlab_review.status is success, error, or timeout (AbortError / ETIMEDOUT). gitlab.project_path is populated from CI_PROJECT_PATH when running inside a GitLab CI pipeline.
Grafana Application Observability auto-discovers the service from its gen_ai.* metrics without any dashboard import. The gitlab_review_* metrics enable project-level Mimir queries such as sum by (gitlab_project_path) (increase(gitlab_review_total_cost_usd_sum[7d])) to track spend per repository.
After each review the bridge emits one OTel log record per generated comment (event.name: gitlab_review.comment) and a review completion record (event.name: gitlab_review.completed). Each comment record carries path, line, severity, duplicate flag, and the comment body. The completion record carries total cost, token counts, model, project/MR IDs, and comment/duplicate counts.
Log records land in Loki (or whichever OTLP log backend you target) and can be correlated back to traces via run.id.
For all three signals to reach their respective backends, the service account token used in OTEL_EXPORTER_OTLP_HEADERS must carry:
Traces Publisher— writes to TempoMetrics Publisher— writes to MimirLogs Publisher— writes to Loki
A token missing any of these scopes will get a silent 401 Unauthorized: invalid scope requested from the OTLP gateway. Set OTEL_LOG_LEVEL=error to surface export failures.
When GITLAB_REVIEW_OTEL is not set, the bridge is a no-op and @opentelemetry/* is never imported (dynamic-loaded behind the env check, so unsetting the flag pays no startup cost).
Library callers with pre-existing TracerProvider/MeterProvider/LoggerProvider can share them by injecting a runtime instead of letting the bridge boot its own NodeSDK:
import { metrics, trace } from '@opentelemetry/api';
import { logs } from '@opentelemetry/api-logs';
import { startOtelBridge } from '@ikko-dev/gitlab-review';
await startOtelBridge({
runtime: {
tracerProvider: trace.getTracerProvider(),
meterProvider: metrics.getMeterProvider(),
loggerProvider: logs.getLoggerProvider(),
shutdown: async () => {},
},
});In addition to inline discussions, the reviewer returns an overall summary (Markdown). The CLI posts it as a non-positional MR note — the same shape a human reviewer creates when typing in the MR comment box. The note carries a hidden marker:
<!-- gitlab-review:summary -->On subsequent runs the CLI finds the existing note by that marker and updates it in place via PUT /merge_requests/:iid/notes/:id, so the summary always reflects the latest review without piling up duplicates. The latest summary stays at the top of the note. When a note is updated, the previous latest summary is moved into a collapsed <details> section labeled Previous review runs instead of being erased; existing history is retained with a bounded limit of 10 previous runs.
The summary is upserted before inline comments are posted so it appears at the top of the MR activity feed. It appends footer metadata after a horizontal rule so reviewers can see the run cost and reviewed commit at a glance:
---
Review usage: 12,345 in / 678 out tokens — $0.0421 (anthropic/claude-sonnet-4-5)
Skills: `code-review`
Reviewed by [@ikko-dev/gitlab-review](https://github.com/ikko-dev/gitlab-review) for commit <sha>.The Skills: line is only present when one or more skills were active for the run.
If a later CI job sees that the current MR head commit already appears in that footer, it skips the agent run to avoid producing a different review for the same diff. Use --force-review or GITLAB_REVIEW_FORCE_REVIEW=true to bypass the guard. The summary upsert runs in both direct and draft posting modes (it always uses the regular notes endpoints — the atomic bulk-publish flow is reserved for inline comments).
Disable with --no-summary or GITLAB_REVIEW_POST_SUMMARY=false. With --dry-run/--no-post, the summary is parsed but not posted, and the reviewed-commit skip guard is not applied.
Each generated comment body includes hidden markers:
<!-- gitlab-review:fingerprint-primary:<hash> -->
<!-- gitlab-review:fingerprint-secondary:<hash> -->Before posting, the CLI fetches existing MR discussions and skips comments where either fingerprint is already present. This prevents reposting across reruns and also prevents duplicates generated in the same run.
Node.js >=24 is required- Use
node:24(or newer) in CI.
- Use
Missing required configuration- Provide required flags or ensure CI vars are available (
CI_PROJECT_ID,CI_MERGE_REQUEST_IID, token, API key).
- Provide required flags or ensure CI vars are available (
--min-severity must be one of: info, warn, critical- Fix
--min-severityorGITLAB_REVIEW_MIN_SEVERITY.
- Fix
- Git history errors / merge-base failures
- Set
GIT_DEPTH: 0. - Ensure source and target branches are fetchable from
origin.
- Set
- GitLab API 401/403 when posting
- Ensure token has rights to read MR metadata/discussions and create MR discussions.
- If using
CI_JOB_TOKEN, ensure your GitLab project settings allow required API access.
- No comments posted
- Check
review-comments.jsonforduplicate: trueor empty parsed comments. - Run with
--dry-runand inspectgitlab-review.mdformatting (== Inline Comments ==).
- Check
npm run typecheck
npm test
npm run build
npm pack --dry-runEval tests call the real LLM and require ANTHROPIC_API_KEY (or GITLAB_REVIEW_API_KEY) in a local .env file:
npm run test:evalsOverride the model for cheaper/faster eval runs:
GITLAB_REVIEW_EVAL_MODEL=anthropic/claude-haiku-4-5-20251001 npm run test:evalsThe review agent runs against pinned @earendil-works/pi-agent-core, @earendil-works/pi-ai, and @earendil-works/pi-coding-agent versions, so published builds keep a deterministic reviewer runtime.
gitlab-review builds on ideas and prior work from several projects:
- pi-reviewer — the original agent-driven code reviewer that
gitlab-reviewgrew out of. The agent runtime (@earendil-works/pi-agent-core), model abstraction (@earendil-works/pi-ai), and read-only coding tools (@earendil-works/pi-coding-agent) are all pi-reviewer infrastructure. - Warden by Sentry — the skills architecture (per-skill instruction blocks, reference files loaded on demand by the agent, project-level discovery) takes direct inspiration from Warden's approach to composable, domain-specific review modules.
- agentskills.io — the
SKILL.mdfrontmatter format and multi-file skill layout (references/,scripts/,assets/) follow the agentskills.io open standard for portable agent skills.