Skip to content

ikko-dev/gitlab-review

Repository files navigation

@ikko-dev/gitlab-review

NPM Version Downloads Size Codecov

Run an agent-driven code review in GitLab CI, parse inline comments, post deduplicated merge request discussions, and report per-run token usage and cost.

Requirements

  • Node.js >=24
  • git available in the runtime
  • A pipeline running in a merge request context (CI_PROJECT_ID, CI_MERGE_REQUEST_IID)

Install / Run

Run without installing:

npx @ikko-dev/gitlab-review

Or install in your project:

npm i -D @ikko-dev/gitlab-review
npx gitlab-review --help

Binary entrypoint

This package exposes the gitlab-review binary through:

  • bin/gitlab-review.js (runtime shim)
  • dist/cli.js (compiled CLI)

Usage

gitlab-review [options]

Common local dry-run:

gitlab-review \
  --project 123 \
  --mr 42 \
  --gitlab-url https://gitlab.example.com \
  --gitlab-token "$GITLAB_TOKEN" \
  --api-key "$GITLAB_REVIEW_API_KEY" \
  --dry-run

GitLab CI example

review:
  image: node:24
  stage: post
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  variables:
    GIT_DEPTH: '0'
  script:
    - npx @ikko-dev/gitlab-review
  artifacts:
    when: always
    paths:
      - gitlab-review.md
      - review-comments.json
      - review-usage.json

Environment variables

The CLI auto-resolves values from CI variables and common token/key names.

Variable Purpose
CI_PROJECT_ID Default for --project
CI_MERGE_REQUEST_IID Default for --mr
CI_SERVER_URL Default for --gitlab-url
CI_SERVER_HOST Fallback for --gitlab-url as https://$CI_SERVER_HOST
GITLAB_TOKEN Preferred GitLab API token (PRIVATE-TOKEN)
GLAB_CLI_TOKEN Fallback GitLab API token (PRIVATE-TOKEN)
CI_JOB_TOKEN Fallback token (JOB-TOKEN)
GITLAB_PRIVATE_TOKEN Fallback token (PRIVATE-TOKEN)
GITLAB_REVIEW_API_KEY Preferred AI API key
ANTHROPIC_API_KEY Fallback AI API key
CLAUDE_API_KEY Fallback AI API key
GITLAB_REVIEW_MODEL Default for --model
GITLAB_REVIEW_MIN_SEVERITY Default for --min-severity
GITLAB_REVIEW_THINKING_LEVEL Default for --thinking
GITLAB_REVIEW_POSTING_MODE Default for --posting-mode
GITLAB_REVIEW_POST_SUMMARY Set to false/0 to skip the MR-level summary note
GITLAB_REVIEW_FORCE_REVIEW Set to true/1 to review even if the commit was already reviewed
GITLAB_REVIEW_SKILLS Comma-separated list of built-in skill names to enable (e.g. code-review)
GITLAB_REVIEW_OTEL Set to 1 to enable the OpenTelemetry bridge (generic OTLP spans + metrics)

Flags

Flag Description Default
--project <id> GitLab project ID/path CI_PROJECT_ID
--mr <iid> Merge request IID CI_MERGE_REQUEST_IID
--gitlab-url <url> GitLab URL CI_SERVER_URL or https://${CI_SERVER_HOST}
--gitlab-token <token> GitLab token GITLAB_TOKEN, GLAB_CLI_TOKEN, CI_JOB_TOKEN, GITLAB_PRIVATE_TOKEN
--api-key <key> API key passed to the review agent GITLAB_REVIEW_API_KEY, ANTHROPIC_API_KEY, CLAUDE_API_KEY
--model <provider/id> Model passed to the review agent GITLAB_REVIEW_MODEL or anthropic/claude-sonnet-4-5
--min-severity <level> info, warn, critical GITLAB_REVIEW_MIN_SEVERITY or info
--thinking <level> off, minimal, low, medium, high, xhigh GITLAB_REVIEW_THINKING_LEVEL or off
--posting-mode <mode> direct or draft (atomic bulk publish) GITLAB_REVIEW_POSTING_MODE or direct
--no-summary Skip posting/updating the MR-level summary note summary posting is on by default
--force-review Review even if the current commit was already reviewed GITLAB_REVIEW_FORCE_REVIEW or false
--review-file <path> Raw gitlab-review output file gitlab-review.md
--output <path> Generated payload artifact file review-comments.json
--cwd <path> Working directory process.cwd()
--skill <name> Enable a built-in skill by name (repeatable) GITLAB_REVIEW_SKILLS or none
--dry-run Generate artifacts and skip posting false
--no-post Same behavior as --dry-run false
--help, -h Show help -
--version, -v Show version -

--thinking controls extended thinking on the underlying agent. Thinking tokens are billed at the model's output token rate, so higher levels cost more — the Review usage: line and review-usage.json reflect that cost.

--posting-mode draft creates GitLab draft notes for every fresh comment and publishes them atomically via POST /draft_notes/bulk_publish. The reviewer either appears fully on the MR or not at all, instead of leaking partial state if the job is interrupted. If a draft creation fails mid-flight, the run sweeps the partial drafts before reporting the failure; if the job is killed before that, the next run's orphan cleanup picks them up. Requires a GitLab version that exposes the draft_notes and bulk_publish endpoints (≥ 15.10) and a token whose user can own draft notes — keep direct for older self-hosted instances or restricted tokens. bulk_publish publishes all of the current user's drafts on the MR, so use a dedicated bot account if multiple processes may share the token.

Skills

Skills are domain-specific review modules that sharpen the agent's focus on a particular class of bug or pattern. Each skill injects a focused instruction block and optional reference files into the system prompt.

Built-in skills

Name What it does
code-review Adversarial correctness review: finds real, demonstrable bugs only. Reports nothing without a concrete proof path (specific input → failure → observable symptom). Includes per-language reference files for JavaScript/TypeScript and PHP/Laravel.

Enable a built-in skill with --skill:

gitlab-review --skill code-review

Or set it permanently via the environment variable:

variables:
  GITLAB_REVIEW_SKILLS: code-review

Multiple skills can be specified by repeating --skill or comma-separating values in GITLAB_REVIEW_SKILLS:

gitlab-review --skill code-review --skill my-custom-skill

Project skills (auto-discovery)

Drop a skill directory anywhere between the git root and cwd. The reviewer walks up the tree and loads every skill it finds:

.agents/skills/<name>/SKILL.md      # preferred location
.claude/skills/<name>/SKILL.md      # alternative location

SKILL.md follows the agentskills.io format — a YAML frontmatter block followed by the skill body:

---
name: my-skill
description: One-line description shown in the summary footer.
---

Your skill instructions here. The reviewer reads these as part of its system prompt.

A references/ subdirectory alongside SKILL.md is optional. Any files placed there are made available to the reviewer by path — the agent can read them on demand using its file-reading tool.

Project skills take precedence over built-in skills with the same name. A skill closer to cwd overrides one closer to the git root.

Skills footer

When skills are active, their names appear in the MR summary note footer:

Skills: `code-review`

Artifacts

  • gitlab-review.md: raw review text returned by the agent
  • review-comments.json: generated comment objects including:
    • parsed comment payload
    • computed fingerprints
    • duplicate status
    • final GitLab discussion payload
  • review-usage.json: token and cost breakdown for the run (tokens.{input,output,cacheRead,cacheWrite,total}, cost.{input,output,cacheRead,cacheWrite,total}, model)

The CLI also prints a one-line summary at the end of the run:

Review usage: 12,345 in / 678 out tokens — $0.0421 (anthropic/claude-sonnet-4-5)

Use these files for CI debugging and auditing.

Diagnostics channels

gitlab-review publishes opt-in Node.js diagnostics_channel tracing events with no external telemetry dependency. Subscribers can listen before calling run() or from a Node preload/import hook before running the CLI.

Base tracing channel names:

  • @ikko-dev/gitlab-review:run
  • @ikko-dev/gitlab-review:gitlab.get_merge_request
  • @ikko-dev/gitlab-review:gitlab.get_latest_version
  • @ikko-dev/gitlab-review:git.prepare_history
  • @ikko-dev/gitlab-review:git.get_merge_diff
  • @ikko-dev/gitlab-review:reviewer.run
  • @ikko-dev/gitlab-review:review.parse
  • @ikko-dev/gitlab-review:gitlab.get_discussions
  • @ikko-dev/gitlab-review:comments.build
  • @ikko-dev/gitlab-review:artifact.write_output
  • @ikko-dev/gitlab-review:gitlab.post_comments
  • @ikko-dev/gitlab-review:gitlab.upsert_summary

Node emits tracing subchannels as tracing:<base>:start, :end, :asyncStart, :asyncEnd, and :error. Payloads include safe run metadata (runId, phase, project, MR, GitLab URL, model, severity, timings, comment counts, and sanitized errorInfo) and intentionally exclude tokens/API keys.

When --posting-mode draft is used, the gitlab.post_comments payload also exposes draftsAbandoned, draftsCreated, draftsDeletedPrePublish, and draftsPublished counters describing the draft lifecycle within the run.

The reviewer.run payload exposes a usage field ({ model, tokens, cost }) once the agent has returned. The same usage is forwarded onto the top-level run payload so a subscriber on run:asyncEnd sees the final token and cost totals for the review.

import { diagnosticChannels, run } from '@ikko-dev/gitlab-review';

const onStart = (ctx) => console.log('review started', ctx.runId);
const onEnd = (ctx) => console.log('review completed', ctx.durationMs, ctx.generated);
const onError = (ctx) => console.error('review failed', ctx.errorInfo);

diagnosticChannels.run.start.subscribe(onStart);
diagnosticChannels.run.asyncEnd.subscribe(onEnd);
diagnosticChannels.run.error.subscribe(onError);

await run(config);

OpenTelemetry bridge

GITLAB_REVIEW_OTEL=1 enables a bridge that subscribes to the diagnostics channels and emits OTLP spans, GenAI client metrics, and structured log records. The OTel runtime is bundled — no extra installs required.

Exporter selection follows the standard OTEL_* env vars (OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, OTEL_EXPORTER_OTLP_PROTOCOL, …). Anything that ingests OTLP works: Tempo, Mimir, Loki, Jaeger, Datadog, Honeycomb, SigNoz, and so on.

Spans

The full trace hierarchy in Tempo is:

invoke_workflow gitlab-review
└── invoke_agent gitlab-review
    ├── gen_ai.agent.turn (turn 1)
    │   ├── execute_tool Read
    │   └── execute_tool Grep
    ├── gen_ai.agent.turn (turn 2)
    │   └── execute_tool Read
    └── gen_ai.agent.turn (turn N)
  • invoke_workflow gitlab-review — root span per run, carrying gitlab.project_id, gitlab.mr_iid, comment counters, and gen_ai.* totals.
  • invoke_agent gitlab-review — wraps the full agent call. Tagged with gen_ai.provider.name, gen_ai.request.model, gen_ai.response.model, gen_ai.operation.name=invoke_agent, aggregate token and cost attributes.
  • gen_ai.agent.turn — one child span per agent turn with per-turn token counts, cost, model, and stop reason.
  • execute_tool <name> — one grandchild span per tool call (gen_ai.tool.name, gen_ai.tool.call.id). Error status is set on failed calls.
  • gitlab-review.<phase> — one span per remaining phase (gitlab.get_merge_request, git.get_merge_diff, gitlab.post_comments, …) for latency and error rates.

Metrics

The bridge emits two sets of metrics.

GenAI client metrics follow the OpenTelemetry GenAI semantic conventions (gen_ai.*) and are emitted per LLM call:

Metric Unit Purpose
gen_ai.client.operation.duration s Overall agent call duration
gen_ai.client.token.usage {token} Token counts per turn by type
gen_ai.client.cost usd Cost per turn
gen_ai.client.time_to_first_token s TTFT per turn (recorded on first streaming event)

Review-level metrics are emitted once per complete run (success or failure):

Metric Type Labels
gitlab_review_run_duration_seconds Histogram gitlab.project_path, gitlab.pipeline_source, gitlab_review.dry_run, gitlab_review.status
gitlab_review_total_cost_usd Histogram gitlab.project_path, gitlab_review.dry_run, gitlab_review.status
gitlab_review_comments_total Counter gitlab.project_path, gitlab_review.dry_run
gitlab_review_drafts_published_total Counter gitlab.project_path, gitlab_review.dry_run
gitlab_review_phase_duration_seconds Histogram gitlab.project_path, gitlab_review.phase, gitlab_review.status

gitlab_review.status is success, error, or timeout (AbortError / ETIMEDOUT). gitlab.project_path is populated from CI_PROJECT_PATH when running inside a GitLab CI pipeline.

Grafana Application Observability auto-discovers the service from its gen_ai.* metrics without any dashboard import. The gitlab_review_* metrics enable project-level Mimir queries such as sum by (gitlab_project_path) (increase(gitlab_review_total_cost_usd_sum[7d])) to track spend per repository.

Structured log records

After each review the bridge emits one OTel log record per generated comment (event.name: gitlab_review.comment) and a review completion record (event.name: gitlab_review.completed). Each comment record carries path, line, severity, duplicate flag, and the comment body. The completion record carries total cost, token counts, model, project/MR IDs, and comment/duplicate counts.

Log records land in Loki (or whichever OTLP log backend you target) and can be correlated back to traces via run.id.

Grafana Cloud token scopes

For all three signals to reach their respective backends, the service account token used in OTEL_EXPORTER_OTLP_HEADERS must carry:

  • Traces Publisher — writes to Tempo
  • Metrics Publisher — writes to Mimir
  • Logs Publisher — writes to Loki

A token missing any of these scopes will get a silent 401 Unauthorized: invalid scope requested from the OTLP gateway. Set OTEL_LOG_LEVEL=error to surface export failures.

Disabling the bridge

When GITLAB_REVIEW_OTEL is not set, the bridge is a no-op and @opentelemetry/* is never imported (dynamic-loaded behind the env check, so unsetting the flag pays no startup cost).

Library injection

Library callers with pre-existing TracerProvider/MeterProvider/LoggerProvider can share them by injecting a runtime instead of letting the bridge boot its own NodeSDK:

import { metrics, trace } from '@opentelemetry/api';
import { logs } from '@opentelemetry/api-logs';
import { startOtelBridge } from '@ikko-dev/gitlab-review';

await startOtelBridge({
  runtime: {
    tracerProvider: trace.getTracerProvider(),
    meterProvider: metrics.getMeterProvider(),
    loggerProvider: logs.getLoggerProvider(),
    shutdown: async () => {},
  },
});

MR-level summary note

In addition to inline discussions, the reviewer returns an overall summary (Markdown). The CLI posts it as a non-positional MR note — the same shape a human reviewer creates when typing in the MR comment box. The note carries a hidden marker:

<!-- gitlab-review:summary -->

On subsequent runs the CLI finds the existing note by that marker and updates it in place via PUT /merge_requests/:iid/notes/:id, so the summary always reflects the latest review without piling up duplicates. The latest summary stays at the top of the note. When a note is updated, the previous latest summary is moved into a collapsed <details> section labeled Previous review runs instead of being erased; existing history is retained with a bounded limit of 10 previous runs.

The summary is upserted before inline comments are posted so it appears at the top of the MR activity feed. It appends footer metadata after a horizontal rule so reviewers can see the run cost and reviewed commit at a glance:

---

Review usage: 12,345 in / 678 out tokens — $0.0421 (anthropic/claude-sonnet-4-5)

Skills: `code-review`

Reviewed by [@ikko-dev/gitlab-review](https://github.com/ikko-dev/gitlab-review) for commit <sha>.

The Skills: line is only present when one or more skills were active for the run.

If a later CI job sees that the current MR head commit already appears in that footer, it skips the agent run to avoid producing a different review for the same diff. Use --force-review or GITLAB_REVIEW_FORCE_REVIEW=true to bypass the guard. The summary upsert runs in both direct and draft posting modes (it always uses the regular notes endpoints — the atomic bulk-publish flow is reserved for inline comments).

Disable with --no-summary or GITLAB_REVIEW_POST_SUMMARY=false. With --dry-run/--no-post, the summary is parsed but not posted, and the reviewed-commit skip guard is not applied.

Duplicate prevention

Each generated comment body includes hidden markers:

<!-- gitlab-review:fingerprint-primary:<hash> -->
<!-- gitlab-review:fingerprint-secondary:<hash> -->

Before posting, the CLI fetches existing MR discussions and skips comments where either fingerprint is already present. This prevents reposting across reruns and also prevents duplicates generated in the same run.

Troubleshooting

  • Node.js >=24 is required
    • Use node:24 (or newer) in CI.
  • Missing required configuration
    • Provide required flags or ensure CI vars are available (CI_PROJECT_ID, CI_MERGE_REQUEST_IID, token, API key).
  • --min-severity must be one of: info, warn, critical
    • Fix --min-severity or GITLAB_REVIEW_MIN_SEVERITY.
  • Git history errors / merge-base failures
    • Set GIT_DEPTH: 0.
    • Ensure source and target branches are fetchable from origin.
  • GitLab API 401/403 when posting
    • Ensure token has rights to read MR metadata/discussions and create MR discussions.
    • If using CI_JOB_TOKEN, ensure your GitLab project settings allow required API access.
  • No comments posted
    • Check review-comments.json for duplicate: true or empty parsed comments.
    • Run with --dry-run and inspect gitlab-review.md formatting (== Inline Comments ==).

Development / release

npm run typecheck
npm test
npm run build
npm pack --dry-run

Eval tests call the real LLM and require ANTHROPIC_API_KEY (or GITLAB_REVIEW_API_KEY) in a local .env file:

npm run test:evals

Override the model for cheaper/faster eval runs:

GITLAB_REVIEW_EVAL_MODEL=anthropic/claude-haiku-4-5-20251001 npm run test:evals

The review agent runs against pinned @earendil-works/pi-agent-core, @earendil-works/pi-ai, and @earendil-works/pi-coding-agent versions, so published builds keep a deterministic reviewer runtime.

Acknowledgements

gitlab-review builds on ideas and prior work from several projects:

  • pi-reviewer — the original agent-driven code reviewer that gitlab-review grew out of. The agent runtime (@earendil-works/pi-agent-core), model abstraction (@earendil-works/pi-ai), and read-only coding tools (@earendil-works/pi-coding-agent) are all pi-reviewer infrastructure.
  • Warden by Sentry — the skills architecture (per-skill instruction blocks, reference files loaded on demand by the agent, project-level discovery) takes direct inspiration from Warden's approach to composable, domain-specific review modules.
  • agentskills.io — the SKILL.md frontmatter format and multi-file skill layout (references/, scripts/, assets/) follow the agentskills.io open standard for portable agent skills.

About

CLI for AI-assisted GitLab merge request reviews

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages