Skip to content

feat(node): iCaptcha-aware repo propagation gate with quarantine#125

Merged
kevincodex1 merged 1 commit into
mainfrom
feat/icaptcha-federation
Jun 30, 2026
Merged

feat(node): iCaptcha-aware repo propagation gate with quarantine#125
kevincodex1 merged 1 commit into
mainfrom
feat/icaptcha-federation

Conversation

@kevincodex1

@kevincodex1 kevincodex1 commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Phase 2 of the iCaptcha work: extend the proof-of-intelligence gate from direct writes to federated propagation, so a node that enforces iCaptcha can defend against spam repos that originate on a node which does not.

The proof now travels with the repo and is re-verified at mirror time:

  • Capture & persist: create_repo/fork_repo store the verified proof (raw token + claims) against the new repo so it can propagate. A new repo_icaptcha_proofs side table holds one proof per repo.

  • Serve: GET /api/v1/repos/{owner}/{repo}/icaptcha-proof returns the stored token, gated on whole-repo "/" read like the other replication endpoints so a private repo's proof is never disclosed.

  • Admit at mirror time: on a first-time mirror the sync worker fetches the origin's proof and icaptcha::admit_mirror re-verifies it offline — signature + level + owner-DID binding, but NOT expiry (proofs are short-lived; the origin enforced freshness at creation) — and consumes the jti in the local ledger, so one solved challenge admits at most one mirror per node (a malicious origin can't reuse one proof for many spam repos).

  • Quarantine: in enforce mode a mirror that fails the gate is kept on disk but hidden from clone/serve (authorize_repo_read + both git serve handlers) and from listings (dedup CTE + federated list), via a new repos.quarantined column. Shadow logs only and writes nothing; off admits normally. Fail-safe: off/shadow/no-key always admit.

Internals: split verify_token (raw token, optional exp check) out of the header path; ProofGuard::consume now returns the VerifiedProof for persistence. DB gains record_repo_proof/get_repo_proof_token, is_repo_quarantined, and (for the deferred operator release surface) set_repo_quarantine/list_quarantined_repo_ids.

Tested: verify_token expiry-skip + signature/level/DID enforcement on the propagation path; proof round-trip; quarantine set/release, preservation across re-sync, and exclusion from listings. Full workspace green against Postgres.

Known tradeoff: a legit repo from a non-iCaptcha (or shadow) origin has no proof and is quarantined by an enforce node — recoverable via operator release. The operator release HTTP endpoint is deferred (no admin-auth surface exists yet); DB methods are in place.

Summary by CodeRabbit

  • New Features

    • Added a new endpoint for retrieving iCaptcha proof information for a repository, supporting mirror verification workflows.
  • Bug Fixes

    • Quarantined mirrors are now hidden from clone/fetch/read access and repository listings.
    • Repository creation and fork flows now preserve verification data for later use during mirroring.
    • Mirror synchronization now applies a stricter first-sync check and keeps prior quarantine decisions on re-sync.

@github-actions github-actions Bot added the needs-issue PR has no linked issue label Jun 30, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Thanks for the contribution. A couple of things will help us review this faster:

  • Link the issue this addresses (Closes #123). For protocol changes, open an issue first.

See CONTRIBUTING.md. Update the PR and these notes will clear automatically.

@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds an iCaptcha propagation quarantine system for mirrored repos. Proofs are now persisted at repo creation and fork, exposed via a new GET /api/v1/repos/{owner}/{repo}/icaptcha-proof endpoint, fetched by the sync worker on first mirror, and used to set a quarantined DB flag. Quarantined repos are hidden from listings, authorization, and git clone/fetch endpoints.

Changes

iCaptcha Propagation Quarantine

Layer / File(s) Summary
VerifiedProof struct and verify_token refactor
crates/gitlawb-node/src/icaptcha.rs
Decision::Consume payload changes from a {jti, exp} pair to a new VerifiedProof struct with a record_for_repo method. ProofGuard::consume now returns Ok(Some(VerifiedProof)). verify_token is extracted as a reusable core verifier accepting a check_exp flag so the propagation path can skip expiry while still enforcing signature, level, and DID. admit_mirror uses check_exp=false. Tests cover the new return type and both expiry behaviors.
DB migration, quarantine schema, and proof/quarantine APIs
crates/gitlawb-node/src/db/mod.rs
v9 migration adds repos.quarantined (default FALSE) and repo_icaptcha_proofs table. upsert_mirror_repo gains a quarantined: bool parameter, stored on insert but not overwritten on resync conflict. list_all_repos_with_stars and DEDUP_CTE filter out quarantined rows. New DB methods: record_repo_proof, get_repo_proof_token, is_repo_quarantined, set_repo_quarantine, list_quarantined_repo_ids. Existing dedup tests updated; new tests cover proof round-trips, quarantine set/release/preservation, and listing exclusion.
Sync worker quarantine gate
crates/gitlawb-node/src/sync.rs
On first-time mirror sync, calls new fetch_icaptcha_proof helper to retrieve the proof from the origin's /icaptcha-proof endpoint, runs icaptcha::admit_mirror, derives quarantined from the result, and passes it to upsert_mirror_repo. Re-syncs of existing mirrors skip the gate entirely.
Proof persistence at creation/fork and new proof endpoint
crates/gitlawb-node/src/api/repos.rs, crates/gitlawb-node/src/server.rs
create_repo and fork_repo capture the returned VerifiedProof and persist it via record_for_repo (warning-only on failure). New get_icaptcha_proof handler authorizes repo read and returns { repo, proof } JSON. Route wired in build_router.
Quarantine enforcement at read surfaces
crates/gitlawb-node/src/api/mod.rs, crates/gitlawb-node/src/api/repos.rs
authorize_repo_read returns RepoNotFound for quarantined repos before evaluating visibility rules. git_info_refs and git_upload_pack each add an early quarantine check returning RepoNotFound.
Test support fixups
crates/gitlawb-node/src/test_support.rs
Four integration test helpers updated to pass the new false (non-quarantined) argument to upsert_mirror_repo.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Gitlawb/node#52: This PR extends the authorize_repo_read gate introduced in #52 to also reject quarantined repos with RepoNotFound.
  • Gitlawb/node#108: Both PRs modify the iCaptcha proof verification/consumption flow; this PR refactors it further to persist VerifiedProof per repo for quarantine propagation.
  • Gitlawb/node#73: This PR extends the shared DEDUP_CTE introduced in #73 to additionally exclude quarantined mirror rows from deduped listings.

Poem

🐇 A mirror appears from afar, quite bold,
But without proof its token can't unfold.
Quarantined repos stay hidden from sight,
No clone, no fetch — they vanish from light.
The rabbit stamps proofs and hops away glad! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning It covers summary, motivation, changes, and testing, but omits required template sections like issue link, kind of change, reviewer verification, and protocol impact. Add the missing template sections: Closes #, kind of change, reviewer verification steps, protocol/signing impact, and reviewer notes/checklists.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately describes the main change: iCaptcha-aware repo propagation with quarantine.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/icaptcha-federation

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/gitlawb-node/src/api/repos.rs`:
- Around line 1543-1558: The get_icaptcha_proof endpoint is returning a raw repo
creation proof that can be reused by any whole-repo reader, so mirror admission
should not rely on this unbound token. Update get_icaptcha_proof and the
downstream admission flow in crate::icaptcha::admit_mirror so the propagated
proof is bound to repo identity/content or made peer/target-specific before
acceptance, and ensure the returned value cannot be replayed across origins.
- Around line 500-504: The `git_receive_pack` POST route is missing the
quarantine gate that already exists in the clone/fetch path, so quarantined
mirrors can still accept pushes. Add the same
`state.db.is_repo_quarantined(&record.id).await?` check in `git_receive_pack`
before any receive-pack handling, and return `AppError::RepoNotFound` for
quarantined repos just like the existing repository lookup flow in `repos.rs`.

In `@crates/gitlawb-node/src/db/mod.rs`:
- Around line 864-871: The repo admission logic in the insert/upsert flow
currently keys `quarantined` off disk existence, which can wrongly admit a repo
when the Git directory exists but the DB row is still missing. Update the
sync/admission path around the `INSERT INTO repos ... ON CONFLICT` flow so
mirror admission is driven by absence of the exact mirror DB row, not by
`local_path.exists()`, and ensure the `quarantined` decision is recomputed from
DB presence before calling the repo insert/update logic.
- Around line 803-806: The mirror replay JTI is being stored with the original
propagation proof expiry, so already-expired proofs can be swept immediately and
replayed after cleanup. Update the mirror admission flow around `admit_mirror`
and the proof ledger in `db::mod` to persist replay JTIs with a durable
retention expiry (or a separate propagation replay ledger) instead of
`claims.exp`, and make `sweep_expired_proofs` only remove entries once that
retention window truly expires.

In `@crates/gitlawb-node/src/icaptcha.rs`:
- Line 421: The replay-guard entry for consumed proofs is being removed too
early, allowing the same proof to be reused after expiration. Update the proof
lifecycle in `admit_mirror`, `consume_proof_jti`, and `sweep_expired_proofs()`
so consumed JTI records remain in the replay ledger longer than `claims.exp` and
cannot be admitted again on the same node after cleanup. Make the cleanup
retention window explicit and ensure the sweep logic only removes entries once
the replay-protection period has safely passed.

In `@crates/gitlawb-node/src/sync.rs`:
- Around line 284-299: The admission check in sync logic is using the local
clone existence as a proxy for mirror approval, which can bypass quarantine on
retries. Update the admission flow in the sync path around the quarantined
calculation to base the decision on DB state or a persisted mirror record rather
than the on-disk clone, so `crate::icaptcha::admit_mirror` is still consulted
when a repo exists locally but was never successfully upserted. Preserve the
existing `warn!` quarantine path and ensure the `upsert_mirror_repo`/mirror
insert path records the correct quarantined state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3ebef415-8506-41f2-9e5b-4a24c9dd991f

📥 Commits

Reviewing files that changed from the base of the PR and between adc20f9 and e59aa10.

📒 Files selected for processing (7)
  • crates/gitlawb-node/src/api/mod.rs
  • crates/gitlawb-node/src/api/repos.rs
  • crates/gitlawb-node/src/db/mod.rs
  • crates/gitlawb-node/src/icaptcha.rs
  • crates/gitlawb-node/src/server.rs
  • crates/gitlawb-node/src/sync.rs
  • crates/gitlawb-node/src/test_support.rs

Comment on lines +500 to +504
// A quarantined mirror is served to no one (clone or push advertisement) —
// hidden as repo-not-found until an operator releases it.
if state.db.is_repo_quarantined(&record.id).await? {
return Err(AppError::RepoNotFound(format!("{owner}/{name}")));
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP 'pub async fn git_receive_pack|is_repo_quarantined|git-receive-pack' crates/gitlawb-node/src/api/repos.rs crates/gitlawb-node/src/server.rs

Repository: Gitlawb/node

Length of output: 1055


🏁 Script executed:

sed -n '540,840p' crates/gitlawb-node/src/api/repos.rs

Repository: Gitlawb/node

Length of output: 11740


🏁 Script executed:

sed -n '150,220p' crates/gitlawb-node/src/server.rs

Repository: Gitlawb/node

Length of output: 2953


🏁 Script executed:

sed -n '556,576p' crates/gitlawb-node/src/api/repos.rs

Repository: Gitlawb/node

Length of output: 1017


Add the quarantine gate to git_receive_pack. The direct /{owner}/{repo}/git-receive-pack POST route still lacks an is_repo_quarantined check, so quarantined mirrors can accept pushes even though they’re hidden from clone/fetch.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/api/repos.rs` around lines 500 - 504, The
`git_receive_pack` POST route is missing the quarantine gate that already exists
in the clone/fetch path, so quarantined mirrors can still accept pushes. Add the
same `state.db.is_repo_quarantined(&record.id).await?` check in
`git_receive_pack` before any receive-pack handling, and return
`AppError::RepoNotFound` for quarantined repos just like the existing repository
lookup flow in `repos.rs`.

Comment on lines +1543 to +1558
/// GET /api/v1/repos/{owner}/{repo}/icaptcha-proof
///
/// Returns the iCaptcha proof token this repo was created with (`null` if none).
/// A peer mirroring this repo fetches it and re-verifies it offline before
/// admitting the mirror (see [`crate::icaptcha::admit_mirror`]). Not owner-gated,
/// but gated on whole-repo `"/"` read like the other replication endpoints, so a
/// private repo's proof is never disclosed.
pub async fn get_icaptcha_proof(
State(state): State<AppState>,
auth: Option<Extension<AuthenticatedDid>>,
Path((owner, repo)): Path<(String, String)>,
) -> Result<Json<serde_json::Value>> {
let caller = auth.as_ref().map(|e| e.0 .0.as_str());
let (record, _rules) =
crate::api::authorize_repo_read(&state, &owner, &repo, caller, "/").await?;
let proof = state.db.get_repo_proof_token(&record.id).await?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift

Don’t expose an unbound bearer proof for mirror admission.

This returns the raw creation proof to any whole-repo reader, while mirror admission validates the token against the owner binding only. For public repos, any reader can reuse that token from another origin or spend it first on a target node. Bind the propagated proof to repo identity/content or make it peer/target-specific before accepting it for admission.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/api/repos.rs` around lines 1543 - 1558, The
get_icaptcha_proof endpoint is returning a raw repo creation proof that can be
reused by any whole-repo reader, so mirror admission should not rely on this
unbound token. Update get_icaptcha_proof and the downstream admission flow in
crate::icaptcha::admit_mirror so the propagated proof is bound to repo
identity/content or made peer/target-specific before acceptance, and ensure the
returned value cannot be replayed across origins.

Comment on lines +803 to +806
// The iCaptcha proof presented at repo creation, kept so it can travel
// with the repo when it propagates to peers. A mirroring node that
// enforces iCaptcha re-verifies this token offline before admitting the
// mirror (see `icaptcha::admit_mirror`). One row per repo (its creation

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift

Persist mirror replay JTIs beyond the proof exp.

admit_mirror accepts expired proofs, but it records claims.exp in the same ledger swept by sweep_expired_proofs. For already-expired propagation proofs, the consumed JTI is immediately sweepable, so the same proof can admit another mirror on the same node after cleanup. Use a separate propagation replay ledger or store a durable mirror-retention expiry instead of the original proof expiry.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/db/mod.rs` around lines 803 - 806, The mirror replay
JTI is being stored with the original propagation proof expiry, so
already-expired proofs can be swept immediately and replayed after cleanup.
Update the mirror admission flow around `admit_mirror` and the proof ledger in
`db::mod` to persist replay JTIs with a durable retention expiry (or a separate
propagation replay ledger) instead of `claims.exp`, and make
`sweep_expired_proofs` only remove entries once that retention window truly
expires.

Comment on lines +864 to 871
// `quarantined` is set only on first insert (the admission decision).
// A re-sync (ON CONFLICT) preserves the existing flag — admission runs
// once, and an operator's later release must not be silently reverted.
sqlx::query(
"INSERT INTO repos (id, name, owner_did, description, is_public, default_branch,
created_at, updated_at, disk_path, machine_id)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10)
created_at, updated_at, disk_path, machine_id, quarantined)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11)
ON CONFLICT (id) DO UPDATE SET updated_at = $8, disk_path = $9, machine_id = $10",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Base first-insert admission on DB presence, not disk presence.

The sync caller derives quarantined=false whenever local_path.exists(). If a prior clone left the repo on disk but failed before inserting the DB row, this INSERT path admits it unquarantined because there is no conflict to preserve. Run mirror admission whenever the exact mirror DB row is absent, separately from whether the Git directory already exists.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/db/mod.rs` around lines 864 - 871, The repo admission
logic in the insert/upsert flow currently keys `quarantined` off disk existence,
which can wrongly admit a repo when the Git directory exists but the DB row is
still missing. Update the sync/admission path around the `INSERT INTO repos ...
ON CONFLICT` flow so mirror admission is driven by absence of the exact mirror
DB row, not by `local_path.exists()`, and ensure the `quarantined` decision is
recomputed from DB presence before calling the repo insert/update logic.

Comment thread crates/gitlawb-node/src/icaptcha.rs Outdated
// later switch to enforce still sees the jti as fresh.
return MirrorAdmission::Admit;
}
match db.consume_proof_jti(&claims.jti, claims.exp).await {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect whether consume_proof_jti reserves already-expired JTIs or treats exp as an immediate TTL.
rg -n -C 10 'consume_proof_jti\s*\(' crates/gitlawb-node/src

# Look for tests covering expired propagation proof replay behavior.
rg -n -C 6 'admit_mirror|expired proof|proof_jti|icaptcha.*jti|already used' crates/gitlawb-node/src

Repository: Gitlawb/node

Length of output: 28343


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the sweep implementation and any scheduled callers/cleanup references.
rg -n -C 4 'sweep_expired_proofs|icaptcha_consumed_proofs|consume_proof_jti\(|cleanup.*proof|sweep.*proof' crates/gitlawb-node/src

# Read the sweep implementation and nearby docs/comments.
sed -n '1128,1180p' crates/gitlawb-node/src/db/mod.rs
sed -n '3736,3765p' crates/gitlawb-node/src/db/mod.rs

Repository: Gitlawb/node

Length of output: 13893


Keep replay ledger entries longer than exp
admit_mirror intentionally accepts expired proofs, but sweep_expired_proofs() deletes the consumed jti as soon as exp passes. After cleanup, the same proof token can be replayed again on this node and admit another mirror.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/icaptcha.rs` at line 421, The replay-guard entry for
consumed proofs is being removed too early, allowing the same proof to be reused
after expiration. Update the proof lifecycle in `admit_mirror`,
`consume_proof_jti`, and `sweep_expired_proofs()` so consumed JTI records remain
in the replay ledger longer than `claims.exp` and cannot be admitted again on
the same node after cleanup. Make the cleanup retention window explicit and
ensure the sweep logic only removes entries once the replay-protection period
has safely passed.

Comment thread crates/gitlawb-node/src/sync.rs Outdated
Comment on lines +284 to +299
let quarantined = if !exists {
let proof =
fetch_icaptcha_proof(client, &origin_url, owner_short, repo_name).await;
match crate::icaptcha::admit_mirror(db, proof.as_deref(), owner_short).await {
crate::icaptcha::MirrorAdmission::Admit => false,
crate::icaptcha::MirrorAdmission::Quarantine(reason) => {
warn!(
repo = %item.repo, origin = %origin_url, reason,
"quarantining mirrored repo: failed iCaptcha propagation gate"
);
true
}
}
} else {
false
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Base admission on DB state, not the on-disk clone.

Line 284 treats any existing local path as an already-admitted mirror. If a prior sync cloned the repo but crashed or failed before upsert_mirror_repo, the retry will skip admit_mirror and insert the repo with quarantined = false, bypassing enforce mode.

Suggested direction
+        let known_repo = match db.get_repo(owner_short, repo_name).await {
+            Ok(Some(_)) => true,
+            Ok(None) => false,
+            Err(e) => {
+                warn!(repo = %item.repo, err = %e, "repo lookup failed before admission gate");
+                let _ = db.mark_sync_failed(&item.id).await;
+                continue;
+            }
+        };
+
-                let quarantined = if !exists {
+                let quarantined = if !known_repo {
                     let proof =
                         fetch_icaptcha_proof(client, &origin_url, owner_short, repo_name).await;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let quarantined = if !exists {
let proof =
fetch_icaptcha_proof(client, &origin_url, owner_short, repo_name).await;
match crate::icaptcha::admit_mirror(db, proof.as_deref(), owner_short).await {
crate::icaptcha::MirrorAdmission::Admit => false,
crate::icaptcha::MirrorAdmission::Quarantine(reason) => {
warn!(
repo = %item.repo, origin = %origin_url, reason,
"quarantining mirrored repo: failed iCaptcha propagation gate"
);
true
}
}
} else {
false
};
let known_repo = match db.get_repo(owner_short, repo_name).await {
Ok(Some(_)) => true,
Ok(None) => false,
Err(e) => {
warn!(repo = %item.repo, err = %e, "repo lookup failed before admission gate");
let _ = db.mark_sync_failed(&item.id).await;
continue;
}
};
let quarantined = if !known_repo {
let proof =
fetch_icaptcha_proof(client, &origin_url, owner_short, repo_name).await;
match crate::icaptcha::admit_mirror(db, proof.as_deref(), owner_short).await {
crate::icaptcha::MirrorAdmission::Admit => false,
crate::icaptcha::MirrorAdmission::Quarantine(reason) => {
warn!(
repo = %item.repo, origin = %origin_url, reason,
"quarantining mirrored repo: failed iCaptcha propagation gate"
);
true
}
}
} else {
false
};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/sync.rs` around lines 284 - 299, The admission check
in sync logic is using the local clone existence as a proxy for mirror approval,
which can bypass quarantine on retries. Update the admission flow in the sync
path around the quarantined calculation to base the decision on DB state or a
persisted mirror record rather than the on-disk clone, so
`crate::icaptcha::admit_mirror` is still consulted when a repo exists locally
but was never successfully upserted. Preserve the existing `warn!` quarantine
path and ensure the `upsert_mirror_repo`/mirror insert path records the correct
quarantined state.

Phase 2 of the iCaptcha work: extend the proof-of-intelligence gate from
direct writes to federated propagation, so a node that enforces iCaptcha
can defend against spam repos that originate on a node which does not.

The proof now travels with the repo and is re-verified at mirror time:

- Capture & persist: create_repo/fork_repo store the verified proof
  (raw token + claims) against the new repo. New repo_icaptcha_proofs table.

- Serve: GET /api/v1/repos/{owner}/{repo}/icaptcha-proof returns the stored
  token, gated on whole-repo "/" read like the other replication endpoints.

- Admit at mirror time: the sync worker fetches the origin's proof and
  icaptcha::admit_mirror re-verifies it offline — signature + level +
  owner-DID binding, but NOT expiry — and consumes the jti in the local
  ledger so one solved challenge admits at most one mirror per node.

- Quarantine: in enforce mode a mirror that fails the gate is kept on disk
  but hidden from clone/serve/push (authorize_repo_read + the git serve and
  receive-pack handlers) and from listings (dedup CTE + federated list), via
  a new repos.quarantined column. Shadow logs only; off admits normally.
  Fail-safe: off/shadow/no-key always admit.

Hardening from review:
- git-receive-pack also rejects quarantined mirrors (no writes while hidden).
- mirror admission keys "first-time" on DB-row absence, not on-disk presence,
  so a clone that crashed before the DB insert is still gated on retry.
- mirror jti is spent against a fixed forward retention window, not the
  proof's already-past exp, so the replay record isn't swept early.

Known limitation: an iCaptcha proof binds to the owner DID, not the repo, so
a malicious origin can get one spam repo per harvested public proof admitted
(per-node jti single-use bounds the amplification). Fully binding a proof to
a repo needs a repo-scoped iCaptcha challenge (protocol change, future work).
The operator release HTTP endpoint is deferred (no admin-auth surface yet);
DB methods are in place.

Tested: verify_token expiry-skip + signature/level/DID enforcement on the
propagation path; proof round-trip; quarantine set/release, preservation
across re-sync, exclusion from listings; mirror jti retention survives a
sweep at the proof's exp. Full workspace green against Postgres.
@kevincodex1 kevincodex1 force-pushed the feat/icaptcha-federation branch from e59aa10 to f3feafe Compare June 30, 2026 12:29
@kevincodex1 kevincodex1 merged commit 8b9ceec into main Jun 30, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-issue PR has no linked issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant