fix(node): gate /ipfs/pins and /arweave/anchors behind authentication (#121)#134
fix(node): gate /ipfs/pins and /arweave/anchors behind authentication (#121)#134Gravirei wants to merge 42 commits into
Conversation
Gitlawb#126) The IPFS visibility gate used withheld_blob_oids (a deny-set enumerating only reachable blobs), so a dangling/unreachable blob was absent from the set and served in cleartext to anonymous callers. Flip to an allowed-set (allowed_blob_set_for_caller) that enumerates reachable blobs the caller may read: a dangling blob has no path, is never in the set, and 404s.
Move store::read_object before the allowed_blob_set_for_caller spawn_blocking call so random-CID spray against repos with path-scoped rules cannot trigger full-history git walks on repos that don't carry the object.
|
Thanks for the contribution. A couple of things will help us review this faster:
See CONTRIBUTING.md. Update the PR and these notes will clear automatically. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds optional-signature middleware for ChangesAuth Gating for Metadata Index Endpoints
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/gitlawb-node/src/api/ipfs.rs (1)
186-203: 🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy liftAvoid exposing the node-wide pin index to every signed DID.
This blocks anonymous callers, but any authenticated DID still reaches
list_pinned_cids()and receives the full node-wide CID index. If authenticated users are not all node-wide admins, private-repo CIDs remain enumerable. Gate this with a real node-wide permission or make pins repo-scoped and authorize repo read before returning entries.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/api/ipfs.rs` around lines 186 - 203, The current auth check in list_pins only blocks anonymous callers, but still lets any authenticated DID reach list_pinned_cids() and see the full node-wide pin index. Update list_pins to enforce a real node-wide permission using the authenticated caller from AuthenticatedDid before returning pins, or change the data model so pins are repo-scoped and authorize repo read access per caller. Keep the existing unauthorized path for unauthenticated requests, and ensure the final result only includes entries the caller is allowed to see.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gitlawb-node/src/api/arweave.rs`:
- Around line 45-57: The global anchor listing path in the arweave handler still
exposes private repo metadata because it only checks that a caller exists before
calling list_arweave_anchors(None, limit). Update the handler in the arweave API
to enforce per-repo read visibility for the no-repo case, or require q.repo to
be present, or gate global listing behind a node-admin capability. Use the
existing caller check and the list_arweave_anchors call site to locate the fix.
- Around line 52-55: The Arweave anchor query is only capping the upper bound in
the handler, so negative `q.limit` values can still reach the database and fail
in `list_arweave_anchors`. Update the limit handling in `arweave.rs` to clamp
the request value to a minimum of zero and a maximum of 200 before passing it
into `state.db.list_arweave_anchors`, keeping the existing `q.repo.as_deref()`
flow unchanged.
---
Outside diff comments:
In `@crates/gitlawb-node/src/api/ipfs.rs`:
- Around line 186-203: The current auth check in list_pins only blocks anonymous
callers, but still lets any authenticated DID reach list_pinned_cids() and see
the full node-wide pin index. Update list_pins to enforce a real node-wide
permission using the authenticated caller from AuthenticatedDid before returning
pins, or change the data model so pins are repo-scoped and authorize repo read
access per caller. Keep the existing unauthorized path for unauthenticated
requests, and ensure the final result only includes entries the caller is
allowed to see.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f9fd5b6a-451e-4f4f-a04b-f97da029b822
📒 Files selected for processing (3)
crates/gitlawb-node/src/api/arweave.rscrates/gitlawb-node/src/api/ipfs.rscrates/gitlawb-node/src/server.rs
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/gitlawb-node/src/test_support.rs (1)
1968-2017: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd a repo-scoped public success case and assert the filter actually narrows the result set.
These tests only exercise
?repo=against a single private repo with a single anchor. That leaves two intended contracts unpinned: anonymous access to a public repo-scoped listing, and returning only anchors for the requested repo. A regression that blanket-denies public?repo=reads or ignores therepopredicate would still pass here.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/test_support.rs` around lines 1968 - 2017, The `anchors_repo_denies_anonymous_on_private` and `anchors_repo_allows_owner` tests only cover a single private repo, so they do not prove that `?repo=` works for public repositories or that the repo filter is actually applied. Add a repo-scoped success case for a public repo in the `anchors_router`/`seed_anchor` test area, and make the assertion check that only anchors for the requested repo are returned (not just the count). Use the existing `anchors_repo_*` test patterns and the `anchors_router` request helpers to locate the right spot.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/gitlawb-node/src/test_support.rs`:
- Around line 1968-2017: The `anchors_repo_denies_anonymous_on_private` and
`anchors_repo_allows_owner` tests only cover a single private repo, so they do
not prove that `?repo=` works for public repositories or that the repo filter is
actually applied. Add a repo-scoped success case for a public repo in the
`anchors_router`/`seed_anchor` test area, and make the assertion check that only
anchors for the requested repo are returned (not just the count). Use the
existing `anchors_repo_*` test patterns and the `anchors_router` request helpers
to locate the right spot.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 13b6010e-7968-4a2e-ae89-754bc8ff29d1
📒 Files selected for processing (1)
crates/gitlawb-node/src/test_support.rs
beardthelion
left a comment
There was a problem hiding this comment.
This closes the anonymous hole in #121 and the ?repo= anchor path is correctly visibility-gated, so the direction is right. But the global listings need to filter on current visibility, not just require authentication, and there is a real leak path that auth alone does not cover. I traced the write and read sides end to end against the merged state.
The index tables are correctly gated on the write side: pinning (repos.rs:1055,:1167, behind withheld.is_some()) and anchoring (repos.rs:1249, behind announce) only run for anonymously-root-readable repos, and object_list excludes withheld blobs. So a repo that is private at push time is never indexed. CodeRabbit's finding is still valid though, by a different mechanism than a missing write gate: the gate is evaluated once at push, visibility is mutable afterward, and nothing reconciles the index.
Findings
-
[P2] Filter the global
/arweave/anchorsand/ipfs/pinslistings on current visibility, not just authentication
crates/gitlawb-node/src/api/arweave.rs:53,crates/gitlawb-node/src/api/ipfs.rs(list_pins)
This confirms and extends CodeRabbit's open finding on the global anchor listing. Requiring authentication does not authorize: identities here are permissionless (optional_signatureverifies a self-made signature,registeris open), so any throwaway DID still reads the global lists. And because index rows are never reconciled when a repo is made private after a public push (noDELETEof either table exists,set_visibilitytouches neither, and both list queries are unfilteredSELECTs), the global lists can serve a now-private repo's slug, owner DID, branch names, commit SHAs, and object CIDs. Content stays gated byGET /ipfs/{cid}(#110/#133), so this is metadata disclosure. The same gap applies to/ipfs/pins, which is not flagged elsewhere. Resolve each row's repo and apply the current-visibility check before returning it (or restrict the global listing to a node-admin capability). Full mechanism and repro in #136. -
[P2] Fix the
?repo=gate-vs-filter representation drift (and the test that hides it)
crates/gitlawb-node/src/api/arweave.rs:51
The gate resolves the repo viaauthorize_repo_read->get_repo, which matches the owner byLIKE(fulldid:key:or bare short form). The result query then filters exactWHERE repo = $1on the raw?repo=string, but anchor rows are written short-form ({owner_short}/{name},repos.rs:1142). So?repo=did:key:zX/nameauthorizes (200) yet returns an empty list; only?repo=zX/namereturns the owner's anchors. It fails safe (returns fewer results, not more), so it is a correctness/usability bug, not a leak. The newanchors_repo_*tests pass only becauseseed_anchorseeds the full-DID form, which production never writes. Normalize the slug from the authorizedRepoRecordbefore querying, and seed the production short form in tests. -
[P3] Run
cargo fmt—test_support.rs:1903failsrustfmt --check
crates/gitlawb-node/src/test_support.rs:1903
A longassert_eq!line needs wrapping; the format gate fails.cargo fmtfixes it.
Notes (non-blocking): the parse-fail branch returns NotFound("repo not found") while the gate path returns RepoNotFound; both 404 and deny is indistinguishable from miss, so no oracle, but returning RepoNotFound on both is more consistent. ?limit=-1 reaches Postgres as LIMIT -1 (pre-existing). The new tests use local routers rather than build_router, so the layering fix that is the core of this PR is not asserted end to end; a test through build_router would lock it. The router refactor itself is correct: the old .merge-after-.layer left /ipfs/pins unsigned, and this fixes it.
The ?repo= gating pattern is the right one. Extending the same current-visibility filtering to the two global listings (per #136) is what completes the #121 fix.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gitlawb-node/src/api/arweave.rs`:
- Around line 36-37: The anonymous global anchor listing path in the arweave API
still allows `list_arweave_anchors(None, limit)` to run when `repo` is absent,
which should be rejected instead. Update the handler in `arweave.rs` around
`caller`/`list_arweave_anchors` so that when no repository is specified and
`auth` yields `None`, the request returns a 401 before any query is executed.
Keep the authenticated and repo-scoped paths unchanged, and make sure the
`list_arweave_anchors` call only happens after a valid caller has been
established.
In `@crates/gitlawb-node/src/api/ipfs.rs`:
- Around line 195-203: The pin listing handler is still continuing when auth is
missing, so anonymous requests get a filtered success response instead of being
rejected. In the IPFS API handler around the caller/auth extraction and
`list_pinned_cids` flow, add an explicit unauthorized check when `caller` is
`None` before loading pins, and return the proper 401 early from this function
instead of proceeding into the database query and filter logic.
- Around line 220-229: The repo selection logic in list_pins is bypassing the
quarantine gate because it calls visibility_check directly instead of the shared
authorization path used by authorize_repo_read. Update the readable-repo
derivation in the loop over repos to preserve the quarantine exclusion before
applying repo-level visibility, preferably by reusing authorize_repo_read or
extracting the quarantine filter into a shared helper referenced by both
list_pins and authorize_repo_read. Ensure the fix is applied around the repo
iteration and visibility_check flow so quarantined repos never contribute CIDs
to the pins index.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e012ef44-6106-4e47-a97e-b4a94070946d
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
crates/gitlawb-node/src/api/arweave.rscrates/gitlawb-node/src/api/ipfs.rscrates/gitlawb-node/src/test_support.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/gitlawb-node/src/test_support.rs
…tion ✓ P3 blocker fixed: cargo fmt applied — the format gate will pass. ✓ P3 cleanup resolved: withheld_blob_oids is still used by replication code in repos.rs, so it stays. • P2 follow-up: Tree/commit disclosure tracked in Gitlawb#135 — out of scope here.
jatmn
left a comment
There was a problem hiding this comment.
I found issues that need to be addressed before this is ready.
Findings
-
[P1] Complete CodeRabbit's request to reject anonymous global listings
crates/gitlawb-node/src/api/arweave.rs:36,crates/gitlawb-node/src/api/ipfs.rs:195
The latest patch still only makes the signature optional and then proceeds whencallerisNone. As a result, anonymousGET /api/v1/arweave/anchorsandGET /api/v1/ipfs/pinsreturn200instead of the401behavior this PR claims and tests for. I verified this with the PR's own focused tests:cargo test -p gitlawb-node pins_list -- --nocapturefails withpins_list_denies_anonymousreturning200instead of401, andcargo test -p gitlawb-node anchors_ -- --nocapturefails withanchors_global_denies_anonymousreturning200instead of401. Please complete the current CodeRabbit requests by returningUnauthorizedbefore querying/filtering whenever the global listing has no authenticated caller. -
[P1] Fix the new endpoint tests so they pass and cover the production data shape
crates/gitlawb-node/src/test_support.rs:1889,crates/gitlawb-node/src/test_support.rs:1960,crates/gitlawb-node/src/test_support.rs:2011
The new tests currently seed rows that the new implementation filters out, so the PR's own coverage is red and does not prove the intended production behavior.pins_list_allows_authenticatedinserts a barepinned_cidsrow but no readable repo/object containing that SHA, whilelist_pinsnow only returns pins whose SHA is found by scanning readable repo object databases, so the count is0instead of1. The anchor tests seedsome/repoor the full-DID slug (did:key:.../repo), but production anchor writes use the short owner slug fromrepos.rs, andlist_anchorsnow normalizes authorized?repo=lookups to that short form, so the owner/global success tests also return count0. Please seed the same repo/object and short-slug shapes that the push/anchor paths actually write, then keep the assertions on the filtered results. -
[P2] Preserve the quarantine gate when deriving readable repos for pins
crates/gitlawb-node/src/api/ipfs.rs:220
authorize_repo_readhides quarantined repos before it applies visibility rules, but the newlist_pinsimplementation bypasses that helper and iterateslist_all_repos()with a directvisibility_check. A quarantined public mirror can therefore still contribute object SHAs toallowed_sha256s, and any matching pinned CID is returned even though the rest of the read surface treats that repo as nonexistent until quarantine is cleared. Please reuseauthorize_repo_readsemantics here or add the sameis_repo_quarantinedexclusion before scanning repo objects.
beardthelion
left a comment
There was a problem hiding this comment.
The core gates verify as resolved on the current head: anonymous global listings return 401 before any query, the pin scan skips quarantined repos, and the tests now seed the production slug and object shapes. The earlier P1/P1#2/P2 all check out on this head. What's left is regression coverage on the new exclusion branches, which currently have no test exercising them.
Findings
-
[P3] Cover the quarantine exclusion in
list_pins
crates/gitlawb-node/src/api/ipfs.rs:235
Theis_repo_quarantinedskip is correct, but nopins_test seeds a quarantined repo, so nothing guards it against regression. Seed a quarantined mirror that shares an object SHA with a pinned CID (and no readable non-quarantined repo carrying that SHA), then assert the pin is withheld. -
[P3] Cover the path-scoped withheld-blob exclusion in
list_pins
crates/gitlawb-node/src/api/ipfs.rs:260
list_pinsbuilds its allowed set withwithheld_blob_oids, a separate construction fromget_by_cid'sallowed_blob_set_for_caller, and nopins_test sets a path-scoped rule, so this subtraction is uncovered. Add a public repo with a/secret/**rule where a withheld blob OID is pinned, and assert an authenticated non-reader's listing excludes that CID while the visible-object pins remain. This is the branch that would leak a private blob SHA through the pin index if the subtraction regresses. -
[P3] Add a negative case for the global authenticated anchor filter
crates/gitlawb-node/src/api/arweave.rs:76
anchors_global_allows_authenticatedonly proves a public anchor is visible; the per-rowauthorize_repo_readfilter has no test that an authenticated non-reader is denied another repo's private anchor. Add that exclusion assertion, mirroringanchors_repo_denies_non_readerfor the?repo=path.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
crates/gitlawb-node/src/test_support.rs (3)
2079-2240: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDuplicated owner/short-slug boilerplate across the five anchors tests.
anchors_global_allows_authenticated,anchors_global_denies_non_reader,anchors_repo_denies_anonymous_on_private,anchors_repo_allows_owner, andanchors_repo_denies_non_readerall repeat the identicalowner_short/short_slugderivation and repo/anchor seeding sequence. A small shared helper (e.g.,seed_owned_repo_with_anchor(is_public, repo_name) -> (Keypair, String /*owner_did*/, String /*short_slug*/)) would cut the duplication and centralize the short-slug convention these tests depend on.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/test_support.rs` around lines 2079 - 2240, The five anchors tests repeat the same owner DID to short-slug derivation and repo/anchor seeding logic, so factor that setup into a shared helper in test_support.rs and reuse it from anchors_global_allows_authenticated, anchors_global_denies_non_reader, anchors_repo_denies_anonymous_on_private, anchors_repo_allows_owner, and anchors_repo_denies_non_reader. Have the helper create the Keypair, derive owner_short/short_slug, seed either a public or private repo, and insert the anchor so the tests only vary by visibility and request/assertion details.
2146-2177: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winMissing positive-path coverage: anonymous
?repo=access to a public repo's anchors.The
?repo=tests only cover a private repo (deny-anonymous here, allow-owner at 2180-2212, deny-non-reader at 2214-2240). There's no test asserting an anonymous caller can still read anchors for a public repo via?repo=, even though the production comments emphasize that gating must not break legitimate anonymous access to public content. Adding that case would directly validate the PR's stated goal of not over-restricting public repos.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/test_support.rs` around lines 2146 - 2177, Add missing positive-path coverage for anonymous `?repo=` access on a public repo in the `anchors_repo_denies_anonymous_on_private` area. Create a new test alongside the existing `anchors_router` / `list_anchors` cases that seeds a public repo and anchor, calls the `GET /api/v1/arweave/anchors?repo=...` path with `anon_get`, and asserts `StatusCode::OK` (or the expected success response) to verify `?repo=` still allows anonymous reads for public content.
1881-2030: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winConsider extracting shared setup for the three pins tests.
pins_list_allows_authenticated,pins_list_excludes_quarantined_repos, andpins_list_withholds_path_scoped_blobseach repeat the same Keypair/fs_slug/short/seed_cid_reposboilerplate. Extracting a small helper (e.g., returning(Keypair, owner_did, CidFixture)) would reduce duplication and the risk of the three tests silently drifting apart on the owner-DID/slug convention.♻️ Sketch of a shared fixture helper
struct PinFixture { owner: gitlawb_core::identity::Keypair, owner_did: String, fx: CidFixture, } fn seed_pin_owner(bare_names: &[&str]) -> PinFixture { use gitlawb_core::identity::Keypair; let owner = Keypair::generate(); let owner_did = owner.did().to_string(); let fs_slug = owner_did.replace([':', '/'], "_"); let short = owner_did.split(':').next_back().unwrap().to_string(); let fx = seed_cid_repos(&fs_slug, &short, bare_names); PinFixture { owner, owner_did, fx } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/test_support.rs` around lines 1881 - 2030, The three pins tests repeat the same owner setup, slug derivation, and seed_cid_repos call, so extract that boilerplate into a small shared helper to keep them aligned. Add a fixture/helper near pins_list_allows_authenticated, pins_list_excludes_quarantined_repos, and pins_list_withholds_path_scoped_blobs that returns the generated Keypair, owner_did, and CidFixture (or equivalent), and update each test to use it for fs_slug/short and repository seeding. Keep the existing test-specific assertions and repo/quarantine/visibility setup unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/gitlawb-node/src/test_support.rs`:
- Around line 2079-2240: The five anchors tests repeat the same owner DID to
short-slug derivation and repo/anchor seeding logic, so factor that setup into a
shared helper in test_support.rs and reuse it from
anchors_global_allows_authenticated, anchors_global_denies_non_reader,
anchors_repo_denies_anonymous_on_private, anchors_repo_allows_owner, and
anchors_repo_denies_non_reader. Have the helper create the Keypair, derive
owner_short/short_slug, seed either a public or private repo, and insert the
anchor so the tests only vary by visibility and request/assertion details.
- Around line 2146-2177: Add missing positive-path coverage for anonymous
`?repo=` access on a public repo in the
`anchors_repo_denies_anonymous_on_private` area. Create a new test alongside the
existing `anchors_router` / `list_anchors` cases that seeds a public repo and
anchor, calls the `GET /api/v1/arweave/anchors?repo=...` path with `anon_get`,
and asserts `StatusCode::OK` (or the expected success response) to verify
`?repo=` still allows anonymous reads for public content.
- Around line 1881-2030: The three pins tests repeat the same owner setup, slug
derivation, and seed_cid_repos call, so extract that boilerplate into a small
shared helper to keep them aligned. Add a fixture/helper near
pins_list_allows_authenticated, pins_list_excludes_quarantined_repos, and
pins_list_withholds_path_scoped_blobs that returns the generated Keypair,
owner_did, and CidFixture (or equivalent), and update each test to use it for
fs_slug/short and repository seeding. Keep the existing test-specific assertions
and repo/quarantine/visibility setup unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 77f2dac8-2f19-4e9e-bb77-0aa66c200e5f
📒 Files selected for processing (1)
crates/gitlawb-node/src/test_support.rs
bf9d690 to
069804a
Compare
beardthelion
left a comment
There was a problem hiding this comment.
Re-reviewed on 069804a. The delta since my last pass is test-only (test_support.rs, +200/-84); the gates in list_pins and list_anchors are byte-identical to beafa8a where I already verified them, so this pass covers the new coverage.
All three gaps I flagged are closed, and each new test is load-bearing: with its production guard disabled in isolation, only that test fails, with the expected count.
- quarantine skip:
pins_list_excludes_quarantined_repos(0 vs 1) - path-scoped withheld-blob subtraction:
pins_list_withholds_path_scoped_blobs(secret OID leaks, 1 vs 2) - global per-row filter:
anchors_global_denies_non_reader(0 vs 1)
CodeRabbit's three nitpicks on 81a5772 (extract the pins/anchors fixtures, add the anonymous-public anchor case) are all addressed here. The full CI suite did not trigger on this head, so I ran it locally: fmt clean, clippy clean, full gitlawb-node suite green (325 passed).
One optional follow-up, not a blocker: list_pins has its own copy of the fail-closed "withheld walk failed, skip repo" arm with no test, while get_by_cid's copy is covered by ipfs_cid_walk_error_fails_closed.
Approving. @kevincodex1 this clears my earlier changes-requested; jatmn's CHANGES_REQUESTED also predates this head.
|
@kevincodex1 LGTM |
jatmn
left a comment
There was a problem hiding this comment.
I found issues that need to be addressed before this is ready.
Findings
-
[P2] Cover the production router wiring for the signed metadata endpoints
crates/gitlawb-node/src/test_support.rs:1848,crates/gitlawb-node/src/test_support.rs:2032
The new tests prove the handlers work when the test manually builds a tiny router and appliesoptional_signature, but they do not exerciseserver::build_router, which is the production wiring this PR had to fix. That matters here because/api/v1/ipfs/pinswas previously outside the optional-signature layer; ifserver.rsregressed or the route were left in the old.merge(...).layer(...)shape, the signedpins_list_allows_authenticatedtest would still pass while the real route would never attachAuthenticatedDidand would return401for every signed pins request. Please add at least one integration test throughbuild_routerfor the signed success path, especially/api/v1/ipfs/pins, so the route-layer fix is actually pinned. -
[P3] Complete CodeRabbit's request to clamp negative anchor limits
crates/gitlawb-node/src/api/arweave.rs:69
CodeRabbit's earlier negative-limit request is still valid on the current head:q.limit.min(200)caps only the upper bound, so/api/v1/arweave/anchors?limit=-1still passes-1intolist_arweave_anchors, where it is bound directly into PostgreSQLLIMIT. PostgreSQL rejects a negative limit, so a malformed query can turn this listing endpoint into an internal error instead of a bounded empty/small result. Please clamp the lower bound as well, for example withq.limit.clamp(0, 200), before callingstate.db.list_arweave_anchors.
…rsor - Removed fail-open else branch after MAX_WALKS: unverified path-scoped pins are dropped (fail closed) instead of silently included. - Added MAX_BATCHES = 10 cap on keyset scan iterations (200 rows each, 2000 rows max per request) so a single request can't scan the full pinned_cids table when hidden newer pins push visible results far out. - Return next_cursor (pipe-delimited pinned_at|repo|sha256_hex) when the scan bound is reached before filling the page. - Accept incoming cursor via query param for client-side pagination.
jatmn
left a comment
There was a problem hiding this comment.
I found an issue that needs to be addressed before this is ready.
Findings
-
[P2] Do not advance past path-scoped pins that were skipped by the walk budget
crates/gitlawb-node/src/api/ipfs.rs:392
The latest walk cap now fails closed instead of returning unverified path-scoped rows, but rows from path-scoped repos after the first 50 distinct walks are still skipped while the batch cursor advances past them. For example, if the 51st path-scoped repo in the scan contains a caller-visible pin,allowed_blobs_by_repo.len() < MAX_WALKSis false,allowed_blobs_by_repo.get(&repo.id)isNone, and that pin is omitted; because the response cursor is already moved to the end of the scanned batch, the caller cannot recover that row on the next page. Please stop before advancing past unverified path-scoped rows, return an explicit truncation/error state, or use a cursor/work-budget shape that preserves visible rows instead of silently hiding them. -
[P2] Follow the pins cursor in the CLI listing
crates/gl/src/ipfs_cmd.rs:46
/api/v1/ipfs/pinsis now paginated by default (default_limit()is 50 and the handler emitsnext_cursor), butgl ipfs liststill performs a single signed GET with no cursor loop. Any node with more than the first page of visible pins will have older pins omitted from the command even though the subcommand is documented as listing all pinned CIDs, and the output gives no indication that another page exists. Please either page throughnext_cursoruntil exhaustion, or change the CLI/API contract to make the bounded first page explicit to users.
beardthelion
left a comment
There was a problem hiding this comment.
Re-reviewed the current head dd24f3d by execution. The gate still holds: I ran anonymous-401, signed-non-reader deny, owner-allow, quarantine (owner path), negative and oversized ?limit, cross-owner slug-collision, and the v10 migration, all passing, and a fail-open mutation of the repo-read gate goes red as expected. The new MAX_BATCHES cap and the limit == 0 early return close the earlier unbounded-scan and zero-limit items.
The new next_cursor reopens the withheld-metadata channel this PR exists to close. Both P2s below come from one place: cursor is taken from batch.last() at ipfs.rs:377, before the per-pin visibility filter, and then emitted.
Findings
-
[P2] Do not put a withheld blob's id in
next_cursor
crates/gitlawb-node/src/api/ipfs.rs:447
The cursor is set from the last fetched row (ipfs.rs:377) before the loop filters each pin throughallowed_blob_set_for_caller, sonext_cursorcan carry thesha256_hexof a path-scoped-hidden blob. Executed: a signed caller who can read a root-public repo but not its/secret/**subtree getscount: 0, whilenext_cursorcomes back as<pinned_at>|<repo>|<secret_sha>wheresecret_shais the withheld blob's OID. Same CID/SHA-of-hidden-pins class you flagged at 17:58, on a different channel, reachable by any registered DID. -
[P2]
next_cursorskips visible pins across pages
crates/gitlawb-node/src/api/ipfs.rs:377
Same cause, availability side: the cursor advances to the last fetched row, so the next page resumes past rows the caller was never shown. Executed: five visible pins, no path rules, paged atlimit=2followingnext_cursorreturns only two of the five. Broader than the walk-budget rows in your 18:17 note; it drops plainly-visible rows too.
Both close together by deriving next_cursor from the last pin actually pushed to the response (the last visible row), emitted only when the page fills. I verified that change by execution: it turns both cases above green and leaves the deny/allow/owner/quarantine/limit/collision/migration suite green. The one case it does not cover is resuming when a whole batch window is entirely hidden, which needs an explicit truncation flag or an opaque cursor, i.e. the item below.
- [P3] Encode
next_cursoropaquely
crates/gitlawb-node/src/api/ipfs.rs:447
The cursor embeds the RFC3339pinned_at, which contains+00:00. A client echoingnext_cursorback as?cursor=...+00:00has the+decoded to a space, corrupting the keyset boundary. An opaque (base64) token also gives you a clean place to carry batch-cap resume state without exposing row ids.
Your 18:17 "follow the pins cursor in the CLI" item is the client half of this and still stands. Not ready: the leak is a still-open confidentiality item and the skip is a correctness regression, both from the same cursor-anchoring fix.
Superseded: reviewed an earlier head. Current review is on the current head (dd24f3d).
…I pagination - next_cursor is now derived from the last *accepted* (visible) pin, not from batch.last(). This avoids leaking withheld-blob SHA/CID metadata via the cursor and prevents skipping rows the caller was never shown across pages (both P2). - next_cursor is emitted only when the page fills (pins.len() >= max_visible), not unconditionally from the DB cursor. - When MAX_BATCHES is hit before the page fills, emit truncated: true plus a resume cursor (from the last visible row if any, or the last scanned row as a last resort). - Cursor is base64-encoded (URL-safe, no pad) for opacity and to avoid URL-decoding corruption of RFC3339 +00:00 timestamps (P3). - gl ipfs list now loops on next_cursor until exhaustion, collecting and printing all visible pins across pages.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the changed paths and found issues that still need to be addressed.
Findings
-
[P2] Do not expose hidden-row IDs through the truncated-page cursor
crates/gitlawb-node/src/api/ipfs.rs:489
The new cursor is base64url, but it is still just a reversible encoding ofpinned_at|repo|sha256_hex. When the scan hitsMAX_BATCHESbefore finding any visible pin, the fallback emitsnext_cursorfromdb_cursor, which is the last scanned row rather than an accepted visible row. For a caller whose first 2000 scanned rows are all path-scoped-hidden, the response can therefore returncount: 0,truncated: true, and a decodable cursor containing a withheld blob SHA. Please make the resume token genuinely opaque or server-side, or return a truncation/error state that does not encode hidden row metadata. -
[P3] Report the full paginated pin count in node status
crates/gl/src/node.rs:191
/api/v1/ipfs/pinsis now paginated by default, and the handler'scountis the number of pins in the current page.gl node statusstill makes one signed request and prints that first-page count asPinned CIDs, so a node with more than the first page of visible pins will be underreported as 50 or fewer. Please either follownext_cursoruntil exhaustion, asgl ipfs listnow does, or change the status text/API contract so it does not present a page count as the total pinned CID count. -
[P2] Preserve pre-existing non-tip pins during the v10 ownership migration
crates/gitlawb-node/src/db/mod.rs:863
The v10 backfill only assignsrepo/owner_didwhen a pinned row'scidmatchesbranch_cids, then falls every other existing pin back to empty strings. Those rows are then invisible to the new/api/v1/ipfs/pinsquery, which only asks for real readable(repo, owner_did)pairs. Since the pin table was populated from the full pushed object list rather than only current branch-tip CIDs, upgraded nodes can silently lose most historical object pins from the listing even when the repo is still readable. Please backfill from a source that preserves the original repo for pinned objects, or keep an explicit compatibility path for legacy rows instead of migrating them into an ownerless bucket that the endpoint never returns. -
[P3] Expose the identity directory for signed IPFS pin reads
crates/gl/src/ipfs_cmd.rs:20
gl ipfs listnow requires a keypair because the endpoint returns 401 for anonymous callers, andgl node statussigns the pins panel when a keypair is available. Both code paths only load the default identity directory, even though the helper functions already accept adirargument and most signedglsubcommands expose--dir. Users who keep their active identity outside the default location cannot list pins, andnode statusreports that identity is required even though there is no flag to supply it. Please thread a--diroption through these signed read paths so the new auth gate remains usable for the same identity layouts as the rest of the CLI.
beardthelion
left a comment
There was a problem hiding this comment.
Re-reviewed on e2addc0. Splitting response_cursor from db_cursor is the right direction and closes the leak on the page-filled and truncated-with-visible paths; I paged that path end to end and it is clean (the cursor derives from the last returned pin, no skip or dup across pages, terminates correctly). One branch still leaks, so this is not ready yet.
Findings
-
[P1] Do not seed
next_cursorfromdb_cursoron a truncated all-hidden page
crates/gitlawb-node/src/api/ipfs.rs:488
When a scan reachesMAX_BATCHESwith no visible pins,response_cursorisNoneand the code falls back todb_cursor, the last fetched row, which is a withheld pin, base64-encoded. I drove this end to end: a caller who passes the repo-level gate for a repo with a/secret/**rule, scanning a hidden-only stretch, getstruncated:trueand anext_cursorthat base64-decodes straight back to a withheld pin'spinned_at|repo|sha256_hex.URL_SAFE_NO_PADis an encoding, not opacity, so the withheld content hash is recoverable, which is the exact disclosure #121/#136 defend against. It does not need a large hidden set:batch_countincrements before the fetch, so a scan that finishes right on the 1800-row batch boundary is still flaggedtruncatedand still emits the staledb_cursor, leaking even when there is no further page. Reachable by any authenticated caller who can read such a repo at the repo level. -
[P1 fix] Make the resume point an opaque server-side token, not the row's own columns
Just omitting the cursor here stops the leak (I confirmed: the page then returnstruncated:truewith nonext_cursor), but on its own it starves a legitimately visible row that sits past a hidden stretch longer than the scan bound: the caller getstruncatedwith no cursor and can never advance to it. If the endpoint should stay live through hidden spans, the scan-position resume needs to be a genuinely opaque token (encrypted, or server-side state), never a base64 of the row tuple and never HMAC-signed plaintext. Keep deriving the visiblenext_cursorfrom the last emitted row as you do now. -
[P2] Bound the CLI pagination loop
crates/gl/src/ipfs_cmd.rs:49
gl ipfs listfollowsnext_cursorwith no page cap and no check that the cursor advanced. Against an honest node it terminates (the keyset scan is strictly monotone), but the node URL is caller-supplied, and a hostile or misconfigured node returning a constantnext_cursorloops the client forever whileall_pinsgrows in memory (it is only printed after the loop). An honest but large result set also grows client memory without bound. Add a page cap, break if the returned cursor equals the previous one, and consider streaming the output. -
[P3] Add a cursor guard test; the fix currently has none
crates/gitlawb-node/src/api/ipfs.rs
There is no test fornext_cursor,truncated, or the base64 round-trip anywhere, which is how the P1 branch regressed with nothing to catch it. Two cases belong here: seed a path-scoped repo with a hidden-only stretch past the scan bound and assert the emittednext_cursordoes not base64-decode to any withheldsha256_hex; and a visible-pagination case asserting the cursor equals the last returned row with no skip or dup across pages.
The anonymous reject, the quarantine and dedup gates, the limit==0/negative clamp, and the per-page status check are all intact on this head.
sqlx prepared statements do not support semicolon-delimited multi-statement queries. Split into three separate execute calls.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the changed paths and found issues that still need to be addressed.
Findings
-
[P2] Do not expose unmapped legacy pins as public
crates/gitlawb-node/src/api/ipfs.rs:288
v10 falls back torepo = ''/owner_did = ''for pre-existing pinned rows that cannot be mapped throughbranch_cids, andlist_pinsthen unconditionally adds that empty pair to every signed caller's readable set. Those rows no longer have a repo context, so they bypass the current-visibility filtering this PR is adding; a pin created while a repo was public, or an ambiguous/non-tip historical object from a repo later made private, is returned to any authenticated DID as an "assumed public" legacy pin. Please do not treat unmapped legacy rows as globally readable: either backfill enough repo context to apply the same visibility gate, hide/expire the unmapped rows, or expose them only through an explicitly privileged compatibility path. -
[P2] Keep paginating past hidden-only scan windows without silently ending the list
crates/gitlawb-node/src/api/ipfs.rs:501
When the scan reachesMAX_BATCHESbefore accepting any visible pin, the handler now returnstruncated: truewith nonext_cursor; the new cursor guard test even seeds an older visible pin behind the 2005 hidden rows and asserts this no-cursor response. That avoids leaking the hiddendb_cursor, but it also makes the older visible pin unreachable, and bothgl ipfs listandgl node statusstop as soon asnext_cursoris absent, so users get an incomplete listing/count with no recovery path. Please either provide a non-leaking resume token for the scan position, or make truncation an explicit client-visible error/state that is not treated as end-of-list.
P2: Remove the legacy-pin empty-repo compatibility path that exposed unmapped legacy pinned_cids rows as unconditionally readable to every authenticated caller. Legacy rows with empty repo/owner_did are now invisible (they don't match the UNNEST pair filter). P2: Add a non-leaking truncated_cursor resume mechanism. When the keyset scan hits MAX_BATCHES before accepting any visible pin (all hidden rows), the response now includes a truncated_cursor that encodes only the pinned_at timestamp (no repo or SHA). The caller can echo this back as ?truncated_cursor=<token> to resume scanning past the hidden window. The SQL query for truncated_cursor uses a (pinned_at < ) condition that avoids passing any hidden-blob identifier into the query.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the changed paths and found issues that still need to be addressed.
Findings
-
[P2] Follow
truncated_cursorin the GL pin readers
crates/gl/src/ipfs_cmd.rs:81
The server now returnstruncated: trueplustruncated_cursorwhen it scans a hidden-only window without accepting a visible pin, and the new cursor guard test proves callers must echo that token to reach an older visible pin behind the hidden span.gl ipfs listandgl node statusstill only readnext_cursorand stop as soon as it is absent, so they treat this recoverable truncated page as end-of-list. That means the CLI can still omit visible pins and undercountPinned CIDsin exactly the hidden-window case this update is trying to make recoverable. Please handletruncated_cursorin both pagination loops, or surface truncation as an explicit incomplete/error state instead of ending the listing. -
[P2] Preserve the full keyset when resuming from truncated pages
crates/gitlawb-node/src/db/mod.rs:2334
The truncated resume query drops the(repo, sha256_hex)tie-breakers and resumes with onlypinned_at < $3, while the real scan order is(pinned_at DESC, repo DESC, sha256_hex DESC). If the last hidden row scanned at the batch cap shares itspinned_atwith later rows in that same ordered timestamp group, the resume skips every remaining row with that timestamp, including visible pins. The PR's own tests seed many rows with identicalpinned_at, so this is a supported shape rather than an impossible state. Please make the non-leaking resume token preserve a total-order position, or otherwise avoid skipping same-timestamp rows when resuming after a truncated hidden window. -
[P2] Do not expose hidden pin timestamps through
truncated_cursor
crates/gitlawb-node/src/api/ipfs.rs:512
The all-hidden truncated path now avoids putting the hidden row's repo and SHA in the cursor, but it still returnstruncated: trueand a base64url token that decodes directly to the last scanned hidden row's exactpinned_at.pinned_atis part of the/ipfs/pinsmetadata this PR is gating, and the response also tells the caller they just crossed a hidden-only window. A caller who can read the repo root but not a withheld subtree can therefore still learn timing/existence information about hidden pins. Please make this resume token genuinely opaque, or return a truncation/error state that does not encode metadata from hidden rows.
…n CLI P2: Replace pinned_at-in-base64 truncated_cursor with a genuinely opaque token. The server stores (pinned_at, repo, sha256_hex) under a random UUID in an in-memory Mutex<HashMap> and returns only the UUID. Callers echo it back via ?truncated_cursor= and the server looks up the full cursor — no hidden-row metadata is ever exposed (no pinned_at, repo, or sha256_hex leaks through the wire). Entries TTL at 5 minutes. P2: Both gl ipfs list (ipfs_cmd.rs) and gl node status (node.rs) now follow truncated_cursor when next_cursor is absent and truncated is true, enabling recovery past hidden-only scan windows. P2: With the full DB cursor preserved in the cache, the resume query uses the complete (pinned_at, repo, sha256_hex) 3-tuple, so no rows are skipped due to same-timestamp grouping.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the changed paths again, including the latest hidden-window pagination changes, and found issues that still need to be addressed.
Findings
-
[P2] Do not advance past path-scoped pins after the walk budget is exhausted
crates/gitlawb-node/src/api/ipfs.rs:424
The current loop still treats every path-scoped repo after the first 50 distinct allowed-blob walks as invisible: onceallowed_blobs_by_repo.len() < MAX_WALKSis false, no entry is inserted for the new repo, and the laterallowed_blobs_by_repo.get(&repo.id)check fails. Becausedb_cursoris already advanced tobatch.last()before filtering, a caller-visible pin in the 51st path-scoped repo is skipped and cannot be recovered by followingnext_cursorortruncated_cursor. Please stop before advancing past unverified path-scoped rows, return an explicit truncation/error state for this budget, or carry a resume shape that preserves those rows instead of silently dropping them. -
[P2] Do not consume truncated cursors in a way that makes GET pagination non-retryable
crates/gitlawb-node/src/state.rs:113
take_truncated_cursorremoves the opaque hidden-window token on the first lookup, andlist_pinssilently falls back toNonewhen that token is unknown or expired. If the request usingtruncated_cursoris retried after a dropped response, timeout, client reconnect, or intermediate retry, the same GET token no longer resumes from the stored DB position; the handler restarts at the beginning and the client has no signal that the cursor was consumed. Pagination tokens need to be safe to retry, or invalid/expired tokens need to produce an explicit error/incomplete state rather than silently returning the first page again. -
[P2] Treat stalled IPFS pagination as incomplete instead of returning partial success
crates/gl/src/ipfs_cmd.rs:57
crates/gl/src/node.rs:201
The new GL readers follownext_cursorandtruncated_cursor, but they no longer check that the returned cursor actually advances. A misconfigured or hostile node can return the same cursor forever;gl ipfs listwill make up to 1000 requests, print the accumulated pins as a successful listing after only a warning, andgl node statussilently breaks at the cap and reports the partialPinned CIDstotal as if it were complete. Since the node URL is user-controlled and the existing Arweave recovery path already guards non-advancing cursors, please restore an advancement guard for both cursor types and make the page cap an error/incomplete result rather than a successful count/listing. -
[P2] Preserve legacy non-tip pins instead of migrating them out of the listing
crates/gitlawb-node/src/db/mod.rs:863
The v10 backfill still only assignsrepo/owner_didwhen an oldpinned_cids.cidmatchesbranch_cids, then falls every other pre-existing pin back to empty strings. The currentlist_pinsno longer exposes that empty pair, which fixes the public-leak variant, but it also means upgraded nodes silently lose historical/non-tip object pins from/api/v1/ipfs/pins,gl ipfs list, and thePinned CIDsstatus count even when the repo is still readable. Since pins were recorded from pushed object lists rather than only current branch tips, please either backfill enough repo context to keep applying the visibility gate to these rows, or make the legacy-unmapped state explicit instead of moving those pins into an ownerless bucket that the endpoint never queries.
… guard, legacy pins P2: Walk budget exhaustion now signals truncated instead of silently dropping rows. db_cursor is advanced per-pin (not per-batch). When a new path-scoped repo exceeds MAX_WALKS, the loop breaks without advancing past that pin so the caller can retry with a fresh walk cache. P2: truncated_cursor is now retry-safe — the server stores the full (pinned_at, repo, sha256_hex) under a random UUID in an in-memory cache and never removes entries on lookup. Expired/unknown tokens simply don't match (fall back to start). P2: CLI advancement guard — gl ipfs list and gl node status now detect non-advancing cursors and error instead of silently looping to MAX_PAGES or returning partial counts. P2: Re-added legacy-pin compat path with explicit log warnings. Pre-Gitlawb#134 pins with empty repo/owner_did are returned to any authenticated caller with a tracing::warn! so operators can backfill repo context.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the changed paths and found an issue that still needs to be addressed.
Findings
-
[P2] Do not return unmapped legacy pins to every signed caller
crates/gitlawb-node/src/api/ipfs.rs:294
The v10 migration still maps old pinned rows that cannot be matched throughbranch_cidstorepo = ''/owner_did = '', andlist_pinsnow deliberately adds that empty pair to every signed caller's readable query set. Those rows then take thepin.repo.is_empty()branch and are returned without any repo visibility context, so a pin created while a repo was public, or an ambiguous/non-tip historical object from a repo later made private, is exposed to any authenticated DID. Please complete the earlier legacy-pin review item by not treating unmapped rows as globally readable: backfill enough repo context to apply the same visibility gate, hide/expire the unmapped rows, or expose them only through an explicitly privileged compatibility path. -
[P2] Make truncated pin cursors durable or reject invalid tokens explicitly
crates/gitlawb-node/src/api/ipfs.rs:347
The newtruncated_cursoris stored only inAppState.truncated_cursors, andget_truncated_cursorreturnsNonewhen the token is missing, expired, generated before a process restart, or routed to another machine.list_pinsthen falls back toNoneviatruncated_resume.or(initial_cursor), so a client trying to resume past a hidden-only scan window silently restarts at the first page instead of getting an error or incomplete state. That makes the new GET pagination token non-durable and non-retryable in normal deployment/restart paths, andgl ipfs list/gl node statuscan duplicate or miscount pages rather than knowing the listing is incomplete. Please either make the truncated resume token self-contained/durable enough to work across retries and node instances, or return an explicit invalid/expired cursor error instead of treating it as an absent cursor.
* fix(gl,ipfs): sign gl ipfs list so it works under the #134 pins auth gate /api/v1/ipfs/pins now 401s anonymous callers (#134). Load the caller's identity (--dir, default keystore) and use NodeClient::get_signed; when no identity exists, propagate load_keypair_from_dir's existing 'gl identity new' error instead of a raw 401. Tests: signed-headers present, empty, and no-identity-issues-no-request (mockito). * fix(gl,node): sign the gl node status pins panel, degrade gracefully The pins panel signs its /api/v1/ipfs/pins read (#134 gates it behind auth) via an injectable fetch_pins helper with four states: pins, empty, unavailable (signed read rejected/errored), and needs-identity (no keypair, no request issued). cmd_status loads the identity gracefully so a missing keystore never aborts status; peers/repos/p2p/events panels stay anonymous. Tests drive fetch_pins directly against a mock node. * fix(gl): address review — surface pins HTTP errors, correct empty-panel label - gl ipfs list: check the /ipfs/pins response status before parsing, so a rejected signed read (e.g. 401) surfaces as an error instead of silently printing 'No IPFS pins recorded'. Matches the sibling gl ipfs get. - gl node status: PinsPanel::Empty now renders 'Pinned CIDs: 0' (a reachable node with zero pins), reserving 'unavailable' for the failure state; the prior label reused 'IPFS not configured' and misreported the zero-pins case. - Carry the resolved count in PinsPanel::Pins, removing the duplicated count-fallback closure between fetch_pins and the render arm. * test(gl,node): cover fetch_pins error branches and pins-panel rendering Extract the pins-panel render into a testable pins_status_line helper and assert all four states' output (closes the untested cmd_status render path, including the empty->'Pinned CIDs: 0' label). Add fetch_pins tests for the two error branches that were code-traced but unexecuted: a malformed 200 body and a transport error both degrade to Unavailable without panicking. * style(gl): rustfmt the ipfs list pins-response parse * fix(gl,node): honor --dir identity selector in `gl node status` (#146) `gl node status` always loaded the default keystore via `load_keypair_from_dir(None)`, so a user who created or selected an identity with `--dir` saw the status pins panel render "sign in to view" even though `gl ipfs list --dir` authenticated fine. Add a `--dir` option to `NodeCmd::Status` and thread it into the signed pins fetch, so both CLI pins callers select the same identity directory. An explicit `--dir` that can't be loaded (missing, corrupt, or unreadable) no longer masquerades as "no identity, run `gl identity new`": the pins identity is resolved up front into Keyed / Anonymous / DirUnusable, and the unusable explicit-dir case renders a distinct "no usable identity in <dir>" panel line that names the path, while a missing default keystore (no `--dir`) still degrades quietly. A pins failure never aborts the dashboard. Tests cover both flag branches (Some/None) and the resolver both ways: a good `--dir` loads Keyed, an unusable one reports DirUnusable/IdentityError without issuing a request. The new symbols were absent before, so they compile-RED first. * test(gl,node): pin the pins-identity resolver to the --dir key Follow-up test hardening on the #146 --dir wiring after review: - resolve_pins_auth_loads_keyed_identity_from_explicit_dir now destructures Keyed and asserts loaded.did() equals the --dir key's DID. A bare Keyed(_) match passed even when the load fell through to the default keystore; the DID assertion catches that regression (confirmed red under a simulated fallthrough, green on the real code). - add resolve_pins_auth_none_is_never_dir_unusable: an absent --dir must degrade to Anonymous or Keyed, never DirUnusable. It asserts only the None to DirUnusable mapping, so it is stable whether or not the test runner has a default ~/.gitlawb identity. Test module only; no production behavior change. --------- Co-authored-by: t <t@t>
Superseded by a re-review of the current head (9087e8c); the findings there are re-verified and updated below.
beardthelion
left a comment
There was a problem hiding this comment.
Re-reviewed the current head (9087e8c) by execution. Concurring with jatmn's two open findings, and raising the legacy-pin one to P1 after reproducing the leak. Separately, the two INV-9 items the bots left unresolved (the next_back anchor slug and the normalize_owner_did duplicate) are already fixed on this head, so no action is needed there. Two blockers remain.
Findings
-
[P1] Do not return unmapped legacy pins to any signed caller
crates/gitlawb-node/src/api/ipfs.rs:294,crates/gitlawb-node/src/api/ipfs.rs:416
This is jatmn's open finding, and it reproduces as a live leak.list_pinspushes the empty("", "")pair into every signed caller's readable set unconditionally, and theif pin.repo.is_empty()branch returns those rows with only atracing::warn, no visibility decision. I seeded a private repo plus one v10-backfilledrepo=''/owner_did=''pin and drovelist_pinsas a freshly generated stranger keypair: the response was200withcount:1carrying the pin'ssha256_hex, the object id of a withheld blob. Identities are permissionless, so any signed DID is effectively public, which puts this at P1 rather than P2. Drop the empty pair from the query set (serve nothing until the row is backfilled) or restrict the legacy branch to an operator identity; a warn log is not an access control. The/arweave/anchorstwin is not affected:list_anchorsbuilds its query set only from readable repos and there is norepo=''backfill forarweave_anchors. -
[P2] Bound and make durable the truncated pin cursor, or drop the server-side map
crates/gitlawb-node/src/state.rs:101,crates/gitlawb-node/src/api/ipfs.rs:536
truncated_cursorsis an in-processArc<Mutex<HashMap>>with a 300s TTL and no size or per-caller cap, andadd_truncated_cursorruns an O(n)retainover the whole map on every insert under a single mutex. On the all-hidden pagination path any signed DID mints an entry per request, so the map grows one entry per request until the TTL catches up, the retain turns into O(n squared) contention that blocks every cursor user rather than just the abuser, and expired entries are evicted only on the next insert, not on read. The same map is also the durability gap jatmn raised: an unknown, expired, post-restart, or cross-node token falls back toNoneand silently restarts pagination at page 1. Both problems go away if the truncated cursor is a self-contained opaque token (encode the scan position with an expiry) instead of server-side state, which is how the visiblenext_cursoris already emitted. -
[P3] Rebase onto current main before merge
This branch's base predates #143, and both touchdb/mod.rs, so the rebase will conflict there. Re-review the resolved diff before merging.
Closes #121
Summary
Two node-wide metadata endpoints were served without authentication or visibility filtering, leaking private-repo metadata.
Changes
/api/v1/ipfs/pins(list_pins):optional_signaturemiddleware to the route401 Unauthorizedfor anonymous callers, stopping anonymous node-wide CID enumeration/api/v1/arweave/anchors(list_anchors):optional_signaturemiddleware to the route?repo=<owner>/<name>is provided: gates onauthorize_repo_read(deny → 404), preventing anchor metadata leaks for private repos?repo=: requires authentication for the global anchor listingTesting
Summary by CodeRabbit
?repo=<owner>/<name>is strictly validated and unauthorized repos are hidden; results are capped (limit ≤ 200) and returned with correctcount.401(anonymous) and200/404visibility behavior for both global and?repo=scopes.