fix(node): gate fork_repo on per-caller path-scoped visibility (#98)#109
Conversation
fork_repo authorized the caller only at the repo root "/" and then git clone --mirror'd the full source, discarding the visibility rules. A non-owner with root read but denied a path-scoped subtree could fork the full mirror and obtain blobs the filtered git_upload_pack path withholds. Refuse the fork when the caller has any withheld glob (per-caller withheld_globs non-empty), the same decision the read path makes; owners and authorized subtree readers are unaffected. Fail closed with a not-found response, before repo_store.acquire, so no withheld object is materialized. Add unit coverage for the policy and a direct-mount integration test pinning the gate wiring.
…globs Address code-review findings on the #98 fork gate: - Document that fork_withheld_blocks's prefix-probe equivalence with the serve path depends on validate_path_glob keeping "/" the only whole-repo scope, mirroring has_path_scoped_rule's caveat (latent-coupling guard). - Import withheld_globs and call it unqualified, consistent with the sibling visibility_check import in the same module.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthrough
Fork withheld-subtree gate
Sequence Diagram(s)sequenceDiagram
participant Caller
participant fork_repo
participant authorize_repo_read
participant fork_withheld_blocks
participant withheld_globs
Caller->>fork_repo: POST /api/v1/repos/{owner}/fork-repo/fork
fork_repo->>authorize_repo_read: caller DID, repo path "/"
authorize_repo_read-->>fork_repo: (source_repo, visibility_rules)
fork_repo->>fork_withheld_blocks: (caller_did, visibility_rules)
fork_withheld_blocks->>withheld_globs: evaluate path-scoped rules
withheld_globs-->>fork_withheld_blocks: non-empty withheld set
fork_withheld_blocks-->>fork_repo: true
fork_repo-->>Caller: 404 RepoNotFound (no clone performed)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the contribution. I do not see any actionable issues from my review.
@kevincodex1 LGTM
Closes #98.
fork_repoauthorized the caller only at the repo root/and thengit clone --mirrord the whole source, discarding the visibility rules it had just fetched. A non-owner with root read but a denied path-scoped subtree could fork the full mirror and obtain blobs the filteredgit_upload_packpath withholds.Fix
Refuse the fork when the caller has any withheld path in the source, the same per-caller decision the read path makes:
fork_withheld_blocks(rules, is_public, owner_did, caller)returns true (refuse) iffwithheld_globs(...)is non-empty for that caller. It delegates to the existingvisibility_check/withheld_globsseam, so the owner bypass (full and short DID) andreader_didsgrants are inherited from the read path rather than reimplemented, and the two cannot drift.repo_store.acquire, so no withheld object is materialized on disk, and fails closed withRepoNotFound(mirroringauthorize_repo_reads Deny) so a withheld subtrees existence is not leaked.Owners, fully-authorized readers, and forks of repos with no path-scoped withholding for the caller are unaffected.
The predicate is a conservative over-approximation of the read paths object-level withholding: never weaker (so fork cannot leak), and marginally stricter only in the duplicate/co-located-blob case. A filtered fork projection (serve readers a fork minus the withheld subtree) is the larger follow-up and is intentionally out of scope.
Tests
reader_didof that subtree (must still be allowed to fork), root-only rule, no rules, mixed root+subtree, and the partial-reader case.#[sqlx::test]integration test that drives the handler: a public repo with a/secret/**rule excluding the caller returns 404 and creates no fork row. Because the gate precedesacquire, it needs no on-disk source.Scope note
While mapping every content-egress path during review, I found a separate, pre-existing leak in
GET /ipfs/{cid}(api/ipfs.rsget_by_cid): it returns any git object by raw SHA-256 across all repos with no visibility check, which is the same bypass class on a different surface. It is out of scope here and filed separately.Summary by CodeRabbit