Skip to content

feat(enrichment): typosquat & dependency-confusion analyzer#1704

Merged
JSONbored merged 1 commit into
JSONbored:mainfrom
nickmopen:feat/enrichment-typosquat
Jun 29, 2026
Merged

feat(enrichment): typosquat & dependency-confusion analyzer#1704
JSONbored merged 1 commit into
JSONbored:mainfrom
nickmopen:feat/enrichment-typosquat

Conversation

@nickmopen

@nickmopen nickmopen commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a REES analyzer (#1501) that flags two supply-chain risks for the dependencies a PR newly adds — the heavy/external name analysis the no-checkout claude --print reviewer cannot do. Additive + fail-safe: it fills its own typosquat findings key and degrades independently like every other analyzer.

Detects (unscoped names only)

  • Typosquat — a newly-added unscoped dep whose name is a near-miss of a bundled popular package:
    • homoglyph / separator variant (canonical-form match, e.g. l0dash, lo-dashlodash),
    • distance-1 Damerau-Levenshtein edit (e.g. expresexpress), with a short-name guard and a known-legitimate near-neighbour allowlist (e.g. preact is not flagged as a near-miss of react). Pure + offline.
  • Dependency-confusion — an unscoped name that returns a definitive 404 on the public registry is publicly claimable (namespace-takeover). Transient/unknown registry states fail safe (no finding).

Scope note — scoped packages are intentionally not analyzed. An npm scope is namespace-protected (it cannot be published to without owning it), so a popular tail under any scope — @types/react, @chakra-ui/react, @acme/express — is not claimable impersonation. There is no offline source of truth that distinguishes a legitimate scoped package from a hypothetical malicious one, so classifying scoped names would only manufacture false positives. The analyzer therefore restricts to the unscoped flat namespace, which is the real typosquat/confusion surface. (This deliberately narrows the issue's "scope-swap" idea to keep the analyzer precise.)

Implementation (established review-enrichment/ pattern)

  1. TyposquatFinding type + typosquat? key in src/types.ts
  2. src/analyzers/typosquat.ts — pure helpers (damerauLevenshtein, canonicalize, classifyTyposquat, isPublished) + scanTyposquat(req, fetch, opts); reuses extractDependencyChanges to read added (not bumped) deps
  3. Registered in the src/brief.ts ANALYZERS registry
  4. Public-safe block in src/render.ts (renders package@version + reason only — never manifest contents)
  5. node:test units in a separate test/typosquat.test.ts (avoids colliding with concurrent analyzer PRs on enrichment.test.ts)

Validation

From review-enrichment/ (Node 24):

npm test   # build + node --test: 132 pass / 0 fail

Covers edit-distance/transposition/bounded early-exit, homoglyph + separator canonicalization, short-name guard, known-legit exemption, scoped-name non-classification (incl. @chakra-ui/react), registry 404/200/5xx/throw/abort, version-bump-ignored, scoped-name skipped, fail-safe, and the rendered public-safe block.

Closes #1501

@nickmopen nickmopen requested a review from JSONbored as a code owner June 29, 2026 02:40
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 29, 2026
@superagent-security superagent-security Bot removed the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 29, 2026
@gittensory-orb

gittensory-orb Bot commented Jun 29, 2026

Copy link
Copy Markdown

Tip

🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩

✅ Gittensory review — safe to merge

5 files · 1 AI reviewer · no blockers · readiness 55/100 · CI green · clean

✅ Approved — safe to merge

Review summary
The change adds a new enrichment analyzer for newly added dependency names, wires it into the analyzer registry, extends the structured finding type, and renders the resulting typosquat/confusion findings. The implementation is mostly coherent and fail-safe: scoped packages are explicitly excluded before near-miss and registry checks, registry uncertainty returns no finding, and the new tests cover the main pure and scan paths. The notable rough edge is documentation/test coverage around limits and stale wording, not a must-fix correctness break in the shown code.

Signal Result Evidence
Code review ✅ No blockers 1 reviewer
Linked issue ✅ Linked #1501
Related work ⚠️ 3 scoped overlaps Top overlaps are listed below; lower-confidence bulk is hidden.
Review load ❌ 8/20 Readiness component derived from cached public PR metadata and labels; size label size:L.
Validation evidence ❌ 5/25 Cached preflight status is hold.
Open PR queue ❌ 3/10 15 open PR(s), 9 likely reviewable, 6 unlinked.
Contributor context ✅ Confirmed Gittensor contributor nickmopen; Gittensor profile; 50 PR(s), 1 issue(s).
Gate result ✅ Passing No configured blocker found.
Nits — 6 non-blocking
  • nit: review-enrichment/src/analyzers/typosquat.ts:2 still says the analyzer detects `scope-swap`, but review-enrichment/src/analyzers/typosquat.ts:113 intentionally returns `null` for every scoped name, so the file header should be changed to match the implemented contract.
  • nit: review-enrichment/test/typosquat.test.ts does not exercise the `maxDeps` or `maxConfusionQueries` limit branches in review-enrichment/src/analyzers/typosquat.ts:170-171 and review-enrichment/src/analyzers/typosquat.ts:193, which are part of the new analyzer behavior.
  • nit: review-enrichment/src/render.ts:226 appends `item.reason` directly into the prompt text; today those reasons are internally generated, but keeping rendered analyzer strings consistently escaped would make this path harder to misuse later.
  • Update the analyzer header in review-enrichment/src/analyzers/typosquat.ts:1-4 to remove `scope-swap` and state the same unscoped-only scope used by `classifyTyposquat`.
  • Add focused tests in review-enrichment/test/typosquat.test.ts for `limits.maxDeps` and `limits.maxConfusionQueries`, including the skipped-query path after the cap is reached.
  • Readiness score is below the configured threshold — Use the readiness panel as advisory maintainer context; the score does not block this PR.
Review context
Contributor next steps
  • Review top overlaps.
  • Add scope summary.
  • Fix blocker.
  • Expect slower review.
  • Refresh registry data or choose a registered active repo.
  • Check active issues and PRs before submitting.
Signal definitions
  • Related work = same linked issue, overlapping active PRs, or title/path similarity.
  • Review load = cached public PR metadata such as size labels, changed paths, and preflight status.
  • Open PR queue = repo-wide review pressure; it is not a PR quality failure.
  • Contributor context = public GitHub/Gittensor identity context; non-Gittensor status is not a blocker.
Review details

Generated from public PR metadata and the diff. Advisory only; deterministic signals remain authoritative.

The change adds a new enrichment analyzer for newly added dependency names, wires it into the analyzer registry, extends the structured finding type, and renders the resulting typosquat/confusion findings. The implementation is mostly coherent and fail-safe: scoped packages are explicitly excluded before near-miss and registry checks, registry uncertainty returns no finding, and the new tests cover the main pure and scan paths. The notable rough edge is documentation/test coverage around limits and stale wording, not a must-fix correctness break in the shown code.

Nits (5)

  • nit: review-enrichment/src/analyzers/typosquat.ts:2 still says the analyzer detects `scope-swap`, but review-enrichment/src/analyzers/typosquat.ts:113 intentionally returns `null` for every scoped name, so the file header should be changed to match the implemented contract.
  • nit: review-enrichment/test/typosquat.test.ts does not exercise the `maxDeps` or `maxConfusionQueries` limit branches in review-enrichment/src/analyzers/typosquat.ts:170-171 and review-enrichment/src/analyzers/typosquat.ts:193, which are part of the new analyzer behavior.
  • nit: review-enrichment/src/render.ts:226 appends `item.reason` directly into the prompt text; today those reasons are internally generated, but keeping rendered analyzer strings consistently escaped would make this path harder to misuse later.
  • Update the analyzer header in review-enrichment/src/analyzers/typosquat.ts:1-4 to remove `scope-swap` and state the same unscoped-only scope used by `classifyTyposquat`.
  • Add focused tests in review-enrichment/test/typosquat.test.ts for `limits.maxDeps` and `limits.maxConfusionQueries`, including the skipped-query path after the cap is reached.

🟩 Safe / merged · 🟦 Advisory · 🟨 Held for review · 🟥 Blocked / closed


💰 Earn for open-source contributions like this. Gittensor lets GitHub contributors earn for the work they already do — register to start earning →.

Checked by Gittensory, a quiet PR intelligence layer for OSS maintainers.

  • Re-run Gittensory review

@gittensory-orb gittensory-orb Bot added gittensor Gittensor contributor context gittensor:feature Gittensor-scored feature linked to a feature issue - worth 1.25x multiplier. labels Jun 29, 2026
@nickmopen nickmopen force-pushed the feat/enrichment-typosquat branch from fa10480 to 8f9ef67 Compare June 29, 2026 05:03
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 29, 2026
@nickmopen nickmopen force-pushed the feat/enrichment-typosquat branch 2 times, most recently from dcce39a to c5ea5d7 Compare June 29, 2026 05:20
Add a REES analyzer that flags two supply-chain risks for the dependencies a PR newly adds, which the
no-checkout in-prompt reviewer cannot detect:

- Typosquat: an UNSCOPED name that is a near-miss of a bundled popular package — a homoglyph/separator
  variant (canonical-form match) or a distance-1 Damerau-Levenshtein edit (with a short-name guard).
  Scoped names (@scope/x) are namespace-protected and never classified, so a popular tail under any
  scope (@types/react, @acme/express) is not a false positive. Pure, offline.
- Dependency-confusion: an unscoped name that returns a definitive 404 on the public registry is
  publicly claimable. Uses an injected fetch; transient/unknown states fail safe (no finding).

A known-legitimate near-neighbour allowlist (e.g. preact vs react) is exempt from name flagging.

Reuses extractDependencyChanges to read added (not bumped) deps from the diff. Registered in the brief
orchestrator and rendered as a public-safe block (package@version + reason only). Tests live in their
own file to avoid colliding with concurrent analyzer work on enrichment.test.ts.

Closes JSONbored#1501
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Jun 29, 2026
@JSONbored JSONbored merged commit 66bdce1 into JSONbored:main Jun 29, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gittensor:feature Gittensor-scored feature linked to a feature issue - worth 1.25x multiplier. gittensor Gittensor contributor context lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.

Development

Successfully merging this pull request may close these issues.

feat(enrichment): Typosquat & dependency-confusion detector

2 participants