Skip to content

feat: fastest json parse with sonic-rs#2

Merged
fengmk2 merged 24 commits intomasterfrom
use-sonic-rs
Nov 21, 2025
Merged

feat: fastest json parse with sonic-rs#2
fengmk2 merged 24 commits intomasterfrom
use-sonic-rs

Conversation

@fengmk2
Copy link
Member

@fengmk2 fengmk2 commented Nov 20, 2025

cnpm/cnpmcore#735

Summary by CodeRabbit

  • New Features

    • Introduced a Package class for parsing and analyzing npm package metadata, including version diffing, metadata extraction, and version history analysis.
  • Documentation

    • Added comprehensive usage guide with examples and updated build/test instructions to use npm and the new testing framework.
  • Tests

    • Migrated test framework from AVA to Vitest.
  • Chores

    • Updated dependencies and development tooling configuration.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 20, 2025

Walkthrough

This PR replaces a simple function with a comprehensive npm package metadata parser. It introduces a Package class (Rust + N-API bindings) that parses packument buffers, extracts metadata, computes version diffs, and reads structured fields. Tests migrate to Vitest, dependencies are added, and documentation reflects the new npm-based build with prebuilt binaries.

Changes

Cohort / File(s) Summary
Configuration & Build Setup
.gitignore, build.rs, rustfmt.toml
Removed Cargo.lock from ignored set; adjusted indentation in build.rs from 2 to 4 spaces; updated rustfmt tab spacing from 2 to 4 spaces.
Cargo Dependencies
Cargo.toml
Added three new dependencies: serde (with derive feature), sonic-rs, and tracing; applied minor formatting adjustments to TOML keys.
Rust Implementation
src/lib.rs, src/package.rs
Removed plus_100 function export; introduced new src/package.rs module providing Package class and supporting structs (Version, Dist, DiffResult, Signature, Attestation, Provenance, Human, etc.) with N-API bindings for package metadata parsing and version diffing.
Type Definitions & Bindings
index.d.ts, index.js
Replaced plus100 export with Package class; added comprehensive TypeScript interfaces for Attestation, Bugs, DiffResult, Dist, Human, NpmOperationalInternal, PeerDependenciesMeta, Provenance, PublishConfig, Repository, Signature, and Version.
Test Framework Migration
__test__/index.spec.ts, __test__/fixtures/obug.json
Switched from AVA to Vitest; added comprehensive test cases for Package class (buffer parsing, field validation, version diffing, dist data extraction); introduced npm-style fixture data for "obug" package.
Benchmarking
benchmark/bench.ts
Replaced simple numeric benchmarks with JSON-focused suite using Package wrapper; introduced multiple data fixtures and readme-specific benchmarks.
Project Configuration
package.json
Replaced test runner AVA with Vitest; updated dev dependencies; added prebench and pretest hooks; reworked lint-staged to include Rust formatting rule; modified bench script invocation.
Git Hooks
.husky/pre-commit
Removed entire pre-commit hook body (deleted shebang, Husky sourcing, and yarn lint-staged command).
Documentation
README.md
Added Usage section with examples (Basic Usage, Diff Versions, Extract Version Metadata); updated Build instructions to npm-based approach with prebuilt addon naming; replaced AVA test guidance with Vitest; revised CI/Release sections for GitHub Actions and platform-specific binary delivery.

Sequence Diagram(s)

sequenceDiagram
    participant Node as Node.js Code
    participant Package as Package Class
    participant Parser as Buffer Parser
    participant VersionStore as Version Store
    
    Node->>Package: new Package(buffer)
    Package->>Parser: parse JSON buffer
    Parser->>VersionStore: extract versions & metadata
    VersionStore-->>Package: return parsed data
    
    rect rgb(240, 248, 255)
    Note over Node,Package: Accessor Methods
    Node->>Package: name(), description(), readme()
    Package-->>Node: return field values
    end
    
    rect rgb(255, 240, 245)
    Note over Node,Package: Version Diffing
    Node->>Package: diff(existsVersions)
    Package->>VersionStore: compare existing vs remote
    VersionStore-->>Package: {addedVersions, removedVersions}
    Package-->>Node: return DiffResult
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • src/package.rs: New core module with substantial parsing logic, struct definitions with serde mappings, and version diffing algorithm requiring careful validation
  • index.d.ts: Extensive new type surface with many interfaces; verify consistency with Rust implementations
  • test/fixtures/obug.json: Large fixture file with complex nested structure; confirm completeness and correctness of test data
  • Test migration: test/index.spec.ts test cases switching frameworks; verify new Vitest assertions and snapshot handling
  • Interdependencies: Changes span Rust bindings, TypeScript types, JavaScript exports, and test infrastructure; require coordinated understanding

Poem

🐰 A package parser rises from the byte-bound deep,
Version diffs computed while codebases sleep,
Sonic-rs parsing, serde's gentle hand,
Metadata extracted, oh how grand!
Vitest hops in, as old AVA departs,
Tests now vibrant—a rabbit's work of art! 🎉

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: fastest json parse with sonic-rs' accurately describes the main architectural change: replacing a simple plus100 function with a comprehensive Package class that parses npm package metadata using sonic-rs for fast JSON parsing.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch use-sonic-rs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@socket-security
Copy link

socket-security bot commented Nov 20, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addednpm/​vitest@​4.0.10981007999100
Addedcargo/​serde@​1.0.2288110093100100
Addednpm/​@​types/​node@​24.10.11001008195100
Addedcargo/​napi@​3.5.28210093100100
Addedcargo/​tracing@​0.1.4110010093100100
Addedcargo/​napi-build@​2.3.19810093100100
Addedcargo/​napi-derive@​3.3.39910093100100
Addedcargo/​sonic-rs@​0.5.610010095100100

View full report


let b: Bench

function JSONParseName(data: Buffer) {
return JSON.parse(data).name
}

function JSONParseVersions(data: Buffer) {

function add(a: number) {
return a + 100
function JSONParseLatestVersion(data: Buffer) {
}

const b = new Bench()
function SonicJSONParseVersions(data: Buffer) {
return Object.keys(new Package(data).versions).length
}

function SonicJSONParseLatestVersion(data: Buffer) {
@fengmk2 fengmk2 marked this pull request as ready for review November 21, 2025 16:02
Copilot AI review requested due to automatic review settings November 21, 2025 16:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements fast JSON parsing for npm package metadata using the sonic-rs library, replacing the basic placeholder functionality. The changes include:

  • Integration of sonic-rs for high-performance JSON parsing
  • New Package struct with comprehensive API for accessing package metadata
  • Migration from ava to vitest for testing
  • Complete type definitions for TypeScript
  • Extensive benchmarking suite demonstrating performance improvements

Reviewed changes

Copilot reviewed 12 out of 21 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/package.rs New module implementing Package struct with methods for parsing npm package metadata using sonic-rs, includes diff functionality for version comparison
src/lib.rs Simplified to only include the package module
Cargo.toml Added dependencies: serde, sonic-rs, tracing; reformatted with 4-space indentation
build.rs Reformatted with 4-space indentation
rustfmt.toml Changed tab_spaces from 2 to 4
index.d.ts Complete TypeScript type definitions for Package class and related interfaces
index.js Updated to export Package class instead of plus100 function
package.json Replaced ava with vitest, added @types/node, updated scripts for new test framework
test/index.spec.ts Migrated tests from ava to vitest with comprehensive test coverage
benchmark/bench.ts Extensive benchmarking code comparing JSON.parse with sonic-rs performance
README.md Updated documentation with usage examples, benchmarks, and API reference
yarn.lock Updated dependencies reflecting vitest migration and removal of ava
.gitignore Cargo.lock now tracked in repository
.husky/pre-commit Simplified hook script
test/fixtures/obug.json New test fixture file
test/snapshots/index.spec.ts.snap Vitest snapshot file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

/// ["1.2.0", [119797, 138592]],
/// ],
/// removed_versions: [
/// ["1.0.1", [100992, 119796]],
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation example on line 54 is inconsistent with the actual output. Line 54 shows removed_versions: [["1.0.1", [100992, 119796]]] with position data, but the actual implementation on line 82 pushes only the version string without position data (removed_versions.push(version.to_string())). The documentation should be corrected to show removed_versions: ["1.0.1"].

Suggested change
/// ["1.0.1", [100992, 119796]],
/// "1.0.1",

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +20
function JSONParseName(data: Buffer) {
// @ts-expect-error ignore the type error
return JSON.parse(data).name
}
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused function JSONParseName.

Suggested change
function JSONParseName(data: Buffer) {
// @ts-expect-error ignore the type error
return JSON.parse(data).name
}

Copilot uses AI. Check for mistakes.
Comment on lines +22 to 26
function JSONParseVersions(data: Buffer) {
// @ts-expect-error ignore the type error
return Object.keys(JSON.parse(data).versions).length
}

Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused function JSONParseVersions.

Suggested change
function JSONParseVersions(data: Buffer) {
// @ts-expect-error ignore the type error
return Object.keys(JSON.parse(data).versions).length
}

Copilot uses AI. Check for mistakes.
Comment on lines +27 to 32
function JSONParseLatestVersion(data: Buffer) {
// @ts-expect-error ignore the type error
const pkg = JSON.parse(data)
return pkg.versions[pkg['dist-tags'].latest]
}

Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused function JSONParseLatestVersion.

Suggested change
function JSONParseLatestVersion(data: Buffer) {
// @ts-expect-error ignore the type error
const pkg = JSON.parse(data)
return pkg.versions[pkg['dist-tags'].latest]
}

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +36
function SonicJSONParseVersions(data: Buffer) {
return Object.keys(new Package(data).versions).length
}

Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused function SonicJSONParseVersions.

Suggested change
function SonicJSONParseVersions(data: Buffer) {
return Object.keys(new Package(data).versions).length
}

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +40
function SonicJSONParseLatestVersion(data: Buffer) {
return new Package(data).getLatestVersion()
}

Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused function SonicJSONParseLatestVersion.

Suggested change
function SonicJSONParseLatestVersion(data: Buffer) {
return new Package(data).getLatestVersion()
}

Copilot uses AI. Check for mistakes.
@fengmk2 fengmk2 merged commit da93d03 into master Nov 21, 2025
43 of 44 checks passed
@fengmk2 fengmk2 deleted the use-sonic-rs branch November 21, 2025 16:07
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
benchmark/bench.ts (1)

17-25: Unused benchmark helpers still present

JSONParseName, JSONParseVersions, JSONParseLatestVersion, SonicJSONParseVersions, and SonicJSONParseLatestVersion are currently unused except in commented‑out benchmarking code. Static analysis has already flagged these as dead code.

If you don’t plan to re‑enable the corresponding benchmarks soon, it’s cleaner to remove these helpers (or move them into the commented region) to avoid ongoing “unused” warnings.

Also applies to: 27-31, 33-39

🧹 Nitpick comments (6)
package.json (1)

47-60: Build hooks before bench/test are safe; consider inner-loop cost

prebench/pretest running yarn build guarantees the native addon exists before benchmarks and tests, which is robust for CI and first-time use. For local development, this can add overhead on every yarn test/yarn bench; if that becomes painful, consider making build explicit in dev (keep pre* hooks for CI only or gate them via an env flag).

Also applies to: 71-81, 83-94

README.md (1)

17-149: Usage docs are solid, but dev requirements are now inconsistent

The new Usage/diff/position examples match the Package API exposed from index.js and clearly explain how to leverage sonic-rs for fast, positional reads. However, the “Develop requirements” section still mentions Node.js@10+ and yarn@1.x, while package.json declares "engines": { "node": ">= 22.0.0" } and "packageManager": "yarn@4.11.0". Consider updating those bullets (and any other references) to reflect the actual minimum Node and Yarn versions to avoid confusion for contributors.

Also applies to: 155-160, 177-187

__test__/index.spec.ts (1)

20-27: Hard‑coded readme offsets and timestamps are brittle

The assertions on readmePosition and time.modified are great for verifying exact behavior, but they’re tightly coupled to the exact JSON layout and metadata in a.json. Any whitespace or fixture tweak will break these tests even if semantics are unchanged. Consider either:

  • Deriving expectations from the fixture content (e.g., asserting consistency between slice and pkg.readme but not absolute offsets), or
  • Moving the exact offsets/timestamps into snapshots.

This would keep tests robust while still guarding behavior.

benchmark/bench.ts (1)

17-25: Avoid repeated @ts-expect-error around JSON.parse(Buffer)

All the JSONParse* helpers rely on JSON.parse(data) where data is a Buffer, with @ts-expect-error to silence the type mismatch. At runtime this works via implicit Buffer -> string coercion, but the pattern hides real type information.

Consider a small helper that makes the intent explicit and removes scattered ts-expect-errors, e.g.:

function parseJsonFromBuffer<T = any>(data: Buffer): T {
  return JSON.parse(data.toString('utf8')) as T
}

Then:

-  // @ts-expect-error ignore the type error
-  return JSON.parse(data).readme
+  return parseJsonFromBuffer<any>(data).readme

This keeps benchmarks readable without relying on implicit coercion everywhere.

Also applies to: 27-31, 147-150, 157-160

index.d.ts (1)

14-42: diff JSDoc output example is inconsistent with the actual DiffResult shape

The documentation block for Package.diff still reflects the Rust/serde struct:

  • It uses Vec<String> instead of TS types.
  • The example output shows added_versions/removed_versions and includes positions for removed versions.
  • The actual TS interface is:
export interface DiffResult {
  addedVersions: Array<[string, [number, number]]>
  removedVersions: Array<string>
}

To avoid confusion, I’d update the comment to:

  • Use TS notation (string[], DiffResult) instead of Vec<String>.
  • Show camelCase keys and the correct removedVersions type, e.g.:
{
  addedVersions: [
    ['1.1.0', [100992, 119796]],
    ['1.2.0', [119797, 138592]],
  ],
  removedVersions: ['1.0.1'],
}

This keeps the docs aligned with what callers actually see.

Also applies to: 65-68

src/package.rs (1)

167-189: get_latest_version silently swallows parse errors

get_latest_version returns Option<Version> and ignores parse failures:

for (key, value) in to_object_iter(versions.as_raw_str()).flatten() {
    if key == latest_version {
        if let Ok(version) = from_str(value.as_raw_str()) {
            return Some(version);
        }
    }
}

If the latest entry is malformed, this just returns None, which is indistinguishable from “no latest tag” from the JS side. Elsewhere (new and versions) you map sonic-rs errors into Status::InvalidArg, so the behavior here is inconsistent.

Consider either:

  • Propagating errors similarly (changing signature to Result<Option<Version>> and surfacing an N-API error), or
  • At least logging/tracing failures so consumers can diagnose bad metadata.

This would make failures to parse latest more visible instead of silently degrading to null.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 69026f2 and b4ea23e.

⛔ Files ignored due to path filters (3)
  • Cargo.lock is excluded by !**/*.lock
  • __test__/__snapshots__/index.spec.ts.snap is excluded by !**/*.snap
  • yarn.lock is excluded by !**/yarn.lock, !**/*.lock
📒 Files selected for processing (14)
  • .gitignore (0 hunks)
  • .husky/pre-commit (0 hunks)
  • Cargo.toml (1 hunks)
  • README.md (1 hunks)
  • __test__/fixtures/obug.json (1 hunks)
  • __test__/index.spec.ts (1 hunks)
  • benchmark/bench.ts (1 hunks)
  • build.rs (1 hunks)
  • index.d.ts (1 hunks)
  • index.js (1 hunks)
  • package.json (4 hunks)
  • rustfmt.toml (1 hunks)
  • src/lib.rs (1 hunks)
  • src/package.rs (1 hunks)
💤 Files with no reviewable changes (2)
  • .husky/pre-commit
  • .gitignore
🧰 Additional context used
🧬 Code graph analysis (2)
__test__/index.spec.ts (1)
src/package.rs (3)
  • readme (113-120)
  • diff (59-96)
  • position (192-196)
src/package.rs (1)
index.d.ts (12)
  • DiffResult (65-68)
  • Version (141-156)
  • Human (96-103)
  • Dist (74-90)
  • NpmOperationalInternal (105-108)
  • PublishConfig (118-121)
  • PeerDependenciesMeta (110-112)
  • Signature (132-135)
  • Attestation (55-58)
  • Provenance (114-116)
  • Repository (127-130)
  • Bugs (61-63)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: stable - aarch64-pc-windows-msvc - node@22
🔇 Additional comments (6)
build.rs (1)

1-3: Formatting-only change looks good

Indentation now matches the rustfmt config (4 spaces); no behavioral impact.

rustfmt.toml (1)

1-1: Rust formatting configuration is consistent

Using 4-space indentation is standard and matches the rest of the changes.

Cargo.toml (1)

5-16: Dependency and release-profile updates look reasonable

The added runtime deps (serde with derive, sonic-rs, tracing) and lto = true in the release profile are appropriate for a performance-focused native parser and N-API binding. No issues from this diff alone.

Also applies to: 21-23

index.js (1)

559-560: Named Package export wiring looks correct

Adding module.exports.Package = nativeBinding.Package on top of module.exports = nativeBinding cleanly exposes Package for both CommonJS (require('@cnpmjs/packument').Package) and ESM named imports via CJS interop. No behavioral issues with the loader logic.

src/lib.rs (1)

1-3: lib entry now defers to package module; wiring is fine

The crate root simply includes mod package;, which is appropriate for a N-API-focused cdylib where all exported surface lives in src/package.rs. As long as package.rs defines the #[napi] Package type and related exports, this setup is correct.

Please double-check that src/package.rs actually exposes the Package class and any other intended #[napi] exports so that the generated binding in index.js matches the Rust side.

__test__/fixtures/obug.json (1)

1-972: The 1.0.1 mismatch is intentional test data—fixture is correct

Verification confirms the fixture is designed correctly. The test "should get diff versions" explicitly queries pkg.diff(['1.0.0', '1.0.1', '10000000.222.111']), where 1.0.1 exists in the time map but not in versions. This is deliberate: the diff method is being tested to verify it can detect version inconsistencies between different package metadata sections. The 10000000.222.111 entry further tests non-existent versions. The fixture structure is sound and the mismatch serves a clear testing purpose.

Likely an incorrect or invalid review comment.

Comment on lines +31 to +32
expect(pkg.getLatestVersion()).matchSnapshot()
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

Snapshot matcher name likely wrong for Vitest

You’re calling expect(...).matchSnapshot(), but Vitest’s built-in matcher is toMatchSnapshot(). Unless you’ve registered a custom matcher named matchSnapshot, this will be undefined at runtime and cause the tests to throw rather than record/compare snapshots. I’d switch all usages to toMatchSnapshot().

Example:

-  expect(pkg.getLatestVersion()).matchSnapshot()
+  expect(pkg.getLatestVersion()).toMatchSnapshot()

Apply similarly on Lines 43, 49, 59, and 63.

Also applies to: 43-44, 49-50, 59-63


I'll verify this review comment by checking the actual test file and confirming the correct Vitest API.

[/run_scripts]
[/web_search]

---

🏁 Script executed:

```shell
#!/bin/bash
# Find and examine the test file
fd -e spec.ts -e test.ts | head -20

Length of output: 22


🏁 Script executed:

#!/bin/bash
# Look for the specific test file and show snapshot matcher usage
find . -path "*/node_modules" -prune -o -name "index.spec.ts" -type f -print 2>/dev/null

Length of output: 24


🏁 Script executed:

#!/bin/bash
# If found, display the content around the mentioned lines
if [ -f "__test__/index.spec.ts" ]; then
  echo "=== Lines 25-35 ==="
  sed -n '25,35p' "__test__/index.spec.ts"
  echo ""
  echo "=== Lines 40-50 ==="
  sed -n '40,50p' "__test__/index.spec.ts"
  echo ""
  echo "=== Lines 55-65 ==="
  sed -n '55,65p' "__test__/index.spec.ts"
fi

Length of output: 1298


🏁 Script executed:

#!/bin/bash
# Search for all matchSnapshot and toMatchSnapshot occurrences
rg -n "match[Ss]napshot" --type ts 2>/dev/null | head -30

Length of output: 356


🌐 Web query:

Vitest snapshot matcher API toMatchSnapshot

💡 Result:

toMatchSnapshot — summary

  • Signature: expect(value).toMatchSnapshot(shape?: Partial | string, hint?: string). Use it to compare a value against the stored snapshot. [1]
  • Behavior: first run creates a .snap file; later runs compare and fail on mismatch. You can provide a shape (e.g., { foo: expect.any(Set) }) to match only parts of an object. [2]
  • Variants:
    • toMatchInlineSnapshot(shape?, snapshot?, hint?) — stores snapshot inline in the test file. [1][2]
    • toMatchFileSnapshot(filepath, hint?) — compares to a specific file (async, use await). [1]
  • Updating: press "u" during interactive run or run Vitest with -u / --update to update snapshots. [1][2]

Sources: Vitest expect API and Snapshot guide. [1][2][3]


Fix incorrect snapshot matcher name across all test cases

Your test file uses .matchSnapshot() on lines 31, 43, 49, 59, and 63. Vitest's correct matcher is toMatchSnapshot(). The current code will fail at runtime since matchSnapshot is undefined.

Update all five occurrences:

-  expect(pkg.getLatestVersion()).matchSnapshot()
+  expect(pkg.getLatestVersion()).toMatchSnapshot()

Apply the same change to lines 43, 49, 59, and 63.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
expect(pkg.getLatestVersion()).matchSnapshot()
})
expect(pkg.getLatestVersion()).toMatchSnapshot()
})
🤖 Prompt for AI Agents
In __test__/index.spec.ts around lines 31, 43, 49, 59 and 63, the tests call the
non-existent matcher matchSnapshot() which will cause runtime failures; replace
each occurrence of .matchSnapshot() with Vitest's correct matcher
.toMatchSnapshot() so all five assertions use expect(...).toMatchSnapshot()
instead.

Comment on lines +12 to +23
pub struct Package<'a> {
root: LazyValue<'a>,
}

#[napi]
impl<'a> Package<'a> {
#[napi(constructor)]
pub fn new(data: &'a [u8]) -> Result<Self> {
let root: LazyValue =
from_slice(data).map_err(|e| Error::new(Status::InvalidArg, e.to_string()))?;
Ok(Package { root })
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Potential lifetime / memory-safety issue storing LazyValue<'a> in an exported N-API class

Package<'a> stores a LazyValue<'a> that is created via from_slice(data) where data: &'a [u8] comes directly from the JS Uint8Array argument in the constructor. Because Package is exposed as a JS class, instances can outlive the original JS buffer reference:

#[napi]
pub struct Package<'a> {
    root: LazyValue<'a>,
}

#[napi(constructor)]
pub fn new(data: &'a [u8]) -> Result<Self> {
    let root: LazyValue = from_slice(data)?;
    Ok(Package { root })
}

Unless napi-rs is copying the bytes under the hood for &[u8] parameters (and sonic-rs’ LazyValue is not actually borrowing the passed slice), this pattern risks root pointing to memory that can be freed by JS GC once the original Uint8Array/Buffer is no longer strongly referenced on the JS side.

I’d strongly recommend double‑checking:

  • Whether from_slice/LazyValue<'a> requires the input slice to remain valid for the lifetime of the LazyValue.
  • How napi-rs implements &[u8] arguments (borrow vs copy).

If it does borrow, you likely need a design where Rust owns the backing bytes (e.g., storing a Buffer/Vec<u8> inside Package and parsing from that each call, or using a non‑borrowing representation) to avoid use‑after‑free.


I'll verify the lifetime/memory-safety concern by gathering information about the libraries and codebase involved.





---

🏁 Script executed:

```shell
#!/bin/bash
# Find the complete Package implementation and related uses
rg -n "impl.*Package" --type rust -A 20

Length of output: 1732


🌐 Web query:

napi-rs how does it handle &[u8] parameters borrow or copy

💡 Result:

napi‑rs treats a parameter of type &[u8] as a zero‑copy, borrowed view into the JS Buffer/TypedArray whose lifetime is bound to the native call. It is not copied automatically — you get a borrow you can read from in that function only. If you need an owned/async‑safe copy, use an owned Buffer/ArrayBuffer or explicitly copy (e.g. Buffer::copy_from / to_vec()). Also borrowed slices cannot cross async/await or thread boundaries. [1][2]


🌐 Web query:

sonic_rs LazyValue lifetime memory ownership

💡 Result:

Short answer — LazyValue<'a> borrows the raw JSON slice from the input; its lifetime is tied to that input. If you need owned data, convert to OwnedLazyValue (or use OwnedLazyValue directly).

Details:

  • LazyValue<'a> is a wrapper around an unparsed raw JSON text borrowed from the origin JSON — the &'a lifetime ties the LazyValue to the input buffer. [1]
  • APIs that take a byte slice (e.g., get_from_slice) return LazyValue<'_> which borrows that slice — you must keep the original input buffer alive while using the LazyValue. [2]
  • If you need an owned copy (for longer storage, Send/Sync, or serde-owning usage), use OwnedLazyValue (can be created from LazyValue and holds its own storage). [3]
  • Multi-get helpers return Vec<Option<LazyValue<'de>>> (i.e., borrowed LazyValue items with the same input lifetime). Unsafe unchecked variants exist but still require the input to be valid and alive. [4]

If you want, I can show a minimal Rust example illustrating the borrow/ownership conversion.


Now let me verify whether Package methods actually access the root field and if there are any other ownership patterns in the codebase:





🏁 Script executed:

#!/bin/bash
# Find all methods on Package and see if they access root
rg -n "pub fn|fn " src/package.rs | head -30

Length of output: 646


🏁 Script executed:

#!/bin/bash
# Check if root field is accessed in methods
rg -n "self\.root|root\." src/package.rs -A 2 -B 1

Length of output: 2187


Critical lifetime/memory-safety vulnerability confirmed: LazyValue<'a> borrows buffer that can be freed by JS GC

Package<'a> stores LazyValue<'a> created from &'a [u8] passed by napi-rs. However, napi-rs provides &[u8] as a zero-copy borrow valid only during the constructor call—once it returns, the underlying JS buffer can be garbage collected. Yet LazyValue<'a> borrows from that slice, and multiple methods actively access self.root after construction (name, description, readme, time, versions, get_latest_version, and others), including direct pointer arithmetic at line 194. This is a use-after-free vulnerability.

Required fix: Use OwnedLazyValue or store owned bytes (Vec<u8>) inside Package to ensure the JSON buffer remains valid for the struct's lifetime.

Comment on lines +140 to +151
#[napi(getter)]
pub fn is_unpublished(&self) -> bool {
let Some(time) = self.root.get("time") else {
return false;
};
for (key, value) in to_object_iter(time.as_raw_str()).flatten() {
if key == "unpublished" && value.is_str() {
return true;
}
}
false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

is_unpublished likely never returns true for real npm packuments

Current logic only treats "unpublished" as true when the value is a string:

for (key, value) in to_object_iter(time.as_raw_str()).flatten() {
    if key == "unpublished" && value.is_str() {
        return true;
    }
}

In npm metadata, time.unpublished is typically an object (with fields like name and time), not a plain string. In that common case, value.is_str() is false, so is_unpublished() will incorrectly return false even when the package is actually unpublished.

If you only need a boolean flag, it’s safer to check for the presence of the key regardless of value type, e.g.:

-    for (key, value) in to_object_iter(time.as_raw_str()).flatten() {
-        if key == "unpublished" && value.is_str() {
-            return true;
-        }
-    }
+    for (key, _value) in to_object_iter(time.as_raw_str()).flatten() {
+        if key == "unpublished" {
+            return true;
+        }
+    }

You may also want a test fixture covering the unpublished case to lock this in.


I'll verify this by checking npm's actual metadata structure and examining the codebase context.





🏁 Script executed:

#!/bin/bash
# Search for to_object_iter function definition and usage context
rg -n "to_object_iter" --type=rust -B5 -A5

Length of output: 4395


🌐 Web query:

npm package metadata time.unpublished field structure

💡 Result:

Short answer: in npm registry package metadata the top-level time field is a map of timestamps (ISO strings) keyed by version names plus "created" and "modified". When a package/version is unpublished the time object includes an unpublished entry whose value is an object (not an ISO string). That unpublished object typically contains at least:

  • time: ISO timestamp of the unpublish action,
  • by: an object with the actor's name/email,
  • versions: an array (or map) of version strings that were removed.

Evidence: the common "time" shape (version keys → ISO strings, plus created/modified) is shown in npm metadata outputs. [1] The registry unpublish semantics / API (which produces an unpublished record in metadata) are documented by npm and explained in registry internals articles. [2][3]

Sources:
[1] example npm metadata "time" output (version → ISO timestamps).
[2] npm docs — Unpublishing packages from the registry.
[3] blog describing registry unpublish behavior / package-unpublish API.



---

🏁 Script executed:

```shell
#!/bin/bash
# Search for tests of is_unpublished function
rg -n "is_unpublished" --type=rust -B3 -A3

Length of output: 293


Fix is_unpublished() to check for key presence, not value type

The time.unpublished field in npm metadata is an object (containing time, by, and versions fields), not a string. The current check value.is_str() at line 146 will always be false for real unpublished packages, causing the function to incorrectly return false even when the package is actually unpublished.

Remove the string type check to properly detect the presence of the unpublished entry:

-    for (key, value) in to_object_iter(time.as_raw_str()).flatten() {
-        if key == "unpublished" && value.is_str() {
-            return true;
-        }
-    }
+    for (key, _value) in to_object_iter(time.as_raw_str()).flatten() {
+        if key == "unpublished" {
+            return true;
+        }
+    }

Add a test fixture covering an unpublished package to prevent regression.

🤖 Prompt for AI Agents
In src/package.rs around lines 140 to 151, the is_unpublished() function
incorrectly checks value.is_str() when detecting the "unpublished" key in the
time object; remove the type check and simply detect the presence of the
"unpublished" key (i.e., if key == "unpublished" return true), so any object
value is accepted; add a unit test fixture representing an unpublished package
(time.unpublished as an object with time/by/versions) to cover this case and
prevent regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants