Skip to content

perf(pm): use sync bin linking checks#2969

Draft
elrrrrrrr wants to merge 1 commit into
perf/pm-resolver-demand-bfsfrom
exp/pm-sync-bin-linking-b33d922
Draft

perf(pm): use sync bin linking checks#2969
elrrrrrrr wants to merge 1 commit into
perf/pm-resolver-demand-bfsfrom
exp/pm-sync-bin-linking-b33d922

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

Summary

AB experiment for p3/p4 install performance.

This keeps package cloning and scheduler behavior unchanged. The only hot path moved is the bin-link rebuild path used by utoo install --ignore-scripts:

  • replace per-bin async try_exists().await with Path::try_exists()
  • add ScriptService::ensure_executable_sync() for bin linking
  • keep existing behavior: skip missing bin targets, add shebang for text bins without one, preserve binary files, and ensure Unix executable permissions

Hypothesis

p4 warm link has no network and uses a warm package cache, so remaining ctx should mostly come from small filesystem syscalls and per-bin async dispatch. Running these tiny bin checks synchronously should reduce scheduler/context-switch overhead without changing clone/download behavior.

Validation

Passed:

  • cargo fmt
  • cargo test -p utoo-pm test_ensure_executable_sync
  • cargo test -p utoo-pm test_execute_queues_skips_missing_bin_file
  • cargo clippy -p utoo-pm --all-targets -- -D warnings --no-deps

Attempted:

  • cargo clippy --all-targets -- -D warnings --no-deps

The full workspace clippy fails before this change is evaluated in pack-core because the local next.js symlinked checkout does not match the pack-core API usage (CrossOrigin, ExternalType::Promise, and Issue trait receiver/signature errors). PM-targeted clippy is clean.

Benchmark Plan

Label-trigger GHA bench, compare p3/p4 against same-run utoo-next, utoo-npm, and bun. Keep only if p4 wall/ctx moves positively without p3 regression.

@elrrrrrrr elrrrrrrr added A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR labels May 18, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces synchronous file execution and shebang management by adding ensure_executable_sync and check_and_add_shebang_sync to ScriptService. PackageService was updated to use these methods, and a unit test was included. Feedback highlighted a potential UTF-8 boundary bug in the shebang detection logic, as well as efficiency issues related to redundant I/O and unnecessary heap allocations, providing a more optimized implementation suggestion.

Comment on lines +381 to +401
let header = {
let mut file = std::fs::File::open(target_path)?;
let mut buffer = vec![0u8; 512];
let n = file.read(&mut buffer)?;
buffer.truncate(n);

std::str::from_utf8(&buffer)
.map_err(|_| anyhow::anyhow!("File is not valid UTF-8, likely a binary file"))?
.to_string()
};

if header.starts_with("#!") {
return Ok(false);
}

let content = std::fs::read_to_string(target_path)?;
let new_content = format!("#!/usr/bin/env node\n{}", content);

let mut file = std::fs::File::create(target_path)?;
file.write_all(new_content.as_bytes())?;
file.flush()?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check_and_add_shebang_sync implementation has several efficiency issues and a potential bug:

  1. UTF-8 Boundary Bug: std::str::from_utf8 on a fixed 512-byte buffer will fail if a multi-byte character is split at the boundary, causing valid text files to be incorrectly identified as binary and skipped.
  2. Unnecessary Allocations: vec![0u8; 512] and .to_string() create heap allocations that are unnecessary for a simple prefix check.
  3. Redundant I/O: The file is opened and read twice (once for the header and once for the full content).

Since std::fs::read_to_string already performs UTF-8 validation and is highly optimized, you can simplify this significantly. For performance, you can still do a quick check for the shebang prefix before reading the whole file.

        {
            let mut file = std::fs::File::open(target_path)?;
            let mut buffer = [0u8; 2];
            if file.read(&mut buffer)? == 2 && &buffer == b"#!" {
                return Ok(false);
            }
        }

        let content = std::fs::read_to_string(target_path).map_err(|e| {
            if e.kind() == std::io::ErrorKind::InvalidData {
                anyhow::anyhow!("File is not valid UTF-8, likely a binary file")
            } else {
                anyhow::Error::from(e)
            }
        })?;

        if content.starts_with("#!") {
            return Ok(false);
        }

        let new_content = format!("#!/usr/bin/env node\n{}", content);
        std::fs::write(target_path, new_content)?;

@github-actions
Copy link
Copy Markdown

📊 pm-bench-phases · d380262 · linux (ubuntu-latest)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM wall ±σ user sys RSS pgMinor
bun 9.19s 0.32s 10.52s 10.28s 769M 340.4K
utoo-next 8.35s 0.43s 10.56s 12.23s 987M 135.1K
utoo-npm 9.62s 0.22s 10.92s 12.86s 1.00G 126.1K
utoo 7.94s 0.07s 11.31s 12.54s 933M 147.2K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 14.7K 19.1K 1.20G 6M 1.88G 1.76G 1M
utoo-next 121.5K 81.6K 1.17G 5M 1.73G 1.72G 2M
utoo-npm 156.0K 107.9K 1.17G 5M 1.73G 1.72G 2M
utoo 118.1K 92.5K 1.17G 6M 1.73G 1.72G 2M

p1_resolve

PM wall ±σ user sys RSS pgMinor
bun 1.97s 0.06s 4.14s 1.07s 521M 169.9K
utoo-next 2.99s 0.04s 5.51s 2.06s 617M 79.4K
utoo-npm 3.05s 0.02s 5.60s 2.06s 617M 87.8K
utoo 2.38s 0.04s 6.08s 1.68s 650M 124.4K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 7.8K 4.9K 206M 3M 110M - 1M
utoo-next 71.6K 88.9K 201M 2M 7M 3M 2M
utoo-npm 71.3K 93.2K 201M 2M 7M 3M 2M
utoo 14.0K 19.5K 204M 3M 7M 3M 2M

p3_cold_install

PM wall ±σ user sys RSS pgMinor
bun 6.78s 0.29s 6.39s 9.91s 622M 212.2K
utoo-next 7.74s 1.66s 5.15s 11.32s 475M 63.4K
utoo-npm 7.22s 2.25s 5.17s 11.09s 470M 61.2K
utoo 6.69s 1.70s 5.12s 11.22s 506M 62.4K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 4.1K 7.6K 1.00G 3M 1.77G 1.77G 1M
utoo-next 128.4K 52.9K 1001M 4M 1.72G 1.72G 2M
utoo-npm 113.0K 47.5K 1000M 3M 1.72G 1.72G 2M
utoo 115.6K 79.6K 1000M 3M 1.72G 1.72G 2M

p4_warm_link

PM wall ±σ user sys RSS pgMinor
bun 3.34s 0.02s 0.21s 2.34s 134M 32.7K
utoo-next 2.42s 0.18s 0.49s 3.76s 78M 18.6K
utoo-npm 2.38s 0.12s 0.48s 3.77s 80M 18.2K
utoo 2.32s 0.16s 0.53s 3.88s 62M 14.0K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 405 25 5M 36K 1.93G 1.76G 1M
utoo-next 38.5K 17.2K 13K 24K 1.72G 1.72G 2M
utoo-npm 42.2K 18.8K 14K 10K 1.72G 1.72G 2M
utoo 51.1K 25.1K 13K 9K 1.73G 1.72G 2M

npmmirror.com: no output captured.

@elrrrrrrr
Copy link
Copy Markdown
Contributor Author

GHA run 1 read

Run: https://github.com/utooland/utoo/actions/runs/26017108305

phase utoo wall utoo ctx same-run utoo-next same-run utoo-npm same-run bun read
p3_cold_install 6.69s ±1.70 115.6K / 79.6K 7.74s, 128.4K / 52.9K 7.22s, 113.0K / 47.5K 6.78s, 4.1K / 7.6K wall is fine, but iCtx regresses materially
p4_warm_link 2.32s ±0.16 51.1K / 25.1K 2.42s, 38.5K / 17.2K 2.38s, 42.2K / 18.8K 3.34s, 405 / 25 not acceptable; ctx is worse than same-run next/npm

Conclusion: reject this single-factor experiment. Moving bin-link existence/chmod checks to sync does not reduce the p4 ctx problem; the small wall movement is not worth the ctx regression. Do not fold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant