Skip to content

chore: migrate to pnpm and enable minimum release age#3568

Open
B4nan wants to merge 23 commits intomasterfrom
chore/migrate-to-pnpm
Open

chore: migrate to pnpm and enable minimum release age#3568
B4nan wants to merge 23 commits intomasterfrom
chore/migrate-to-pnpm

Conversation

@B4nan
Copy link
Copy Markdown
Member

@B4nan B4nan commented Apr 14, 2026

Summary

Part of the org-wide supply-chain hardening + pnpm migration. Migrates crawlee from Yarn 4 to pnpm across the 20-package monorepo, plus docs/ and website/ workspaces. Keeps a 1-day minimum release age guard (already in place via Yarn's npmMinimalAgeGate) at the pnpm layer and augments Renovate with the internal @apify/* + @crawlee/* whitelist.

Changes

  • package.json: drop workspaces, packageManagerpnpm@10.24.0, yarn Xpnpm X
  • lerna.json: npmClient: "pnpm"
  • .npmrc: node-linker=hoisted, link-workspace-packages=true, prefer-workspace-packages=true, public-hoist-pattern[]=*, plus !@types/minimatch to keep the stale v3 types off the root
  • pnpm-workspace.yaml: packages: [packages/*, website, docs], minimumReleaseAge: 1440, @apify/* + @crawlee/* excluded; overrides pin playwright-core, @browserbasehq/stagehand, and minimatch: ^9.0.0 (so the single hoisted minimatch matches what @crawlee/core actually imports)
  • tsconfig.build.json: incremental: false — per-package clean only removes dist/, not the tsconfig.build.tsbuildinfo sidecar, which was leaving tsc in a stale "already built" state after partial failures
  • All 20 packages/*/package.json: yarnpnpm in scripts; internal @crawlee/* cross-deps switched to "workspace:*" so pnpm links the local copy (avoids peer-dep identity drift — same fix as actor-scraper / fingerprint-suite)
  • docs/package.json, website/package.json: yarnpnpm
  • .husky/pre-commit: yarn lint-stagedpnpm lint-staged
  • New .github/actions/pnpm-install composite action (cached pnpm store, keyed by year-month + lockfile hash — pattern from chore: move to pnpm from yarn apify-cli#1068)
  • CI workflows (docs, publish-to-npm, release, test-ci, test-e2e): delegate install to composite; drop corepack prepare yarn and cache: 'yarn'; yarnpnpm; pnpm publish --no-git-checks
  • renovate.json: add @apify/* + @crawlee/* whitelist packageRule; remove "yarn" from ignoreDeps; drop old "constraints": {"npm": ...}
  • Deleted: yarn.lock, .yarnrc.yml, docs/yarn.lock, website/yarn.lock

Verification

pnpm install + pnpm build pass for all 20 packages locally. pnpm test is 1125 passed / 17 failed — all 17 failures are puppeteer/browser-pool tests that need Chromium to launch and error out on page.goto, i.e. environmental, not migration regressions. The stale .claude/worktrees/jazzy-riding-reddy on disk (from a separate branch) also leaks into vitest discovery; it's not part of this PR.

🤖 Generated with Claude Code

@github-actions github-actions Bot added this to the 138th sprint - Tooling team milestone Apr 14, 2026
@github-actions github-actions Bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Apr 14, 2026
@B4nan B4nan added the adhoc Ad-hoc unplanned task added during the sprint. label Apr 14, 2026
@B4nan B4nan changed the title chore: migrate to pnpm and enable minimum release age (WIP) chore: migrate to pnpm and enable minimum release age Apr 15, 2026
@B4nan B4nan marked this pull request as ready for review April 15, 2026 17:08
B4nan and others added 18 commits April 16, 2026 10:13
Migrates crawlee from Yarn 4 to pnpm workspaces across the 20-package
monorepo, plus docs/ and website/ workspaces. Adds a 1-day minimum
release age supply-chain guard at the package manager layer
(pnpm-workspace.yaml) and at the Renovate layer. Internal
`@apify/*` and `@crawlee/*` packages are whitelisted at both layers.

Notable changes:
- package.json: drop "workspaces", set packageManager to pnpm@10.24.0;
  "yarn X" -> "pnpm X" in scripts
- lerna.json: npmClient "yarn" -> "pnpm"
- .npmrc: node-linker=hoisted + link-workspace-packages=true +
  prefer-workspace-packages=true + public-hoist-pattern[]=*
- pnpm-workspace.yaml: packages [packages/*, website, docs], min
  release age 1440, @apify/* + @crawlee/* excluded, plus pinned
  overrides for playwright-core and @browserbasehq/stagehand
- All 20 packages/*/package.json: yarn -> pnpm in scripts; internal
  @crawlee/* deps converted to "workspace:*"
- Deleted yarn.lock, .yarnrc.yml, docs/yarn.lock, website/yarn.lock;
  generated pnpm-lock.yaml
- .husky/pre-commit: yarn -> pnpm
- New .github/actions/pnpm-install composite action (cached pnpm
  store, pattern from apify/apify-cli#1068)
- CI workflows (docs, publish-to-npm, release, test-ci, test-e2e):
  delegate install to composite; yarn/corepack removed;
  yarn -> pnpm; pnpm publish --no-git-checks
- renovate.json: add @apify/* + @crawlee/* whitelist; drop "yarn"
  from ignoreDeps and old npm constraint. minimumReleaseAge
  and internalChecksFilter were already in place.

WIP / known local issue: @crawlee/types build currently fails in
gen-esm-wrapper (dist resolution), needs small follow-up fix. CI
will likely surface additional peer-dep / type-resolution issues
typical of a large yarn->pnpm swap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… tsc

Fixes the local build failures from the original migration commit:

- Add minimatch "^9.0.0" override in pnpm-workspace.yaml — the default
  hoisted layout was pulling minimatch@3 as the single hoisted version
  (from a transitive lerna dep), which tripped type resolution in
  @crawlee/core (src uses minimatch@9's named `minimatch` export).
- Exclude @types/minimatch from public hoisting (.npmrc) — the stale
  @types/minimatch@3 at the root was shadowing minimatch@9's built-in
  types when tsc resolved the module.
- Set incremental: false in the root tsconfig.build.json — per-package
  clean scripts only remove dist/, not the tsconfig.build.tsbuildinfo
  sidecar, so partial failures left stale incremental state that made
  tsc skip emitting on the next run.

Together these make pnpm build pass for all 20 packages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI needs these to run postinstall to download browser binaries and
build native modules. Matches the 'Ignored build scripts' warning
from pnpm install.
Patches target older docusaurus versions than pnpm now resolves.
The patches themselves don't apply, which is fine (patched behavior
may no longer be needed), but patch-package exits non-zero which
kills pnpm install.
The original yarn setup had workspaces: [packages/*] only — website
and docs had their own yarn installs. Keep that separation under pnpm
so the main build doesn't require Node 20+ in the test matrix.
Replaces the local .github/actions/pnpm-install composite copy
with the shared one from apify/workflows@main. Identical behavior,
less duplication.
@apify/eslint-config@2.0.6 switched from eslint-plugin-import to
eslint-plugin-import-x, which adds strict rules (import-x/extensions,
import-x/no-extraneous-dependencies) that surface pre-existing issues.
Pin to 2.0.0 to match master's yarn.lock resolution.

Also clean up the inquirer import into a default-import + destructure
pattern so it complies with import/first and import/newline-after-import.
The @crawlee/core exports are re-exported by many packages (basic,
cheerio, http, browser, etc.) as the core design of the monorepo.
import/export flags every one of these as 'Multiple exports'.
Master hides this because yarn's layout didn't load the plugin;
pnpm's hoisted layout loads it properly. Silence the rule.
- Add docs and website back to pnpm-workspace.yaml (Node 20+ matrix
  is fine for docusaurus).
- Add @crawlee/* and crawlee workspace deps to docs/package.json so
  tsc resolves them; pnpm refused to walk up the symlink tree.
- Add postinstall script to delete README/CHANGELOG files in
  packages/*/node_modules — pnpm leaves nested copies for peer-dep
  conflicts (inquirer 8/9/12, wrap-ansi, etc.) and the typedoc
  plugin walks up to those package.json files and pulls the README
  into the MDX loader. Yarn classic flattened these out.
- Exclude **/node_modules/** from docusaurus docs scan so workspace
  symlinks under docs/ don't get treated as routes.
- Apply biome format
pnpm 10 doesn't auto-run puppeteer's postinstall by default; tests
crash with 'Could not find Chrome'. Add an explicit browser install
step (mirrors what apify-pr-toolkit and others do).
Vitest 4 enforces destructuring on fixture/hook arguments.
Silences npm warnings about unknown options like node-linker; pnpm
reads the same keys from pnpm-workspace.yaml in camelCase form.
Block accidental npm/yarn install — npm 10.5+ and pnpm 10.x both
honor devEngines.packageManager and refuse to run when it doesn't
match.
@B4nan B4nan force-pushed the chore/migrate-to-pnpm branch from cb7138d to 2468161 Compare April 16, 2026 08:13
@B4nan B4nan force-pushed the chore/migrate-to-pnpm branch from 9eadfba to dcc3307 Compare April 16, 2026 12:12
B4nan and others added 3 commits April 16, 2026 15:52
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
devEngines.packageManager breaks pnpm commands that delegate to npm
(pnpm version, pnpm pkg set, etc). Use the battle-tested only-allow
approach instead (same as Vite, Vue, Astro).
B4nan added a commit that referenced this pull request Apr 20, 2026
## Summary

Port the pnpm migration from master (#3568) to the v4 branch. Same
goals:
- Yarn 4 → pnpm 10.24.0
- `minimumReleaseAge: 1440` supply-chain guard at both pnpm and Renovate
layers
- Internal `@apify/*`, `@crawlee/*`, `apify-client`, `apify`, `crawlee`,
`got-scraping` whitelisted
- `only-allow pnpm` preinstall hook blocks npm/yarn usage
- Shared `apify/workflows/pnpm-install@main` action in all CI workflows

### v4-specific adaptations
- Node matrix `[22, 24]` (v4 already dropped 18, 20)
- oxlint/oxfmt (not eslint/biome)
- publish dist-tag `v4` instead of `next`
- No eslint-config pin needed
- Internal deps: `workspace:*` for regular deps, `workspace:^` for
peerDeps

### Incidentally fixed
Four packages were missing explicit `@crawlee/*` deps that yarn's
hoisting masked:
- `@crawlee/http` → `@crawlee/core`
- `@crawlee/linkedom` → `@crawlee/utils`
- `@crawlee/playwright` → `@crawlee/basic`
- `@crawlee/puppeteer` → `@crawlee/core`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@janbuchar
Copy link
Copy Markdown
Contributor

Are there any significant differences from #3581?

@B4nan
Copy link
Copy Markdown
Member Author

B4nan commented Apr 20, 2026

Maybe some small details, I guess we can close this one and keep yarn in the master until we merge v4 into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants