Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update primordials #9076

Closed
wants to merge 1 commit into from
Closed

Update primordials #9076

wants to merge 1 commit into from

Conversation

jdalton
Copy link
Contributor

@jdalton jdalton commented Feb 23, 2024

What does this PR do?

In a step to improving our interoperability with Node on fronts like node:http we needed to update our primordials (a lower level module used by Node lib modules.

side by side comparison

The image above is comparing Node v20.11.1, Bun main branch, and Bun jdalton/primordials. I created a JS file that mimics Node's lib/internal/per_context/primordials.js module.

benchmark init time ("a") and 1,000x iterations of each entry
console.time("a");
//
// The code of lib/internal/per_context/primordials.js
//
console.timeEnd("a");

console.time("Set");
for (let i = 0; i < 1000; i++) {
  new Set([1, 2, 3, 4]);
}
console.timeEnd("Set");

console.time("Map");
for (let i = 0; i < 1000; i++) {
  new Map([
    [1, 2],
    [3, 4],
  ]);
}
console.timeEnd("Map");

console.time("SafeMap");
for (let i = 0; i < 1000; i++) {
  new primordials.SafeMap([
    [1, 2],
    [3, 4],
  ]);
}
console.timeEnd("SafeMap");

console.time("SafeSet");
for (let i = 0; i < 1000; i++) {
  new primordials.SafeSet([1, 2, 3, 4]);
}
console.timeEnd("SafeSet");

console.time("makeSafe");
for (let i = 0; i < 1000; i++) {
  primordials.makeSafe(Map, class extends Map {});
}
console.timeEnd("makeSafe");

console.time("SafeWeakMap");
for (let i = 0; i < 1000; i++) {
  new primordials.SafeWeakMap([
    [{}, {}],
    [{}, {}],
  ]);
}
console.timeEnd("SafeWeakMap");

console.time("SafeWeakSet");
for (let i = 0; i < 1000; i++) {
  new primordials.SafeWeakSet([{}, {}, {}, {}]);
}
console.timeEnd("SafeWeakSet");

console.time("SafeWeakRef");
for (let i = 0; i < 1000; i++) {
  new primordials.SafeWeakRef({});
}
console.timeEnd("SafeWeakRef");

console.time("hardenRegExp");
for (let i = 0; i < 1000; i++) {
  primordials.hardenRegExp(/a/);
}
console.timeEnd("hardenRegExp");

console.log(Reflect.ownKeys(primordials).length);
With the primordials PR we are now creating more primordials (better Node interop), are doing so faster than Node, and on par with Bun's existing code path. I did this by optimizing the creation of the primordials in the following areas:
  • unrolling generic iteration
  • special pathing uncurrying
  • leveraging WebKit JSC @ helpers
  • avoiding known slow paths in JS-land
    (like Object.defineProperty, Object.defineProperties, Object.getOwnPropertyDescriptor, and friends).

Methods like the internal primordials.makeSafe (that is used nowhere else in Node's code) were removed in favor of less generic, more specific approaches.

I also curated the list of primordials (documenting which ones have been skipped) because Node added a bunch of junk-entries for things like SomeMethodName, SomeMethodLength, and so on.

Copy link

github-actions bot commented Feb 23, 2024

@jdalton 1 files with test failures on bun-darwin-aarch64:

View test output

#0d2c82977188dc721840303bdf449b31d6ed19a3

Copy link

github-actions bot commented Feb 23, 2024

@jdalton 1 files with test failures on linux-x64:

View test output

#0d2c82977188dc721840303bdf449b31d6ed19a3

Copy link

github-actions bot commented Feb 23, 2024

Copy link

github-actions bot commented Feb 23, 2024

✅ test failures on bun-darwin-x64 have been resolved.

#0d2c82977188dc721840303bdf449b31d6ed19a3

Copy link

github-actions bot commented Feb 23, 2024

❌🪟 @jdalton, there are 35 test regressions on Windows x86_64

  • test\bundler\bundler_edgecase.test.ts
  • test\cli\hot\hot.test.ts
  • test\cli\run\require-cache.test.ts
  • test\cli\run\transpiler-cache.test.ts
  • test\js\bun\dns\resolve-dns.test.ts
  • test\js\bun\http\fetch-file-upload.test.ts
  • test\js\bun\shell\bunshell-file.test.ts
  • test\js\bun\shell\shelloutput.test.ts
  • test\js\bun\shell\throw.test.ts
  • test\js\bun\http\bun-server.test.ts
  • test\js\node\crypto\crypto.key-objects.test.ts
  • test\js\node\dns\node-dns.test.js
  • test\js\node\env-windows.test.ts
  • test\js\node\process\process-args.test.js
  • test\js\node\process\process.test.js
  • test\js\node\util\node-inspect-tests\parallel\util-inspect.test.js
  • test\js\node\worker_threads\worker_threads.test.ts
  • test\js\third_party\body-parser\express-body-parser-test.test.ts
  • test\js\third_party\es-module-lexer\es-module-lexer.test.ts
  • test\js\third_party\socket.io\socket.io-close.test.ts
  • test\js\third_party\socket.io\socket.io-connection-state-recovery.test.ts
  • test\js\third_party\socket.io\socket.io-handshake.test.ts
  • test\js\third_party\socket.io\socket.io-messaging-many.test.ts
  • test\js\third_party\socket.io\socket.io-middleware.test.ts
  • test\js\third_party\socket.io\socket.io-namespaces.test.ts
  • test\js\third_party\socket.io\socket.io-server-attachment.test.ts
  • test\js\third_party\socket.io\socket.io-socket-middleware.test.ts
  • test\js\third_party\socket.io\socket.io-socket-timeout.test.ts
  • test\js\third_party\socket.io\socket.io-utility-methods.test.ts
  • test\js\third_party\socket.io\socket.io.test.ts
  • test\js\web\fetch\body-stream.test.ts
  • test\js\web\timers\setTimeout.test.js
  • test\js\web\websocket\websocket.test.js
  • test\regression\issue\08095.test.ts
  • test\js\web\workers\worker.test.ts

Full Test Output

@jdalton

This comment was marked as resolved.

@paperdave
Copy link
Collaborator

I think the approach of primordials.js is a bad idea in general, which is why jhmasters prior pr was merged with this comment added:

// TODO: Use native code and JSC intrinsics for everything in this file.
// Do not use this file for new code, many things here will be slow especailly when intrinsics for these operations is available.
// It is primarily used for `internal/util`

Even if it is faster than before and faster than node (which is a nice thing to see), I'm of the opinion that none of these methods should be used if possible. If this is merged, I'd like a warning comment of this nature to be

They also aren't loaded before user code (i imagine this file is slow to load), so it doesn't really achieve the safety guarantee that node primoridals do (which is already not that strong considering so much of the node js library is pollutable despite the primordials). And for the things that are safe, these should just be done on a per-case basis using the @ intrinsics, which the builtin bundler will already do for many cases.

@jdalton
Copy link
Contributor Author

jdalton commented Feb 24, 2024

@paperdave

They also aren't loaded before user code (i imagine this file is slow to load), so it doesn't really achieve the safety guarantee that node primoridals do (which is already not that strong considering so much of the node js library is pollutable despite the primordials).

We should definitely load these first to have such guarantees. Primordials via intrinsics or other mechanisms are important. The reason to have things similar to Node is to ease translation/porting for interoperability because meeting developers and the ecosystem where they are is critical for Bun's success and future. When possible I'm having Node's primordials use JSC private methods and our self hosted code (native modules written in JS) that aren't using primordials should definitely be using the JSC private methods when possible (my preference is explicit use over magic transpilation).

Even if it is faster than before and faster than node (which is a nice thing to see), I'm of the opinion that none of these methods should be used if possible. If this is merged, I'd like a warning comment of this nature to be

The benefit of these in easing porting of code outweighs the ugly factor or current perf concerns since Node today is saddled with a slower implementation and we can continue to augment with private JSC methods (I've even found private JSC method @isRegExpObject is 2x faster than our isRegExp from node:util/types).

@Jarred-Sumner
Copy link
Collaborator

Jarred-Sumner commented Feb 25, 2024

Bun’s performance advantage today often comes from technical choices that leverage a deeper understanding & control of system internals. Instead of “what’s the easiest way to make this work given how it works today?”, we lean towards “what’s the most performant way we could make this work reliably?”. Often at overwhelming implementation difficulty (for developers outside of Bun)

The constraints of Node.js/V8 and Bun/JSC are different. Making the interfaces match is very much a goal, but making the implementation match is both a non-goal and in many cases an anti-goal for us to continue to be faster.

If we really need to have primordials like this, then it’d make more sense to add them to the String, Array, etc prototype in JSC directly as private properties than to make these primordials in a separate function copied. This avoids prototype pollution issues correctly without incurring an extra initialization overhead other than the extra properties added. I’m still not convinced we need all the extra properties in most cases.

@jdalton
Copy link
Contributor Author

jdalton commented Feb 26, 2024

@Jarred-Sumner

Bun’s performance advantage today often comes from technical choices that leverage a deeper understanding & control of system internals.

Performance that leads to incorrect results or crashes is not real performance. It's more like a magician’s card trick ✨ . Bun cannot achieve its goal of replacing Node as the ecosystem go-to on perf alone. Devs don’t care what whizbang tech Bun is built on when their stuff doesn’t work.

Instead of “what’s the easiest way to make this work given how it works today?”, we lean towards “what’s the most performant way we could make this work reliably?”. Often at overwhelming implementation difficulty (for developers outside of Bun)

Said another way: Bun is boxing itself in without a lifeline (the community). Implementations (and duplicate implementations) are so complex it's difficult for issues to be root caused, crashes squashed, and issues reliably reproduced – leading to performance that doesn’t work. Bun, with an ever growing surface area for crashes and security exploits, is burdened with opaque tech debt that gets heavier with each release.

The constraints of Node.js/V8 and Bun/JSC are different. Making the interfaces match is very much a goal, but making the implementation match is both a non-goal and in many cases an anti-goal for us to continue to be faster.

Bun is nearly 3 years old and struggling with fundamentals precisely because it didn’t match implementations. Behind many methods, especially those self-hosted in JavaScript, there are non-obvious side-effects and unintentionally established cow-paths. The vast majority of Node APIs are not super interesting or performance critical. They just need to work. Bun can be faster in very big ways while still aligning with implementations of many APIs. Bun is the newbie trying to gain ground in Node’s ecosystem. It doesn’t have the luxury of choice just yet.

If we really need to have primordials like this, then it’d make more sense to add them to the String, Array, etc prototype in JSC directly as private properties than to make these primordials in a separate function copied. This avoids prototype pollution issues correctly without incurring an extra initialization overhead other than the extra properties added.

Yes! This is ideal. However, I have noticed Node leverage some of these in more generic ways (many JS Array and String methods are generic allowing values to be array-like or coerced to strings) which creates obstacles to stumble over (as I ran into even in this PR).

I’m still not convinced we need all the extra properties in most cases.

There is room to prune for sure. We are not at the pruning stage yet.

@Jarred-Sumner
Copy link
Collaborator

Jarred-Sumner commented Feb 26, 2024

Performance that leads to incorrect results or crashes is not real performance. It's more like a magician’s card trick ✨ . Bun cannot achieve its goal of replacing Node as the ecosystem go-to on perf alone. Devs don’t care what whizbang tech Bun is built on when their stuff doesn’t work.

100%

Bun is nearly 3 years old and struggling with fundamentals precisely because it didn’t match implementations

IMO, that's not why - it's because I based the implementations of node:http, node:fs, etc on the docs/on the documented behavior instead of the implemented behavior and didn't copy Node's test suite to verify it works correctly - instead manually writing tests which were (and are) not nearly comprehensive enough. Copy-pasting implementations is one way to make it work, but it is not the only way

Yes! This is ideal. However, I have noticed Node leverage some of these in more generic ways (many JS Array and String methods are generic allowing values to be array-like or coerced to strings) which creates obstacles to stumble over (as I ran into even in this PR).

Makes sense, we can skip extra wrappers and skip worrying about userland overwriting builtins when they're private properties

@jdalton
Copy link
Contributor Author

jdalton commented Feb 26, 2024

@Jarred-Sumner

Copy-pasting implementations is one way to make it work, but it is not the only way

Yes, though not matching implementation due to basing things off of docs (something devs are notoriously lax on, me included) or otherwise is still not matching implementations 🙃 🏃‍♂️ 🙈 .

I've found every time I've tried to get clever or tweak implementations a bit it's been far too easy for something to fall through the cracks (side effects of self-hosted JS implementations and its forgiving APIs are a bummer). Copy-pasta / less-creative-direct-porting is not exciting but does give confidence that the code will just work, makes for updating as Node changes easier (I end up thinking less about the why it does something and more about the steps), and more easily gets Node's own tests to pass.

Makes sense, we can skip extra wrappers and skip worrying about userland overwriting builtins when they're private properties

I think this will be the next PR 😸 (the private methods are so nice, there is some caution around the ones WebKit provides as they seem to add and remove them at will)

@jdalton
Copy link
Contributor Author

jdalton commented Mar 22, 2024

Moved to #9306

@jdalton jdalton closed this Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants