Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue: Performance improvement #73

Closed
naruaway opened this issue Aug 7, 2023 · 24 comments
Closed

Tracking issue: Performance improvement #73

naruaway opened this issue Aug 7, 2023 · 24 comments
Assignees
Labels
enhancement New feature or request performance Its performance related priority This has priority

Comments

@naruaway
Copy link
Contributor

naruaway commented Aug 7, 2023

This is a tracking issue to discuss & improve the performance of Valibot.
I created a benchmark suite for Valibot so that we can track its runtime performance in many situations since existing benchmarks were failing to capture the real world schema and data diversity.

The current latest result can be checked from https://naruaway.github.io/valibot-benchmarks/

Currently we see the following interesting observations:

  • Most of the cases, Valibot is slower than Zod
  • Valibot performs better in Bun (JavaScriptCore)
  • When the data is invalid, Valibot is much slower than Zod, which is probably because we rely on exception handling for failure case

I'll continue improving my benchmark suite and will post any interesting information here.

We already have several performance improvement PRs: https://github.com/fabian-hiller/valibot/pulls?q=is%3Apr+is%3Aopen+label%3Aperformance

I'll post benchmark result for each PR later

@fabian-hiller
Copy link
Owner

Thanks for your research and the benchmark suite. This week I will focus exclusively on performance for a few days and probably release a new version with performance improvements.

@BastiDood had recently checked the performance of .forEach vs. for...of and we have merged the first changes with PR #68. Let's use this issue to share more insights here to make Valibot maximally fast. 🔥

@fabian-hiller fabian-hiller self-assigned this Aug 7, 2023
@fabian-hiller fabian-hiller added enhancement New feature or request priority This has priority performance Its performance related labels Aug 7, 2023
@fabian-hiller
Copy link
Owner

I expect to start my performance research on Wednesday or Thursday and then share my findings with you here. I plan to test the validation for the success and failure case under normal and extreme conditions between different libraries. For Valibot, the most important thing is to find the vulnerabilities in the code that are currently affecting performance. Thank you @naruaway for the vulnerabilities you have already found.

@naruaway
Copy link
Contributor Author

naruaway commented Aug 8, 2023

Hi @fabian-hiller, I updated (actually, a lot :) ) https://github.com/naruaway/valibot-benchmarks and now it can easily compare different versions of Valibot as well, not only Zod, for example, here is the excerpt of the performance of #46 🎉 :

reduce-spread

Also, I confirmed that in my MacBook, after my changes Valibot is faster than Zod except for "failure" data. I plan to set up GitHub actions to run it also in Linux and Windows.

If you have a chance, could you check out the README of https://github.com/naruaway/valibot-benchmarks ? Also if you have additional test cases, it might be nice to manage in the same place. All the schema and data put in https://github.com/naruaway/valibot-benchmarks/tree/main/test_data are automatically picked up by the benchmarker.

My next plan is to finish up my PRs with evidences using this tool. And then I'll also run the benchmark against recently merged PRs to detect potential regressions. Note that this tool can also generate bundle size for each schema since each test case is bundled separately first by webpack

@fabian-hiller
Copy link
Owner

Thank you very much. I have clicked through a bit. Great work! I think it's great that your suite automatically takes into account different run times.

The idea is cool to run benchmarks on PRs as well to notice the impact of changes. Maybe in the long run we'll also integrate some benchmarks into the website.

@fabian-hiller
Copy link
Owner

I just want to let you know that I will probably not be able to start my performance research until the weekend, because I am currently looking at Zod, ArkType and Typia as part of my bachelor thesis.

@zkulbeda
Copy link
Contributor

zkulbeda commented Aug 12, 2023

Vitest has builtin benchmarking method (link) but it's still experimental (btw type testing is experimental too, but it already used in valibot). Can we use this method to unify perfomance testing for beginners? To start, we could copy tests from typescript-runtime-type-benchmarks to compare perfomance with zod in pull requests. This also allows to use benchmarking in ci as well as testing and unify running platform's cpu and arch. I'm afraid this leads to adding zod and other libs into devDependencies

@naruaway
Copy link
Contributor Author

Oh Vitest's benchmarking is interesting! FYI I am prototyping my benchmark suite to make sure it's potentially more precise (e.g. it's bundling each lib using webpack production mode with minifier and each run is launching specific engine every time separately (e.g. Node.js/Bun/Deno/want to add quickjs-wasm)) and also tests extreme cases (e.g. this case is useful to reveal the slowdown effect of throwing errors since try-catch happens many times for "wide" schema)

Anyway I think we can integrate simpler benchmark suite (Vite benchmark seems to be good choice!) in the main valibot repo to prevent obvious regressions

@zkulbeda
Copy link
Contributor

zkulbeda commented Aug 12, 2023

Yeah, I was exploring your suite and my first thought was it's quite difficult to use for pull requests without publishing to npm, or you need to copy the lib before and after changes manually. I think your work is most suitable to valibot website for visual demonstration, because github markdown doesn't have bar charts. Also I don't know if vitest supports different engines as deno and bun.
I see the usage of vitest benchmark in ci on pull request with github actions:

  • checkout repo
  • run vitest bench --reporter json --outputFile /tmp/after.json
  • checkout main branch
  • run vitest bench --reporter json --outputFile /tmp/before.json
  • run script by actions/github-script to read, compare results and format a results table
  • leave sticky message in pull request by marocchino/sticky-pull-request-comment

@zkulbeda
Copy link
Contributor

zkulbeda commented Aug 13, 2023

@fabian-hiller, If you don't mind the vitest benchmark and including it in github workflow, I can make a pr.

@fabian-hiller
Copy link
Owner

Yes, feel free to make a draft PR

@kurtextrem
Copy link
Contributor

Two small thoughts, although I did not benchmark anything yet:

  • for (..; ++i) is usually faster than for..of
  • The array that is needed to fix the prototype pollution issue has a length of 3, as far as I can tell? What happens if we make it 3 conditions instead? (instead of [a,b,c].includes(d) -> d === a || d === b || d === c)

@BastiDood
Copy link
Contributor

for (..; ++i) is usually faster than for..of

When I implemented the refactor from forEach to for..of in #68, I refrained from applying this transformation because I feared that the (presumed) performance boost came at the expense of too much readability. Unless the benchmarks show that the improvements are truly significant across the board, the poor readability of C-style for-loops makes me reluctant to proceed.

The array that is needed to fix the prototype pollution issue has a length of 3, as far as I can tell? What happens if we make it 3 conditions instead?

I think this is a worthwhile exploration. I'd be quite surprised if the JIT compiler doesn't already apply this optimization over time. Nevertheless, the BLOCKED_KEYS are currently stored in a global variable. I can imagine that its globality may make it more difficult for the V8 engine to optimize its const-ness, which leads me to believe that there may be some hidden performance wins with your proposal.

However, this is less relevant when Valibot is already bundled into a single file. Modern bundlers are quite aggressive with inlining and hoisting imports, which makes the aforementioned issues with globality a non-issue. I'd be interested to know if the three-condition version would be better than the bundled version anyway.

@fabian-hiller
Copy link
Owner

We have now reworked the entire library to optimize performance. The changes are in the main branch. The only bottleneck still exists is new ValiError(...) in the safeParse method. I have deprecated this so that we can remove this in the future.

Feel free to test your ideas with the code in the main branch and create an issue or PR if you find performance improvements. for vs. for..of made almost no difference in my test environment.

@ssalbdivad
Copy link

@fabian-hiller After ArkType's alpha release, I wanted to spend some time learning about the performance characteristics of other validators. ArkType was similar to Zod at that point, both of which I think took a reasonably performant approach, yet somehow according to the benchmarks validators like Typia and Typebox were hundreds of times faster.

Any discussion of validator performance should centrally address the one aspect of the implementation that accounts for the vast majority of that difference- whether the validator precompiles its constraints.

During the build process or at runtime via the Function constructor, a list of conditions parsed from some input can be transformed to JS source like typeof data.name === "string" && typeof data.age === "number", removing index accesses and loops from the validator logic and allowing V8 to heavily optimize the checks in a way that would otherwise be impossible.

Because of those advantages, ArkType's upcoming release is precompiled by default via the Function constructor and will offer a precompilation process that can be integrated with the build (also offers a path toward optimal bundle size).

However, though being 400x faster than Zod sounds impressive, for the vast majority of use cases, whether validation takes a compile microseconds or a couple nanoseconds really won't matter. I think incremental steps like this to improve performance are valuable, but hopefully it's helpful to know that if having highly-optimized performance is a top priority, finding a way to leverage V8's JIT optimizations makes everything else look like a rounding error.

Great work again on Valibot. If you have any questions about ArkType or the ecosystem in general, I'm happy to talk anytime 😊

@fabian-hiller
Copy link
Owner

Hi @ssalbdivad, thank you for your message. I would be happy to exchange ideas. Feel free to add me, fabianhiller, on Discord. ArkType is part of my bachelor's thesis.

I also looked at other solutions like Typia, and it is true that a compiler can generate code that a "normal" library can't match. However, the compilation step also brings disadvantages.

My goal for Valibot is to find a good compromise between API design, bundle size, and performance. As soon as that is done, we will see further.

@ssalbdivad
Copy link

My goal for Valibot is to find a good compromise between API design, bundle size, and performance. As soon as that is done, we will see further.

Yes! Sorry if it wasn't clear, my comment wasn't meant to say this is what Valibot should do. If anything, I'd say that outside of the marketing value, optimizing validation performance beyond the basics is rarely a good investment. Just wanted to make sure you were aware of this and not comparing the current performance to solutions that do something totally different👍

@jussisaurio
Copy link
Contributor

jussisaurio commented Aug 14, 2023

Edit: looks like I forked this repo before you changed your approach to error handling. Hah!

Hey, not sure what the approach for the upcoming versions is going to be, but I have an experimental branch that seems to improve Valibot's performance quite a bit compared to 0.12:

% npm run start run valibot

> typescript-runtime-type-benchmarks@1.0.0 start
> ts-node index.ts run valibot

Removing previous results
Executing "valibot"
Loading "valibot"
Running "parseSafe" suite...
Progress: 100%

  valibot:
    268 446 ops/s, ±2.77%   | fastest

Finished 1 case!
Running "parseStrict" suite...
Progress: 100%

  valibot:
    299 202 ops/s, ±2.37%   | fastest

Finished 1 case!

# linking valibot version from local disk
% npm link valibot 

changed 1 package, and audited 920 packages in 905ms

% npm run start run valibot

> typescript-runtime-type-benchmarks@1.0.0 start
> ts-node index.ts run valibot

Removing previous results
Executing "valibot"
Loading "valibot"
Running "parseSafe" suite...
Progress: 100%

  valibot:
    1 864 412 ops/s, ±0.14%   | fastest

Finished 1 case!
Running "parseStrict" suite...
Progress: 100%

  valibot:
    1 618 466 ops/s, ±0.20%   | fastest

Finished 1 case!

The basic idea is not to throw errors but return a discriminated union of ok + value or notOk + issues , branch here:
https://github.com/jussisaurio/valibot/tree/no-throw

Unfortunately I'm just about to go on a holiday trip so won't be able to continue this for a while, but this might be an interesting approach to pursue. I got this idea since similar performance work was done by @milankinen on Zod a good while ago

I also haven't yet had time to validate that the validators are actually doing what they're supposed to be doing, since all the tests are broken

@FlorianDevPhynix
Copy link
Contributor

DX of ok() and notOk seems nice

@fabian-hiller
Copy link
Owner

Thanks @jussisaurio for your research. Yes, I implemented almost the same thing yesterday. For v0.13.0, the goal is maximum performance, and for v0.14.0, to reduce the bundle size and make the code prettier. So I am happy to get more feedback.

@fabian-hiller
Copy link
Owner

@naruaway can you check in your benchmark suite if it makes a difference to write issues.push(...result.issues) everywhere instead of a for loop? After we have achieved optimal performance, I would like to try how we can reduce the bundle size again.

Example with current for loop implementation:

for (const issue of result.issues) {
issues.push(issue);
}

@naruaway
Copy link
Contributor Author

naruaway commented Aug 20, 2023

@fabian-hiller Sure, I ran the benchmark suite for the following commits on my MacBook Air (M2 chip):

Benchmark results
nodejs, deep, valid
  before-stop-throwing : 365050 ops/s (fastest)
  use-spread-op        : 355486 ops/s
  stop-throwing        : 355328 ops/s
  latest-main          : 354613 ops/s

nodejs, many_features, invalid
  use-spread-op        : 301071 ops/s (fastest)
  stop-throwing        : 299901 ops/s
  latest-main          : 299818 ops/s
  before-stop-throwing : 18259 ops/s

nodejs, many_features, valid
  use-spread-op        : 663490 ops/s (fastest)
  latest-main          : 662570 ops/s
  stop-throwing        : 662049 ops/s
  before-stop-throwing : 108781 ops/s

nodejs, optional_nullable, valid
  use-spread-op        : 408952 ops/s (fastest)
  stop-throwing        : 405393 ops/s
  latest-main          : 403448 ops/s
  before-stop-throwing : 388210 ops/s

nodejs, wide, invalid
  stop-throwing        : 183672 ops/s (fastest)
  latest-main          : 183637 ops/s
  use-spread-op        : 179024 ops/s
  before-stop-throwing : 4198 ops/s

nodejs, wide, valid
  before-stop-throwing : 243478 ops/s (fastest)
  latest-main          : 237870 ops/s
  use-spread-op        : 237556 ops/s
  stop-throwing        : 236885 ops/s

bun, deep, valid
  use-spread-op        : 431137 ops/s (fastest)
  latest-main          : 429077 ops/s
  stop-throwing        : 428984 ops/s
  before-stop-throwing : 404460 ops/s

bun, many_features, invalid
  latest-main          : 1221688 ops/s (fastest)
  stop-throwing        : 1220351 ops/s
  use-spread-op        : 1129409 ops/s
  before-stop-throwing : 89970 ops/s

bun, many_features, valid
  latest-main          : 800226 ops/s (fastest)
  use-spread-op        : 799809 ops/s
  stop-throwing        : 798960 ops/s
  before-stop-throwing : 425766 ops/s

bun, optional_nullable, valid
  latest-main          : 459893 ops/s (fastest)
  stop-throwing        : 457241 ops/s
  use-spread-op        : 454673 ops/s
  before-stop-throwing : 448791 ops/s

bun, wide, invalid
  stop-throwing        : 360316 ops/s (fastest)
  latest-main          : 359866 ops/s
  use-spread-op        : 300189 ops/s
  before-stop-throwing : 23619 ops/s

bun, wide, valid
  stop-throwing        : 316240 ops/s (fastest)
  latest-main          : 315906 ops/s
  use-spread-op        : 314496 ops/s
  before-stop-throwing : 300856 ops/s

In the above result, we see the following:

  • wide, invalid case is probably the most interesting one in this context since it is really a "wide" schema and for "invalid" data, it would throw and catch many times. stop-throwing made Valibot 4275% faster for Node.js, 1425% faster for Bun. use-spread-op is not making Valibot so slow especially for Node.js. Comparing the huge impact of "not throwing", I would say this is negligible so probably better to optimize for the bundle size and code readability.
  • (Note that many_features, valid case seems to be broken (I mean, the data seems actually invalid since before-stop-throwing is extremly slow), I'll fix later but it should not change any conclusion) UPDATE: I cannot find a bug in my test case. I just realized that e6c53c8 can actually change perf for "valid" case as well since there are some changes like lazy init issues

@fabian-hiller
Copy link
Owner

Thank you for the data! I will think about it.

@fabian-hiller
Copy link
Owner

fabian-hiller commented Aug 22, 2023

I have made further performance improvements based on PR #104. @naruaway feel free to test out the latest commits in your benchmark suite.

@fabian-hiller
Copy link
Owner

Valibot v0.13.0 with extreme performance improvements is now available! Thank you for your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Its performance related priority This has priority
Projects
None yet
Development

No branches or pull requests

8 participants