Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include failing data in the benchmarks #1105

Open
akutruff opened this issue Aug 1, 2023 · 5 comments
Open

Include failing data in the benchmarks #1105

akutruff opened this issue Aug 1, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@akutruff
Copy link

akutruff commented Aug 1, 2023

Sorry if I'm misinterpreting the code, but it appears that the benchmarks only measure validating successful validations. If that's correct, then it should be noted that libraries like myzod will completely re-parse all of the data in order to collect errors.

It would be prudent to include a test that collected and return errors as well

@moltar
Copy link
Owner

moltar commented Aug 2, 2023

Sorry if I'm misinterpreting the code, but it appears that the benchmarks only measure validating successful validations.

That is correct!

If that's correct, then it should be noted that libraries like myzod will completely re-parse all of the data in order to collect errors.

Valid point!

How many libraries do you think behave this way?


Just as a general theme, we do try to stick to a "common denominator" across all validator libraries. It is understood that libraries will sometimes have different trade-offs in their designs to improve speed. But in the end, we do need to have some similarities across them to be able to bench.

Something that fails validation is certainly a very common scenario, I think, so probably worth adding it.

Ping @hoeck thoughts?

@akutruff
Copy link
Author

akutruff commented Aug 2, 2023

How many libraries do you think behave this way?

No idea, but adding this benchmark will certainly put a lens on libs that do.

The sketchy side of this is that by taking error-reporting shortcuts, it means that the validation of correct data is faster as a result, so it kind of manifests itself twice by also inflating the benchmark numbers for valid data.

I was just curious as to why myzod claimed such performance gains.

@akutruff
Copy link
Author

akutruff commented Aug 2, 2023

Also, there's a subtly in semantics of libraries - some libraries validate entire objects and do not abort when they encounter an error. They validate the entire object so the user can see all the errors in an object. I think there should be three different failure tests:

  • Fail on any error
  • Fail on all properties
  • Fail on all properties AND call any method that gathers ALL the errors and reports them to the user. (For cases with 'or' clauses/unions only 1 error needs to be reported.)

In general, benchmarks can be dangerous, because library authors do not want to see their library at the bottom of benchmarks. Without covering the the full user experience, then it's tempting to only optimize the parts that are measured.

@akutruff
Copy link
Author

akutruff commented Aug 2, 2023

I added a quick invalid data test, and as expected see results below. myzod goes goes from being 2.7x faster than zod for safe parsing, but then drops to 2x faster than zod in the failure case. Again, I don't know if myzod is failing fast or not based on the test I wrote. Also, I did not actually call any function that do error reporting. I'm not familiar enough with myzod's behavior.

Running "parseSafe" suite...

  myzod:
    4 063 636 ops/s, ±0.39% 

Running "parseSafeInvalidData" suite...

  myzod:
    88 576 ops/s, ±0.65% 
Running "parseSafe" suite...

  zod:
    1 511 938 ops/s, ±0.69%

Running "parseSafeInvalidData" suite...

  zod:
    45 169 ops/s, ±2.45%  

@moltar
Copy link
Owner

moltar commented Aug 2, 2023

@akutruff Great findings, thanks for sharing. Certainly looks like something we shall add.

Would you be interested in contributing a PR?

@moltar moltar added the enhancement New feature or request label Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants