New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adapt to real world scenarios #300
Comments
BTW, you can add a vanilla JS benchmark, as a control group. import { Case } from './abstract';
export class VanillaJsCase extends Case implements Case {
name = 'vanilla-js';
validate() {
const value = this.data;
if (value == null) return;
if (!Number.isSafeInteger(value.number)) return;
if (value.negNumber >= 0) return;
if (value.maxNumber <= 0) return;
if (typeof value.string !== 'string') return;
if (typeof value.longString !== 'string') return;
if (typeof value.boolean !== 'boolean') return;
if (value.deeplyNested == null) return;
if (typeof value.deeplyNested.foo !== 'string') return;
if (typeof value.deeplyNested.num !== 'number') return;
if (typeof value.deeplyNested.bool !== 'boolean') return;
return value;
}
} You will see it runs 8x to 10x faster than |
Hey @shlomiassaf, This is a great analysis. Thank you for that insight. I knew that some of these libraries use
Do you have any suggestions on how to fix this? Randomize the data maybe?
This unfortunately does not provide the type guarding. But I think it is possible to create a type-guarded vanilla JS validation function anyways. There is the new TS assert guard functionality that can be useful here. I'll open another issue for this. |
Hey @shlomiassaf, Any ideas on the above? |
It would also be helpful to separate quick validations (that return true/false) and error-reporting validations. In the case of |
@gigobyte thank you for your input, I do think this is something we can address also. |
@gigobyte is right. This is the biggest performance difference. Having such simple checks without error reporting is basically useless. What should I do when quartet returns false? Throw a generic error? Not very practical in real world code. Also interesting fact regarding quartet is that as soon as you activate error reporting (by using @shlomiassaf a couple of things need to be clarified since they are just not true.
It didn't get completely inlined into the
Other libraries functions get inlined as well. Not completely but parts are surely inlined. But inlined or not is not the important bit why quartet & co is so much faster. It's simply because much much less code runs per validation. Less code means faster execution times. It doesn't matter if it was inlined or not, at least in this case. There are many factors why code runs fast, and inlining is just one of them. Other important stuff is monomorphic function calls, fast object properties, fast type unboxing, fast built-in functions, etc. When the heuristic determines it can be optimized in certain ways, then they will be optimized.
Again, it has nothing to do with order or being inlined. Once a function has been inlined, that won't change. And it doesn't matter at what depth the function was called. When v8 decided the function can be inlined, it will be inlined and stays inlined.
The heuristic to determine if a function is hot and thus could possibly be optimized doesn't work that way. The call stack doesn't matter either. The v8 engine tries to predict how useful it would be to optimize a function by estimating the executing costs of the unoptimized version. Every function might be a candidate for it, even functions that were called in a request/response framework and thus have a bit of delay between each call, or functions that were called deep in the call stack.
That won't change anything. Quartet stays the fastest, no matter how many requests and call stacks you generate in between. As soon as you execute a couple of times this function, v8 tries to optimize it. It doesn't matter if there were 1ms between the call or several seconds, so stretching it artificially won't change anything.
It makes totally sense because what quartet & co do is they generate code for the v8 engine that can be perfectly further optimized by the JIT engine. A JIT engine in the JIT engine. This is incredible fast and stays faster, no matter how artificially you want to limit its function calls. The drawback is of course that the code behind it is much more complicated and you need a lot more knowledge to build code that can be perfectly optimized by the v8 JIT engine and won't be deoptimized. |
@gigobyte Cannot get this to work. Does it require the extra fp-ts package with Either type?
I just removed @marcj is there anything actionable I can do to improve this project? |
|
I have a io-ts benchmark here: https://github.com/super-hornet/super-hornet.ts/blob/master/packages/marshal-benchmark/tests/validation2.spec.ts, which is based on their official benchmark. They have already built-in
|
Made this change: 5501aa1 Is this good enough? |
To everyone involved in this issue, @hoeck put a huge amount of effort into this. Please take a look at the results published. And please give feedback. If we can consider this done, then I'll close the issue. Thanks! |
@marcj any feedback on the recent changes? |
Hi,
Nice project, thanks, i've used it for my evaluations.
I've noticed the huge gap between 2 libraries to all other libraries
This huge gap is probably because of the way the project is running the tests.
The 2 libraries above use a different strategy than all others to create the validators.
While others mostly use predefined, hard-coded validator functions and through composition of them create a schema, the fastest 2 libraries will compile JS code at runtime (
eval()
ornew Function(...)
) to create discrete validation functions that do not call other functions internally (no composition) but instead have all the required validation code within the same function created specifically for the schema.For example, Quartet:
For the following schema:
It will generate the following validator function:
This has a deep impact on performance depending on how you run your code.
The benchmark code in this project will use 1 schema and iterate over it for a certain period of time. This is perfect for
quartet
because of how V8 works.The function becomes super hot, it quickly becomes inlined and additionally, if any internal function call exists within the validator it will get inlined as well!
In other libraries, this can not happen because so many functions are called, due to the composition, so most of them are cold and nothing get's inlined.
In real world scenarios, such a perfect order does not exists. For example, when handling incoming request, so many functions are called that by the time we reach the validator it is no longer hot!
And of course, we also need to factor in handling of multiple incoming requests.
The major advantage of the 2 libraries in question does not play along in real world scenarios and the results of the 2 are distorted in the benchmark.
I general sucj a huge gap does not make sense, otherwise everyone would have used these libraries entirely.
Thanks again!
The text was updated successfully, but these errors were encountered: