-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve encode performance #52
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I appreciate your thoughtful approach and agree that a third-party library would be undesirable.
Last version had some penalties, on the non-fastpath. Also this implementation tends to be the stable in performance across different javascript engines.
I have done a new version based on the algorithm of html-entities. I would really like to present you a easy oneliner, but it seems to that the loop based version deliver more stable performance across various engines. Complate-Cpp BenchmarksI ran the Benchmarks again with our new algorithms. Loop-Based is the one i have commited.
I have also updated the online benchmark. |
@FND what is your opinion on that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty solid!
I don't currently have sufficient brains left to fully grok all the details, but hoping to return to that soon. Relying on - and validating - an existing implementation's experience certainly seems like a good choice - especially in the face of an ever-changing engine landscape. Thanks again for your efforts and for sticking with it.
I'm happy to run further performance tests (→ Array#join
) myself and also tweak some minor stylistic preferences (→ bouncer pattern, avoiding const
because I like to buck the trend, that sorta thing) as soon as I get to it.
src/html.js
Outdated
res += str.substring(lastIndex); | ||
} | ||
} else { | ||
res = str; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I prefer the bouncer pattern to avoid excessive nesting, reduce action at a distance etc.
if(!match) {
return str;
}
let rx = attribute ? /[&<>'"]/g : /[&<>]/g;
…
return res;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. yeah it would better match the nature of the function. If it would be able to speak it would say:
I have seen so many strings in my life, i expect there are no evill in it and then it's no more todo. Just take it.
But when i found something, then i have to do a bit more to protect you.
😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty solid!
Btw. I have to thank. You have created an amazing project (i really love JSX) and spend your time to let me part of it ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad you find it useful and are willing to contribute.
I should mention that I'm not currently using complate myself and thus might not always be able to provide a timely response. So just to be safe, you might wanna send a quick note (e.g. by creating an issue) before starting any larger efforts. Having said that, there are other people around here who are perfectly capable of providing feedback.
src/html.js
Outdated
let res; | ||
let match = rx.exec(str); | ||
if(match) { | ||
res = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It used to be the case that string concatenation was expensive, so people collected strings in an array which was eventually #join
d. We might eke out a little more performance that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen so many unexpected results, that it would be always worth a try.
According to The Performance of JavaScript String Concat form 2017, i expect that we are not gaining performance as long as we are not throwing a whole html document in one single #encodeHtml
call.
But it would be explainable that #join
will outperform at some input, when we make three assumptions about an engine:
- Array is fast.
- Strings are stored in an immutable piece of memory.
- Substrings are stored as a non owning reference to the original string.
Then just put the strings together will result in creating a new string, copy old content and deal with the old string which have to be deleted (or marked for it) on each concatination. Where on the #join
strategy the result string will be created only once at the end, where the length of the underlying piece of memory is known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... when we take the medium post, then might the #generateAttributes
will profit from using string concatination instead of #join
, becuase it's a small amount of short strings. I make a try on that idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... when we take the medium post, then might the #generateAttributes will profit from using string concatination instead of #join, becuase it's a small amount of short strings. I make a try on that idea.
Bad idea from me to enlarge the context. We should stick on our work at #encodeHtml
first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like string concatenation is indeed faster these days: https://jsben.ch/VBdEz - I'm not entirely sure how representative that sample really is, but it seems to be a negligible micro-optimization either way.
naming, bouncer pattern, `let` consistency
As noted earlier, I've tweaked a few details in a5830ea (see commit message); if you're fine with that, I think this is ready to go. |
Your change looks good to me, the names are much more better. I ran all my tests and all have passed. The bechmark shows a little drop on QuickJs (0.4ms @ 100 todos), not noticeable on other engines. But with what we achieved, this seems to me irrelevant. I agree, let's take this nice profit. 👍 |
✅ Squashed as fb7fd6b and released as v0.16.10. I'm surprised there's a difference to the previous version, but that's just me idly wondering whether that's just chance - but yeah, negligible either way. Thank you. |
Hi. I have looked at issue #1 and find out that the current htmlEncode can be improved in performance. First i have tried to include html-entities, which will work but i was not brave enogh to include it, because it bundles a lot of stuff into the views bundle. Also htmlEncode is a public API of the library and i don't wanted to change it's produced encoding, because it might break user tests. So i ended up by creating a "own" implementation again, to speed things up.
Micro-Benchmarks
The speed improvement highly depends on the input and the underlying javascript engine. I have prepared some micro-benchmarks for you, that shows the new version should be faster. Just click to run it, these benchmarks will only run the htmlEncode function with some input.
Complate-cpp Benchmarks
I have also run complate-cpp rendering benchmarks with the current and the optimized implementation. I'm using a Intel i7-12700KF which runs an Ubuntu 20.04 on WSL2. All results are mean values of all iterations, i have taken best of five runs each, rounded to one decimal.
To run the complate-cpp benchmarks by yourself on Linux, assuming you have the dependencies installed on your system.
mkdir build cd build cmake -DBUILD_TESTS=on .. make test/lib/complate-tests QuickJsRendererBenchmark test/lib/complate-tests V8RendererBenchmark