Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web: custom string interning #142

Closed
jkelleyrtp opened this issue Jan 21, 2022 · 8 comments
Closed

Web: custom string interning #142

jkelleyrtp opened this issue Jan 21, 2022 · 8 comments

Comments

@jkelleyrtp
Copy link
Member

In the web, we use wasm-bindgens intern feature to cache strings on the js/rust boundary. However, this approach forces us to come up with a list of strings at compile-time. Instead, we should implement our own interning feature that uses a bloom filter and a configurable amount of cached elements, or some sort of better heuristic (like length). The time to cache/uncache is pretty significant in benchmarks, and it's unnatural to tell users to pre-populate the string cache if they want good performance.

@dranikpg
Copy link

This is an interesting issue to research. Its important to define 'good' performance and goals ahead: is it average or worst-case?

I've traced the example projects and some of my own pet projects. Almost all good candidates (by most frequent appearance) for interning already were part of the builtin predefined cache. This is generally because applications have few repetitive content (text and class names). I assume, that having a smart automatic interning algorithm would reduce overall performance.

However the rules above don't apply for applications with large lists, tables and highly repetitive content. Given, that we'll be tracing DomEdits, we can tell tags and attributes apart from user text and values. The former are probably already interned, the latter require more analysis. And this is an opportunity for reducing worst case performance (no predictions on average), so the main goal would be high adaptiveness and flexibility. For example, detecting that we're in the middle (or better at the start) of rendering a large table with repetitive text and styles.

web-sys is already looking up every string in a hashmap if string interning is enabled (if only we could supply it with pre-calculated hashes 😢).

A good approach with high adaptiveness would be having a sliding window counting the most frequent string among the last X requests. As soon as a string appears more than 2 or 3 times, it should be interned. And then easily be replaced if its not in the window. A bloom filter is great for tracking large amounts of data in a compact manner, but doesn't tell let you detect rapid changes or delete elements from it. I'll experiment around before giving any detailed results.

@jkelleyrtp
Copy link
Member Author

This is an interesting issue to research. Its important to define 'good' performance and goals ahead: is it average or worst-case?

I've traced the example projects and some of my own pet projects. Almost all good candidates (by most frequent appearance) for interning already were part of the builtin predefined cache. This is generally because applications have few repetitive content (text and class names). I assume, that having a smart automatic interning algorithm would reduce overall performance.

However the rules above don't apply for applications with large lists, tables and highly repetitive content. Given, that we'll be tracing DomEdits, we can tell tags and attributes apart from user text and values. The former are probably already interned, the latter require more analysis. And this is an opportunity for reducing worst case performance (no predictions on average), so the main goal would be high adaptiveness and flexibility. For example, detecting that we're in the middle (or better at the start) of rendering a large table with repetitive text and styles.

web-sys is already looking up every string in a hashmap if string interning is enabled (if only we could supply it with pre-calculated hashes 😢).

A good approach with high adaptiveness would be having a sliding window counting the most frequent string among the last X requests. As soon as a string appears more than 2 or 3 times, it should be interned. And then easily be replaced if its not in the window. A bloom filter is great for tracking large amounts of data in a compact manner, but doesn't tell let you detect rapid changes or delete elements from it. I'll experiment around before giving any detailed results.

It's a very exciting problem! We can do a lot of work to bridge the gap between Rust and JS frameworks until WASI types are supported and UTF8 JS strings are accepted naturally.

Our DomEdits are a tad naive in recognizing lists... hence why template support would be interesting.

I would love to see some prototypes of different heuristics. It might be worth wiring up the few benchmarks we have into CI/CD to get a sense of performance on every push and plot that over time.

Pre-hashing, bloom filters, tries, and other techniques would be very interesting to explore and I'd love to see any prototypes you might come up with. Hopefully the string cache code is simple enough to root around in.

@jkelleyrtp
Copy link
Member Author

This is a somewhat solved problem with templates now. We have a fixed list of strings and then the templates are always made once.

jkelleyrtp added a commit that referenced this issue Jun 29, 2023
[documention update] add mandatory hint to the config documentation
@gdennie
Copy link

gdennie commented Aug 28, 2023

Any reason why you didn't simply create or use a utf16 crate such as utf16string?

Granted, the non-web platforms use utf8 but could that difference be resolved via a trait or type alias?

@ealmloff
Copy link
Member

ealmloff commented Aug 28, 2023

Any reason why you didn't simply create or use a utf16 crate such as utf16string?

Granted, the non-web platforms use utf8 but could that difference be resolved via a trait or type alias?

It doesn't appear to be the case that converting from a utf16 string to a JavaScript string is faster https://jsbench.me/ptllv2ek5d/1. Sledgehammer bindgen helps with string conversion

@gdennie
Copy link

gdennie commented Aug 29, 2023

Somewhat terse response...I presume you are saying that converting from utf8 to utf16 and vice versa within Javascript is sufficiently performant. However, though that may be the case and indeed my suggestion might amount to premature optimization; nonetheless, avoiding the need for conversion in the first place seems inherently more performant. 😀

It would also be a great demonstration and exploitation of Rust fantastic trait based abstraction mechanism.

In any event, you know best. I am really a novice. Thanks for your attention 🙂.

@ealmloff
Copy link
Member

ealmloff commented Aug 29, 2023

Somewhat terse response...I presume you are saying that converting from utf8 to utf16 and vice versa within Javascript is sufficiently performant. However, though that may be the case and indeed my suggestion might amount to premature optimization; nonetheless, avoiding the need for conversion in the first place seems inherently more performant. 😀

Sorry, I was trying to get something out in between meetings before I forgot. Should have read that again before I sent it. I hadn't tried that before. I meant it doesn't appear to make things faster. It seemed promising so I created a benchmark. I would have expected converting a utf16 array to a JavaScript string to be faster because you don't have to change encodings. If you run the benchmark in my last message it seems to be slower instead. I'm not entirely sure why? Maybe the uft16 version is just copying more memory?

Part of this is due to the types of string operations we do. I suspect that adding two strings already in JavaScript would be faster than adding them in rust and then converting the arrays to a JavaScript string but most of Dioxus' code would require the strings to start in rust code so it is difficult to avoid the conversation.

We do a few things to be faster with string conversion:

  1. We dynamicly cache the most recently used string for elements, attributes, etc by hashing the pointer (as opposed to a prebaked list like this PR adds)
  2. We combine all strings on the rust side into one long string and then only call decode once in the JavaScript side. This is the main thing that sledgehammer-bindgen does faster
  3. We use hyristics to choose a string decoding algorithm

@gdennie
Copy link

gdennie commented Aug 29, 2023

Perhaps I was nieve in thinking that the Javascript runtime could accept the byte array as a proper reference to a utf16 string without decoding or other gymnastics. If that were possible and it didn't perform any subsequent reallocation then I expect performance would be better. Of course, in the general case we would creating Javascript types in the interface buffer and perhaps bumping the specification of WASM, something that's out of scope.

Thanks again for your attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants