Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upLookups into HashMap<Atom, ...> have to create needless Atoms #25699
Comments
|
If there's not already a way to do it, I'd suggest something like |
|
There is no way to determine if there's a string in the atom cache, as far as I can tell. The question is also not quite as clear to answer because of the presence of inline atoms, as shown in https://github.com/servo/string-cache/blob/af2c7707e797768660b3db90066b80218dbca6f7/src/atom.rs#L171-L205. |
|
I see. Very short strings get packed verbatim instead of converted to hashes, so there's no actual cache data structure including them. There's no static allocation there, so always creating those short atoms isn't a problem the way creating long ones could be. I could see optimizing with a method that returns Some(atom) if it's either inline or in-cache and None if it's neither, but I can't think of what to name that method or how to document its semantics in a way that isn't exposing too much internal state. |
|
A way to encapsulate this idea without exposing which strings are inlined could be to make string-cache itself define a struct like |
|
We could add (The implementation would be the same as These semantics are weak, but sufficient to skip a lookup in However this feels very much like a micro-optimization. Do the creation and destruction of dynamic atoms show up while profiling? |
|
I doubt it; the thing that I'm thinking about is that someone might be running some script in a loop that does a large number of strings and then we're carrying those interned atoms around in RAM forever. I don't disagree with the papercut label. |
|
Dynamic atoms are reference counted, so they won’t stay in RAM forever unless something keeps using them. |
There's a comment pointing out this matter in:
servo/components/script/dom/stylepropertymapreadonly.rs
Lines 64 to 77 in 5f55cd5
but it's true in most or all places we have a HashMap with Atom keys.
What we should do instead is first ask "Is this string in the atom cache?", since if it's not then there's obviously no way for it to be a key of the HashMap.
Does string_cache already support asking that question?