Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upLazy transcoding of DOM strings #1880
Comments
|
Perhaps we’ll want different strategies for different types of strings: content of a text node, element name, attribute name, attribute value, … |
|
This may impact #1879 in the choice of the encoding for the input to the HTML tokenizer. |
|
Yep, that's why I'm thinking about it right now :) Will send an email regarding this today. |
|
Dropping a link to Ms2ger's UCS-2 string experiments: Ms2ger/servo@mozilla:master...Ms2ger:strings |
|
cc @bzbarsky |
|
This is part of my current plan for strings in html5ever. |
|
Is this issue talking about something that's still relevant, or is it all stuff that was later decided to just make UTF8 all the time? |
|
Given that spidermonkey is adding more and more APIs that accept UTF-8, I think we can close this. |
DOM APIs and SpiderMonkey require us to use UCS-2 strings, but the vast majority of content comes off the wire in UTF-8 or another ASCII-compatible encoding. We can save memory and potentially time by storing a DOM string as UTF-8 and converting it to UCS-2 when first touched by script.
Not sure how this would interact with interning.
See #282, #1701.