Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upString representation and internment #282
Comments
|
Netsurf uses libwapcaplet for string internment. Having a common representation is certainly desirable. I think the best option is probably to put something in the core/std Rust library. |
|
We talked a little about this today on the conference call. Rust has no general-purpose string interner, though rustc itself does have a string interner that is not multithreaded. I agree that string internment is important enough that if we can develop a general solution it should live in Rust's libextra. |
|
Hello Servo people, As part of my cleanup of rustc's libsyntax I'm rewriting the interner. I'd like the Interner trait to be generically usable, mind looking over my current work? https://gist.github.com/cmr/5846269 In particular I think |
|
During the meeting we concluded that we're not quite ready to give specific feedback. |
|
graydon posted this over on the rust interner bug: http://people.mozilla.org/%7Egraydon/interner.rs |
|
Adam Barth just posted a breakdown on strings in Blink. |
|
I like Blink’s design, but does SpiderMonkey have a hard requirement on UTF-16 buffers? |
|
All SpiderMonkey strings are UTF-16, full stop. |
|
Long term SpiderMonkey could evolve, but you cannot get 16-bit unsigned integers out of the platform. Due to JavaScript's ever growing presence lone surrogates are everywhere. This means that Servo will need to have 16-bit unsigned integers as string representation and if Rust and Servo want to share a bunch of libraries, such as URL, Encoding, HTML, CSS, JavaScript, etc. they should get the same string type. If the Rust community is uncomfortable with strings that can have lone surrogates maybe a wrapper could be provided that guarantees the value space is the same as that of utf-8 (i.e. only Unicode scalar values). |
|
Closing as a dupe of #1153 and the others linked here. |
Servo currently has several string representations, some with their own internment strategies: SpiderMonkey, Rust and netsurfcss. We want just one.