-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: std::string::String could provide options about UTF BOM #2428
Comments
Hey there. Thank you for your interest in designing Rust! I think you can find more interest in the issue if you post it over at: https://internals.rust-lang.org/ :) |
|
Copy of my comment on rust-lang/rust#50386 (comment), which was about "stripping" the BOM. It’s not clear what is being proposed. Which APIs exactly should strip? And more importantly, why? I tend to think of these standard library API as low-level primitives, and feel that BOM removal would tend to belong more in a higher library that might for example also support multiple encodings and detect the presence of a BOM to help pick one. And even then, maybe not always. https://docs.rs/encoding_rs/0.7.2/encoding_rs/struct.Decoder.html#impl has different methods for different use cases, only some of them remove a BOM. |
@H2CO3 @SimonSapin |
String shouldn't have magic treatment for any characters. It makes sense for encoding/decoding methods to have options for handling the BOM, such as those @SimonSapin mentioned, but none of that should happen internally to String. |
We may consider adding a method like pub fn from_utf8_with_bom(vec: Vec<u8>) -> Result<String, FromUtf8Error> or even pub fn from_utf8_with_optional_bom(vec: Vec<u8>) -> Result<String, FromUtf8Error> But like comments below, this shouldn't affect |
std::string::String
should :&str
or[u8]
, should be with or without UTF BOM by caller.p.s. I am not native English speaker, so may what I describe may differ from my original meaning.
The text was updated successfully, but these errors were encountered: