Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: Stabilize the std::str module #19741

Merged
merged 2 commits into from Dec 23, 2014
Merged

Conversation

alexcrichton
Copy link
Member

This commit starts out by consolidating all str extension traits into one
StrExt trait to be included in the prelude. This means that
UnicodeStrPrelude, StrPrelude, and StrAllocating have all been merged into
one StrExt exported by the standard library. Some functionality is currently
duplicated with the StrExt present in libcore.

This commit also currently avoids any methods which require any form of pattern
to operate. These functions will be stabilized via a separate RFC.

Next, stability of methods and structures are as follows:

Stable

  • from_utf8_unchecked
  • CowString - after moving to std::string
  • StrExt::as_bytes
  • StrExt::as_ptr
  • StrExt::bytes/Bytes - also made a struct instead of a typedef
  • StrExt::char_indices/CharIndices - CharOffsets was renamed
  • StrExt::chars/Chars
  • StrExt::is_empty
  • StrExt::len
  • StrExt::lines/Lines
  • StrExt::lines_any/LinesAny
  • StrExt::slice_unchecked
  • StrExt::trim
  • StrExt::trim_left
  • StrExt::trim_right
  • StrExt::words/Words - also made a struct instead of a typedef

Unstable

  • from_utf8 - the error type was changed to a Result, but the error type has
    yet to prove itself
  • from_c_str - this function will be handled by the c_str RFC
  • FromStr - this trait will have an associated error type eventually
  • StrExt::escape_default - needs iterators at least, unsure if it should make
    the cut
  • StrExt::escape_unicode - needs iterators at least, unsure if it should make
    the cut
  • StrExt::slice_chars - this function has yet to prove itself
  • StrExt::slice_shift_char - awaiting conventions about slicing and shifting
  • StrExt::graphemes/Graphemes - this functionality may only be in libunicode
  • StrExt::grapheme_indices/GraphemeIndices - this functionality may only be in
    libunicode
  • StrExt::width - this functionality may only be in libunicode
  • StrExt::utf16_units - this functionality may only be in libunicode
  • StrExt::nfd_chars - this functionality may only be in libunicode
  • StrExt::nfkd_chars - this functionality may only be in libunicode
  • StrExt::nfc_chars - this functionality may only be in libunicode
  • StrExt::nfkc_chars - this functionality may only be in libunicode
  • StrExt::is_char_boundary - naming is uncertain with container conventions
  • StrExt::char_range_at - naming is uncertain with container conventions
  • StrExt::char_range_at_reverse - naming is uncertain with container conventions
  • StrExt::char_at - naming is uncertain with container conventions
  • StrExt::char_at_reverse - naming is uncertain with container conventions
  • StrVector::concat - this functionality may be replaced with iterators, but
    it's not certain at this time
  • StrVector::connect - as with concat, may be deprecated in favor of iterators

Deprecated

  • StrAllocating and UnicodeStrPrelude have been merged into StrExit
  • eq_slice - compiler implementation detail
  • from_str - use the inherent parse() method
  • is_utf8 - call from_utf8 instead
  • replace - call the method instead
  • truncate_utf16_at_nul - this is an implementation detail of windows and does
    not need to be exposed.
  • utf8_char_width - moved to libunicode
  • utf16_items - moved to libunicode
  • is_utf16 - moved to libunicode
  • Utf16Items - moved to libunicode
  • Utf16Item - moved to libunicode
  • Utf16Encoder - moved to libunicode
  • AnyLines - renamed to LinesAny and made a struct
  • SendStr - use CowString<'static> instead
  • str::raw - all functionality is deprecated
  • StrExt::into_string - call to_string() instead
  • StrExt::repeat - use iterators instead
  • StrExt::char_len - use .chars().count() instead
  • StrExt::is_alphanumeric - use .chars().all(..)
  • StrExt::is_whitespace - use .chars().all(..)

Pending deprecation -- while slicing syntax is being worked out, these methods
are all #[unstable]

  • Str - while currently used for generic programming, this trait will be
    replaced with one of [], deref coercions, or a generic conversion trait.
  • StrExt::slice - use slicing syntax instead
  • StrExt::slice_to - use slicing syntax instead
  • StrExt::slice_from - use slicing syntax instead
  • StrExt::lev_distance - deprecated with no replacement

Awaiting stabilization due to patterns and/or matching

  • StrExt::contains
  • StrExt::contains_char
  • StrExt::split
  • StrExt::splitn
  • StrExt::split_terminator
  • StrExt::rsplitn
  • StrExt::match_indices
  • StrExt::split_str
  • StrExt::starts_with
  • StrExt::ends_with
  • StrExt::trim_chars
  • StrExt::trim_left_chars
  • StrExt::trim_right_chars
  • StrExt::find
  • StrExt::rfind
  • StrExt::find_str
  • StrExt::subslice_offset

@alexcrichton
Copy link
Member Author

cc @aturon, I believe this is what we discussed at the work week
cc @Kimundi, I left out all of the methods of your pre-RFC, but you may be interested in this as well

@Gankra
Copy link
Contributor

Gankra commented Dec 11, 2014

How does this interact with #19612? CC @japaric

@erickt
Copy link
Contributor

erickt commented Dec 11, 2014

@alexcrichton / @aturon: Have you considered replacing Str::as_bytes() with an implementation of slice::AsSlice for &str? That would allow us to write a function like:

fn foo<T: AsSlice>(x: T) {
    let x = x.as_slice();
    ...
}

foo("foo");
foo([1, 2, 3]);

Otherwise to implement this pattern we'd need to make another trait to support this pattern.

@alexcrichton
Copy link
Member Author

@erickt I think that it leads to ambiguities when you just call .as_slice() (without a trait bound).

impl ::slice::AsSlice<u8> for str {                
    fn as_slice(&self) -> &[u8] { self.as_bytes() }
}                                                  
/home/alex/code/rust3/src/libcollections/string.rs:1022:18: 1022:28 error: multiple applicable methods in scope [E0034]
/home/alex/code/rust3/src/libcollections/string.rs:1022         (**self).as_slice()
                                                                         ^~~~~~~~~~
/home/alex/code/rust3/src/libcollections/string.rs:1022:18: 1022:28 note: candidate #1 is defined in an impl of the trait `core::slice::AsSlice` for the type `&'a _`
/home/alex/code/rust3/src/libcollections/string.rs:1022         (**self).as_slice()
                                                                         ^~~~~~~~~~
/home/alex/code/rust3/src/libcollections/string.rs:1022:18: 1022:28 note: candidate #2 is defined in an impl of the trait `core::slice::AsSlice` for the type `&'a mut _`
/home/alex/code/rust3/src/libcollections/string.rs:1022         (**self).as_slice()
                                                                         ^~~~~~~~~~
/home/alex/code/rust3/src/libcollections/string.rs:1022:18: 1022:28 note: candidate #3 is defined in an impl of the trait `core::str::Str` for the type `str`
/home/alex/code/rust3/src/libcollections/string.rs:1022         (**self).as_slice()
                                                                         ^~~~~~~~~~
/home/alex/code/rust3/src/libcollections/string.rs:1022:18: 1022:28 note: candidate #4 is defined in an impl of the trait `core::slice::AsSlice` for the type `str`
/home/alex/code/rust3/src/libcollections/string.rs:1022         (**self).as_slice()
                                                                         ^~~~~~~~~~
/home/alex/code/rust3/src/libcollections/string.rs:1022:18: 1022:28 note: candidate #5 is defined in an impl of the trait `core::str::Str` for the type `&'a _`
/home/alex/code/rust3/src/libcollections/string.rs:1022         (**self).as_slice()
                                                                         ^~~~~~~~~~

@alexcrichton
Copy link
Member Author

@aturon chatted with me on IRC, and he'll post something about #19612 @gankro

@aturon
Copy link
Member

aturon commented Dec 11, 2014

See #19612 (comment)

@aturon
Copy link
Member

aturon commented Dec 11, 2014

re: AsSlice, that trait is probably going away in any case, for several reasons -- amongst others, the [] notation.

@erickt
Copy link
Contributor

erickt commented Dec 11, 2014

@alexcrichton / @aturon: Yeah, we could do .as_slice(), but I'm thinking about the ergonomics of using apis, so I think there's room for something here. It's a little sad in, say, rust-mdbm, where I could either have users do:

db.set("foo".as_bytes(), "abc".as_bytes()).unwrap();
db.set("bar".as_bytes(), "def".as_bytes()).unwrap();
db.set("baz".as_bytes(), "ghi".as_bytes()).unwrap();

Or add a wrapper for setting string keys with:

db.set_str("foo", "abc".as_bytes()).unwrap();
db.set_str("bar", "def".as_bytes()).unwrap();
db.set_str("baz", "ghi".as_bytes()).unwrap();

There's a bit of line noise in both approaches. It would be much nicer to have something like AsSlice that lets me write:

db.set("foo", "abc").unwrap();
db.set("bar", "def").unwrap();
db.set("baz", "ghi").unwrap();

I could write a trait for my library to do this, but this pattern would then force people wanting to support AsSlice-like behavior for their types, they'd have to implement essentially the same type for each library that uses this pattern. It'd be much nicer to have something like this in the standard library.

@@ -649,10 +655,11 @@ impl BorrowFrom<String> for str {

#[unstable = "trait is unstable"]
impl ToOwned<String> for str {
fn to_owned(&self) -> String { self.into_string() }
fn to_owned(&self) -> String { self.to_string() }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton Please implement this as String(self.as_bytes().to_vec()) (you may need to move it to collections/string.rs). Let's avoid degrading the performance of this method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or you could use String::from_str(), I think it does the same thing, and doesn't need moving this impl)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I'll switch it over.

@reem
Copy link
Contributor

reem commented Dec 12, 2014

I dislike that this leaves the obvious thing to do when converting a string literal to a String (or really any str to String), to_string as the much more inefficient method over to_owned, which is much less obvious.

I know that it's more consistent and micro-benchmarks etc., but it feels just silly for the most obvious thing to make a full roundtrip through the formatting infrastructure and a redundant check for valid utf-8, in addition to over-allocating.

This was a wart when the answer was into_string, but since we are now talking about stabilizing this module, we should really consider this situation more critically.

@alexcrichton
Copy link
Member Author

The question of the efficiency of the formatting subsystem is somewhat orthogonal in my mind because into_string was clearly the wrong name for the API (no consumption was happening) and it has been replaced with to_owned via the std::borrow module. This means that the status quo is basically the same as it is today.

@tbu-
Copy link
Contributor

tbu- commented Dec 12, 2014

@alexcrichton Except that this PR is stabilizing this status quo.

@alexcrichton
Copy link
Member Author

Remember that this is deprecating into_string for the exact same functionality provided by to_owned. Stabilizing what we have today does not mean we're not allowed to add more methods in the future.

@thestinger
Copy link
Contributor

It's not orthogonal. You're causing a severe performance regression. Attention to performance is part of API design, and even if it was an implementation issue it is still a stupid regression.

@erickt
Copy link
Contributor

erickt commented Dec 13, 2014

@alexcrichton: The conflict you're having with impl AsSlice for &str looks like it's happening because you didn't rename Str::as_slice to into, say, Str::as_str. I'm not sure why your implementation is conflicting with the other impls for AsSlice, as this works fine for me:

#![feature(lang_items, macro_rules)]
#![no_std]
#![crate_type = "staticlib"]

extern crate core;

pub unsafe fn replace<T>(dest: *mut T, mut src: T) -> T {                               |
use core::kinds::Sized;

#[lang = "stack_exhausted"] extern fn stack_exhausted() {}
#[lang = "eh_personality"] extern fn eh_personality() {}
#[lang = "panic_fmt"] fn panic_fmt() -> ! { loop {} }

#[unstable = "may merge with other traits"]
pub trait AsSlice<T> for Sized? {
    fn as_slice<'a>(&'a self) -> &'a [T];
}

#[unstable = "trait is unstable"]
impl<T> AsSlice<T> for [T] {
    #[inline(always)]
    fn as_slice<'a>(&'a self) -> &'a [T] { self }
}

impl<'a, T, Sized? U: AsSlice<T>> AsSlice<T> for &'a U {
    #[inline(always)]
    fn as_slice<'a>(&'a self) -> &'a [T] { AsSlice::as_slice(*self) }
}

impl<'a, T, Sized? U: AsSlice<T>> AsSlice<T> for &'a mut U {
    #[inline(always)]
    fn as_slice<'a>(&'a self) -> &'a [T] { AsSlice::as_slice(*self) }
}

impl<'a> AsSlice<u8> for str {
    #[inline(always)]
    fn as_slice<'a>(&'a self) -> &'a [u8] {
        unsafe { core::mem::transmute(self) }
    }
}

@aturon: Will the [] notation be a part of a trait that str could potentially implement? I care more about the functionality than the trait/method :)

@alexcrichton: I'm very sad to see StrAllocating::into_string going away. I missed that when I read through your proposal. I found it to be quite handy for my serde::json::ObjectBuilder, where I can let users do:

// Trigger a copy for me.
let o = ObjectBuilder::new().insert("foo", ...).unwrap();

// Move the string into the `json::Value` enum with no allocation.
let key = String::new("foo");
let o = ObjectBuilder::new().insert(key, ...).unwrap();

Since I'm betting most users are going to use ObjectBuilder with static strings, that means users will have to do .insert("foo".to_string(), ...), which feels really noisy after a while. On the other hand, if we try to be less noisy with .insert("foo", ...) and we actually have a key we want to move, we trigger a needless allocation.

I could have two APIs again, .insert(&str)/.insert_string(String), but that leads to duplication and ugliness. Or I guess I could have my own variation of StrAllocating, but then we risk having variations of that trait everywhere, and lots of boilerplate impls that do the same thing for each library.

@erickt
Copy link
Contributor

erickt commented Dec 13, 2014

@aturon: This might just be your cast trait with a different name, but this variation on BorrowFrom seems like it could satisfy my generic casting needs:

trait BorrowFrom<'a, To> {
    fn borrow_from(&'a self) -> To;
}

impl<'a> BorrowFrom<'a, &'a [u8]> for Vec<u8> {
    fn borrow_from(&'a self) -> &'a [u8] {
        self.as_slice()
    }
}

impl<'a> BorrowFrom<'a, &'a str> for String {
    fn borrow_from(&'a self) -> &'a str {
        self.as_slice()
    }
}

impl<'a> BorrowFrom<'a, &'a [u8]> for String {
    fn borrow_from(&'a self) -> &'a [u8] {
        self.as_bytes()
    }
}

impl<'a> BorrowFrom<'a, &'a [u8]> for &'a str {
    fn borrow_from(&'a self) -> &'a [u8] {
        self.as_bytes()
    }
}

impl<'a, T: BorrowFrom<'a, U>, U> BorrowFrom<'a, U> for &'a T {
    fn borrow_from(&'a self) -> U {
        (**self).borrow_from()
    }
}

#[deriving(Show)]
struct Datum<'a> { data: &'a [u8] }

impl<'a> BorrowFrom<'a, Datum<'a>> for String {
    fn borrow_from(&'a self) -> Datum<'a> {
        self.as_slice().borrow_from()
    }
}

impl<'a> BorrowFrom<'a, Datum<'a>> for &'a str {
    fn borrow_from(&'a self) -> Datum<'a> {
        Datum { data: self.as_bytes() }
    }
}

fn foo_slice<'a, T>(t: &'a T) where T: BorrowFrom<'a, &'a [u8]> {
    let datum: &'a [u8] = t.borrow_from();
    println!("datum: {}", datum);
}

fn foo_str<'a, T>(t: &'a T) where T: BorrowFrom<'a, &'a str> {
    let datum: &'a str = t.borrow_from();
    println!("datum: {}", datum);
}

fn foo_custom<'a, T>(t: &'a T) where T: BorrowFrom<'a, Datum<'a>> {
    let datum: Datum<'a> = t.borrow_from();
    println!("datum: {}", datum);
}

fn main() {
    let s = "hello world".to_string();
    foo_slice(&s);
    foo_str(&s);
    foo_custom(&s);
}

@alexcrichton alexcrichton force-pushed the stabilize-str branch 4 times, most recently from e71d542 to 66925c4 Compare December 14, 2014 02:47
@alexcrichton
Copy link
Member Author

@erickt Yes I didn't rename to as_str, and the conflict doesn't happen on the impl but rather when you call the method and have many of the traits in scope. The [] notation already works on strings today as well via the ops::Slice trait.

For your use case I know @aturon has also been thinking about a generic set of conversion traits recently to serve a more broad purpose. Having lots of little one-off traits would be unfortunate for all types in the standard library (e.g. why should we not have IntoVec, IntoPath, etc), so the conversion story is something we'd like to revisit in a more general pattern than exists today.

@reem
Copy link
Contributor

reem commented Dec 14, 2014

@alexcrichton The discomfort (at least for me) is not so much that this basically changes the idiom from into_string to to_owned but that this leaves using to_string to convert to a string as unidiomatic, which is a nasty wart.

@aturon
Copy link
Member

aturon commented Dec 14, 2014

@alexcrichton @erickt

For your use case I know @aturon has also been thinking about a generic set of conversion traits recently to serve a more broad purpose. Having lots of little one-off traits would be unfortunate for all types in the standard library (e.g. why should we not have IntoVec, IntoPath, etc), so the conversion story is something we'd like to revisit in a more general pattern than exists today.

Yes, that's right -- for traits whose sole purpose is generic programming over conversions (i.e. providing implicit conversions via overloading), we should be able to replace them with a single set of traits that everyone knows/uses/implements. This should cut down on the problem of people having to know and implement your custom trait to be compatible with your library.

@aturon
Copy link
Member

aturon commented Dec 14, 2014

@erickt

@aturon: Will the [] notation be a part of a trait that str could potentially implement? I care more about the functionality than the trait/method :)

Yep. The trait will be Index.

@alexcrichton: I'm very sad to see StrAllocating::into_string going away.

I think that generic conversion traits will serve this role much better, as I mentioned in my previous comment. "Overloading over ownership" is a pattern that's emerging in several APIs (you can see it in the Command API, for example) where a function wants ownership, and you can either give it an owned value directly or a reference to something that can be cloned into something owned. The exact form of this stuff will be the subject of an RFC I plan to get out next week, but the main point is that these changes to str are in favor of a more general replacement.

@aturon
Copy link
Member

aturon commented Dec 14, 2014

@alexcrichton Ok, I've looked this over and it looks good to me -- just a couple of tiny typos.

r=me once we've resolved the to_string question.

(We'll need to discuss methods like char_range_at more at some point soon.)

This commit starts out by consolidating all `str` extension traits into one
`StrExt` trait to be included in the prelude. This means that
`UnicodeStrPrelude`, `StrPrelude`, and `StrAllocating` have all been merged into
one `StrExt` exported by the standard library. Some functionality is currently
duplicated with the `StrExt` present in libcore.

This commit also currently avoids any methods which require any form of pattern
to operate. These functions will be stabilized via a separate RFC.

Next, stability of methods and structures are as follows:

Stable

* from_utf8_unchecked
* CowString - after moving to std::string
* StrExt::as_bytes
* StrExt::as_ptr
* StrExt::bytes/Bytes - also made a struct instead of a typedef
* StrExt::char_indices/CharIndices - CharOffsets was renamed
* StrExt::chars/Chars
* StrExt::is_empty
* StrExt::len
* StrExt::lines/Lines
* StrExt::lines_any/LinesAny
* StrExt::slice_unchecked
* StrExt::trim
* StrExt::trim_left
* StrExt::trim_right
* StrExt::words/Words - also made a struct instead of a typedef

Unstable

* from_utf8 - the error type was changed to a `Result`, but the error type has
              yet to prove itself
* from_c_str - this function will be handled by the c_str RFC
* FromStr - this trait will have an associated error type eventually
* StrExt::escape_default - needs iterators at least, unsure if it should make
                           the cut
* StrExt::escape_unicode - needs iterators at least, unsure if it should make
                           the cut
* StrExt::slice_chars - this function has yet to prove itself
* StrExt::slice_shift_char - awaiting conventions about slicing and shifting
* StrExt::graphemes/Graphemes - this functionality may only be in libunicode
* StrExt::grapheme_indices/GraphemeIndices - this functionality may only be in
                                             libunicode
* StrExt::width - this functionality may only be in libunicode
* StrExt::utf16_units - this functionality may only be in libunicode
* StrExt::nfd_chars - this functionality may only be in libunicode
* StrExt::nfkd_chars - this functionality may only be in libunicode
* StrExt::nfc_chars - this functionality may only be in libunicode
* StrExt::nfkc_chars - this functionality may only be in libunicode
* StrExt::is_char_boundary - naming is uncertain with container conventions
* StrExt::char_range_at - naming is uncertain with container conventions
* StrExt::char_range_at_reverse - naming is uncertain with container conventions
* StrExt::char_at - naming is uncertain with container conventions
* StrExt::char_at_reverse - naming is uncertain with container conventions
* StrVector::concat - this functionality may be replaced with iterators, but
                      it's not certain at this time
* StrVector::connect - as with concat, may be deprecated in favor of iterators

Deprecated

* StrAllocating and UnicodeStrPrelude have been merged into StrExit
* eq_slice - compiler implementation detail
* from_str - use the inherent parse() method
* is_utf8 - call from_utf8 instead
* replace - call the method instead
* truncate_utf16_at_nul - this is an implementation detail of windows and does
                          not need to be exposed.
* utf8_char_width - moved to libunicode
* utf16_items - moved to libunicode
* is_utf16 - moved to libunicode
* Utf16Items - moved to libunicode
* Utf16Item - moved to libunicode
* Utf16Encoder - moved to libunicode
* AnyLines - renamed to LinesAny and made a struct
* SendStr - use CowString<'static> instead
* str::raw - all functionality is deprecated
* StrExt::into_string - call to_string() instead
* StrExt::repeat - use iterators instead
* StrExt::char_len - use .chars().count() instead
* StrExt::is_alphanumeric - use .chars().all(..)
* StrExt::is_whitespace - use .chars().all(..)

Pending deprecation -- while slicing syntax is being worked out, these methods
are all #[unstable]

* Str - while currently used for generic programming, this trait will be
        replaced with one of [], deref coercions, or a generic conversion trait.
* StrExt::slice - use slicing syntax instead
* StrExt::slice_to - use slicing syntax instead
* StrExt::slice_from - use slicing syntax instead
* StrExt::lev_distance - deprecated with no replacement

Awaiting stabilization due to patterns and/or matching

* StrExt::contains
* StrExt::contains_char
* StrExt::split
* StrExt::splitn
* StrExt::split_terminator
* StrExt::rsplitn
* StrExt::match_indices
* StrExt::split_str
* StrExt::starts_with
* StrExt::ends_with
* StrExt::trim_chars
* StrExt::trim_left_chars
* StrExt::trim_right_chars
* StrExt::find
* StrExt::rfind
* StrExt::find_str
* StrExt::subslice_offset
bors added a commit that referenced this pull request Dec 22, 2014
This commit starts out by consolidating all `str` extension traits into one
`StrExt` trait to be included in the prelude. This means that
`UnicodeStrPrelude`, `StrPrelude`, and `StrAllocating` have all been merged into
one `StrExt` exported by the standard library. Some functionality is currently
duplicated with the `StrExt` present in libcore.

This commit also currently avoids any methods which require any form of pattern
to operate. These functions will be stabilized via a separate RFC.

Next, stability of methods and structures are as follows:

Stable

* from_utf8_unchecked
* CowString - after moving to std::string
* StrExt::as_bytes
* StrExt::as_ptr
* StrExt::bytes/Bytes - also made a struct instead of a typedef
* StrExt::char_indices/CharIndices - CharOffsets was renamed
* StrExt::chars/Chars
* StrExt::is_empty
* StrExt::len
* StrExt::lines/Lines
* StrExt::lines_any/LinesAny
* StrExt::slice_unchecked
* StrExt::trim
* StrExt::trim_left
* StrExt::trim_right
* StrExt::words/Words - also made a struct instead of a typedef

Unstable

* from_utf8 - the error type was changed to a `Result`, but the error type has
              yet to prove itself
* from_c_str - this function will be handled by the c_str RFC
* FromStr - this trait will have an associated error type eventually
* StrExt::escape_default - needs iterators at least, unsure if it should make
                           the cut
* StrExt::escape_unicode - needs iterators at least, unsure if it should make
                           the cut
* StrExt::slice_chars - this function has yet to prove itself
* StrExt::slice_shift_char - awaiting conventions about slicing and shifting
* StrExt::graphemes/Graphemes - this functionality may only be in libunicode
* StrExt::grapheme_indices/GraphemeIndices - this functionality may only be in
                                             libunicode
* StrExt::width - this functionality may only be in libunicode
* StrExt::utf16_units - this functionality may only be in libunicode
* StrExt::nfd_chars - this functionality may only be in libunicode
* StrExt::nfkd_chars - this functionality may only be in libunicode
* StrExt::nfc_chars - this functionality may only be in libunicode
* StrExt::nfkc_chars - this functionality may only be in libunicode
* StrExt::is_char_boundary - naming is uncertain with container conventions
* StrExt::char_range_at - naming is uncertain with container conventions
* StrExt::char_range_at_reverse - naming is uncertain with container conventions
* StrExt::char_at - naming is uncertain with container conventions
* StrExt::char_at_reverse - naming is uncertain with container conventions
* StrVector::concat - this functionality may be replaced with iterators, but
                      it's not certain at this time
* StrVector::connect - as with concat, may be deprecated in favor of iterators

Deprecated

* StrAllocating and UnicodeStrPrelude have been merged into StrExit
* eq_slice - compiler implementation detail
* from_str - use the inherent parse() method
* is_utf8 - call from_utf8 instead
* replace - call the method instead
* truncate_utf16_at_nul - this is an implementation detail of windows and does
                          not need to be exposed.
* utf8_char_width - moved to libunicode
* utf16_items - moved to libunicode
* is_utf16 - moved to libunicode
* Utf16Items - moved to libunicode
* Utf16Item - moved to libunicode
* Utf16Encoder - moved to libunicode
* AnyLines - renamed to LinesAny and made a struct
* SendStr - use CowString<'static> instead
* str::raw - all functionality is deprecated
* StrExt::into_string - call to_string() instead
* StrExt::repeat - use iterators instead
* StrExt::char_len - use .chars().count() instead
* StrExt::is_alphanumeric - use .chars().all(..)
* StrExt::is_whitespace - use .chars().all(..)

Pending deprecation -- while slicing syntax is being worked out, these methods
are all #[unstable]

* Str - while currently used for generic programming, this trait will be
        replaced with one of [], deref coercions, or a generic conversion trait.
* StrExt::slice - use slicing syntax instead
* StrExt::slice_to - use slicing syntax instead
* StrExt::slice_from - use slicing syntax instead
* StrExt::lev_distance - deprecated with no replacement

Awaiting stabilization due to patterns and/or matching

* StrExt::contains
* StrExt::contains_char
* StrExt::split
* StrExt::splitn
* StrExt::split_terminator
* StrExt::rsplitn
* StrExt::match_indices
* StrExt::split_str
* StrExt::starts_with
* StrExt::ends_with
* StrExt::trim_chars
* StrExt::trim_left_chars
* StrExt::trim_right_chars
* StrExt::find
* StrExt::rfind
* StrExt::find_str
* StrExt::subslice_offset
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Dec 22, 2014
This commit starts out by consolidating all `str` extension traits into one
`StrExt` trait to be included in the prelude. This means that
`UnicodeStrPrelude`, `StrPrelude`, and `StrAllocating` have all been merged into
one `StrExt` exported by the standard library. Some functionality is currently
duplicated with the `StrExt` present in libcore.

This commit also currently avoids any methods which require any form of pattern
to operate. These functions will be stabilized via a separate RFC.

Next, stability of methods and structures are as follows:

Stable

* from_utf8_unchecked
* CowString - after moving to std::string
* StrExt::as_bytes
* StrExt::as_ptr
* StrExt::bytes/Bytes - also made a struct instead of a typedef
* StrExt::char_indices/CharIndices - CharOffsets was renamed
* StrExt::chars/Chars
* StrExt::is_empty
* StrExt::len
* StrExt::lines/Lines
* StrExt::lines_any/LinesAny
* StrExt::slice_unchecked
* StrExt::trim
* StrExt::trim_left
* StrExt::trim_right
* StrExt::words/Words - also made a struct instead of a typedef

Unstable

* from_utf8 - the error type was changed to a `Result`, but the error type has
              yet to prove itself
* from_c_str - this function will be handled by the c_str RFC
* FromStr - this trait will have an associated error type eventually
* StrExt::escape_default - needs iterators at least, unsure if it should make
                           the cut
* StrExt::escape_unicode - needs iterators at least, unsure if it should make
                           the cut
* StrExt::slice_chars - this function has yet to prove itself
* StrExt::slice_shift_char - awaiting conventions about slicing and shifting
* StrExt::graphemes/Graphemes - this functionality may only be in libunicode
* StrExt::grapheme_indices/GraphemeIndices - this functionality may only be in
                                             libunicode
* StrExt::width - this functionality may only be in libunicode
* StrExt::utf16_units - this functionality may only be in libunicode
* StrExt::nfd_chars - this functionality may only be in libunicode
* StrExt::nfkd_chars - this functionality may only be in libunicode
* StrExt::nfc_chars - this functionality may only be in libunicode
* StrExt::nfkc_chars - this functionality may only be in libunicode
* StrExt::is_char_boundary - naming is uncertain with container conventions
* StrExt::char_range_at - naming is uncertain with container conventions
* StrExt::char_range_at_reverse - naming is uncertain with container conventions
* StrExt::char_at - naming is uncertain with container conventions
* StrExt::char_at_reverse - naming is uncertain with container conventions
* StrVector::concat - this functionality may be replaced with iterators, but
                      it's not certain at this time
* StrVector::connect - as with concat, may be deprecated in favor of iterators

Deprecated

* StrAllocating and UnicodeStrPrelude have been merged into StrExit
* eq_slice - compiler implementation detail
* from_str - use the inherent parse() method
* is_utf8 - call from_utf8 instead
* replace - call the method instead
* truncate_utf16_at_nul - this is an implementation detail of windows and does
                          not need to be exposed.
* utf8_char_width - moved to libunicode
* utf16_items - moved to libunicode
* is_utf16 - moved to libunicode
* Utf16Items - moved to libunicode
* Utf16Item - moved to libunicode
* Utf16Encoder - moved to libunicode
* AnyLines - renamed to LinesAny and made a struct
* SendStr - use CowString<'static> instead
* str::raw - all functionality is deprecated
* StrExt::into_string - call to_string() instead
* StrExt::repeat - use iterators instead
* StrExt::char_len - use .chars().count() instead
* StrExt::is_alphanumeric - use .chars().all(..)
* StrExt::is_whitespace - use .chars().all(..)

Pending deprecation -- while slicing syntax is being worked out, these methods
are all #[unstable]

* Str - while currently used for generic programming, this trait will be
        replaced with one of [], deref coercions, or a generic conversion trait.
* StrExt::slice - use slicing syntax instead
* StrExt::slice_to - use slicing syntax instead
* StrExt::slice_from - use slicing syntax instead
* StrExt::lev_distance - deprecated with no replacement

Awaiting stabilization due to patterns and/or matching

* StrExt::contains
* StrExt::contains_char
* StrExt::split
* StrExt::splitn
* StrExt::split_terminator
* StrExt::rsplitn
* StrExt::match_indices
* StrExt::split_str
* StrExt::starts_with
* StrExt::ends_with
* StrExt::trim_chars
* StrExt::trim_left_chars
* StrExt::trim_right_chars
* StrExt::find
* StrExt::rfind
* StrExt::find_str
* StrExt::subslice_offset
@bors bors merged commit 082bfde into rust-lang:master Dec 23, 2014
@alexcrichton alexcrichton deleted the stabilize-str branch December 23, 2014 05:18
@brson
Copy link
Contributor

brson commented Dec 29, 2014

These were breaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants