Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: introduce `split_at(mid: usize)` on `str` #1123
Conversation
This comment has been minimized.
This comment has been minimized.
|
cc @Manishearth you requested something like this in rust-lang/rust#18063. |
This comment has been minimized.
This comment has been minimized.
|
[Edit: I'm wrong here] |
This comment has been minimized.
This comment has been minimized.
|
@Manishearth It doesn't loop at all, it uses byte indices and therefore each slice operation is constant time (just check that the index is in bounds and the byte is the start of a UTF-8 code point, then adjust the start/length of the slice). |
This comment has been minimized.
This comment has been minimized.
|
[Edit: I'm wrong here] String indexing indexes by chars IIRC, and for that you need to loop through because chars have variable length. |
This comment has been minimized.
This comment has been minimized.
|
String indexing is by bytes and checks that the index lies on a codepoint boundary (which is cheap, just check that the byte is less than ox80). This is consistent with almost all other methods, including |
This comment has been minimized.
This comment has been minimized.
|
Ah, okay. |
This comment has been minimized.
This comment has been minimized.
|
The method has only a single index that needs to be bounds checked and checked for character boundary ( |
This comment has been minimized.
This comment has been minimized.
|
This sounds like a great idea to me, thanks for writing this up @bluss! |
alexcrichton
added
the
T-libs
label
May 18, 2015
This comment has been minimized.
This comment has been minimized.
|
Since this doesn't return |
This comment has been minimized.
This comment has been minimized.
|
I am of the belief that now that the trains are running, this kind of change no longer |
This comment has been minimized.
This comment has been minimized.
|
I would expect adding any new public API to require an RFC. Twiddling the internals is just implementation that's hidden, but something that will eventually be marked |
This comment has been minimized.
This comment has been minimized.
|
@retep998 Done! @seanmonstar This was what I thought was the rule, and I agree with you. |
This comment has been minimized.
This comment has been minimized.
|
Is there any chance |
This comment has been minimized.
This comment has been minimized.
|
The panicking logic is the same as for slicing. You should only ever use it with a byte offset you got from somewhere, so you know it's always correct (for example from I absolutely don't want to introduce an inconsistency. A part of the motivation here is that you expect .split_at to be available as it is on slices. If we go the option route, it needs a new name. |
This comment has been minimized.
This comment has been minimized.
I don't think it's necessarily a contract violation (because strings and vectors can grow and shrink) but yeah, maybe I just want an additional checked version (but I thought our new policy on this was that we shouldn't have both a panicking and non-panicking version of an API?). |
This comment has been minimized.
This comment has been minimized.
|
Since we don't have any optional-value slicing, I think we should add Additionally or alternatively, we should stabilize |
bluss
pushed a commit
to bluss/rust
that referenced
this pull request
May 27, 2015
bluss
pushed a commit
to bluss/rust
that referenced
this pull request
May 27, 2015
alexcrichton
added
the
final-comment-period
label
Jun 2, 2015
This comment has been minimized.
This comment has been minimized.
|
This RFC is now entering the final week-long final comment period. The library subteam may not require an RFC for a change such as this in the future, but we have yet to concretely decided on the guidelines for what needs an RFC (for discussion see this thread). While we develop this policy, however, we're going to go ahead with this RFC. |
This comment has been minimized.
This comment has been minimized.
|
This seems fine to add. No strong feelings, though. |
This comment has been minimized.
This comment has been minimized.
|
My only comment is that I question the unicode hygiene of this function (i.e. naive calls to this probably won't be unicode-correct?). However, this is consistent with |
This comment has been minimized.
This comment has been minimized.
|
What do you mean by unicode-correct? Like all other slicing and indexing, it respects code point boundaries, i.e., it will rather panic than cutting a code point in half. It won't try to make you deal with grapheme clusters or anything, but none of the existing methods do that (except for |
This comment has been minimized.
This comment has been minimized.
|
I mean that it is very easy to split a string to give nonsense output, e.g. at the most basic level, splitting the two-codepoint version of |
This comment has been minimized.
This comment has been minimized.
|
But all other string manipulation1 already has that problem. It's a long-standing policy to not try and prevent that sort of error. In addition, it's not like this API can't be used "properly" --- you just have to select the index at which you split properly. Which, again, is exactly the same situations as with the rest of the string API. 1With the aforementioned exception, which BTW moved out of libstd now. |
This comment has been minimized.
This comment has been minimized.
|
Yes, I covered those aspects in my comment. It's why I'm not against this landing. :) |
Gankro
self-assigned this
Jun 5, 2015
bluss
pushed a commit
to bluss/rust
that referenced
this pull request
Jun 9, 2015
This comment has been minimized.
This comment has been minimized.
|
This consensus of the libs subteam is that we should merge this RFC, so I'm going to merge it. Thanks again for the RFC @bluss! |
bluss commentedMay 17, 2015
Introduce the method
split_at(&self, mid: usize) -> (&str, &str)onstr,to divide a slice into two, just like we can with
[T].Rendered version
Adding
split_atis a measure to provide a method from[T]in a version thatmakes sense for
str.Once used to
[T], users might even expect thatsplit_atis present on str.