Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split_at family should return an Result #90410

Open
Stargateur opened this issue Oct 29, 2021 · 6 comments
Open

split_at family should return an Result #90410

Stargateur opened this issue Oct 29, 2021 · 6 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@Stargateur
Copy link
Contributor

Stargateur commented Oct 29, 2021

These function:

Are inconvenient to use, one need to check len (or be sure) then call split_at() that will also check len, first this can lead to bad asm #74938 & #30112. Then it's inelegant. This would be ok if these function would be unsafe, so not checking the len twice.

These function are better cause there return an Option:

I propose we add the equivalent of split_at returning a Result

impl slice {
  fn split_at_mid(&self, mid: usize) -> Result<(&[T], &[T]), usize>;
  fn split_at_mid_mut(&mut self, mid: usize) -> Result<(&mut [T], &mut [T]), usize>;
}

impl str {
  fn split_at_mid(&self, mid: usize) -> Result<(&str, &str), usize>;
  fn split_at_mid_mut(&mut self, mid: usize) -> Result<(&mut str, &mut str), usize>;
}

The Err be mid or len (but we could imagine more precise error for example str could indicate char error boundary) or we keep it simple with Option

BTW: We miss split_first() and split_last() on str.

@sanmai-NL
Copy link

Can you comment on the consistency of this pattern with the rest of the standard library API related to slices?

@Stargateur
Copy link
Contributor Author

Stargateur commented Nov 2, 2021

Unsure I would said the std would prefer return an Option like get():

impl slice {
  fn split_at_mid(&self, mid: usize) -> Option<(&[T], &[T])>;
  fn split_at_mid_mut(&mut self, mid: usize) -> Option<(&mut [T], &mut [T])>;
}

impl str {
  fn split_at_mid(&self, mid: usize) -> Option<(&str, &str)>;
  fn split_at_mid_mut(&mut self, mid: usize) -> Option<(&mut str, &mut str)>;
}

But return an Result with the mid usize allow to have nice code like split_at_mid(42).map_err(MyErrorEnum::OutOfBound) where MyErrorEnum::OutOfBound(usize). If not the user would need to add the context by hand and this allow mistake and add code at each call of split_at_mid() for the user. And if user want to ignore the value it's very easy split_at_mid(42).ok().

Return a Result doesn't really increase the cost cause most of the time it will return Ok() (so same the Some()) and this return Err instead of panicking... that would require a panic unwind... also I expect this is cost free to move the usize to Err(mid)

In short, this is consistent by returning a Option but return an Result make it easier to return Error in user code.

We could also consider having a structure in std for encapsulate the usize, this would allow to implement Error and the user could just do split_at_mid(42).unwrap() that would be the same as split_at(42) that panic. There is already a number of FooError like in std.

@Dylan-DPC Dylan-DPC added C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 11, 2023
@faern
Copy link
Contributor

faern commented May 26, 2023

I would really like to have slice splitting methods that return an Option. It would make byte mucking/parsing a lot more ergonomic in a bunch of cases for me. We are finally getting exactly this, but for splitting at a const index thanks to #111774, but it's still sorely missed where the index is a value only known at runtime.

See my comment #111774 (comment) here about how the const version of the same thing only solved half my problem, and split_at_mid proposed here would solve the rest.

But return an Result with the mid usize allow to have nice code like split_at_mid(42).map_err(MyErrorEnum::OutOfBound)

I really don't think this API should return Results. It would be inconsistent with all other parts of std that are similar. And you can easily get exactly the above from the Option with just split_at_mid(42).ok_or(MyErrorEnum::OutOfBounds(42)).

@rayhem
Copy link

rayhem commented Jun 13, 2023

And you can easily get exactly the above from the Option with just split_at_mid(42).ok_or(MyErrorEnum::OutOfBounds(42)).

Is it worth differentiating an out-of-bounds error from something like an invalid UTF code point in the case of strings?

@Stargateur
Copy link
Contributor Author

Is it worth differentiating an out-of-bounds error from something like an invalid UTF code point in the case of strings?

Why not that why I mention return a Result as a possibility, anyway that doesn't change much it would still be a programing error.

@zackw
Copy link
Contributor

zackw commented Jun 28, 2023

I'd like to place another vote for fallible/non-panicking split_at(_mut) for str in particular. I currently have a FromStr impl with code like this

fn from_str(s: &str) -> Result<Self, Self::Err> {
    if let Some(prefix) = s.get(0..5) {
        if prefix.eq_ignore_ascii_case("blake") {
            if let Some(suffix) = s.get(5..) {
                if suffix == "3" || suffix == "-3" {
                    return Ok(Self::Blake3);
                }
            }
        }
    }
    // ...
}

If I could write this instead, it would eliminate three nested if levels and an impossible branch, while remaining verifiably panic-free. (Looks like the compiler can eliminate the panic if I write a regular split_at guarded by an is_char_boundary condition, but I don't like relying on the optimizer for that sort of thing.)

fn from_str(s: &str) -> Result<Self, Self::Err> {
    if let Some((prefix, suffix)) = s.try_split_at(5) {
        if prefix.eq_ignore_ascii_case("blake") && (suffix == "3" || suffix == "-3") {
              return Ok(Self::Blake3);
        }
    }
    // ...
}

I wrote the code above assuming try_split_at returns an Option, because that's what str::get does, but I'm indifferent to Option vs Result here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants