Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest _[x..][..n] instead of _[x..x+n] #2272

Open
llogiq opened this issue Dec 14, 2017 · 8 comments
Open

Suggest _[x..][..n] instead of _[x..x+n] #2272

llogiq opened this issue Dec 14, 2017 · 8 comments

Comments

@llogiq
Copy link
Contributor

llogiq commented Dec 14, 2017

This may well be an allowed lint, but I think the latter way is cool and want more folks to learn about it 🙂

@clarfonthey
Copy link
Contributor

Does this have any benefits besides being a cool trick?

@Manishearth
Copy link
Member

Manishearth commented Dec 14, 2017 via email

@clarfonthey
Copy link
Contributor

clarfonthey commented Dec 14, 2017

Now that you mention it, yeah, that does make sense. Although the real power of this lint IMO is to notice larger expressions and factor those out too, i.e. [x + 1..x + 2] could become [x + 1..][..1] or [x..][1..2].

Although this is basically a fundamental style difference IMO, and any lint should probably support both directions, unless the API guidelines swing one way or the other.

Also, another thing to think about is how much you might want to factor these out. For example, should [a + 1..b] become [a..b][1..] ?

@killercup
Copy link
Member

killercup commented Dec 15, 2017

So, there is a 'trivial pattern' change from [$start..$start+$n] to [$start..][..$n] here, but I find @clarcharr's suggestions even more interesting: [x..][1..2] tells me something different than [x + 1..][..1]. I read the first like "Go to offset x (which might be start of a row in a flat array) and take the 2nd and 3rd element (of that row)", while the second doesn't really convey that meaning.


Also, this is basically the "index syntax" version of x.iter().skip($start).take($n)—do we have a lint that simplifiesx.iter().skip($start).take($n)1 to x[$start..$n]2 iff x is a slice3?

Footnotes

  1. maybe with .cloned(), maybe as consuming iterator

  2. or, with this lint, some other form

  3. or something that implements Index with Range

@teor2345
Copy link
Contributor

Going from [x..x+n] to [x..][..n] behaves differently if x+n would overflow usize.

Which is possible when using:

  • 32-bit code on 64-bit platforms, or
  • Zero-Sized Types

@clarfonthey
Copy link
Contributor

Going from [x..x+n] to [x..][..n] behaves differently if x+n would overflow usize.

Which is possible when using:

* 32-bit code on 64-bit platforms, or

* Zero-Sized Types

The behaviour actually remains the same since internally if they'd overflow a usize you'd be either accessing the array out of bounds (e.g. in the case of a large ZST array), overflowing a usize length, or getting a lopsided range (e.g. x + n < x), all of which would cause a panic in debug mode. In release mode, the refactoring afaik has the chance to panic more often, since overflow usually doesn't panic but invalid slice access always does.

@teor2345
Copy link
Contributor

In release mode, the refactoring afaik has the chance to panic more often, since overflow usually doesn't panic but invalid slice access always does.

This seems like a minor reason to prefer [x..][..n], but it doesn't seem compelling.

@clarfonthey
Copy link
Contributor

clarfonthey commented Apr 23, 2021

Like mentioned earlier in the thread, this idea is mostly driven by aesthetics rather than correctness or performance.

For the sake of fully describing (what I think) would make the most sense, I'll distinguish between the two desired formats as the "start-len" format (buf[a..][..b - a]) and the "start-end" format (buf[a..b]).

Any lint that we define should bail-out if any of the terms used to index are not simple variables or constants, potentially allowing known-pure expressions like buf.len(). We also want the indices to be purely additive, i.e. just a sum (or difference) of these terms, and explicitly acting on primitive integer types; parentheses are okay as long as the result is still additive. Once we've done that, we can define a "complexity" metric as the number of terms on each bound of the slice, added together. We could potentially weight terms like buf.len() higher than others.

We'll define "simplifying" an index as something we need later. This involves a few passes:

  1. Convert the expression into a purely additive one, e.g. replace subtraction with added negated terms. Distribute subtraction into parentheses if needed. Cancel out double-negation.
  2. "Cancel-out" repeats of variables, i.e. delete occurrences of the pair x and -x, where x can be any term.
  3. If the expression contains multiple constants, sum together all constant values to a single value, which we call the sum. If the sum is negative and the first term is a constant, replace the second constant with the sum; if it's not, just replace the first constant term with the sum. In all cases, remove all constants except the sum.
  4. Replace addition of negated terms back with subtraction. The first term may remain negated.

To convert something to the start-len format, we just convert buf[a..b] to buf[a..][..b-a] or buf[..b][..b-a]. To convert something into the start-end format, we just convert buf[a..][..b] into buf[a..a+b] and buf[..b][a..] into buf[b-a..b].

We then simplify the indices to determine what to pick, based upon what has the least number of terms. In the event of a tie, or if the user would like to prefer a specific format regardless of complexity.

Note: this is not necessarily how the final implementation would look like, it's just my idea for a basic algorithm that would accomplish what I mentioned in my comment. YMMV as always, and there are other cases we might wanna consider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants