Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add clamp RFC #1961

Merged
merged 17 commits into from Sep 6, 2017

Conversation

@Xaeroxe
Copy link
Contributor

Xaeroxe commented Mar 27, 2017

RFC for feature discussed at https://internals.rust-lang.org/t/clamp-function-for-primitive-types/4999

Tracking issue: rust-lang/rust#44095
Implementation PR: rust-lang/rust#44097

@killercup

This comment has been minimized.

Copy link
Member

killercup commented Mar 27, 2017

Rendered

@alexcrichton alexcrichton added the T-libs label Mar 27, 2017

/// Returns the upper bound of the range if input is greater than the range, and the lower bound of
/// the range if input is less than the range. Otherwise this will return input.
#[inline]
pub fn clamp<T: Ord>(input: T, range: RangeInclusive<T>) -> T {

This comment has been minimized.

@kennytm

kennytm Mar 27, 2017

Member

I'm not sure if we should tie the stability of this RFC with #1192 (rust-lang/rust#28237) here by using range: RangeInclusive<T> instead of min: T, max: T.

This comment has been minimized.

@Xaeroxe

Xaeroxe Mar 27, 2017

Author Contributor

After messing around with the playground a bit and doing some thinking I actually agree with you. I much prefer the RangeInclusive syntax but the instability of it is too much of a drawback.

@kornelski

This comment has been minimized.

Copy link
Contributor

kornelski commented Mar 28, 2017

My two concerns:

  • for floats semantics of NaN handling should allow implementation with the pair of maxsd & minsd.
  • I use clamp very often in conjunction with converting to an integer type, e.g. clamp(0,255) as u8. Should there be saturating_into() instead or in addition to clamp? In Rust the cast is a very expensive operation. SSE3 has FISTTP instruction that could perhaps be used to accelerate this?
@bluetech

This comment has been minimized.

Copy link

bluetech commented Mar 28, 2017

I think the RFC and docstring should explicitly describe what happens if max < min.

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Mar 28, 2017

@pornel my first thought is that we ought to optimize the as operator rather than introducing new syntax to work around the performance problems it has. I would consider that suggestion to be outside the scope of this RFC. To be honest I don't really know what implications "for floats semantics of NaN handling should allow implementation with the pair of maxsd & minsd" would have on the implementation. If you could provide a code sample in Rust I might be able to better understand what you mean by that.

@bluetech agreed, I'll add something for that as soon as I can.

@kornelski

This comment has been minimized.

Copy link
Contributor

kornelski commented Mar 28, 2017

@Xaeroxe please note that as doesn't clamp, so it's not a sufficient alternative.

I think use of SSE dictates that if any of the operands is NaN, then NaN is returned. http://www.felixcloutier.com/x86/MAXSD.html

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Mar 28, 2017

@pornel No,

If only one value is a NaN (SNaN or QNaN) for this instruction, the second source operand, either a NaN or a valid floating-point value, is written to the result.

i.e. MAXSD(NaN, 123.0) == 123.0.

The definition of MAXSD(a, b) is consistent with if a > b { a } else { b }.

input
}
}
```

This comment has been minimized.

@ranma42

ranma42 Mar 28, 2017

Contributor

An alternative implementation which results in better code is:

pub fn clamp(input:f32, min: f32, max: f32) -> f32 {
    let mut x = input;
    if !(x < min) { x = min; }
    if !(x > max) { x = max; }
    x
}

It conveniently preserves the source when it is NaN, as required in the edge cases listed below.

This comment has been minimized.

@Xaeroxe

Xaeroxe Mar 28, 2017

Author Contributor

Thanks! Looks great!

This comment has been minimized.

@ranma42

ranma42 Mar 28, 2017

Contributor

... but it incorrectly returns NaN when max == NaN || min == NaN (EDIT: this is pseudocode... as crazy as it looks, the check for NaN would be max != max || min != min).
I am unsure if this is good or bad: while the infinity corresponding to no enforcement is straightforward, it is not obvious to me what is the desirable behaviour for NaN.

This comment has been minimized.

@Xaeroxe

Xaeroxe Mar 28, 2017

Author Contributor

You know honestly that might be better behavior than what I proposed. It assumes a NaN is unintentional which they often are. If someone explicitly wants no bounds enforced they should provide infinity.

This comment has been minimized.

@kennytm

kennytm Mar 28, 2017

Member

@ranma42 I think you mean if !(x >= min) etc.

@ranma42

This comment has been minimized.

Copy link
Contributor

ranma42 commented Mar 28, 2017

Should we mention the expected behaviour for !(min < max) explicitly?

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Mar 28, 2017

@ranma42 I think so yes. max and min may not be constant values so it's not that far out to say a user could accidentally provide a max < min

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Mar 28, 2017

So I think the behavior for max < min should be explicitly defined but I don't really know what that behavior should be. My first thought is to just add a line at the top of the function:

assert!(max >= min);

I worry about
A: Is the performance impact of this undesirable
B: Is panicking really the best way to handle this state?

@bluetech

This comment has been minimized.

Copy link

bluetech commented Mar 28, 2017

If min and max are consts, then the assert will be elided by the optimizer.

If they are not statically known, and the function is marked inline, then if it is called inside of a loop and min and max are invariant then the optimizer will move the check out of the loop which makes it again practically free.

So IMO panic is the safe way to go.

@ranma42

This comment has been minimized.

Copy link
Contributor

ranma42 commented Mar 28, 2017

@Xaeroxe that assert would also forbid passing NaN as max or min, which might be a good thing (and would simplify the edge cases table).

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Mar 28, 2017

Thanks @bluetech ! I learned some cool stuff about the optimizer today.
@ranma42 I guess the question is do we want this function to propagate NaN or panic? Propagating NaN is what floating point operations typically do but I also am not sure if that's the best approach here. Propagating NaN doesn't cause the program to halt execution but it also can result in "error at a distance" type problems as well.

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Mar 28, 2017

Personally I prefer panic, as more harsh consequences for invalid input are more likely to make the programmer make sure to account for the possibility of invalid input. Of course if anyone has counter points to that I'd love to hear them.

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Mar 28, 2017

@Xaeroxe I strongly suggest you revert the implementation back to what you had. @ranma42's implement is wrong (check with 4.clamp(3, 5)), and even when the typo is fixed it will interact badly when the input itself is NaN (check with NaN.clamp(3, 5)).

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Mar 28, 2017

@kennytm Good catch. How about this?

pub fn clamp(input:f32, min: f32, max: f32) -> f32 {
    assert!(max >= min);
    let mut x = input;
    if x < min { x = min; }
    if x > max { x = max; }
    x
}
@ranma42

This comment has been minimized.

Copy link
Contributor

ranma42 commented Mar 28, 2017

@kennytm sorry, you're right, I was misled by the documentation mentioning the first/second operand in Intel syntax vs the playground showing the GNU syntax. My bad.

@ranma42

This comment has been minimized.

Copy link
Contributor

ranma42 commented Mar 28, 2017

@Xaeroxe I think assert!(min <= max) reads better, but yes, that's the original code by @kennytm, which is correct.

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Mar 28, 2017

Changes made, requesting feedback.

Good work, I think we're getting closer to a final.

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Mar 28, 2017

For other languages/libraries having clamp or clip:

  • C++17's std::clamp (PR0025R0) implements exactly the same as this RFC before the last change (if x < lo { lo } else if hi < x { hi } else { x }). It asserts !(hi < lo) (NaNs can pass this assertion). Boost's boost::algorithm::clamp is equivalent but without the assert. (When not considering the assertion and optimizer, this algorithm is in fact produces the same output as the current RFC.)

  • Swift's clamped(to:) is range based, but when the self is a single point, is exactly the same as this RFC (if lo > x { lo } else if hi < x { hi } else { x }) (implementation.)

  • NumPy's clip() specializes for NaN, likely for performance of vectorized operation (implementation.), but otherwise the same as C++17.

    if hi.is_nan() && lo.is_nan() {
        x
    } else if hi.is_nan() {
        if x < lo { lo } else { x }
    } else if lo.is_nan() {
        if x > hi { hi } else { x }
    } else {
        if x < lo { lo } else if x > hi { hi } else { x }
    }
  • GLib's CLAMP() is if x > hi { hi } else if x < lo { lo } else { x }.

  • Qt's qBound() is implemented as max(lo, min(hi, x)), which is expanded to (implementation.):

    let y = if hi < x { hi } else { x }; 
    if lo < y { y } else { lo }
  • GLSL's clamp is defined as min(max(x, lo), hi). Not sure how GLSL handles NaN. Also, Apple's Accelerate framework vector_clamp/simd::clamp is implemented using the same formula, with min and max corresponding to fmin and fmax. It differs from the current RFC in the handling of NaN.

  • Ruby's clamp is defined as

    match x.partial_cmp(lo).unwrap() {
        Equal => x,
        Less => lo,
        _ => match x.partial_cmp(hi).unwrap() {
            Greater => hi,
            _ => x,
        }
    }

    Any NaN including the input will cause ArgumentError in Ruby, and there is an assertion for !(lo > hi).


Behaviors of different algorithms

lo x hi C++17 Qt GLSL This RFC (ignoring assert)
3 6 5 hi (5) hi (5) hi (5) hi (5)
3 5 5 x (5) x (5) x/hi (5) x (5)
3 4 5 x (4) x (4) x (4) x (4)
3 3 5 x (3) lo (3) x/lo (3) x (3)
3 2 5 lo (3) lo (3) lo (3) lo (3)
NaN 6 5 hi (5) lo (NaN) hi (5) hi (5)
NaN 5 5 x (5) lo (NaN) x/hi (5) x (5)
NaN 4 5 x (4) lo (NaN) x (4) x (4)
3 NaN 5 x (NaN) lo (3) lo (3) x (NaN)
3 4 NaN x (4) x (4) x (4) x (4)
3 3 NaN x (3) lo (3) x/lo (3) x (3)
3 2 NaN lo (3) lo (3) lo (3) lo (3)
NaN NaN 5 x (NaN) lo (NaN) hi (5) x (NaN)
NaN 4 NaN x (4) lo (NaN) x (4) x (4)
3 NaN NaN x (NaN) lo (3) lo (3) x (NaN)
-0 +0 1 x (+0) lo (-0) x (+0) x (+0)
+0 -0 1 x (-0) lo (+0) lo (+0) x (-0)
-1 -0 +0 x (-0) x (-0) x (-0) x (-0)
-1 +0 -0 x (+0) x (+0) hi (-0) x (+0)
+0 -0 +0 x (-0) lo (+0) lo/hi (+0) x (-0)
-0 +0 -0 x (+0) lo (-0) hi (-0) x (+0)
@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Jul 26, 2017

@Xaeroxe Are you still in favour of the "on all the ranges" option? Are you planning to update the RFC to propose that instead? Is there anything with which you'd like help?


Some potentially-interesting context: rust-lang/rust#25663 added default min and max methods to Ord.

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Jul 26, 2017

I apologize for my extended absence on this RFC. Now that I've come back to this later and re-evaluated it I think there's a trade-off to option 1 that we have to decide if it is worth it or not. Option 1 is more complicated to implement and use, but in exchange for that we get the ability to operate on more range types. So the question is, is that necessary? Will anyone ever have a need for a range that isn't fully closed? I think I'm going to investigate other languages and the usage of their clamp functions in order to get an answer to that question.

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Jul 26, 2017

I did some research on how other languages implement and use clamp, and for the most part they operate exclusively on fully inclusive ranges. The one exception is a library from npm with less than 10 downloads. So I don't feel the effort required to produce option 1 is really necessary and I'm leaving this RFC as is. I consider this RFC ready to merge.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Aug 9, 2017

@scottmcm @sfackler do you concur that this is ready for FCP?

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Aug 9, 2017

Yep - I agree that the flexibility of an impl on ranges isn't needed in practice.

@scottmcm
Copy link
Member

scottmcm left a comment

Yes, I agree this is ready to advance.


Likely locations would be on the Ord trait, and a special version implemented for f32 and f64.
The f32 and f64 versions could live either in std::cmp or in the primitive types themselves. There are good arguments for either
location.

This comment has been minimized.

@scottmcm

scottmcm Aug 9, 2017

Member

Should probably update this now that the RFC has made a decision on where.


Alternatives were explored at https://internals.rust-lang.org/t/clamp-function-for-primitive-types/4999

Additionally there is the option of placing clamp in std::cmp in order to avoid backwards compatibility problems. This is however semantically undesirable, as `1.clamp(2, 3);` is more readable than `clamp(1, 2, 3);`

This comment has been minimized.

@scottmcm

scottmcm Aug 9, 2017

Member

Adding min and max as methods on Ord caused a number of regressions (rust-lang/rust#42496 (comment)), and I'm unsure what their fate will be. Do we expect clamp to hit similar things?

This comment has been minimized.

@Xaeroxe

Xaeroxe Aug 9, 2017

Author Contributor

@scottmcm Possibly, although to a much lesser magnitude I think. I guess the only way to know is to try it and see.

# Unresolved questions
[unresolved]: #unresolved-questions

Is the proposed handling for NAN inputs ideal?

This comment has been minimized.

@scottmcm

scottmcm Aug 9, 2017

Member

I think this was actually resolved (some of the inputs: https://internals.rust-lang.org/t/clamp-function-for-primitive-types/4999/19 #1961 (comment) #1961 (comment)); it might be worth a note in the RFC explaining why the specified NaN behaviour was chosen.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Aug 9, 2017

The shepherds for this RFC have signaled readiness for full team review, so here we go!

@rfcbot fcp merge

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Aug 9, 2017

Team member @aturon has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Aug 26, 2017

Sorry for the delay -- hadn't realized this was waiting on @brson. Should go through now. Thanks, @Xaeroxe, for your long patience with this RFC!

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Aug 26, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Aug 26, 2017

Tracking issue: rust-lang/rust#44095
Implementation PR: rust-lang/rust#44097

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Sep 5, 2017

The final comment period is now complete.

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Sep 5, 2017

Implementation PR is ready, it can be merged as soon as this is merged I believe.

@aturon aturon merged commit 9fad211 into rust-lang:master Sep 6, 2017

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Sep 6, 2017

Merged! Thanks, @Xaeroxe, for sticking this one out.

Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Sep 6, 2017

Rollup merge of rust-lang#44097 - Xaeroxe:clamp, r=burntsushi
Add clamp functions

Implementation of clamp feature:

Tracking issue: rust-lang#44095
RFC: rust-lang/rfcs#1961

Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Sep 7, 2017

Rollup merge of rust-lang#44097 - Xaeroxe:clamp, r=burntsushi
Add clamp functions

Implementation of clamp feature:

Tracking issue: rust-lang#44095
RFC: rust-lang/rfcs#1961

@Xaeroxe Xaeroxe deleted the Xaeroxe:Xaeroxe-clamp-rfc branch Sep 8, 2017

@Xaeroxe

This comment has been minimized.

Copy link
Contributor Author

Xaeroxe commented Sep 9, 2017

Clamp was implemented then shortly reverted due to concerns about breaking downstream code. It's unlikely this RFC will be implemented. See this: rust-lang/rust#44095

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.