New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversions: `FromLossy` and `TryFromLossy` traits #2484

Open
wants to merge 9 commits into
base: master
from

Conversation

Projects
None yet
@dhardy
Copy link
Contributor

dhardy commented Jun 22, 2018

Add FromLossy, TryFromLossy traits.

Discuss the bigger picture of conversions and the as keyword.

Specify that From implementations must not only be safe, but also exact.

Rendered RFC

@dhardy dhardy changed the title New RFC: from-lossy Conversions: `FromLossy` and `TryFromLossy` traits Jun 22, 2018

@Centril Centril added the T-libs label Jun 22, 2018

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 22, 2018

I think the word exact is problematic and that it defined imprecisely in this RFC.

It suggests to me that for X: From<Y> there must exist an inversion Y: From<X> such that X and Y are isomorphic. But u8u16 does not have an inversion u16u8 precisely because it is a narrowing operation.

Since From is such a central trait in the ecosystem, I think the requirements and guarantees placed on implementations of it should be specified with mathematical precision. As currently specified, I am not sure how the changed definition extends to types outside the standard library.

One possible definition that seems to fit the criteria described in the RFC is:

  • [X: From<Y>] → [∀ y0, y1 : Y. X::from(y0) ≡ X::from(y1) → y0 ≡ y1].
@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 22, 2018

Other questions that this RFC provoked:

  • Should From::from always be a "pure" computation?
  • Is such a change in recommendation re. From considered to be a breaking change?
@SoniEx2

This comment has been minimized.

Copy link

SoniEx2 commented Jun 22, 2018

should From be allowed to allocate remote resources? (granted you should probably use TryFrom for things like that)

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Jun 22, 2018

It suggests to me that for X: From there must exist an inversion Y: From such that X and Y are isomorphic

I disagree with the need for an isomorphism; e.g. I see no problem with f32's two representations for 0 both mapping to the same integer value.

To be fair, that's not what you said. But surely an exact conversion f: A → B implies only that for any a1, a2 in A, a1==a2 exactly when f(a1) == f(a2) (and derived from this: if a==b can be defined for a in A and b in B, then a==f(a) for all a in A). Okay, that's roughly what your type rule says, if I parse it right (the . in the middle is confusing; I've usually seen | used (maths contexts); also I believe we need a two-way implication).

Should From::from always be a "pure" computation?

Meaning an intrinsic? I don't really see why this is important.

Is such a change in recommendation re. From considered to be a breaking change?

Since this wasn't specified before, I don't see it as a big issue, but I suppose it could be (if libs remove their implementations to comply).

should From be allowed to allocate remote resources?

It musn't fail (existing requirement, not from this RFC). Exactly what this means with regards to unlikely (and probably unrecoverable) panics is up for interpretation, I guess (and specifying something won't necessarily influence how people use it for anyway).

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 22, 2018

@dhardy To be clear, I'm not saying that X: From<Y> must also imply Y: From<X>, so I don't think there should be an isomorphism requirement either. I'm just saying the way you framed it suggests that to me.

Okay, that's roughly what your type rule says

That's exactly what the proposition says :)

In other words:

 X: From<Y>   X::from(y0) ≡ X::from(y1)
--------------------------------------- proper_From_implementation
                 y0 ≡ y1

or if we are inclined to be somewhat more pedantic:

Δ ⊢ Implemented(X, From<Y>)
Γ ⊢ y0 : Y
Γ ⊢ y1 : Y
Γ, Δ ⊢ X::from(y0) ≡ X::from(y1)
--------------------------------- proper_From_implementation
Γ, Δ ⊢ y0 ≡ y1

Meaning an intrinsic? I don't really see why this is important.

No, "pure" means deterministic here. That is (formulated somewhat imprecisely), for
any two y0, y1 : Y, y0 ≡ y1 → X::from(y0) ≡ X::from(y1). So X::from could have no side-effects.

I don't such a requirement is necessary tho?

@newpavlov

This comment has been minimized.

Copy link

newpavlov commented Jun 22, 2018

It's not just about this RFC, but can't we make FromFoo traits just aliases for TryFromFoo<Error=!>?

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Jun 22, 2018

@Centril someone likes their type theory 😄

WP has an article on pure functions. But it's not the same as what you wrote which is about from respecting the operator (presumably in this case PartialEq, which is not required to be implemented for either type). Or if by you mean "the same binary representation" or something, I don't think that's a distinction worth making; it's not something we have any operator to test (aside from indirectly with memory comparisons), and it would be possible (though wrong) for PartialEq to depend on other things such as the memory alignment.

Back on topic, I guess all from implementations should be pure (ignoring allocations / caches), but I don't think specifying this in the documentation is useful in any way.

@newpavlov Good point. I'm not sure, is it possible to do this without it being a big breaking change?

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 22, 2018

@dhardy

someone likes their type theory 😄

I do! ;) I think we should specify formally (with logic -- probably not typing rules) in the RFC and the documentation of From what laws should hold and what is meant by "exact", as we've done with PartialEq.

Yeah being imprecise about what exactly is is what I meant about "formulated somewhat imprecisely" ;)

I guess all from implementations should be pure (ignoring allocations / caches), but I don't think specifying this in the documentation is useful in any way.

If that is a reasonable expectation the user has, we might as well specify it?

@newpavlov

This comment has been minimized.

Copy link

newpavlov commented Jun 22, 2018

@dhardy
I am not sure, I guess it will depend on how trait aliases and never type will shape out. For example if impl's on trait aliases will be allowed, then current implementations of From will be still legal and without blank implementations of TryFrom for From we should be fine. And of course we will need automatic coercion (or straight equivalence maybe?) of Result<T, !> to just T. (I am not sure how folks who work on it think it should work) But either way backward compatibility concerns only From and Into traits, not the traits described in this RFC.

UPD: Hm, it looks automatic coercion is not planned, and it seems we'll get back to cyclic trait design from rand discussions:

pub trait TryFrom<T>: Sized {
    type Error;
    fn from(value: T) -> Self {
        Self::try_from(value).unwrap_or_else(|_| panic!("error"))
    }
    fn try_from(value: T) -> Result<Self, Self::Error> {
        Ok(Self::from(value))
    }
}
trait From = TryFrom<Error=!>;

It will allow existing From implementations work as is, though it definitely looks quite hacky.

@rkruppe
Copy link
Member

rkruppe left a comment

I haven't formed an opinion on the actual proposal, so here's what I do best: nitpicking about floating point minutiae

This has several problems:

- `as` can perform several different types of conversion (as noted) and is therefore error-prone
- `as` can have [undefined behaviour](https://github.com/rust-lang/rust/issues/10184)

This comment has been minimized.

@rkruppe

rkruppe Jun 22, 2018

Member

Nit: this is "just" an implementation bug, thus "temporary" and not a problem with as-the-language-feature.

This comment has been minimized.

@dhardy

dhardy Jun 23, 2018

Author Contributor

True, the undefined behaviour can be (and is being) fixed. I should say instead that there are certain casts which have no correct result and therefore should fail, which as has no mechanism for (other than panics).


Where conversions are not exact, they should be equivalent to a conversion
which first forces some number of low bits of the integer representation to
zero, and then does an exact conversion to floating-point.

This comment has been minimized.

@rkruppe

rkruppe Jun 22, 2018

Member

Is this trying to avoid specifying a rounding mode? Or is it trying to specify a rounding mode in some convoluted manner? Either way, all these conversions should be specified to round to nearest, ties to even. That's the default rounding mode and we (as well as other languages) use it everywhere except for float -> int conversions.

This comment has been minimized.

@dhardy

dhardy Jun 23, 2018

Author Contributor

True, this is convoluted. And you're correct, since the output type here is floating-point, it would be expected to round to nearest with ties to even.


The implementations should fail on NaN, Inf, negative values (for unsigned
integer result types), and values whose integer approximation is not
representable. The integer approximation should be the value rounded towards

This comment has been minimized.

@rkruppe

rkruppe Jun 22, 2018

Member

Failing if the integer approximation doesn't fit the target type is... debatable. The proposed semantics for as are to saturate to int_type::MAX or int_type::MIN respectively, and that is in some ways an "approximately equivalent value".

BTW the same point applies to infinities (i.e., it's reasonable to define f32::INFINITY -> u8 to result in 255), but I can understand being more uneasy with that.

This comment has been minimized.

@dhardy

dhardy Jun 23, 2018

Author Contributor

No, infinities and out-of-range values fall into the same category here: not representable. IMO 255 is a very poor approximation of 1000f32 and an even worse approximation to 1e300f64.

Certainly I'm being opinionated here but I think if we have a fallible conversion available then we should fail on out-of-range values. (But if not, then TryFromLossy also has no reason to exist.)

This comment has been minimized.

@rkruppe

rkruppe Jun 23, 2018

Member

I have sympathy with that argument, but at the same time I'd like to reduce divergence from the semantics of as (to avoid people sticking to as because they prefer its behavior), and currently that seems more likely to be saturation than panicking.

This comment has been minimized.

@dhardy

dhardy Jun 23, 2018

Author Contributor

Also note that as must return some defined value simply in order to avoid undefined behaviour. We have no need to model this conversion trait on the limitations of as.

This comment has been minimized.

@rkruppe

rkruppe Jun 23, 2018

Member

as doesn't have to return a value, it can panic (there are some reasons to not want that, but it avoids UB and is reasonable in isolation). That would be the moral equivalent of this trait returning Err.

- 100_000f32 → u16: error

(Alternatively we could allow floats in the range (-1, 0] to convert to 0, also
for unsigned integers.)

This comment has been minimized.

@rkruppe

rkruppe Jun 22, 2018

Member

That seems more consistent with the stated rounding mode (I view all these conversions as "round, then see if they fit into the target type") and it also matches the current behavior of as.

negative, infinite or an NaN. So even though the output type has large enough
range, this conversion trait is still applicable.)

The implementations should fail on NaN, Inf, negative values (for unsigned

This comment has been minimized.

@rkruppe

rkruppe Jun 22, 2018

Member

Does "negative values" include negative zero?

(This question disappears if values in (-1.0, 0.0] result in Ok(0) as discussed below.)

This comment has been minimized.

@dhardy

dhardy Jun 23, 2018

Author Contributor

No, not last time I checked 😄

But this question goes away if we use the alternative as you suggested anyway.

representable. The integer approximation should be the value rounded towards
zero. E.g.:

- 1.6f32 → u32: 2

This comment has been minimized.

@rkruppe

rkruppe Jun 22, 2018

Member

This isn't rounded towards zero.

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Jun 23, 2018

Updated regarding several things mentioned.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jun 23, 2018

  • I agree that as is problematic and like the idea of adding alternatives. (Maybe deprecating it some day, but this RFC can be incremental improvement without going that far.)

  • Regarding naming: I find the world lossy to be a bit vague. What kind of loss are we talking about? It turns out that this RFC is all about approximating numeric types, so maybe Approximate and TryApproximate would work as trait names? Verbs are nice for single-method traits because they also work well for the method name.

  • Regarding the prelude: we have precedent with TryFrom and TryInto of adding unstable traits and saying “we’ll consider adding them to the prelude when stabilizing”. When we did, a number of crates broke because they had copy/pasted the traits into their own code in order to use them on stable Rust, and now had two traits in scope with methods of the same name, causing a compiler error. We had to revert: rust-lang/rust#49518

    I think the lesson here is that when discussing new traits here and there’s a chance that we’ll want them in the prelude, we should add them to the prelude as soon as they are implemented even if we end up removing them before stabilization.

    However if we make an API that works without traits being in scope, being in the prelude doesn’t matter. There is precedent for this with the str::parse inherent method and the FromStr trait. Since this RFC is about numeric approximation, it is concerned with a comparatively small number of types (14, v.s. unbounded for the more general From and TryFrom traits). We could have inherent methods on the relevant numeric types, similar to str::parse.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 23, 2018

@repax The documentation of From should explain what injective means for those not familiar with that mathematical jargon :)

But yes, saying that T::from must be an injective function should be sufficient since to be a function in the first place, T::from must satisfy ∀ s0, s1 : S. s0 ≡ s1 → T::from(s0) ≡ T::from(s1) since a function satisfies ∀ x, y : S. x ≡ y → f(x) ≡ f(y) by definition.

The link to PartialEq is not something testable for all From implementations since there is no requirement that X: From<Y> → [X: PartialEq ∧ Y: PartialEq] so idk about that part...

But regardless, it is nice to see a movement towards more exact definitions, kudos to @dhardy on that.

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Jun 23, 2018

@Centril of course all functions are homomorphisms w.r.t. equality (whoops). I guess we can drop the PartialEq stuff then.

@SimonSapin thanks for the feedback.

  • Yes, the approximations could be inherent methods, though it is much easier to write generic code using traits (one of the motivations here).
  • I guess these could be named Approximate etc., though isn't the parallel with the name From useful?

I suppose there should be Into parallels too.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jun 23, 2018

it is much easier to write generic code using traits (one of the motivations here).

I’m having a hard time imagining a situation where generic code with these traits in a where clause is useful. Do you have some examples?

If I remember correctly the existence of both From and Into has to do with impl coherence. In some combinations of types one can be implemented but not the other. I don’t think this is as much of a concern here since I don’t expect many impls outside of the standard library.


rust-lang/rust#42456 was another attempt that could be mentioned in Prior Art, even though it didn’t land.

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 23, 2018

Adding traits sounds unnecessarily confusing and hierarchical. And maybe breaking. Could const members address this? And avoid breakage via default values?

pub trait From<T> {
    fn from(T) -> Self;
    const INJECTIVE: Option<bool> = None;
    const SURJECTIVE: Option<bool> = None;
}

If you needed this, then you'd write where U: From<T, const INJECTIVE = Some(true)> or whatever.

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Jun 24, 2018

@SimonSapin this is what motivated me. Ideally we wouldn't need to implement this trait ourselves. But it may not have wide usage.

@burdges interesting idea. But why use Option and why mention SURJECTIVE? And if we can have const INJECTIVE: bool = true; then does U: From<T> imply U only matches injective ("non lossy") implementations? If so then this might be workable. (But would it also be possible to match any implementation, e.g. with U: From<T, INJECTIVE = _>?)

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jun 24, 2018

@dhardy This example is sort of making my point. As far as I can tell it is not generic, and $ty is (after macro expansion) a concrete type at every $ty::cast_from_int(…) call site.

Is there a use case for code like either of these?

fn foo<N>(…) where N: CastFromInt<u32> {…}
fn bar<N>(…) where u32: CastFromInt<N> {…}
@burdges

This comment has been minimized.

Copy link

burdges commented Jun 24, 2018

I think const INJECTIVE: Option<bool> = None; keeps existing impl From<..> for .. code correct by not specifying either true or false incorrectly. I donno if SURJECTIVE helps, maybe not.

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Jun 24, 2018

Perhaps not. So I guess the alternative is simple but involves quite a few methods: add these to every integer type:

fn to_f32(self) -> f32;
fn to_f64(self) -> f64;

These types already have to_le and to_be functions and proposed to_bytes, so I guess these fit.

Or more akin to str::parse, add this to f32 and f64 (along with a trait):

fn from_int<T: ToFloat<Self>>(x: T) -> Self;

But @burdges U: From<T, SURJECTIVE = None> (i.e. the default) will not match From<T, SURJECTIVE = Some(false)> implementations, so it becomes a breaking change to specify these constants in existing implementations (and a pain to use generically).

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 24, 2018

We do nee a default regardless. I suppose you're saying const INJECTIVE: bool = false; can be the default because code should never use U: From<T, SURJECTIVE = false>? Fine.

If you want to avoid breaking changes then you can specify it like this:

/// Opaque type used to specify if a `From` is injective.
pub struct Injectivity(bool);
const INJECTION: Injectivity = Injectivity(true)

pub trait From<T> {
    fn from(T) -> Self;
    const INJECTIVE: Injectivity = Injectivity(false);
}

pub trait Into<T>: Sized {
    fn into(self) -> T;
    const INJECTIVE: Injectivity = Injectivity(false);
}
impl<T, U> Into<U> for T where U: From<T>
{
    fn into(self) -> U {
        U::from(self)
    }
    const INJECTIVE: From::Injectivity = <U as From>::INJECTIVE;
}
@Diggsey

This comment has been minimized.

Copy link
Contributor

Diggsey commented Jun 24, 2018

@burdges adding constants like that seems way more confusing (and also breaking) than what is proposed in the RFC.

IMO it was a mistake to add From/To traits in the first place without specifying a meaning beyond what their signature implies. Luckily, in practice usage of these traits has followed the rules set down in this RFC (that conversions with these traits should be lossless).

Having more traits is fine, but I think it would be useful to reduce the number of traits which must be implemented, so I think going forward all new infallible traits should have a generic implementation for types implementing the infallible trait with an error type of !:

impl<T, U> FromLossy<T> for U where U: TryFromLossy<T, Error=!> { ... }

Also, @dhardy, the trait definitions in the RFC are missing their generic parameter T - I believe this is a mistake?

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 24, 2018

I imagined the coherence rules isolated any breakage when adding a default constant, but.. I suppose you're talking about how the trickier coherence rules around the existing type parameter, yes?

I suggested constants here largely because they permit positive or indeterminate defaults, while traits mostly only support negative defaults. It's possible negative defaults are desirable here, but the RFC design looked kinda bent around negative defaults. I think folks have voiced roughly that opinion, hence my initial suggestion being a default None interpreted as unspecified.

As an aside, constants are less confusing because they're documented with the main trait, but rust doc could be modified to have subordinate traits documented on the same page as the main trait, so whatever.

Oh. If one really wants unspecified defaults, one might also define trait FromProperties<T> : From<T> { const INJECTIVE : bool }, but that's messier.

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Jun 25, 2018

@Diggsey what's your take on @SimonSapin's point that it may be better to find options which don't involve adding to the prelude? FromLossy is more generic than what he proposed but is not the only option.

Auto-implementing TryFromLossy for all FromLossy impls makes sense. Doing the same for TryFrom / From is breaking (for any explicit implementations of both) but, once specialisation is ready, we should be able to introduce a default implementation. There's one caveat here: introducing usage of default on an existing implementation is (technically) a breaking change, so either we should not auto-implement TryFromLossy before specialisation, or we should put up with slightly different semantics there.

@Diggsey

This comment has been minimized.

Copy link
Contributor

Diggsey commented Jun 25, 2018

It's a good point - I would at least like a solution which is generic over the numeric types (including user-defined numeric types) but I don't think it need be more general than that.

Regarding the prelude - would it be possible to have the traits exist in both the prelude and a module, and only stabilise the version outside the prelude? (Effectively preventing the kinds of conflicts @SimonSapin described without committing to stabilisation of the trait's existence in the prelude).

TBH, I'm not sure these need to be in the prelude when the as operator still exist, as that will cover the most common uses.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jul 29, 2018

From the RFC:

Tweak the documentation to clarify that From implementations should not lose precision (i.e. should be injective)

This may not be possible, because as #2506 (comment) points out some existing impls in std do not satisfy this property and removing them would be a breaking change. For example, From<Vec<T>> for BinaryHeap<T> does not preserve order.

@clarcharr

This comment has been minimized.

Copy link
Contributor

clarcharr commented Jul 29, 2018

IMHO, things like Vec -> BinaryHeap and Vec -> HashSet should implement FromLossy and have their From implementations deprecated.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Jul 29, 2018

As far as I know impl deprecation isn’t really a thing.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Jul 29, 2018

I really like RoundFrom. It's the best word I've seen yet in this thread for what it does, and it conveniently would allow it to exist in num, and thus not get blanket impls from normal From, since I definitely wouldn't say that String: RoundFrom<&str> makes sense.

I'm not a fan of trying to turn From/TryFrom into a single trait. I definitely don't want anything on TryFrom that reports failure by panicking; someone can .try_into().unwrap() if they want that.

(Given a time machine, I'd rather like trait From<T> where T: TryFrom<Self> {} as a way to mark the expectations for the conversion, but obviously that cannot happen now.)

I definitely agree with omitting zero_extend and sign_extend. People can change signedness (via WrappingFrom) if they need that, the same as they need to do for right shifts.

Obviously I would be happy to see SaturatingFrom 🙂

In fact, the phrasing as WrappingFrom + RoundingFrom + SaturatingFrom (all in core::num) gives a nice, focused title for the RFC as something like "numeric conversion traits", helping to scope it down to the critical parts, and avoid questions like whether Vec -> HashSet fits in any generic trait.

The question of rounding mode is an interesting one. I think the word "rounding" suggests to me that i8::rounding_from(f32::from(1i8) ± ϵ) should still be 1i8, which implies that it should be round-to-nearest (also the default LLVM floating point environment). But at the same time, I was picturing round-downward for u8: SaturatingFrom<f32> so that i..i+1 mapped to i. I do think it shouldn't be round-toward-zero, since that makes 0 twice as wide as the other integers.

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Jul 30, 2018

the word "rounding" suggests to me that i8::round_from(f32::from(1i8) ± ϵ) should still be 1i8

This is an important point. While I don't think that's something that can be taken for granted (since there are several ways to round things), it could be very misleading if in fact it rounds down (or to zero, or anything other than to nearest).

So much as I like the name RoundFrom, it seems like we should use ApproxFrom instead and/or make RoundFrom actually round to nearest (perhaps with ties to even). (From my POV, something like ApproxFrom with rounding down to neg-inf or 0, probably the former, seems important.)

The other point about using the num crate is an interesting one. To me, ApproxFrom, WrappingFrom, and potentially also SaturatingFrom feel like std lib functionality (the first two are pretty-much required for various common numeric conversions, if not using the as keyword). On the other hand, RoundFrom might fit well in num — except that it's so similar to ApproxFrom that having both feels redundant.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Jul 30, 2018

The other point about using the num crate is an interesting one

I meant the core::num module (which I apparently only specified later in the post, oops). I definitely agree that these should be in std, as a critical part of being able to deprecate them in as later.

On round-to-nearest, what does f64 -> f32 do? I would have hoped round-to-nearest, but I assume round-towards-zero by truncating would be the cheap implementation. Maybe the differing expectations here is evidence that it should just take an enum of the four rounding modes?

@gnzlbg

This comment has been minimized.

Copy link
Contributor

gnzlbg commented Jul 30, 2018

Maybe the differing expectations here is evidence that it should just take an enum of the four rounding modes?

How would we implement that? IIRC there were some undefined behavior related issues with changing the rounding mode in LLVM to do these types of computations.

@peterjoel

This comment has been minimized.

Copy link

peterjoel commented Aug 5, 2018

Many existing Into and From impls are already lossy.

Any impl Into<Y> for X is analogous to a function X -> Y, and an impl TryInto<Y> from X is a partial function. The existence of corresponding From and TryFrom impls also says something about the nature of the Into impl.

An alternative to this RFC might be to acknowledge that Into and From impls can be lossy, but introduce new traits for when they're not:

trait Surj<T>: TryInto<T> + From<T> {} // sort of
trait Inj<T>: Into<T> + TryFrom<T> {}
trait Bij<T>: Surj<T> + Inj<T> {}

impl<A, B> Surj<A> for B where B: TryInto<A> + From<A> {}
impl<A, B> Inj<A> for B where B: Into<A> + TryFrom<A> {}
impl<A, B> Bij<A> for B where B: Surj<A> + Inj<A> {}

The definitions – Surj in particular – are a little loose, so perhaps other names could be chosen.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Aug 5, 2018

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Aug 10, 2018

Many existing Into and From impls are already lossy.

They can be lossy in the sense that the reverse mapping is not always unique, but they are usually to some extent "obvious". For example Vec<T>BinaryHeap<T> does not preserve order, but that’s just how binary heaps work.

For numeric conversions there are multiple possible (and useful) behaviors for lossy cases. This is the reason why From is not appropriate there, since there can only be one From impl between a given pair of types.

I think this RFC should not attempt full generality similar to From and TryFrom, but instead focus on the various kinds of conversions between numeric types and pick API names based on the respective behavior: wrap, truncate, approximate, etc.

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Aug 13, 2018

I will push some minor tweaks to the RFC, but still not sure what to do with regards to rounding behaviour etc. For f64 → f32 as well as float → int conversions, we also have range issues, and the following options:

  • undefined behaviour: common in C but definitely not preferred; I believe this is what using unguarded instructions like CVTTSD2SI would result in
  • saturating cast (excepting NaN): convert to min/max of output type when out of range, or error value if input is NaN; this is sometimes desired but other times errors are preferable
  • saturating cast (including NaN): as above but also output MAX (or some such) when input is NaN and output type has no NaN value; this is probably never useful except that it avoids need for error handling (though saturating_cast(input).unwrap_or(MAX) probably suffices)
  • explicit error: probably the best default

Of the above, it seems that it may be worth supporting both saturating casts and explicit errors, e.g. via SaturatingFrom and TryRoundFrom.

Then there is rounding behaviour (see @fstirlitz's post):

  • truncate (towards zero): seems to be the default / fast option; good enough in many cases
  • round to nearest: common human-friendly and best-approximation option
  • round up / round down: less common
  • round with explicit error: requires multiple output parameters; probably no fast general-purpose implementation so may be better left to third-party libraries?

Of the above, it would be nice to support at least the first two options; maybe also round up/down. However, how should this be done? One possibility would be to have multiple methods within a TryRoundFrom trait, though this doesn't properly allow partial implementation:

trait TryRoundFrom<T> {
    type Error;
    fn try_trunc_from(x: T) -> Result<Self, Self::Error>;
    fn try_round_from(x: T) -> Result<Self, Self::Error>;
}
@gnzlbg

This comment has been minimized.

Copy link
Contributor

gnzlbg commented Aug 13, 2018

I believe this is what using unguarded instructions like CVTTSD2SI would result in

Doesn't CVTTSD2SI perform a truncation?

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Aug 13, 2018

What does truncation mean for values out of range of the target type? Also, what's a good way to test these instructions? Writing ASM I suppose...

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Aug 14, 2018

Julia has a nice approach to conversions, basically parameterised rounding functions. Unfortunately by the time you add error handling and account for Rust not allowing default values for type arguments, the nearest equivalent is much less ergonomic):

trait Round<T> {
    type Error;
    fn round(&self) -> Result<T, Error>;
}
let y = Round::<u32>::round(2.6f32)?;
// In Julia, it always rounds to the nearest integer:
assert_eq!(Round::<f64>::round(10.1).unwrap(), 10.0);
// though we'd probably not want to do this

The above is not exactly ergonomic Rust. But regarding @SimonSapin's previous point about focusing on numeric conversions, it suggests the following direction:

  • RoundFrom, TruncFrom, CeilFrom, FloorFrom
  • all traits have an Error type and always return a Result
  • could be in prelude or could just be traits in std::num or std::convert

Which brings up another point: why are the error types CharTryFromError, TryFromIntError, TryFromSliceError not in the std::convert module and not even consistently named?

Or, there is another possible approach, though more convoluted and probably a bad idea:

/// pure marker wrapper
pub struct Round<T>(T);

impl TryFrom<Round<f32>> for u32 { ... }
@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Aug 17, 2018

Catching up with the last three(!) weeks of discussion, here's assorted notes about implementability and performance (sorry still don't have an opinion on the bikesheds):

  • While @gnzlbg is right that changing the global rounding mode and having it affect subsequent instructions is very problematic, there is no problem at all with providing APIs that perform one operation with a specified rounding mode (e.g., "f64->f32, round towards +inf", "add with round to zero", etc.) that doesn't affect nor is affected by the global rounding mode.
  • I don't think we should worry too much about how costly these operations are and whether they map to a single CPU instruction. If someone needs round-to-foo, then they need round-to-foo, end of story. Contrast this with potential fixes for as (rust-lang/rust#10184) where performance is very important as it's very widely and "blindly" used, including in cases where overflow/NaN are impossible.
  • But if someone insists on worrying, note that for float->int, a check for whether the result fits into the target type might very well be as expensive or more expensive than rounding mode fiddling (though how expensive that is depends on the architecture and microarchitecture).
@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Aug 17, 2018

Thanks @rkruppe!


For saturating conversions (clamping to target range), is it reasonable to only support a single rounding mode (perhaps truncation) or is there reason to support multiple rounding modes, as with other conversions?

Also, for saturating conversions from floating point, what should be the behaviour on NaN and Inf input? (Do we need error handling anyway, via Result or panic?) @SoniEx2?

Because saturating conversions could be as complex as non-saturating (erroring) variants, or could be simplified to a single trait & function.


@SimonSapin would you know if there are any proposals similar to ? in try_f()? but for .unwrap()? The point is that T::round_from(x).unwrap() is not very ergonomic, but something like T::round_from(x)?! might be good enough.

Or will we even get free unwrapping from Result<T, !> types eventually?

The point is that in this case we can use uniform error handling (type Error; fn round_from(x: T) -> Result<Self, Self::Error>;). But if the above is not the case, then there is more reason to have multiple traits (as with FromLossy and TryFromLossy in the current RFC, though those names will be changed).

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Aug 17, 2018

@dhardy Per http://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/ it sounds like the plan is to allow this:

fn foo(x: Result<T, !>) {
    let Ok(value) = x;  // This pattern is irrefutable, since the Err(!) case is impossible
}

… but I don’t know of any proposal for doing it within an expression in any way close in terseness to the ? operator.

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Aug 17, 2018

For saturating conversions (clamping to target range), is it reasonable to only support a single rounding mode (perhaps truncation) or is there reason to support multiple rounding modes, as with other conversions?

I don't see why the handling out values outside the target range should be related to how you're rounding fractional quantities within the range.

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Aug 17, 2018

I don't see why the handling out values outside the target range should be related to how you're rounding fractional quantities within the range.

It's not related; it's just an attempt to simplify. Because so far it looks like we might want all these:

fn trunc_from(x: T) -> Result<Self, Error>;
fn round_from(x: T) -> Result<Self, Error>;
fn ceil_from(x: T) -> Result<Self, Error>;
fn floor_from(x: T) -> Result<Self, Error>;
fn saturating_trunc_from(x: T) -> Result<Self, Error>;
fn saturating_round_from(x: T) -> Result<Self, Error>;
fn saturating_ceil_from(x: T) -> Result<Self, Error>;
fn saturating_floor_from(x: T) -> Result<Self, Error>;

(and this is assuming we don't have separate try_trunc_from etc.)

Of course, we could reduce this list by means of a rounding_mode parameter, but that's less convenient to use much of the time.

Or we could use T::try_from(Round(x)) etc. as I suggested above, though it has drawbacks — Round etc. need to be imported and unintuitively Round(x) doesn't actually do anything.

@vks

This comment has been minimized.

Copy link

vks commented Aug 30, 2018

Of course, we could reduce this list by means of a rounding_mode parameter, but that's less convenient to use much of the time.

Alternatively, there could be an ApproxScheme type parameter, see conv::ApproxFrom. However, this is prone to result in unportable code. I think it is best to have explicit methods as you suggested.

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Aug 30, 2018

Currently "as" casts are a language feature, and they are a basic feature in a language, so I think the problem of casts should be discussed and worked on from a more general point of view, instead of just from a stdlib point of view like (I think) it's being discussed here, to face the bigger problem.

I think the first we should take a look at Rust code to see where most casts are, to classify them and see how much dangerous they could be.

In past I've written various posts about this, like:

https://internals.rust-lang.org/t/to-reduce-the-number-of-true-casts/3076

https://users.rust-lang.org/t/correctness-integral-overflows/6449/

In many cases the casts are like:

let i: u16 = ...;
my_slice[i as usize] = 0;

So a well rounded proposal for casts should face the problem of array/slice/vec indexes.

Another common situation is this one, Rust should offer a safe cast (that doesn't contain an unwrap) to do this:


let x: u32 = ...;
let y = (x % 10) as u8;

To do this I think a stdlib-level solution isn't good enough, you need to bake inside the compiler some value-range analysis.

Also, number casts aren't the only thing we should care about.

There are also enums:
https://users.rust-lang.org/t/c-style-enum-conversions/12861

And slices, currently in Nightly you can do this:

#![feature(try_from)]

use std::convert::{TryFrom, TryInto};

fn foo(a: &[u8; 4]) {
    println!("{:?}", a);
}

fn main() {
    let data = [10, 20, 30, 40, 50];

    let conv1 = <&[u8; 4]>::try_from(&data[1..]);
    foo(conv1.unwrap());

    foo(<_>::try_from(&data[1..]).unwrap());

    let conv2: Result<&[u8; 4], _> = data[1..].try_into();
    foo(conv2.unwrap());

    foo(data[1..].try_into().unwrap());
}

But the slice <-> array conversions is currently too much underpowered, tricky and unergonomic, it should become more flexible and simpler to use, keeping code correctness.

Beside the value-range analysis of numeric expressions the compiler could perform analysis on the length of the slices too, to allow slice <-> array conversions with a light syntax, keeping safety and avoiding unwraps() where the slice->array conversion is provably always correct (like in the (x % 10) as u8 case above).

@mgeisler

This comment has been minimized.

Copy link

mgeisler commented Sep 1, 2018

I noticed that the RFC and the discussion above has been almost exclusively focussed on numeric conversions.

However, it is my impression that From is used for much more than that. The documentation mentions String::from as an example of a From<&str> for String implementation and uses error handling as its main example.

Similarly, while rounding errors is one good example of a lossy conversion, I would also expect FromLossy to be implemented for conversation between UTF-8 and Latin1 text encodings.

Going further, one could imagine an image manipulation library using FromLossy to express a conversion from PNG to JPEG. A parsing library could implement TryFrom<&str> for AST. However, the latter might be better expressed as parse(&str) -> Result<AST, ParseError> so maybe the sentiment is that these traits should only be used for "small" and "quick" conversions?

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Sep 1, 2018

From this example:

let i: u16 = ...;
my_slice[i as usize] = 0;

one would think it could be solved by by implementing SliceIndex<[T]> for I where usize: From<I>. At least, this would solve the case mentioned — but since there are a lot of methods which take usize index parameters (including most of the Vec methods), generalising this to all applicable cases would result in a fair amount of code churn (and probably a noticeable increase to compile time and some cases of failed monomorphisation).

In C++ one could take a different approach: implicit conversions. Rust doesn't have them and C++ over-uses them, but perhaps this would be a legitimate use (along with some other conversions which are simple and loss-less).

@vks

This comment has been minimized.

Copy link

vks commented Sep 1, 2018

one would think it could be solved by by implementing SliceIndex<[T]> for I where usize: From.

I don't think this this is feasible, because it breaks type inference, likely for a lot of code.

@dhardy

This comment has been minimized.

Copy link
Contributor Author

dhardy commented Sep 1, 2018

True, that wouldn't work because of type inference.

What might work is (a) trait Implicit<S>: From<S> {} selectively enabling implicit conversions plus (b) changes to type inference such that if the type is not directly specified (x: T) then there must be a unique type among input types (x = f() where f -> I) or, failing that, output types (g(x) where g(y: O)), such that all bounds are met (T: Implicit<I> and Implicit<O>: T).

But annoyingly literals need special attention to avoid breakage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment