Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic newtypes #2242

Closed
wants to merge 12 commits into from
Closed

Semantic newtypes #2242

wants to merge 12 commits into from

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Dec 11, 2017

Introduce a newtype construction allowing newtypes to use the capabilities of the underlying type while keeping type safety.

TLDR: type Foo = Bar;, but no implicit conversion between Foo and Bar

Rendered

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 11, 2017
Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the moral behind this RFC but I'd like to discuss some details.

id type:

```rust
type UserIndex is new usize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This syntax feels alien to me - why not simply:

newtype UserIndex(usize);

or:

new type UserIndex(usize);

Assuming we are allowed to introduce this to the grammar (with an epoch or whatever), to me this feels more direct - you know that it is a newtype from the fist syllable and it's more consistent with the current syntax rules. Perhaps you considered this syntax and rejected it - if so, why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume there's a good reason we can't just retrofit any/all of the following (which are all semantically newtypes given #[repr(transparent)]):

struct New(Base);

struct New {
    field_name: Base
}

enum New {
    Variant(Base)
}

enum New {
    Variant {
        field_name: Base
    }
}

Does it have to do with coherence and/or backwards-compatibility? Still, I think it should be spelled out whatever the reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newtype

was rejected in #186 (comment)

new type

I like it.

#[repr(transparent)]

that representation does not do any newtyping to the best of my knowledge. It just makes sure the memory representation in the backend is 100% the same as just the inner type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was rejected in #186 (comment)

2014 was a eons ago and before 1.0 ^,- I think there's a good argument from familiarity (Haskell) to be made for newtype as a syntax. Tho new type could be equally clear. There's the "is new a modifier on type, or is it a different concept altogether?"-discussion to be had.

that representation does not do any newtyping to the best of my knowledge. It just makes sure the memory representation in the backend is 100% the same as just the inner type

Would not new type New(Base); assume #[repr(transparent)] if transmute is to be a possibility? What I mean is that:

#[repr(transparent)]
struct New(Base);

is semantically a new-type today - you just lack the auto-deriving capabilities. Those capabilities could perhaps be added for some traits (decidable by analysis on the syntax of the trait and not a specific list of traits?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deriving creates more code for something that is just a semantic name change. Additionally it only works when you already know what traits to derive. Any traits implemented downstream for the base type won't get implemented for your type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, you have to explicitly say what types you want to derive - often, there might be many traits you do want to derive and perhaps 1-2 you want to leave out in order to give a different impl for the newtype. A common example of this scenario is the Monoid trait and numeric types.

}

// Initialize the same way as the underlying types
let (start_inch, end_inch): (Inch, Inch) = (10, 18);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if Inch has a smart constructor explicitly rejecting subsets of usize? Seems like that use case isn't supported, which is what I expect from a feature called "newtype" (as used in Haskell).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave that to the ranged types #671 or custom literals ideas

good(a); // Ok, Foo implements Sub
```

## Derived traits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The #![derive_unfinished(..)] RFC has some notes here on how we can be more flexible wrt. deriving by extending the syntax of derive(_unfinished) itself. Feel free to steal any of those notes =) Tho I understand if you want to conservative initially.

I think generalized-newtype-deriving would be the major ergonomics-boost coming from this proposal.

contextual keywords `is` and `new`. The reason for using `is new` instead of
another sigil is that `type X = Y;` would be very hard to distinguish from any
alternative like `type X <- Y;` or just `type X is Y;`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of this grammar, see newtype as an alternative syntax above.

Newtypes can explicitly be converted to their base types, and vice versa.
Implicit conversions are not allowed.
This is achieved via the `Into` trait, since newtypes automatically implement
`From<BaseType>` and `From<NewType> for BaseType`. In order to not expose new
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is achieved via the Into trait, since newtypes automatically implement From<BaseType> and From<NewType> for BaseType.

That sounds problematic. These impls would be available to anyone as soon as NewType is exported due to the way impls are not scoped in Rust. This means that you have no ability to write smart constructors that constrain BaseType further in the allowed set of values. While refinement types may partially (the expressiveness of refinement may/should not be turing complete, unlike unconstrained Rust-code, etc.) solve this in the future, we are far from having those currently. I think it is important that newtypes be allowed to provide only smart constructors such as NewType::new(..) -> Self that are user defined.

To that end, as an alternative I'd like to propose that the compiler generate a module:

mod NewType {
    pub fn from_base(BaseType) -> NewType {..}
    pub fn into_base(NewType) -> BaseType {..}

Another alternative may be inherent and associated methods on NewType.
The From impls should only be generated and iff the user does derive(From, Into) or perhaps derive(Newtype) (with a possible actual trait Newtype) to avoid conflicts with custom-derive.

Some notes discussing T<NewType> -> T<BaseType> where T may be &, Box, Arc, Vec, .. , would be nice even if you are proposing we not allow those conversions. In that case, notes on why the conversions should not be allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an open question as stated at the end of the RFC ;)

I'm totally on board with scrapping these impls.

I personally don't need them. I'd be happy writing them out whenever needed.

Some notes discussing T -> T

I'll add those

## Implementation

The compiler would treat newtypes as a thin wrapper around the original type.
This means that just declaring a newtype does *not* generate any code, because
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with:

This is achieved via the Into trait, since newtypes automatically implement From<BaseType> and From<NewType> for BaseType.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, good catch 😆


We could save a keyword with this approach and we might consider a generalization
over all tuple structs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like an excellent idea! Please do consider that generalization =)

complaints about it seem to suggest that something needs to be done. Even
the compiler itself uses generated newtypes extensively for special `Vec`s that
have a newtype index type instead of `usize`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are excellent points.

Copy link
Member

@kennytm kennytm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once upon a time, the D programming language had a typedef feature very similar to what this RFC proposes. Eventually they deprecated it in favor of a library solution due to lack of flexibility. I highly doubt this feature will find success as a built-in feature in Rust. 👎.

I counter-propose we concentrate on unblocking RFC #1406 (delegation of implementation) and support user-defined literals instead.

println!("{}", frobnicate(x.into()));

let a: usize = 2;
let i: Inch = a; // Compile error, implicit conversion not allowed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between a and 2 that makes

let i: Inch = a; // illegal
let j: Inch = 12; // legal

? Being a "literal"? What about a constant

const YARD: usize = 36;
let k: Inch = YARD;

And expression

let day: Seconds = 24 * 60 * 60;

What if the base type cannot be represented using any kind of literals

type RcInch is new Rc<usize>;
let ai: RcInch = Rc::new(0); // ?????

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is that a has a type.

let a = 2;
let i: Inch = a; // works, because `a` is inferred to be `Inch`

The same with expressions. It already works for choosing any of the base integer types. If inference is adjusted to newtypes, this will work out of the box just like the inferred variable type example above.

type RcInch is new Rc<usize>;
let ai: RcInch = Rc::new(0); // ERROR, expected RcInch, got Rc
let aj: RcInch = Rc::new(0).into(); // might work, but a lot of generics
let ak: RcInch = RcInch::new(0); // ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oli-obk Thanks for the clarifications.

In Rust only integer and floating point literals have unspecified types. I suppose we want to support these cases?

type Identifier<'a> is new &'a str;
type Verbose is new bool;

let default_verbose: Verbose = false;
let tag_name: Identifier<'static> = "center";

Do we need to make these literals' types to be a generic {{boolean}} and {{string}}?

In the derived trait implementations the basetype will be replaced by the newtype.

So for example as `usize` implements `Add<usize>`, `type Inch is new usize`
would implement `Add<Inch>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section requires a serious expansion. Without auto trait implementation, I don't see how this is better than simply introducing user-defined literals (1234_inch).

Examples to clarify:

  1. usize has an inherent method fn trailing_zeros(self) -> u32, should it be available to Inch?
  2. If (1) is yes, usize has an inherent method fn swap_bytes(self) -> usize, should it be available to Inch and returns an Inch?
  3. If (2) is yes, usize has an inherent method fn checked_neg(self) -> Option<usize>, should it be available to Inch and returns an Option<Inch>?
  4. If (2) is yes, usize has an associated (static) method const fn min_value() -> usize, should it be available to Inch?
  5. usize implements Add<&usize>. Would Inch implement Add<&Inch>?
  6. If (5) is yes, &usize implements Add<&usize>, would &Inch implement Add<&Inch>?
  7. Should usize implement Add<Inch>? Add<&Inch>?
  8. usize implements Shr<T> for all integer types T (u8, u32, usize etc). Should Inch implement Shr<usize>? Should Inch implement Shr<Inch>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. yes, that's the entire point of semantic newtypes, you get everything the base type has
  2. yes
  3. yes
  4. yes
  5. yes
  6. yes
  7. no
  8. The strategy proposed in this RFC would result in Inch: Shr<Inch>

I'll add these points in the RFC text

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 12, 2017

I've become convinced that a library based solution will be better.

Reasons:

  • Zero changes required to the compiler or language
    • Especially the compiler changes would have been significant
  • Except in very common cases like newtype indices, some trait impls will be required to be overwritten or removed
    • newtype indices could simply have a one-line solution in a library based version
  • need most changes like conversion-from-literals anyway.
  • Conversions between base type and newtype will often want to be specified explicitly anyway
  • Compiler diagnostics can be tuned to report nicely on library newtypes, no need for language based newtypes
  • Incremental move from the current newtypes possible
  • Downstream crates implementing traits for the base type but not the newtype could be solved by a NewType<BASE> trait, where crates stop implementing traits for builtin types, but instead implement for T where T: NewType<usize>. Of course U: NewType<U>, because every type is its own newtype 😆

@oli-obk oli-obk closed this Dec 12, 2017
@eddyb
Copy link
Member

eddyb commented Dec 12, 2017

Something I don't think anyone pointed out in this thread is that impl From<T> + Into<T> can act a bit like a newtype (though only for certain usecases of it) and "happens" to work with T itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants