Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for std::num::NonZero* types #49137

Closed
Centril opened this issue Mar 18, 2018 · 48 comments

Comments

Projects
None yet
@Centril
Copy link
Member

commented Mar 18, 2018

This is a tracking issue for the RFC "Add std::num::NonZeroU32 and friends, deprecate core::nonzero" (rust-lang/rfcs#2307).

Steps:

Unresolved questions:

  • Should the signed types be included?
  • Decide on naming -- NonZeroU32 or PositiveU32? Should the U be dropped?
  • Should the memory layout of e.g. Option be a language guarantee?
  • Settle issues wrt. const generics before stabilization.
    -- We should either end up with a generic type or with a clear explanation of why a generic type can't work or doesn't work well.
@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2018

Implementation PR: #48265

Documentation: @Centril, do doc-comments in this PR look sufficient to you?


My opinion on unresolved questions:

Signed types (e.g. NonZeroI32): as discussed in the RFC thread, they are included in the RFC and implementation PR because it was easy and because core::nonzero::NonZero<T> supports them too. I see a downside to having them, but I also haven’t had a use case for them. And they are easy to build outside of std in a crate that uses the unsigned types internally and does the conversion in new, new_unchecked, and get. (Unlike e.g. non-zero integers of different sizes if only NonZeroUsize were available.)

Positive* names: The fact that zero is not positive in English is subtle, and it is not shared by some other languages. (For example zero is both positive and negative in French, and the correct translation for “nombre positif” is “non-negative number”.) I don’t think relying on such a subtlety to communicate the core property of these types is a good idea.

Dropping the U in names: this is only possible if we also remove the signed types. But even so, I think that including the full name of the type accepted by new and returned by get is desirable.

Separate types v.s. one generic type: this was discussed at length in the RFC. The former has precedent with std::sync::atomic::Atomic*. The latter brings other unresolved questions that I feel don’t have satisfactory answers: trait bounds on that type parameter, and implementations of relevant traits for composite for types outside of std.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2018

Making the memory layout of Option<num::NonZero*> a language guarantee: I explicitly made the RFC not make this guarantee in order to keep the RFC within the realm of the standard library (and hopefully easier to get accepted). But if @rust-lang/lang feels comfortable making that guarantee, that sounds good to me.

@Centril

This comment has been minimized.

Copy link
Member Author

commented Mar 18, 2018

@SimonSapin

Documentation: @Centril, do doc-comments in this PR look sufficient to you?

Pretty much, yeah =)


On the unresolved questions:

  1. Maybe we should skip the signed types but add them when someone asks for them with an issue or PR or on IRC (don't need to go to another RFC then...)?

  2. I agree that NonZeroU32 is best.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2018

For what it’s worth the implementation of the 6 signed types costs exactly an additional 111 characters in a macro invocation. They’re not a maintenance burden.

@Centril

This comment has been minimized.

Copy link
Member Author

commented Mar 18, 2018

For what it’s worth the implementation of the 6 signed types costs exactly an additional 111 characters in a macro invocation. They’re not a maintenance burden.

This is true; but they cost in compile times -- but probably not a noteworthy amount of time.

@clarfon

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2018

Honestly, compile time of end-user crates matters a lot more than compile time of libstd and co. because for most users libstd will already be compiled. So, even adding a minute of compile time to libstd isn't super bad.

@scottmcm

This comment has been minimized.

Copy link
Member

commented Mar 20, 2018

Sounds like I'm the only one who likes Positive, so let's mark that resolved and stick with NonZeroU32.

Maybe just put the signed ones under a different feature gate? That way we'll find out which ones people use, and there's still nobody "attached" to them, stabilize/remove them separately. (Personally I think 0 is a weird hole in a signed type, and would much rather have nice named types for symmetric integers [not INT_MIN] than ones without zero. [Until we get const generics and can make our own holes.])

@clarfon

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2018

I honestly think that Positive8 is a bit ambiguous considering how it could mean only 1 to 127 instead of 255.

NonZeroed8 may be a bit (heh) cleaner as it indicates that the bits aren't all zeroed, rather than the fact that 0 is an excluded value.

bors added a commit that referenced this issue Mar 21, 2018

Auto merge of #48265 - SimonSapin:nonzero, r=KodrAus
Add 12 num::NonZero* types for primitive integers, deprecate core::nonzero

RFC: rust-lang/rfcs#2307
Tracking issue: ~~#27730 #49137
Fixes #27730

frewsxcv added a commit to frewsxcv/rust that referenced this issue Mar 23, 2018

Rollup merge of rust-lang#48265 - SimonSapin:nonzero, r=KodrAus
Add 12 num::NonZero* types for primitive integers, deprecate core::nonzero

RFC: rust-lang/rfcs#2307
Tracking issue: ~~rust-lang#27730 rust-lang#49137
Fixes rust-lang#27730

alexcrichton added a commit to alexcrichton/rust that referenced this issue Mar 23, 2018

Rollup merge of rust-lang#48265 - SimonSapin:nonzero, r=KodrAus
Add 12 num::NonZero* types for primitive integers, deprecate core::nonzero

RFC: rust-lang/rfcs#2307
Tracking issue: ~~rust-lang#27730 rust-lang#49137
Fixes rust-lang#27730

@SimonSapin SimonSapin changed the title Tracking issue for RFC 2307, "Add std::num::NonZeroU32 and friends, deprecate core::nonzero" Tracking issue for std::num::NonZero* types Mar 23, 2018

@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Mar 23, 2018

The implementation #48265 has landed, it should reach Nightly in a few hours.

@Centril

This comment has been minimized.

Copy link
Member Author

commented Mar 23, 2018

@SimonSapin how do you feel about @scottmcm's note that we should:

Maybe just put the signed ones under a different feature gate?

@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Mar 23, 2018

How useful is this to do now? (v.s. changing the status later of only a subset)

My thoughts lately on the signed types is that there isn’t really a strong case against them, but there isn’t much of a use case either. Adding that they can be easily be built out of tree on top of the unsigned types, I’d be ok with removing or deprecating them and see if anyone objects. They’re easy to add (back) later if we decide to do so.

@Centril

This comment has been minimized.

Copy link
Member Author

commented Mar 23, 2018

@SimonSapin

I’d be ok with removing or deprecating them and see if anyone objects. They’re easy to add (back) later if we decide to do so.

👍

SimonSapin added a commit to SimonSapin/serde that referenced this issue Mar 24, 2018

impl Serialize and Deserialize for std::num::NonZero*
… gated on the `unstable` Cargo feature.

These are new standard library types. Tracking issue:
rust-lang/rust#49137

SimonSapin added a commit to SimonSapin/serde that referenced this issue Mar 24, 2018

impl Serialize and Deserialize for std::num::NonZero*
… gated on the `unstable` Cargo feature.

These are new standard library types. Tracking issue:
rust-lang/rust#49137
@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Mar 24, 2018

Deprecation of NonZeroI* #49324

Support for NonZeroU* in serde: serde-rs/serde#1191

@mickvangelderen

This comment has been minimized.

Copy link

commented Jul 26, 2018

I arrived here through the deprecation notice for NonZeroI32. I am working with OpenGL and want to wrap uniform locations which are represented by an i32 in the OpenGL API. Uniform location values are non-negative except for the sentinel value -1 representing a location that wasn't found. I would liked to have defined the wrapper as struct UniformLocation(NonZeroI32), storing the raw uniform location value + 1. Currently I use NonZeroU32 and a few casts which is fine but I wanted to report my use case regardless. Alternatively I would be able to specify that -1 can represent None as mentioned in #49137 (comment).

@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2018

Alternatively I would be able to specify that -1 can represent None

There seems to be consensus that a more general feature, where you can specify what (ranges of) values are forbidden, would be desirable. But it would likely be based on const generics which is not implemented in the language yet.

In the meantime, NonZeroI32 with an offset would allow values -2, -3, etc which you’re saying should not be allowed in this context. So it sounds like it wouldn’t do what you want. If I missed something, could you give a more detailed example?

@Kixunil

This comment has been minimized.

Copy link

commented Aug 25, 2018

Would using an unsafe trait to specify invalid ranges be viable? I think traits are a great way to specify interfaces, so that idea appeals to me at least from this point of view.

Example:

/// Allows for a type to specify invalid bit pattern in order to allow enum layout optimization.
#[some_magic_lang_item_annotation]
unsafe trait InvalidValues: Sized {
    /// How many invalid values can be written into the same memory Self fits in.
    const INVALID_VALUE_COUNT: usize;

    /// Must map discriminant to some invalid bit pattern and write it to `dest`.
    /// The caller must guarantee that `discriminant < Self::INVALID_VALUE_COUNT`.
    /// The caller must also provide a valid pointer. The pointer must point to uninitialized memory.
    const unsafe fn write_invalid_value(dest: NonNull<Self>, discriminant: usize);

    /// This function must inspect the memory pointed to by `value` and return the same discriminant as previously written by `write_invalid_value()`. In case the memory contains valid value, this function must return `Self::INVALID_VALUE_COUNT`.
    const unsafe fn get_discriminant(value: NonNull<Self>) -> usize;
}

It would allow users to specify arbitrary invalid bit patterns. That way we could even have things like:

#[repr(transparent)] // not necessary, but why not?
struct LessThan253(u8);

impl InvalidValues for LessThan254 {
    const INVALID_VALUE_COUNT: usize = 2;

    unsafe fn write_invalid_value(dest: NonNull<Self>, discriminant: usize) {
        mem::write(dest.as_ptr() as *mut u8, discriminant as u8 + 254);
    }

    unsfe fn get_discriminant(value: NonNull<Self>) -> usize {
        let value = mem::read(value.as_ptr() as *mut u8) as usize;
        if value < 254 {
            Self::INVALID_VALUE_COUNT
        } else {
            value - 254
        }
    }
}

fn test() {
    enum Foo {
        Value(LessThan254),
        Bar,
        Baz,
    }

    assert_eq!(mem::size_of<Foo>() == mem::size_of<LessThan254>());
    assert_eq!(mem::size_of<u8>() == mem::size_of<LessThan254>());
}

What do you think?

@eddyb

This comment has been minimized.

Copy link
Member

commented Aug 26, 2018

@Kixunil Using just associated consts could maybe work, but I really really do not want to involve the trait system during layout computation and risk weird inconsistencies because of how people wrote the impls, or have to do complex checking ahead of time, when the impls are first written.

Using methods is a no-no, it means you're getting no LLVM optimization (), no nested reuse (for the same reason Option<Option<(&T, &U)>> is not the same size as Option<(&T, &U)>), miri wouldn't work, and probably a myriad of other things that escape me right now.

IMO the only reliable way to get this is const generics and a wrapper type that removes a values of a certain width (or just full width), at a certain location (or just offset 0), from the type.
The name is (obviously) up for bikeshed, but WithInvalid<u8, 254..=255> is one option.

I think traits are a great way to specify interfaces

Sure. But this is not an interface, it's a fundamental representational property of the data type.
You can totally emulate Option<T> using some chosen value of T for None, in a library, but this is not that - the invalid values the compiler knows about are UB to exist.

In your case, who's stopping me from creating LessThan254(255)? You have to write checks in unsafe code, and manually maintain the invariants, whereas a wrapper type would do all that for you.

@Kixunil

This comment has been minimized.

Copy link

commented Sep 9, 2018

@eddyb Could you give an example of weird inconsistency that might happen? People can already shoot themselves in the foot in unsafe code. I don't see a problem with careful use of unsafe when it's really needed.

Using methods is a no-no, it means you're getting no LLVM optimization

Why? If LLVM can optimize some functions, what's preventing it from optimizing other functions?

miri wouldn't work

Why one const fn works in miri but not other const fn?

the invalid values the compiler knows about are UB to exist.

would using &mut MaybeUninitialized<Self> in the signature help?

In your case, who's stopping me from creating LessThan254(255)

What's preventing me to implement e.g. Mutex incorrectly? Nothing. I have to use the unsafe keyword, though. If someone impls that trait for their type incorrectly, it's the same thing as screwing up other unsafe code. I don't understand, why this unsafe code should be special.

WithInvalid could be implemented in terms of InvalidValues. Additionally, InvalidValues can be used for things like OddNumber, which can't be achieved with simple range.

@eddyb

This comment has been minimized.

Copy link
Member

commented Sep 9, 2018

Could you give an example of weird inconsistency that might happen?

The problem is a trait impl is not "inherent" to a type. We'd have to pile on a bunch of checks, to ensure impls of this trait are "boring" wrt to when they hold. Most importantly, the implementation must not depend on type parameters, nor introduce relationships between lifetime parameters (or just not have generic parameters at all). In a way, this is similar to the restrictions of the Drop trait.

With all of that work, and const generics on the horizon, I prefer an "obviously correct" solution using const generics, instead.

Why? If LLVM can optimize some functions, what's preventing it from optimizing other functions?

I mean we can't tell LLVM anything about value validity, if you have functions doing encoding and decoding. Our existing contiguous "(in)valid ranges" actually lower to LLVM !range metadata.

Why one const fn works in miri but not other const fn?

We don't have const fn in traits yet.

would using &mut MaybeUninitialized<Self> in the signature help?

Sure, but you actually want a custom union between Self and a type with the same primitive field but no validity semantics, because otherwise you can't access the primitive.

Additionally, InvalidValues can be used for things like OddNumber, which can't be achieved with simple range.

I personally have no intention of supporting those kinds of optimizations. they can significantly complicate discriminant decoding (which happens at runtime, unlike encoding), whereas, currently, all of our enums, even when using invalid values, take at most one subtraction and one comparison, to decode the discriminant.

@Kixunil

This comment has been minimized.

Copy link

commented Sep 16, 2018

That makes a lot more sense to me now, thanks for explaining!

@frankmcsherry

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2018

Somewhere else (the PR that deprecated them) one is requested to drop in here and indicate if you use (or would use, I guess) signed NonZero variants. I do / would, lots. The most common use case is in differential dataflow (but it is probably more common) where you record changes to things. You don't ever need to record a zero change, and indeed often you want to use types to communicate the non-zero'd ness.

For example, I'd love to be using a trait

trait Addition {
    /// returns the result of the summation if non-zero.
    fn add(self, other: Self) -> Option<Self>;
}

The important point for me is that I'd like the opportunity for the zero value to not inhabit the types, so that e.g. () could implement GF(2), which is important for zero-overhead implementation of incremental updates to collections with set semantics (vs tracking a 0/1 multiplicity as an associated integer).

Without non-zero signed integers, the above Option<isize> ends up being a bit bigger and probably a bit more complicated logically (vs simply addition with the zero value interpreted as None).

Anyhow, I can work around forever, obvs, but I thought you all might want to know! :D

@eddyb

This comment has been minimized.

Copy link
Member

commented Nov 25, 2018

@frankmcsherry FWIW this should be sufficient:

#[derive(Copy, Clone)]
pub struct NonZeroI32(NonZeroU32);

impl NonZeroI32 {
    pub fn new(x: i32) -> Option<Self> {
        NonZeroU32::new(x as u32).map(NonZeroI32)
    }
    pub fn get(self) -> i32 {
        self.0.get() as i32
    }
}
@andrewhickman

This comment has been minimized.

Copy link

commented Nov 25, 2018

There is also a crate implementing signed NonZero type

@frankmcsherry

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2018

Yup, thank you! I'm def happy working around, but equally happy to let you folks know about use cases (and if other people have them, put something like this next to NonZeroU32 in the stdlib).

@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2018

Given the existence of a work-around that preserves the enum layout optimization, this doesn’t seem very pressing. On the other hand, adding them is very easy, and not really any additional maintenance burden. So I’m really neutral on this one. Consider nagging some other member of the libs team :)

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2018

This is probably more amusing than significant, but another real-world use case of signed-numbers-without-zero is years.

@yoshuawuyts

This comment has been minimized.

Copy link
Member

commented Jan 6, 2019

Another use case for signed non-zero numbers: I'm currently working with a C library where the number 0 is significant. Representing this as an enum would probably work well I reckon.

libnghttp2_sys priority_spec.stream_id

API Example

An example of how to represent the libnghttp API above:

pub type StreamId = NonZeroI32;

pub enum StreamDependency {
  Independent,
  Dependent(StreamId),
}
@clarfon

This comment has been minimized.

Copy link
Contributor

commented Jan 6, 2019

For now, can't signed nonzero types just be implemented as a separate crate? Then people can judge inclusion into libstd based upon how many people use it.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Jan 6, 2019

Yes, https://crates.io/crates/nonzero_signed was mentioned upthread.

@clarfon

This comment has been minimized.

Copy link
Contributor

commented Jan 6, 2019

...oh. In that case, perhaps this issue could be locked from future comments on that and a larger note could be made to mention that crate?

SimonSapin added a commit to SimonSapin/rust that referenced this issue Jan 9, 2019

Add signed num::NonZeroI* types
Multiple people have asked for them, in
rust-lang#49137.
Given that the unsigned ones already exist,
they are very easy to add and not an additional maintenance burden.
@SimonSapin

This comment has been minimized.

Copy link
Contributor

commented Jan 9, 2019

#57475 Add signed num::NonZeroI* types

SimonSapin added a commit to SimonSapin/rust that referenced this issue Jan 10, 2019

Add signed num::NonZeroI* types
Multiple people have asked for them, in
rust-lang#49137.
Given that the unsigned ones already exist,
they are very easy to add and not an additional maintenance burden.

SimonSapin added a commit to SimonSapin/rust that referenced this issue Jan 10, 2019

Add signed num::NonZeroI* types
Multiple people have asked for them, in
rust-lang#49137.
Given that the unsigned ones already exist,
they are very easy to add and not an additional maintenance burden.

SimonSapin added a commit to SimonSapin/rust that referenced this issue Jan 10, 2019

Add signed num::NonZeroI* types
Multiple people have asked for them, in
rust-lang#49137.
Given that the unsigned ones already exist,
they are very easy to add and not an additional maintenance burden.

SimonSapin added a commit to SimonSapin/rust that referenced this issue Jan 17, 2019

Add signed num::NonZeroI* types
Multiple people have asked for them, in
rust-lang#49137.
Given that the unsigned ones already exist,
they are very easy to add and not an additional maintenance burden.

SimonSapin added a commit to SimonSapin/rust that referenced this issue Jan 17, 2019

Add signed num::NonZeroI* types
Multiple people have asked for them, in
rust-lang#49137.
Given that the unsigned ones already exist,
they are very easy to add and not an additional maintenance burden.

SimonSapin added a commit to SimonSapin/rust that referenced this issue Jan 17, 2019

Add signed num::NonZeroI* types
Multiple people have asked for them, in
rust-lang#49137.
Given that the unsigned ones already exist,
they are very easy to add and not an additional maintenance burden.

SimonSapin added a commit to SimonSapin/rust that referenced this issue Jan 17, 2019

Add signed num::NonZeroI* types
Multiple people have asked for them, in
rust-lang#49137.
Given that the unsigned ones already exist,
they are very easy to add and not an additional maintenance burden.

Centril added a commit to Centril/rust that referenced this issue Jan 18, 2019

Rollup merge of rust-lang#57475 - SimonSapin:signed, r=alexcrichton
Add signed num::NonZeroI* types

Multiple people have asked for them in rust-lang#49137. Given that the unsigned ones already exist, they are very easy to add and not an additional maintenance burden.

bors added a commit that referenced this issue Jan 22, 2019

Auto merge of #57475 - SimonSapin:signed, r=estebank
Add signed num::NonZeroI* types

Multiple people have asked for them in #49137. Given that the unsigned ones already exist, they are very easy to add and not an additional maintenance burden.

VardhanThigle pushed a commit to jethrogb/rust that referenced this issue Jan 31, 2019

Add signed num::NonZeroI* types
Multiple people have asked for them, in
rust-lang#49137.
Given that the unsigned ones already exist,
they are very easy to add and not an additional maintenance burden.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.