Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Champion "Native-Sized Number Types" #435

Open
gafter opened this issue Apr 13, 2017 · 52 comments
Open

Champion "Native-Sized Number Types" #435

gafter opened this issue Apr 13, 2017 · 52 comments

Comments

@gafter
Copy link
Member

@gafter gafter commented Apr 13, 2017

  • Proposal added
  • Discussed in LDM
  • Decision in LDM
  • Finalized (done, rejected, inactive)
  • Spec'ed

/cc @KrzysztofCwalina @jaredpar

[jcouv update:] as part of this feature, we should consider what Interlocked overloads the BCL/runtime should provide for those new types.

@jkotas

This comment has been minimized.

Copy link
Member

@jkotas jkotas commented Apr 17, 2017

@jkotas jkotas mentioned this issue Apr 17, 2017
1 of 5 tasks complete
@CyrusNajmabadi

This comment has been minimized.

Copy link

@CyrusNajmabadi CyrusNajmabadi commented Apr 17, 2017

nuint... :)

@jnm2

This comment has been minimized.

Copy link
Contributor

@jnm2 jnm2 commented Apr 17, 2017

Actually, could C# add nint, nunit, and nfloat?

@CyrusNajmabadi

This comment has been minimized.

Copy link

@CyrusNajmabadi CyrusNajmabadi commented Apr 17, 2017

As i mentioned in the other proposal, i don't see a lot of value in C# adding any keywords here. IntN, FloatN and the like are totally reasonable names to just use as is. And if you really want a all-lower name, then just add using nint = System.IntN.

@jnm2

This comment has been minimized.

Copy link
Contributor

@jnm2 jnm2 commented Apr 17, 2017

Do you foresee people opting to use IntN over int as iteration variables?

@CyrusNajmabadi

This comment has been minimized.

Copy link

@CyrusNajmabadi CyrusNajmabadi commented Apr 18, 2017

@jnm2 i'm not sure what you mean? Can you give an example? I don't generally think about iteration as being related to native sized ints... so i'm not sure what the connection is. Thanks!

@jnm2

This comment has been minimized.

Copy link
Contributor

@jnm2 jnm2 commented Apr 18, 2017

@CyrusNajmabadi Just a plain for loop for example. Everyone defaults to int for the iteration variable type, but would IntN potentially eke out more performance than int on x64?

@CyrusNajmabadi

This comment has been minimized.

Copy link

@CyrusNajmabadi CyrusNajmabadi commented Apr 18, 2017

I would really hope not :) But i'll let the CLR guys weigh in on that.

@jkotas

This comment has been minimized.

Copy link
Member

@jkotas jkotas commented Apr 18, 2017

would IntN potentially eke out more performance than int on x64?

native int is more performant than int32 for loops that iterate over memory on 64-bit platforms. int32 tend to require int32->native int casts that are unnecessary overhead. E.g. Check how some methods on Span<T> have been optimized to avoid the extra overhead: https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/SpanHelpers.byte.cs#L76

@jveselka

This comment has been minimized.

Copy link

@jveselka jveselka commented Apr 18, 2017

I wonder, should nint/nuint be allowed as underlying type of enums as well? And nint/nuint consts?

@bbarry

This comment has been minimized.

Copy link
Contributor

@bbarry bbarry commented Apr 18, 2017

@zippec that would be a breaking change in struct layouts wouldn't it?

@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Apr 18, 2017

@bbarry, how would that be a breaking change? Only new enums would support being subclassed from nint/nuint and the runtime already supports said functionality.

The runtime also supports native sized constants (although they are actually 32-bit constants that are implicitly converted)

@bbarry

This comment has been minimized.

Copy link
Contributor

@bbarry bbarry commented Apr 18, 2017

I misread that comment as changing the default underlying type.

@gafter

This comment has been minimized.

Copy link
Member Author

@gafter gafter commented Apr 18, 2017

If we add this feature, how is the compiler supposed to know if the following code is valid or not (because it would overflow nint during compile-time constant folding)?

nint t = (nint)int.MaxValue + (nint)1;
@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Apr 18, 2017

@gafter , I would think the compiler could determine the constant is larger than 32-bits and spit a warning that this will result in different behavior on 32-bit vs 64-bit platform.

Essentially the compiler just does all folding as the largest size (long or ulong) then explicitly downcasts to 32-bits for the store. If the downcast causes an overflow/underflow the compiler can warn/error.

@sharwell

This comment has been minimized.

Copy link
Member

@sharwell sharwell commented Apr 19, 2017

@gafter

how is the compiler supposed to know if the following code is valid or not (because it would overflow nint during compile-time constant folding)?

For compile-time constant folding, my instinct is to define the feature in the following way:

When evaluating a constant value of native size at compile time in checked mode, the computation is performed as though the native size is the preferred size for intermediate values. In this mode, it is an error for an intermediate value to overflow the range of the preferred size. In unchecked mode, the computation is performed as though the native size is the maximum allowable value for native-size integers for the compilation at runtime.

The preferred and maximum size for native integers is implementation-dependent. Examples of the preferred and maximum size for native integers for csc.exe are shown in the following table:

Argument Preferred Size Maximum Size
Default/unspecified 64-bit 64-bit
/platform:anycpu 64-bit 64-bit
/platform:anycpu32bitpreferred 32-bit 64-bit
/platform:x64 64-bit 64-bit
/platform:x86 32-bit 32-bit
@sharwell

This comment has been minimized.

Copy link
Member

@sharwell sharwell commented Apr 19, 2017

I added some additional thoughts in the GitHub issue for the original proposal. For the case of native int (IntPtr) and native unsigned int (UIntPtr), I am in favor of defining and implementing this feature as a change to the language specification and the C# compiler, with no changes made to the standard library.

@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Apr 19, 2017

@sharwell, that would be my existing proposal #48

However, it suffers from the back-compat concerns listed on the other thread, due to:

  • The constructor and cast operators having explicit checked behavior on 32-bit platforms
  • The existing add/sub operators defined in the standard library
@gafter

This comment has been minimized.

Copy link
Member Author

@gafter gafter commented Apr 19, 2017

@gafter , I would think the compiler could determine the constant is larger than 32-bits and spit a warning that this will result in different behavior on 32-bit vs 64-bit platform.

Essentially the compiler just does all folding as the largest size (long or ulong) then explicitly downcasts to 32-bits for the store. If the downcast causes an overflow/underflow the compiler can warn/error.

I don't see how that works:

const nint a = unchecked((nint)uint.MaxValue + (nint)1);
const bool c = a == (nint)0;

The bool c must be true if run on a 64-bit platform, and false if run on a 32-bit platform. But since it is a compile-time constant, it must be fixed to one of those values before it is ever run on any platform.

@sharwell

This comment has been minimized.

Copy link
Member

@sharwell sharwell commented Apr 19, 2017

@gafter Interesting example. Outside of situations where the bit size is known at compile time (e.g. /platform:x86 or /platform:x64), I don't see how the use of const would be allowed on native-sized integers. I would not be opposed to disallowing it altogether. Since you could inline the entire expression in the initializer for bool c, I think this mandates a restriction that the outcome of constant folding as a compiler optimization (allowed but not required) is not allowed to be dependent on the bit width of the target machine.

@gafter

This comment has been minimized.

Copy link
Member Author

@gafter gafter commented Apr 19, 2017

@sharwell Constant folding is not currently an "optimization" - it is required by the specification to occur under certain circumstances, and under no other circumstances. It is observable without const declarations:

void M(nint value)
{
    switch (value)
    {
        case (nint)0:
        case unchecked((nint)uint.MaxValue + (nint)1): // allowed, or duplicate case?
            break;
    }
}
@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Apr 19, 2017

@gafter, The CLI spec explicitly states that native int constants are only allowed to be 32-bit values. So the problem is already solved, you can't have a native int constant with a value of 4294967296.

@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Apr 19, 2017

The specific section is (I.12.1.4 CIL instructions and numeric types):

The load constant (ldc.*) instructions are used to load constants of type int32,
int64, float32, or float64. Native size constants (type native int) shall be created by
conversion from int32 (conversion from int64 would not be portable) using conv.i or conv.u.

@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Apr 19, 2017

I would assume that means any native int constant would have to be created as two instructions:

ldc.i4 <num>
conv.i

The language spec would then need to determine whether or not (nint)uint.MaxValue + 1 is folded to be 0 (not portable) or if folding was not done and it was emitted as a constant plus an add:

ldc.i4 4294967295
conv.i
ldc.i4.1
add

It might worth noting that (nint)uint.MaxValue + 1 and (nint)uint.MaxValue + (nint)1 should produce the same result, but the former requires one less instruction (since you do not need to run conv.i on the secondary value).

@sharwell

This comment has been minimized.

Copy link
Member

@sharwell sharwell commented Apr 19, 2017

@gafter If nint is not allowed to be used where a constant expression is required, then it would be an optimization to fold constants anyway.

@SWB2

This comment has been minimized.

Copy link

@SWB2 SWB2 commented Jan 11, 2018

Why not adopt an approach similar to C's stdint.h?

For those who are not familiar, stdint.h defines exact-width types like int32_t, uint32_t, etc., which are equivalent to .NET's Int32, UInt32, etc. But it also defines types like, for example, int_fast32_t, which is a platform-specific type mapping to the fastest signed integer available with at least 32 bits. On a 32-bit machine, this would typically be an Int32. On a 64-bit machine, this would typically be an Int64. A C# equivalent might be named IntFast32.

One possible issue with something like nint or NativeInt is that there is no (obvious) guarantee of minimum size. Therefore, the possibility exists that a developer could write code that works just fine on their 64-bit machine but crashes on their customer's 32-bit machine. Granted, that possibility always exists anyway, but at least with a minimum size explicit in the type name, it's clearer what range of values are expected and whether that range is exceeded.

I'm not sure there's much use in .NET for the other variants in stdint.h and stddef.h (e.g. int_least32_t, intmax_t, size_t, ptrdiff_t, etc.), but if so, it might make sense to add them at the same time. (Personally, I wouldn't mind seeing UInt8 and Int8 as aliases for Byte and SByte for consistency with the rest of the integer names, but that's a minor quibble.)

@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Jan 11, 2018

One possible issue with something like nint or NativeInt is that there is no (obvious) guarantee of minimum size.

It can be reasonably assumed that IntPtr will always be at least 4-bytes. I don't expect that CoreCLR itself has any plans to support 16-bit platforms and I doubt that any ports of the runtime (outside of maybe an experimental/hobby project) would support such architectures either.

If the size does actually matter, IntPtr.Size will give you the size of a native integer at runtime, as will sizeof(void*) (and several other functions).

Therefore, the possibility exists that a developer could write code that works just fine on their 64-bit machine but crashes on their customer's 32-bit machine.

This is the case for anyone doing unsafe code (and manually doing pointer arithmetic) or otherwise dealing with native sized numbers. The same issue exists with any variable sized type (including the int_fast32_t you suggested).

@nietras

This comment has been minimized.

Copy link

@nietras nietras commented Jul 17, 2018

As an exercise in futility I have now completed the nint and nuint implementations written entirely in MSIL, since this issue and the availability of these in C# seem far of. This is now available as a nuget package (https://www.nuget.org/packages/DotNetCross.NativeInts), see https://github.com/DotNetCross/NativeInts for more. It is tagged as alpha and comes with no warranties. It should be rather feature complete though.

Feedback welcomed there :-)

No matter what, a language solution is still preferred.

@CyrusNajmabadi

This comment has been minimized.

Copy link

@CyrusNajmabadi CyrusNajmabadi commented Jul 17, 2018

No matter what, a language solution is still preferred.

Question for my own edification:

Would there be any difference having this at hte language level, vs using your package but adding at the top of a file:

using nint = DotNetCross.NativeInts.nint;
using nuint = DotNetCross.NativeInts.nuint;

?

Thanks!

@nietras

This comment has been minimized.

Copy link

@nietras nietras commented Jul 17, 2018

language level, vs using your package

@CyrusNajmabadi checked/unchecked compiler keywords are of course not supported. So overflow detection by compiler cannot be supported via library. The only overflow detection the library does is for the constructor taking a 64-bit integer e.g. new nint(long.MaxValue) will throw on 32-bit, as will new nuint(ulong.MaxValue). Other than that no overflow checking is done. I.e. no ovf IL instructions are used.

I hope and think everything else would be identical. So you can easily switch if compiler support comes.

@CyrusNajmabadi

This comment has been minimized.

Copy link

@CyrusNajmabadi CyrusNajmabadi commented Jul 17, 2018

So overflow detection by compiler cannot be supported via library

This would only be for overflow detection of constants at compile-time right? Overflow trapping would still behave as expected at runtime depending on your compilation/checked/unchecked usage, right?

Thanks!

@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Jul 17, 2018

@CyrusNajmabadi, not exactly.

Only primitive types support checked/unchecked contexts (as they depend on IL opcodes). native int is a primitive type, but without language support.

Even if a 3rd party (or even CoreFX) publishes a nint type, it requires compiler support for checked(nint + nint) to have any impact (otherwise it just uses whatever the exposed operator + method was compiled to do, which is generally unchecked semantics)

@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Jul 17, 2018

A broader language feature would allow the user to declare both checked and unchecked forms of an operator and for those to be respected by the checked/unchecked language keywords. This would allow more advanced user types (like custom Int128 support) -- This would also allow System.IntPtr to just be expanded with these operators, which wouldn't be as ideal as native support (which would be a single IL opcode, rather than a call), but would be better than a new type (IMO).

@svick

This comment has been minimized.

Copy link
Contributor

@svick svick commented Jul 18, 2018

@tannergooding

A broader language feature would allow the user to declare both checked and unchecked forms of an operator and for those to be respected by the checked/unchecked language keywords.

For reference, the issue about that is #686.

@tannergooding

This comment has been minimized.

Copy link
Member

@tannergooding tannergooding commented Jul 18, 2018

I thought someone logged that issue, I forgot that person was me :)

@markusschaber

This comment has been minimized.

Copy link

@markusschaber markusschaber commented Nov 18, 2018

After reading through several proposals like this, I still don't really get the value for interop scenarios, as in C, an int is always 32 bit on all platforms supported by .NET (Core, Mono, etc.).

What we actually need in practice are types which map to long and unsigned long C types, as they are 32 bit on x64 Windows, but 64 bit on 64 bit Linux (and, AFAIK, also MacOS) - thus, none of the existing types (including IntPtr) can be used to marshal it.

We actually had to duplicate 80% of all p/invoke signatures and structs, and guard all the calls with wrappers and if() in a production project... :-(

The library itself exposes a C interface, but is written in C++ and also exposes a C++ interface, and thus cannot change their type definitions without losing binary compatibility for existing C++ clients.

@yaakov-h

This comment has been minimized.

Copy link
Contributor

@yaakov-h yaakov-h commented Nov 19, 2018

for interop scenarios, as a in C, an int is always 32 bit on all platforms supported by .NET (Core, Mono, etc.).

Xamarin for Apple's platforms have nint to interop with NSInteger, which is a 32-bit int in 32-bit processes and a 64-bit int in 64-bit processes.

@markusschaber

This comment has been minimized.

Copy link

@markusschaber markusschaber commented Nov 19, 2018

@yaakov-h This is not exactly the same. While I see the value of nint for interop with Apples native interfaces, and especially for the higher layers interfacing with the interop layer. But Intptr should actually work in this case, at least to get the data through interop.

@yaakov-h

This comment has been minimized.

Copy link
Contributor

@yaakov-h yaakov-h commented Nov 19, 2018

@markusschaber Have you read through the design proposal yet? It explicitly calls that out:

but we decided against it because of compatibility reasons (it would require changing behavior of UIntPtr and IntPtr)

@markusschaber

This comment has been minimized.

Copy link

@markusschaber markusschaber commented Nov 20, 2018

@yaakov-h Yes, I read that.
That's why I wrote "I see the value of nint...". A real integer type with arithmetics working etc. clearly has advantages compared to the limited possibilities of IntPtr - but at least, IntPtr allows to get the values across the Interop boundaries. For long and unsigned long, no such datatype exists. I clarified my comment above a bit.
Just to state it clearly: I'm not against the proposed nint / nreal datatypes, but I see more need for long datatypes which you cannot even get across the interop boundaries without defining two different signatures and structs, and if ()-guarding all accesses.

@GlassGrass

This comment has been minimized.

Copy link

@GlassGrass GlassGrass commented Dec 13, 2018

I propose that native numbers (and IntPtr/UIntPtr) reference should be atomic, and it will be declared in spec document.


I found C# spec documents did not guarantee the atomicity on reading/writing for IntPtr/UintPtr.

ECMA-334 5th edition section 10.6, or spec/variables.md says:

Atomicity of variable references

Reads and writes of the following data types are atomic: bool, char, byte, sbyte, short, ushort, uint, int, float, and reference types. In addition, reads and writes of enum types with an underlying type in the previous list are also atomic. Reads and writes of other types, including long, ulong, double, and decimal, as well as user-defined types, are not guaranteed to be atomic. Aside from the library functions designed for that purpose, there is no guarantee of atomic read-modify-write, such as in the case of increment or decrement.

But IntPtr/UIntPtr are supported on volatile modifier.

ECMA-334 5th edition section 15.5.4, or spec/classes.md says:

Volatile fields

... The type of a volatile field must be on of the following:

  • A reference_type.
  • The type byte, sbyte, short, ushort, int, uint, char, float, bool, System.IntPtr, or System.UIntPtr.
  • An enum_type having an enum base type of byte, sbyte, short, ushort, int, or uint.

...

And CLI guarantees the types, whose size are not larger than sizeof(native int), will be read/written atomically.

ECMA-335 6th Edition, section I.12.6.6 says:

Atomic reads and writes

A conforming CLI shall guarantee that read and write access to properly aligned memory locations no larger than the native word size (the size of type native int) is atomic when all the write accesses to a location are the same size.

...

In current C#, atomic access for native sized value type will not be assured at least on spec. I want to use native sized values under high performance multi threading with confidence.

@gafter gafter added this to 8.0 Candidate (not started) in Language Version Planning Mar 6, 2019
@MadsTorgersen MadsTorgersen moved this from 8.0 Candidate (not started) to 9.0 Candidate in Language Version Planning Apr 29, 2019
@gafter gafter modified the milestones: 8.0 candidate, 9.0 candidate Apr 29, 2019
@jcouv jcouv mentioned this issue Sep 24, 2019
0 of 1 task complete
@Zenexer

This comment has been minimized.

Copy link

@Zenexer Zenexer commented Sep 24, 2019

native int is more performant than int32 for loops that iterate over memory on 64-bit platforms. int32 tend to require int32->native int casts that are unnecessary overhead. E.g. Check how some methods on Span<T> have been optimized to avoid the extra overhead: https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/SpanHelpers.byte.cs#L76

Permalink to that method around the time the comment was made (April 2017): https://github.com/dotnet/corefx/blob/d6173e069a9bcedfdfd7f4f41e67d23f67157b61/src/System.Memory/src/System/SpanHelpers.byte.cs#L71

Permalink to the same method as of the release of .NET Core 3.0.0 stable (September 2019): https://github.com/dotnet/corefx/blob/932957425cdbd752f8954c0ce2ebf9a06de530c9/src/Common/src/CoreLib/System/SpanHelpers.Byte.cs#L196

@GSPP

This comment has been minimized.

Copy link

@GSPP GSPP commented Oct 1, 2019

How do native compilers deal with this loop bitness issue? I believe they transform the loop variable to a 64-bit integer and performance is optimal. The JIT could be taught that trick so that all code benefits.

@gafter

This comment has been minimized.

Copy link
Member Author

@gafter gafter commented Oct 3, 2019

A draft spec is at #2833

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.