Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: rename `int` and `uint` to `intptr`/`uintptr` #9940
Comments
This comment has been minimized.
This comment has been minimized.
|
I think Fixed integers of pseudo-arbitrary width are rarely useful. |
This comment has been minimized.
This comment has been minimized.
|
Seems to me I rather like the I think the machine-word-sized |
This comment has been minimized.
This comment has been minimized.
Why introduce a completely new & (so far) unused naming convention to the language? |
This comment has been minimized.
This comment has been minimized.
|
@Thiez: They aren't machine word size, they're pointer-size. On the x32 ABI they will be 32-bit, despite having 16 64-bit integer registers. If you want to use fixed-size integers correctly, you need upper bounds on the size. Fixed-size types named |
This comment has been minimized.
This comment has been minimized.
|
@thestinger fair point. Perhaps that should change as well? Since we're not really supposed to be messing around with pointers outside of unsafe blocks, perhaps a pointer-size type is deserving of an ugly name. That opens up the option of having int and uint be machine word sized... |
This comment has been minimized.
This comment has been minimized.
|
@cmr: I agree they're awful names. We should discourage using fixed-size types only when bounds are unknown. I think you only want these types in low-level code or for in-memory container sizes. @Thiez: I don't really think word-sized is a useful property. If the upper bound is 32-bit, 32-bit integers will likely be fastest for the use case due to wasting less cache space. |
This comment has been minimized.
This comment has been minimized.
|
I realize my suggestion is silly anyway as one would still need a pointer-size variable for array and vector lengths, which is a nice case for int/uint (but not when they're word-sized). Ignore it :) |
This comment has been minimized.
This comment has been minimized.
1fish2
commented
Oct 19, 2013
|
I completely agree with @thestinger A machine-word sized integer means bugs and security holes e.g. because you ran the tests on one platform then deployed on others. If one of the platforms has 16-bit int like PalmOS, that's too short to use without thinking carefully about it, so the prudent coding style forbids un-sized int and uint. (Actually the PalmOS 68000 ABI is emulated on a 32-bit ARM so it's not clear what's a machine word.) Hence the strategy of using a pointer-size integer type only in low-level code that requires it, with an ugly name. |
This comment has been minimized.
This comment has been minimized.
UtherII
commented
Oct 19, 2013
|
I agree that using int and uint should be discouraged and renaming them to a less straightforward name is better. |
This comment has been minimized.
This comment has been minimized.
|
I think that's a good idea. You can't really rely on very much when using I'm not so fond of the names |
This comment has been minimized.
This comment has been minimized.
|
IMHO, the interesting questions are: what type should be used to index into arrays, and what should it be named? Indexing into arrays is pretty common. A pointer-sized type is needed to be able to represent any index. It should presumably be unsigned. I'm not sure if there's much reason to also have a signed version. Expanding to a BigInt on overflow doesn't make much sense here. But wrapping around on over/underflow also doesn't make very much sense, I think. If you want to catch over/underflow and I think the strongest argument might be for expanding: negative or larger-than-the-address-space values don't make sense for array indexes, but the array bounds check will already catch that. Meanwhile it's versatile and generally useful for most other situations as well, not just array indexing. The downside is a performance cost relative to a type that wraps on over/underflow. (In the event of a fixed pointer-sized type, the relevant association when naming it should be that it holds any array index, not that it's pointer-sized.) Whatever this type ends up being and named, it's the one that should be in the If someone explicitly needs pointer-sized machine integers for |
This comment has been minimized.
This comment has been minimized.
|
Dumb question here, but what's the use of having a signed pointer-sized int at all? Could we get away with having only As for the general idea of this bug, I'm warming to it after seeing how well the removal of |
This comment has been minimized.
This comment has been minimized.
|
@bstrie: a signed one is needed for offsets/differences (POSIX has |
This comment has been minimized.
This comment has been minimized.
1fish2
commented
Oct 19, 2013
|
@thestinger good point. Subtracting array indexes should yield a signed value. So to reverse the question, what's the need for an unsigned array index type? Is it feasible to allocate a byte array that takes more than half the address space? |
This comment has been minimized.
This comment has been minimized.
|
AFAIK the rationale for unsigned types here is to avoid the need for a dynamic check for a negative integer in every function. A bounds check only has to compare against the length, and a reserve/with_capacity function only has to check for overflow, not underflow. It just bubbles up the responsibility for handling underflow as far as possible into the caller (if it needs to check at all - it may not every subtract from an index). |
This comment has been minimized.
This comment has been minimized.
|
cc me I have contemplating whether |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@pnkfelix I think the two are very closely related. (basically: if we want to use the existing int/uint as the preferred type for indexing arrays, then they should not be renamed to intptr/uintptr, but if we want to prefer a different type for that (e.g. one which checks for over/underflow), then they should be renamed.) |
This comment has been minimized.
This comment has been minimized.
|
To those commenting that Having +1 for this change from me. |
This comment has been minimized.
This comment has been minimized.
|
If there's consensus that it's bad practice to use |
This comment has been minimized.
This comment has been minimized.
|
@brson We already make folks choose between |
This comment has been minimized.
This comment has been minimized.
|
I find this thread confusing.
|
This comment has been minimized.
This comment has been minimized.
|
The other possibility was to use a type that doesn't wrap on over/underflow, but eithers traps or extends into a bigint. Which is likely to be slow, but I don't know whether it's been tested. |
This comment has been minimized.
This comment has been minimized.
1fish2
commented
Dec 6, 2013
|
(What are bors?) An integer type with platform-specific overflow makes programs Intertwined issues: whether to have non-portable integer types, what to |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
ghost
commented
Dec 6, 2013
|
There's also the x32 ABI where pointers are smaller than ints. I'd remove variable-width int altogether (except ffi of course). Those who expect their code to run on 32-bit should already be thinking about overflows and use int64/bigint where appropriate, and those who know they'll only ever run on 64-bit should have no problem either way. Are there credible use cases of pointer-sized Rust ints outside ffi? |
This comment has been minimized.
This comment has been minimized.
|
Pointer-sized ints are required for representing pointers and indices into On Fri, Dec 6, 2013 at 6:54 AM, György Andrasek notifications@github.comwrote:
|
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis The idea is that int and uint aren't very useful numeric types, and that a fast bigint would be more appropriate most of the time one is actually dealing with numbers. And when one does not want a real numeric type, they probably want one of the fixed-size types anyway. |
This comment has been minimized.
This comment has been minimized.
|
@cmr I do not agree with the "probably want one of the fixed-size types" part of that sentence. That is not clear to me -- I think it is very common to have integers that are ultimately indices into some sort of array or tied to the size of a data structure, and for that use case it is natural to want an integer that represents "the address space of the machine". Of course I think having a nice, performant bigint library would be great, particularly for implementing "business logic" or other use cases where a "true integer" is required. But I am not sure how common that really is. |
This comment has been minimized.
This comment has been minimized.
nmsmith
commented
Jan 12, 2014
|
The moment someone tries to use |
This comment has been minimized.
This comment has been minimized.
CloudiDust
commented
Jan 12, 2014
|
@ecl3ctic Yes the compiler can and will help here, but I think the principle of least surprise applies. On the other hand, |
This comment has been minimized.
This comment has been minimized.
|
On Sat, Jan 11, 2014 at 11:37:02AM -0800, Daniel Micay wrote:
As I wrote earlier, I don't find this argument especially persuasive,
|
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis: As far as I know, GMP is leagues ahead of any other big integer implementation in performance. There's no doubt that it's the best open-source implementation. It has many different algorithms implemented for each operation because with very large integers it has progressively better asymptomatic performance than other libraries. It also has highly optimized hand-written assembly for different revisions of many platforms too, because it's many times faster than the same code in C without specialized instructions. Intel adds relevant instructions with almost every iteration of their CPU architecture too... Haswell has MULX, Broadwell brings ADOX and ADCX, and there are many relevant SSE/AVX instructions. It's licensed under LGPL, which gives you 3 choices:
There are various clones of the library with inferior performance and a less exhaustive API but more permissive licenses. I think Rust should default to using one of these libraries and allow GMP as a drop-in alternative.
This is well-explored territory with |
This comment has been minimized.
This comment has been minimized.
|
@CloudiDust: The names http://en.cppreference.com/w/cpp/types/integer The |
This comment has been minimized.
This comment has been minimized.
|
@huonw pointed out https://github.com/wbhart/bsdnt on IRC, which seems like My thoughts for auto-overflow is make the type have an align of at least 2, On Sun, Jan 12, 2014 at 3:55 PM, Daniel Micay notifications@github.comwrote:
|
This comment has been minimized.
This comment has been minimized.
|
@cmr: It will incur two branches, since you need to check if you have a big integer and then check for overflow. Checking the overflow flag serializes the CPU pipeline quite a bit too. |
This comment has been minimized.
This comment has been minimized.
|
If you're limited to 31-bit then it seems that you'll need to use a comparison instruction rather than using the carry/overflow flag. This could be really bad for multiplication. |
This comment has been minimized.
This comment has been minimized.
|
Simple example: extern mod extra;
use std::unstable::intrinsics::{abort, u32_mul_with_overflow};
use extra::test::BenchHarness;
#[inline(never)]
fn control(xs: &mut [u32]) {
for x in xs.mut_iter() {
*x *= 5;
}
}
#[inline(never)]
fn check(xs: &mut [u32]) {
for x in xs.mut_iter() {
unsafe {
let (y, o) = u32_mul_with_overflow(*x, 5);
if o {
abort()
}
*x = y;
}
}
}
#[inline(never)]
fn check_libstd(xs: &mut [u32]) {
for x in xs.mut_iter() {
*x = x.checked_mul(&5).unwrap();
}
}
#[bench]
fn bench_control(b: &mut BenchHarness) {
b.iter(|| {
let mut xs = [0, ..1000];
control(xs)
});
}
#[bench]
fn bench_check(b: &mut BenchHarness) {
b.iter(|| {
let mut xs = [0, ..1000];
check(xs)
});
}
#[bench]
fn bench_check_libstd(b: &mut BenchHarness) {
b.iter(|| {
let mut xs = [0, ..1000];
check_libstd(xs)
});
}--opt-level=2
--opt-level=3
Ouch. It becomes a larger slowdown multiplier when you add more operations to the loop too. Since it's increasing the code size a lot, it will bloat the instruction cache too. |
This comment has been minimized.
This comment has been minimized.
CloudiDust
commented
Jan 13, 2014
|
@thestinger Thanks for the link. I am aware that the names come from the C/C++ standards, but still find them confusing (to rust newcomers from outside the C/C++ world). Now come to think of it, this is a convention that can be learnt quickly, and C# actually uses But there may be another problem: the names This is to say, we may have dedicated names for container-indexing integer types, while the fact that they are pointer sized is an implementation detail on certain architectures, just like in C/C++. Here are three pairs of possible candidates: Common pros:
Common cons:
Pros and cons specific to each candidate:
On the other hand, So I lean towards Regarding an arbitrarily sized I am not sure about my stance on the "default integer type" issue, but people must make informed choices consciously. Some "rusty guidelines to integer type selection" in the docs would be great. |
This comment has been minimized.
This comment has been minimized.
|
On Sun, Jan 12, 2014 at 12:55:36PM -0800, Daniel Micay wrote:
I do not understand how checking for overflow and failing can possibly |
This comment has been minimized.
This comment has been minimized.
I'm not saying performing a branch on the contained value and then a check for overflow is faster than the check for overflow. I'm just suggesting that it's worth making benchmarks to measure the cost of both. |
huonw
referenced this issue
Jan 27, 2014
Closed
Decide whether to keep pointer sized integers as the default #11831
This comment has been minimized.
This comment has been minimized.
|
There's a lot of interrelated concerns here:
I personally think that The easiest way to make immediate progress (not the best, mind you) might be the following:
This lets us punt on the topics of bigints, bounds checking, and signed vs unsigned for a later date. |
This comment has been minimized.
This comment has been minimized.
I don't think there should be a default fallback, It means you can't trust the compiler to infer the type or give an error, and you have to watch out for bugs from this. |
This comment has been minimized.
This comment has been minimized.
|
agree with @thestinger and @bstrie. having defaulting for literals when theres no type constraints is a mixed bag, one hand its great sometimes (but mostly when using a repl). Othertimes its really unclear/confusing what it can mean. What about a model where literals are treated as being "polymorphic" if theres no constraints? (this may not make sense in Rust granted), but in haskell / ghc, literals have a generic type until used.
|
This comment has been minimized.
This comment has been minimized.
|
Using a fixed-size integer requires carefully considering whether the application enforces bounds on it. Otherwise, you need a big integer instead. A default fallback type removes this thought process in favour of lazy, incorrect code everywhere. Haskell makes the fallback configurable, but the default is a big integer type. |
This comment has been minimized.
This comment has been minimized.
|
a wider problem in actual haskell code is users choosing to use Int, and then assuming int is 32 or 64bits always :), but yes, defaulting to integer would be wrong for rust |
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
On Fri, Feb 14, 2014 at 3:45 PM, Daniel Micay notifications@github.comwrote:
|
This comment has been minimized.
This comment has been minimized.
nmsmith
commented
Feb 14, 2014
|
I agree that the compiler should not automatically choose an arbitrary, potentially dangerous integer type if it can't infer the type from the context. |
thestinger commentedOct 19, 2013
An arbitrarily sized integer type would be provided in
stdunder the nameInt. I think encouraging use of an arbitrarily sized integer when bounds are unknown is a much better solution than adding failure throwing overflow checks to fixed-size integers.