Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: int/uint portability to 16-bit CPUs #161
Conversation
cmr
reviewed
Jul 12, 2014
|
|
||
| # Background | ||
|
|
||
| Rust defines types `int` and `uint` as integers that are wide enough to hold a |
This comment has been minimized.
This comment has been minimized.
cmr
Jul 12, 2014
Member
Is this even true on 16-bit devices, or do modern ones still use a segmentation system? Are there any relevant 16-bit chips anymore?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1fish2
Jul 12, 2014
Author
Some Atmel AVR controllers http://en.wikipedia.org/wiki/Atmel_AVR and some PIC controllers http://en.wikipedia.org/wiki/PIC_microcontroller have 16-bit address spaces. These tend to have Harvard architectures, that is, separate instruction and data memory/addresses.
This comment has been minimized.
This comment has been minimized.
huonw
reviewed
Jul 12, 2014
|
|
||
| See the discussions from many contributors to [Issue #14758](https://github.com/rust-lang/rust/issues/14758) and [Issue #9940](https://github.com/rust-lang/rust/issues/9940). | ||
|
|
||
| Also see [Issue #11831](https://github.com/rust-lang/rust/issues/11831) about |
This comment has been minimized.
This comment has been minimized.
huonw
Jul 12, 2014
Member
Note that that issue is closed, and that int and uint are no longer the compiler driven default, as the fall back has been removed. rust-lang/rust#6023 (i.e. it's already been decided that people aren't happy with this choice.)
This comment has been minimized.
This comment has been minimized.
1fish2
Jul 12, 2014
Author
I'm unclear on Issue #11831 "Decide whether to keep pointer sized integers as the default", "make a firm decision one way or the other on whether we want to commit to int and uint being pointer sized".
Q. "Default" means for the compiler not for programmers, right?
Q. When the issue was closed, was that a commitment to keeping int and uint as pointer-sized?
If so, it's not my intention to reopen a design decision, just to propose in that case making int/uint be at least 32 bits.
This comment has been minimized.
This comment has been minimized.
errordeveloper
commented
Jul 12, 2014
|
Sounds reasonable. |
errordeveloper
reviewed
Jul 12, 2014
|
|
||
| # Drawbacks | ||
|
|
||
| - Renaming `int`/`uint` requires figuring out which of the current uses to replace with `index`/`uindex` vs. `i32`/`u32`/`BigInt`. |
This comment has been minimized.
This comment has been minimized.
errordeveloper
Jul 12, 2014
And some people will just end-up redefining the int and uint to be 32-bit in their projects...
This comment has been minimized.
This comment has been minimized.
errordeveloper
commented
Jul 12, 2014
|
Overall, it's quite a reasonable thing to do, considering Rust's goals. Although, may be the motivation and title could be generalised a bit more... |
errordeveloper
reviewed
Jul 12, 2014
|
|
||
| # Motivation | ||
|
|
||
| So Rust libraries won't have new overflow bugs when run on embedded devices with |
This comment has been minimized.
This comment has been minimized.
errordeveloper
Jul 12, 2014
I'd just replace the entier paragraph with: "Avoid bugs where programmer presumed default integer size for indexing of arrays and eleswhere."
This comment has been minimized.
This comment has been minimized.
errordeveloper
Jul 12, 2014
You can expand a little to just say that this concerns non-32 bit targets, mostly, 8-bit and 16-bit MCUs and, to some extend 64-bit CPUs too.
errordeveloper
reviewed
Jul 12, 2014
| So Rust libraries won't have new overflow bugs when run on embedded devices with | ||
| 16-bit addressing, ditto for code written for 64-bit addressing then run in | ||
| 32-bit environments. Rust is a very compelling replacement for C/C++ in embedded | ||
| devices, "Internet of Things" devices, and safety-critical robotics actuators. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1fish2
Jul 12, 2014
Author
A better phrasing: IoT is a class of embedded devices that's forecast to surpass the PC + tablet + phone units combined within 3 years and sell 9G units 4 years from now, and the biggest downside is threats to data & physical security.
http://www.businessinsider.com/the-internet-of-everything-2014-slide-deck-sai-2014-2?op=1
I don't want them programmed in C/C++.
This comment has been minimized.
This comment has been minimized.
errordeveloper
Jul 12, 2014
Sure, but for this purpose it's just a spammy statement. We all know about
those "things" and the term IoT doesn't need to be thrown in every piece of
text on the internet. I definitely wouldn't use Rust if it claimed itself
"The language of the IoT".
This comment has been minimized.
This comment has been minimized.
errordeveloper
commented
Jul 12, 2014
|
Am I correct in understanding that this is to keep the integral type used for array indexing default to "native", i.e. fastest, integer? (On the AVR the By the way, what about suffixes? Would this imply dropping of |
This comment has been minimized.
This comment has been minimized.
The integral type used for indexing is the smallest one that covers the address space. "fastest"/"native" is irrelevant. |
This comment has been minimized.
This comment has been minimized.
errordeveloper
commented
Jul 12, 2014
|
@huonw thanks for the better formulation, perhaps that's the way the RFC/docs should state it. |
This comment has been minimized.
This comment has been minimized.
|
Agreed. I'll rephrase that. |
This comment has been minimized.
This comment has been minimized.
|
Yes, the motivation and title can be generalized. I was trying to start |
This comment has been minimized.
This comment has been minimized.
dobkeratops
commented
Jul 12, 2014
|
as well as embedded, some machines have had coprocessors with smaller address spaces.. not so common now, but who knows what the future will bring My suggestions would have been ... [1] Officially define [2] Then add other types which are more specific.. These are complimentary to the specific types,i32 etc.. code might cluster data dynamically to suit its platform. [3] Vec could be defined more versatile as
Seems like the C name 'long' being distinct from int is actually useful, maybe even swapping int out as suggested by the OP would be good, but adding another complimentary type would be less disruptive it think. |
glaebhoerl
reviewed
Jul 13, 2014
|
|
||
| > In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this. ... | ||
| > | ||
| > Some people, including some textbook authors, recommend using unsigned types to represent numbers that are never negative. This is intended as a form of self-documentation. However, in C, the advantages of such documentation are outweighed by the real bugs it can introduce. |
This comment has been minimized.
This comment has been minimized.
glaebhoerl
Jul 13, 2014
Contributor
This suggestion makes a lot of sense in a context where overflow/underflow silently wraps around. However, if something like RFC PR #146 were to be implemented, then it would once again make sense to use types which more accurately express the range of legal values (i.e., which are self-documenting), because compiler-added checks can be enabled to catch errors where the value would go out of range. Accurate types with compiler-added assertions beats inaccurate types with programmer-added assertions.
This comment has been minimized.
This comment has been minimized.
Thiez
Jul 13, 2014
@glaebhoerl So would you recommend we wait for PR #146 to be accepted or rejected before evaluating this RFC further?
This comment has been minimized.
This comment has been minimized.
glaebhoerl
Jul 13, 2014
Contributor
Nah. This was just an ancillary remark on an ancillary part of the proposal. The main part of the proposal (which is about changes to the language to better accomodate [portability to] 16-bit architectures) is unaffected.
(And anyway, the suggestion makes sense in the context of the current language, and the style guide could just be updated again if the language changes.)
This comment has been minimized.
This comment has been minimized.
1fish2
Jul 13, 2014
Author
Aha! Nice insight, @glaebhoerl.
I'll make the style guide recommendation conditional on overflow-checking.
Q. Does/will overflow checking happen during conversion between integer types?
This comment has been minimized.
This comment has been minimized.
glaebhoerl
Jul 13, 2014
Contributor
A. It doesn't currently, but in the context of #146, if #[overflow_checks(on)], I think it should.
Rationale: As far as I can tell as is meant to preserve meaning rather than representation, e.g. 5000i32 as f32 is equivalent to 5000f32 and not to transmute::<i32, f32>(5000i32). Therefore if attempting to transport the meaning of the original value to the target type causes it to overflow, it should be caught.
This comment has been minimized.
This comment has been minimized.
1fish2
Jul 13, 2014
Author
Yes. Otherwise computing a value in one integer type then converting to another would accidentally bypass the overflow checks.
This comment has been minimized.
This comment has been minimized.
errordeveloper
commented
Jul 13, 2014
|
Also, another point this RFC should consider is how would a typical |
1fish2
referenced this pull request
Jul 13, 2014
Closed
RFC: Scoped attributes for checked arithmetic #146
This comment has been minimized.
This comment has been minimized.
Thiez
commented
Jul 14, 2014
|
@errordeveloper I doubt that would be a problem because in most cases one would iterate over an indexable collection directly rather than indexing (and paying for bounds checking). Not that I support this RFC... |
This comment has been minimized.
This comment has been minimized.
|
There should be some integer type that corresponds to pointer size. That is why I like There could be some fancy macro that you give constraints (fastest / smallest, max abs val, signed/unsigned, etc) and it spits out a type or aborts compilation. This seems more versatile and less namespace-cluttering than C99's solution. BTW, last I checked rust let you transmute I would love some infrastructure everybody could share to do continuous integration with different int sizes. This probably necessitates virtualizing different CPU architectures (because int--ptr transmutations), but it would be cool if it didn't. I initially didn't think compiler-added overflow checks was too important. But if that is what it takes to make people use unsigned integers for natural numbers, I am all for it. |
This comment has been minimized.
This comment has been minimized.
Trying to protect against everything that can change per platform/configuration is impossible. e.g. #[cfg(windows)]
struct Foo { x: u8 }
#[cfg(not(windows))]
struct Foo { x: u16 }
transmute::<Foo, u8>(...) |
This comment has been minimized.
This comment has been minimized.
|
Impossible I think not. I'd like to some how match on a list of archs one attempts to support, lest one forget a case, rather than just config-chaining, and hoping for the best. This shouldn't be to hard. More radically, for the purposes of type checking it would be nice to take an intersection intersection type or something analogous: e.g: // can't be transmuted / unique size,
// implements all traits that both u8 and u16 do.
type Magic = u8 ∩ u16
struct Foo { x: Magic }This is kind of "mangling of phases", and a rather big step from the way things work currently. The alternative is just to part of compilation brute-force the various configuration options, or just cross compile and virtualize as I said before. |
This comment has been minimized.
This comment has been minimized.
alexchandel
commented
Jul 28, 2014
|
Given the purpose of the |
l0kod
referenced this pull request
Aug 15, 2014
Closed
RFC: rename `int` and `uint` to `intptr`/`uintptr` #9940
This comment has been minimized.
This comment has been minimized.
l0kod
commented
Aug 15, 2014
|
In rust-lang/rust#9940, @thestinger said:
I think renaming |
This comment has been minimized.
This comment has been minimized.
|
Claiming that purpose 1 is the only purpose for these types is wrong, and yet that's the motivation for renaming to The only real issue with Importantly, renaming these to Basically, renaming these types does not really do anything at all for overflow, it just encourages people to add more unchecked integral casts to their code. Because of this, the only approach I can support is keeping |
This comment has been minimized.
This comment has been minimized.
l0kod
commented
Aug 15, 2014
For a static typing language, the For this reason, I don't think it's a good idea to promote the
Obviously, the architecture-related integer is needed for memory-related access (i.e. indexing and sizing of containers). Is there a good reason for hiding the initial goal and bug-prone (e.g. cast) property of a type?
That's a possibility, but if they are aware of the architecture-related property they have more reasons to do the right choice: to choose the right type everywhere. If that make sense, the "at least 32-bit" exception is not needed. Moreover it would introduce another weird rule to this already weird type. |
This comment has been minimized.
This comment has been minimized.
This isn't really the case, it's just using any other types is annoying and historically unfavoured (since we had default-to- |
This comment has been minimized.
This comment has been minimized.
Alternatively, declare each array's index type rather than using an architecture-dependent type that spans the address space. |
This comment has been minimized.
This comment has been minimized.
|
Good plan. Would you like me to withdraw this PR and submit a new PR to rename And to be sure I have it precisely right, "fallback" means both the type inference default for integer literals and the recommended programmers' go-to type? |
This comment has been minimized.
This comment has been minimized.
thestinger
commented
Nov 7, 2014
|
@1fish2: Yeah, I think a new RFC with that scope would have a high chance of success.
Yeah, the type inference default (which was accepted again with https://github.com/rust-lang/rfcs/blob/master/text/0212-restore-int-fallback.md) which is essentially the type that the language is recommending as a good default. |
This comment has been minimized.
This comment has been minimized.
l0kod
commented
Nov 7, 2014
|
For bikeshed discussion about new |
This comment has been minimized.
This comment has been minimized.
thestinger
commented
Nov 7, 2014
|
Calling them |
This comment has been minimized.
This comment has been minimized.
Thiez
commented
Nov 7, 2014
|
Perhaps a bit offtopic, but suppose we decide to stop using |
This comment has been minimized.
This comment has been minimized.
netvl
commented
Nov 7, 2014
|
@Thiez, isn't it there to represent a difference between pointers? You can't have it without a sign. |
This comment has been minimized.
This comment has been minimized.
Thiez
commented
Nov 7, 2014
|
Sure you can. Suppose we have a machine with 256 bytes of memory, so |
This comment has been minimized.
This comment has been minimized.
|
I'm currently working on a draft on changing the default fallback type to @Thiez This is indeed offtopic, I don't think it's helping the RFC. |
This comment has been minimized.
This comment has been minimized.
|
OK. I'll do that in a couple days and let you review it before sending the PR. |
This comment has been minimized.
This comment has been minimized.
errordeveloper
commented
Nov 7, 2014
|
Have we thought of just adding a lint warning when the type in question is |
This comment has been minimized.
This comment has been minimized.
|
@errordeveloper If it's used for indexing it's already automatically inferred to be a |
This comment has been minimized.
This comment has been minimized.
|
@thestinger if indexing is done with uint, is there any problem with 32-bit processes on 64-bit machines? I do agree we should call them something along the lines of |
This comment has been minimized.
This comment has been minimized.
thestinger
commented
Nov 7, 2014
|
@Ericson2314: There's no problem in terms of |
This comment has been minimized.
This comment has been minimized.
thestinger
commented
Nov 7, 2014
|
@Thiez: Pointer arithmetic is inherently signed because it can go in both directions, not unsigned. It is not well-defined to overflow normal (fast) pointer arithmetic. |
This comment has been minimized.
This comment has been minimized.
|
@thestinger so negative ptr offsets are an essential thing to support? |
This comment has been minimized.
This comment has been minimized.
thestinger
commented
Nov 7, 2014
|
@Ericson2314: Yes, being able to calculate pointer differences and do negative offsets is an essential feature. Ensuring correctness requires limiting the maximum object size to |
This comment has been minimized.
This comment has been minimized.
|
@thestringer Ok, I'm sold. Especially given the performance aspect. |
This comment has been minimized.
This comment has been minimized.
|
The new, simpler draft RFC to replace the present one is at 0000-int-name.md. Comments? |
This comment has been minimized.
This comment has been minimized.
thestinger
commented
Nov 13, 2014
|
@1fish2: It looks great to me. |
This comment has been minimized.
This comment has been minimized.
l0kod
commented
Nov 13, 2014
|
@1fish2: great! I would also add the argument that the renaming process would be the good and probably only time to spot future bugs before they appear. There is also the question about integer suffixes A good example of using |
This comment has been minimized.
This comment has been minimized.
|
Excellent points, Mickaël. I just sent the PR. Do you want to add these points there? We'll continue the discussion there. |
This comment has been minimized.
This comment has been minimized.
I propose to withdraw this RFC in favor of the single-purpose RFC: Renaming int/uint (PR #464). |
This comment has been minimized.
This comment has been minimized.
errordeveloper
commented
Nov 13, 2014
|
On 13 November 2014 08:46, Jerry Morrison notifications@github.com wrote:
Makes sense. |
This comment has been minimized.
This comment has been minimized.
|
@1fish2 you have the power to close it :) |
1fish2 commentedJul 12, 2014
Both Issue #14758 and Issue #9940 call for RFCs.
This RFC summarizes those discussions, explains the core issue of
code portability to 16-bit CPUs (also of 64-bit code to 32-bit CPUs),
explains what's meant by "default" integer types, makes 2 specific
proposals, and proposes coding style for integer sizing.