From f5e3e6c7935387467a38a51b25fcde8a3f7ef351 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Sun, 28 Dec 2014 22:22:48 +0800 Subject: [PATCH 01/20] RFC: Rename `int/uint` to `intx/uintx` The initial commit for yet another `int/uint` renaming RFC. --- text/0000-int-to-intx.md | 78 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 78 insertions(+) create mode 100644 text/0000-int-to-intx.md diff --git a/text/0000-int-to-intx.md b/text/0000-int-to-intx.md new file mode 100644 index 00000000000..010c3d363ce --- /dev/null +++ b/text/0000-int-to-intx.md @@ -0,0 +1,78 @@ +- Start Date: 2014-12-28 +- RFC PR #: (leave this empty) +- Rust Issue #: (leave this empty) + +# Summary + +Rename the pointer-sized integer types `int/uint` to `intx/uintx`, and use new literal suffixes `ix/ux`, so as to avoid misconceptions and misuses. + +# Motivation + +Currently, Rust defines two [machine-dependent integer types](http://doc.rust-lang.org/reference.html#machine-dependent-integer-types) `int/uint` that have the same number of bits as the target platform's pointer type. These two types are used for many purposes: indices, counts, sizes, offsets, etc. + +The problem is, `int/uint` *look* like default integer types, but pointer-sized integers are not good defaults, and it is desirable to discourage people from overusing them. + +And it is a quite popular opinion that, the best way to discourage their use is to rename them. + +Previously, the latest renaming attempt [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464) was rejected. (Some parts of this RFC is based on that RFC.) [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062/17) states the following reasons: + +- Changing the names would affect literally every Rust program ever written. +- Adjusting the guidelines and tutorial can be equally effective in helping people to select the correct type. +- All the suggested alternative names have serious drawbacks. + +However: + +Rust was and is currently undergoing quite a lot breaking changes. Even though the `int/uint` renaming will "break the world", it is not unheard of, and it is mainly a "search & replace". Also, a transition period can be provided, during which `int/uint` can be deprecated, while the new names can take time to replace them. So "to avoid breaking the world" shouldn't stop the renaming. + +`int/uint` have a long tradition of being the default integer type names, so programmers *will* be tempted to use them in Rust, even the experienced ones, no matter what the documentation says. The semantics of `int/uint` in Rust is quite different from those in many other mainstream languages. Worse, the Swift programming language, which is heavily influenced by Rust, has the types `Int/UInt` with *almost* the *same semantics* as Rust's `int/uint`, but it *actively encourages* programmers to use `Int` as much as possible. From [the Swift Programming Language](https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/TheBasics.html#//apple_ref/doc/uid/TP40014097-CH5-ID309): + +> Swift provides an additional integer type, Int, which has the same size as the current platform’s native word size: ... + +> Swift also provides an unsigned integer type, UInt, which has the same size as the current platform’s native word size: ... + +> Unless you need to work with a specific size of integer, always use Int for integer values in your code. This aids code consistency and interoperability. + +> Use UInt only when you specifically need an unsigned integer type with the same size as the platform’s native word size. If this is not the case, Int is preferred, even when the values to be stored are known to be non-negative. + +Thus, it is very likely that newcomers will come to Rust, expecting `int/uint` to be the preferred integer types, *even if they know that they are pointer-sized*. + +Not renaming `int/uint` violates the principle of least surprise, and is not newcomer friendly. + +As stated in previous discussions, all suggested alternative names have some drawbacks that may be unbearable. (Please refer to [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062/17) and related discussions for details.) + +Therefore this RFC proposes a new pair of alternatives: `intx`/`uintx`, where the `x` suffix means "unknown size"/"variable size", or "platform-dependent size". + +The pros: + +- The names are foreign to programmers from other languages, so they are less likely to make incorrect assumptions, or use them out of habit. +- But not too foreign, they still look like integer type names. (Some believe that `imem/umem` fail here.) +- They do not favour one of the types' use cases over the others in the names. (Alternatives `iptr/uptr`, `idiff/usize` and others fail here.) +- They follow the same *signed-ness + size* naming pattern used by other integer types like `i32/u32`. +- They somewhat look like `index/uindex`. This may or may not be an advantage. + +# Detailed Design + +Rename these two pointer-sized integer types, `int` to `intx`, and `uint` to `uintx`. + +Use `ix` and `ux` as the literal suffix for `intx` and `uintx`, respectively. + +Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. + +# Drawbacks + +- Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. +- The new names are longer (but not much longer). + +# Alternatives + +- Keep the status quo. + +Which may hurt in the long run, especially when there is at least one (would-be?) high-profile language (which is Rust-inspired) taking the opposite stance of Rust. + +- Use `ix/ux` as the new type names, not just literal suffixes. + +While `ix/ux` more closely follow the `i32/u32` pattern, they may be too short (and tempting) and may not look like integer types for some. + +# Unresolved questions + +None. From 7537b67d59cffa3f0b2ea0c1970d4709688e78a3 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Sun, 28 Dec 2014 22:27:41 +0800 Subject: [PATCH 02/20] Grammar correction. --- text/0000-int-to-intx.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-int-to-intx.md b/text/0000-int-to-intx.md index 010c3d363ce..6268f6abfae 100644 --- a/text/0000-int-to-intx.md +++ b/text/0000-int-to-intx.md @@ -22,7 +22,7 @@ Previously, the latest renaming attempt [RFC PR 464](https://github.com/rust-lan However: -Rust was and is currently undergoing quite a lot breaking changes. Even though the `int/uint` renaming will "break the world", it is not unheard of, and it is mainly a "search & replace". Also, a transition period can be provided, during which `int/uint` can be deprecated, while the new names can take time to replace them. So "to avoid breaking the world" shouldn't stop the renaming. +Rust was and is undergoing quite a lot of breaking changes. Even though the `int/uint` renaming will "break the world", it is not unheard of, and it is mainly a "search & replace". Also, a transition period can be provided, during which `int/uint` can be deprecated, while the new names can take time to replace them. So "to avoid breaking the world" shouldn't stop the renaming. `int/uint` have a long tradition of being the default integer type names, so programmers *will* be tempted to use them in Rust, even the experienced ones, no matter what the documentation says. The semantics of `int/uint` in Rust is quite different from those in many other mainstream languages. Worse, the Swift programming language, which is heavily influenced by Rust, has the types `Int/UInt` with *almost* the *same semantics* as Rust's `int/uint`, but it *actively encourages* programmers to use `Int` as much as possible. From [the Swift Programming Language](https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/TheBasics.html#//apple_ref/doc/uid/TP40014097-CH5-ID309): From c65000470d9f66cd3b5315f06d44d5c9243045de Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Sun, 28 Dec 2014 22:39:04 +0800 Subject: [PATCH 03/20] Formatting change. --- text/0000-int-to-intx.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-int-to-intx.md b/text/0000-int-to-intx.md index 6268f6abfae..7f462279e1b 100644 --- a/text/0000-int-to-intx.md +++ b/text/0000-int-to-intx.md @@ -65,11 +65,11 @@ Update code and documentation to use pointer-sized integers more narrowly for th # Alternatives -- Keep the status quo. +**A. Keep the status quo.** Which may hurt in the long run, especially when there is at least one (would-be?) high-profile language (which is Rust-inspired) taking the opposite stance of Rust. -- Use `ix/ux` as the new type names, not just literal suffixes. +**B. Use `ix/ux` as the new type names, not just literal suffixes.** While `ix/ux` more closely follow the `i32/u32` pattern, they may be too short (and tempting) and may not look like integer types for some. From d14c66b8c56b1c3883709958e56060c106f3df47 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Sun, 28 Dec 2014 22:53:02 +0800 Subject: [PATCH 04/20] Improved the link to the Swift doc. --- text/0000-int-to-intx.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-int-to-intx.md b/text/0000-int-to-intx.md index 7f462279e1b..16dc6415b3f 100644 --- a/text/0000-int-to-intx.md +++ b/text/0000-int-to-intx.md @@ -24,7 +24,7 @@ However: Rust was and is undergoing quite a lot of breaking changes. Even though the `int/uint` renaming will "break the world", it is not unheard of, and it is mainly a "search & replace". Also, a transition period can be provided, during which `int/uint` can be deprecated, while the new names can take time to replace them. So "to avoid breaking the world" shouldn't stop the renaming. -`int/uint` have a long tradition of being the default integer type names, so programmers *will* be tempted to use them in Rust, even the experienced ones, no matter what the documentation says. The semantics of `int/uint` in Rust is quite different from those in many other mainstream languages. Worse, the Swift programming language, which is heavily influenced by Rust, has the types `Int/UInt` with *almost* the *same semantics* as Rust's `int/uint`, but it *actively encourages* programmers to use `Int` as much as possible. From [the Swift Programming Language](https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/TheBasics.html#//apple_ref/doc/uid/TP40014097-CH5-ID309): +`int/uint` have a long tradition of being the default integer type names, so programmers *will* be tempted to use them in Rust, even the experienced ones, no matter what the documentation says. The semantics of `int/uint` in Rust is quite different from those in many other mainstream languages. Worse, the Swift programming language, which is heavily influenced by Rust, has the types `Int/UInt` with *almost* the *same semantics* as Rust's `int/uint`, but it *actively encourages* programmers to use `Int` as much as possible. From [the Swift Programming Language](https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/TheBasics.html#//apple_ref/doc/uid/TP40014097-CH5-ID319): > Swift provides an additional integer type, Int, which has the same size as the current platform’s native word size: ... From c07f2167c70bc1c057a491aa40eb14a28086309d Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Sun, 28 Dec 2014 23:02:27 +0800 Subject: [PATCH 05/20] Fixed the link to the discuss post. --- text/0000-int-to-intx.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-int-to-intx.md b/text/0000-int-to-intx.md index 16dc6415b3f..be86d13e37a 100644 --- a/text/0000-int-to-intx.md +++ b/text/0000-int-to-intx.md @@ -14,7 +14,7 @@ The problem is, `int/uint` *look* like default integer types, but pointer-sized And it is a quite popular opinion that, the best way to discourage their use is to rename them. -Previously, the latest renaming attempt [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464) was rejected. (Some parts of this RFC is based on that RFC.) [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062/17) states the following reasons: +Previously, the latest renaming attempt [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464) was rejected. (Some parts of this RFC is based on that RFC.) [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062) states the following reasons: - Changing the names would affect literally every Rust program ever written. - Adjusting the guidelines and tutorial can be equally effective in helping people to select the correct type. @@ -38,7 +38,7 @@ Thus, it is very likely that newcomers will come to Rust, expecting `int/uint` t Not renaming `int/uint` violates the principle of least surprise, and is not newcomer friendly. -As stated in previous discussions, all suggested alternative names have some drawbacks that may be unbearable. (Please refer to [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062/17) and related discussions for details.) +As stated in previous discussions, all suggested alternative names have some drawbacks that may be unbearable. (Please refer to [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062) and related discussions for details.) Therefore this RFC proposes a new pair of alternatives: `intx`/`uintx`, where the `x` suffix means "unknown size"/"variable size", or "platform-dependent size". From e7b1e26a65108e492ac5b87caa6cdc93c701f6f9 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Mon, 29 Dec 2014 23:22:11 +0800 Subject: [PATCH 06/20] Updated drawbacks and alternatives. --- text/0000-int-to-intx.md | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/text/0000-int-to-intx.md b/text/0000-int-to-intx.md index be86d13e37a..3878d0aa78b 100644 --- a/text/0000-int-to-intx.md +++ b/text/0000-int-to-intx.md @@ -24,7 +24,7 @@ However: Rust was and is undergoing quite a lot of breaking changes. Even though the `int/uint` renaming will "break the world", it is not unheard of, and it is mainly a "search & replace". Also, a transition period can be provided, during which `int/uint` can be deprecated, while the new names can take time to replace them. So "to avoid breaking the world" shouldn't stop the renaming. -`int/uint` have a long tradition of being the default integer type names, so programmers *will* be tempted to use them in Rust, even the experienced ones, no matter what the documentation says. The semantics of `int/uint` in Rust is quite different from those in many other mainstream languages. Worse, the Swift programming language, which is heavily influenced by Rust, has the types `Int/UInt` with *almost* the *same semantics* as Rust's `int/uint`, but it *actively encourages* programmers to use `Int` as much as possible. From [the Swift Programming Language](https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/TheBasics.html#//apple_ref/doc/uid/TP40014097-CH5-ID319): +`int/uint` have a long tradition of being the default integer type names, so programmers *will* be tempted to use them in Rust, even the experienced ones, no matter what the documentation says. The semantics of `int/uint` in Rust is quite different from that in many other mainstream languages. Worse, the Swift programming language, which is heavily influenced by Rust, has the types `Int/UInt` with *almost* the *same semantics* as Rust's `int/uint`, but it *actively encourages* programmers to use `Int` as much as possible. From [the Swift Programming Language](https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/TheBasics.html#//apple_ref/doc/uid/TP40014097-CH5-ID319): > Swift provides an additional integer type, Int, which has the same size as the current platform’s native word size: ... @@ -40,7 +40,7 @@ Not renaming `int/uint` violates the principle of least surprise, and is not new As stated in previous discussions, all suggested alternative names have some drawbacks that may be unbearable. (Please refer to [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062) and related discussions for details.) -Therefore this RFC proposes a new pair of alternatives: `intx`/`uintx`, where the `x` suffix means "unknown size"/"variable size", or "platform-dependent size". +Therefore this RFC proposes a new pair of alternatives: `intx/uintx`, where the `x` suffix means "unknown size"/"variable size", or "platform-dependent size". The pros: @@ -62,6 +62,7 @@ Update code and documentation to use pointer-sized integers more narrowly for th - Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. - The new names are longer (but not much longer). +- The `x` suffix may be too generic and doesn't carry enough meaning. In particular, it signifies the fact that the size is "unknown"/"variable" "in some way", but what is this "some way" after all? # Alternatives @@ -73,6 +74,30 @@ Which may hurt in the long run, especially when there is at least one (would-be? While `ix/ux` more closely follow the `i32/u32` pattern, they may be too short (and tempting) and may not look like integer types for some. +**C. Use `intx/uintx` as the new literal suffixes, not just type names.** + +For some, `42intx/42uintx` are too long and don't look pretty, but then again others may find this desirable. + +**D. Use `intp/uintp` and/or `ip/up` instead.** + +Here `p` means "pointer (sized)" or "platform (dependent)", thus making the semantics of `intp/uintp` clearer than that of `intx/uintx`. + +The drawback here is that some people may incorrectly assume that `intp/uintp` *only* have the same use case as C/C++'s `intptr_t/uintptr_t`, which are *only* for storing casted pointer values. + +Also, as literal suffixes or type names, `ip/up` may be more confusing than `ix/ux`, as `ip/up` have meanings that aren't related to integers. + +**E. Use `imem/umem` and/or `im/um` instead.** + +While `imem/umem` was rejected previously, it is still controversial whether they are truly "ugly" or "not integer-like". Also, they may have some advantages over `intx/uintx`: + +- They actually more closely follow the `i32/u32` pattern: `i/u` + **mem**ory pointer-sized. +- So they also better describe what size they have, instead of just stating "unknown"/"variable", but the unfortunate implications of `intp/uintp` are avoided. +- If one prefers `imem/umem` as type names, then they also make better suffixes than `intx`/`uintx` because `umem` is shorter than `uintx` and `imem/umem` are of the same length. + +`im/um` may also be more (or less) confusing than `ix/ux`. + +A related pair of variants `intm/uintm` may also be worth considering. + # Unresolved questions None. From 8e3e94342f6111ad0eee9dc64c1d75b815a92e8d Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Tue, 30 Dec 2014 22:44:04 +0800 Subject: [PATCH 07/20] Major revision. --- text/0000-int-to-intx.md | 73 ++++++++++++++++++++++------------------ 1 file changed, 41 insertions(+), 32 deletions(-) diff --git a/text/0000-int-to-intx.md b/text/0000-int-to-intx.md index 3878d0aa78b..46c6e71721e 100644 --- a/text/0000-int-to-intx.md +++ b/text/0000-int-to-intx.md @@ -4,7 +4,7 @@ # Summary -Rename the pointer-sized integer types `int/uint` to `intx/uintx`, and use new literal suffixes `ix/ux`, so as to avoid misconceptions and misuses. +This RFC proposes that we rename the pointer-sized integer types `int/uint` to `intp/uintp`, `intm/uintm` or `imem/umem`, so as to avoid misconceptions and misuses. # Motivation @@ -40,63 +40,72 @@ Not renaming `int/uint` violates the principle of least surprise, and is not new As stated in previous discussions, all suggested alternative names have some drawbacks that may be unbearable. (Please refer to [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062) and related discussions for details.) -Therefore this RFC proposes a new pair of alternatives: `intx/uintx`, where the `x` suffix means "unknown size"/"variable size", or "platform-dependent size". +Before the rejection, the community largely settled on two pairs of candidates: `imem/umem` and `iptr/uptr`. -The pros: +`iptr/uptr` were rejected because they may remind the programmers of C/C++ `intptr_t`/`uintptr_t`, which were typically *only* used for storing casted pointer values. However, favouring any one of the types' use cases in the names is undesirable. -- The names are foreign to programmers from other languages, so they are less likely to make incorrect assumptions, or use them out of habit. -- But not too foreign, they still look like integer type names. (Some believe that `imem/umem` fail here.) -- They do not favour one of the types' use cases over the others in the names. (Alternatives `iptr/uptr`, `idiff/usize` and others fail here.) -- They follow the same *signed-ness + size* naming pattern used by other integer types like `i32/u32`. -- They somewhat look like `index/uindex`. This may or may not be an advantage. +`imem/umem` were rejected because they may be "not integer-like" and reminded people of "int memory"/"unsigned int memory" which made no sense. + +But the problems can be dealt with, in one of the following ways: + +1. instead of stressing the `ptr`/`mem` part, stress the `i`/`u` part, +2. or find and canonicalize a better interpretation for `imem/umem`. # Detailed Design -Rename these two pointer-sized integer types, `int` to `intx`, and `uint` to `uintx`. +## Approach 1. + +Rename `int/uint` to `intp/uintp` or `intm/uintm`. -Use `ix` and `ux` as the literal suffix for `intx` and `uintx`, respectively. +Introduce the respective literal suffixes `ip/up` or `im/um`. Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. -# Drawbacks +## Approach 2. -- Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. -- The new names are longer (but not much longer). -- The `x` suffix may be too generic and doesn't carry enough meaning. In particular, it signifies the fact that the size is "unknown"/"variable" "in some way", but what is this "some way" after all? +Rename `int/uint` to `imem/umem`, and use `imem/umem` as the new literal suffixes. -# Alternatives +Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. -**A. Keep the status quo.** +How to deal with the problems of `imem/umem`? -Which may hurt in the long run, especially when there is at least one (would-be?) high-profile language (which is Rust-inspired) taking the opposite stance of Rust. +Interpret them in the documentation as: **signed/unsigned _mem_ory-pointer-sized integers.** -**B. Use `ix/ux` as the new type names, not just literal suffixes.** +With this interpretation in mind, the RFC author (@CloudiDust) considers `imem`/`umem` to once again be his favourite among the three pairs of candidates. -While `ix/ux` more closely follow the `i32/u32` pattern, they may be too short (and tempting) and may not look like integer types for some. +## Advantages -**C. Use `intx/uintx` as the new literal suffixes, not just type names.** +The advantages of both approaches over `int`/`uint`: -For some, `42intx/42uintx` are too long and don't look pretty, but then again others may find this desirable. +- The names are foreign to programmers from other languages, so they are less likely to make incorrect assumptions, or use them out of habit. +- But not too foreign, they still look like integers. (Depending on personal taste, Approach 1 is either better or worse than Approach 2 on this one.) +- They follow the same *signed-ness + size* naming pattern used by other integer types like `i32/u32`. (Approach 2 is arguably better here.) + +# Drawbacks + +Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. -**D. Use `intp/uintp` and/or `ip/up` instead.** +`intp/uintp` are supported by several community members, but they still may remind people of `intptr_t`/`uintptr_t`, although arguably to a lesser extent, as `p` can be interpreted as `platform-dependent` here. -Here `p` means "pointer (sized)" or "platform (dependent)", thus making the semantics of `intp/uintp` clearer than that of `intx/uintx`. +With different trade-offs considered, `intm/uintm` may or may not be a better name than `imem/umem` depending on personal tastes. -The drawback here is that some people may incorrectly assume that `intp/uintp` *only* have the same use case as C/C++'s `intptr_t/uintptr_t`, which are *only* for storing casted pointer values. +# Alternatives + +## A. Keep the status quo. + +Which may hurt in the long run, especially when there is at least one (would-be?) high-profile language (which is Rust-inspired) taking the opposite stance of Rust. -Also, as literal suffixes or type names, `ip/up` may be more confusing than `ix/ux`, as `ip/up` have meanings that aren't related to integers. +## B. Approach 1a: Use `intx/uintx` as the new type names, and `ix/ux` as the new suffixes. -**E. Use `imem/umem` and/or `im/um` instead.** +`intx/uintx` were actually the names that got promoted in the previous revisions of this RFC, where `x` means "unknown", "variable" or "platform-dependent". However the `x` suffix was too vague as there were other integer types that have platform-dependent sizes, like register-sized ones, so `intx/uintx` lost their "promoted" status in this revision. -While `imem/umem` was rejected previously, it is still controversial whether they are truly "ugly" or "not integer-like". Also, they may have some advantages over `intx/uintx`: +## C. Approach 1b: Use the proposed literal suffixes as the new type names, not just literal suffixes. -- They actually more closely follow the `i32/u32` pattern: `i/u` + **mem**ory pointer-sized. -- So they also better describe what size they have, instead of just stating "unknown"/"variable", but the unfortunate implications of `intp/uintp` are avoided. -- If one prefers `imem/umem` as type names, then they also make better suffixes than `intx`/`uintx` because `umem` is shorter than `uintx` and `imem/umem` are of the same length. +While that will make type names more closely follow the `i32/u32` pattern, they may be too short (and tempting) and may have unrelated meanings on their own (`ip`/`up`/`um` etc.) -`im/um` may also be more (or less) confusing than `ix/ux`. +## D. Approach 1c: Use the proposed type names as literal suffixes, not just type names. -A related pair of variants `intm/uintm` may also be worth considering. +`uintp`/`uintm`/`uintx` may or may not too long as literal suffixes. # Unresolved questions From 6746c05459836a463d64c3df3208f1612cb7b47d Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Tue, 30 Dec 2014 22:49:58 +0800 Subject: [PATCH 08/20] Renamed the RFC file. --- text/{0000-int-to-intx.md => 0000-rename-int-uint.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-int-to-intx.md => 0000-rename-int-uint.md} (100%) diff --git a/text/0000-int-to-intx.md b/text/0000-rename-int-uint.md similarity index 100% rename from text/0000-int-to-intx.md rename to text/0000-rename-int-uint.md From 70ffaa47d0eadfe62db79a90d6d6a19228c364f8 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Thu, 1 Jan 2015 14:30:54 +0800 Subject: [PATCH 09/20] Another major revision. Major revision. Added `iptr/uptr` back to the candidates based on recent community discussions, rewrote much text, and suggested that we should wait until the general policy about integers were nailed down. --- text/0000-rename-int-uint.md | 73 +++++++++++++++++------------------- 1 file changed, 35 insertions(+), 38 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 46c6e71721e..727a2ae1dae 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -4,7 +4,7 @@ # Summary -This RFC proposes that we rename the pointer-sized integer types `int/uint` to `intp/uintp`, `intm/uintm` or `imem/umem`, so as to avoid misconceptions and misuses. +This RFC proposes that we rename the pointer-sized integer types `int/uint` to stress the fact that they are pointer-sized, so as to avoid misconceptions and misuses. # Motivation @@ -38,56 +38,61 @@ Thus, it is very likely that newcomers will come to Rust, expecting `int/uint` t Not renaming `int/uint` violates the principle of least surprise, and is not newcomer friendly. -As stated in previous discussions, all suggested alternative names have some drawbacks that may be unbearable. (Please refer to [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062) and related discussions for details.) +Before the rejection of [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464), the community largely settled on two pairs of candidates: `imem/umem` and `iptr/uptr`. As stated in previous discussions, the names have some drawbacks that may be unbearable. (Please refer to [A tale of two's complement](http://discuss.rust-lang.org/t/a-tale-of-twos-complement/1062) and related discussions for details.) -Before the rejection, the community largely settled on two pairs of candidates: `imem/umem` and `iptr/uptr`. +This RFC originally proposed a new pair of alternatives `intx/uintx`. -`iptr/uptr` were rejected because they may remind the programmers of C/C++ `intptr_t`/`uintptr_t`, which were typically *only* used for storing casted pointer values. However, favouring any one of the types' use cases in the names is undesirable. +However, given the discussions about the previous revisions of this RFC, and the discussions in [Restarting the `int/uint` Discussion]( http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), this RFC author (@CloudiDust) now believes that `iptr/uptr` and `imem/umem` can still be viable candidates. -`imem/umem` were rejected because they may be "not integer-like" and reminded people of "int memory"/"unsigned int memory" which made no sense. +This RFC also proposes two more pairs of candidates: `intp/uintp` and `intm/uintm`, which are actually variants of the above, but stress the `int` part instead of the `ptr`/`mem` part, which may make them subjectively better (or worse), as some would think `iptr/uptr` and `imem/umem` stress the wrong part of the name, and don't look like integer types. -But the problems can be dealt with, in one of the following ways: +# Detailed Design -1. instead of stressing the `ptr`/`mem` part, stress the `i`/`u` part, -2. or find and canonicalize a better interpretation for `imem/umem`. +Rename `int/uint` to one of the above pairs of candidates. -# Detailed Design +Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. -## Approach 1. +Also, a decision should be made about whether to introduce the respective literal suffixes `ip/up` or `im/um`, or use the type names as literal suffixes. -Rename `int/uint` to `intp/uintp` or `intm/uintm`. +This author believes that, if `iptr/uptr` or `imem/umem` are chosen, they should be used directly as suffixes, but if `intp/uintp` or `intm/uintm` are chosen, then `ip/up` or `im/um` should be used, as `uintp`/`uintm` may be too long as suffixes. `ip/up` and `im/um` don't make good *type names* though, as they are too short and have meanings unrelated to integers. -Introduce the respective literal suffixes `ip/up` or `im/um`. +## Advantages -Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. +Each pair of candidates make different trade-offs, and choosing one would be a quite subjective matter. But they are all better than `int/uint`. -## Approach 2. +### The advantages of all candidates over `int/uint`: -Rename `int/uint` to `imem/umem`, and use `imem/umem` as the new literal suffixes. +- The names are foreign to programmers from other languages, so they are less likely to make incorrect assumptions, or use them out of habit. +- But not too foreign, they still look like integers. (`intp/uintp` and `intm/uintm` may be a bit better here.) +- They follow the same *signed-ness + size* naming pattern used by other integer types like `i32/u32`. (`iptr/uptr` and `imem/umem` are better here as they follow the pattern more faithfully. Please see the following discussion for why `imem/umem` also have the *size* part.) -Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. +### The advantages and disadvantages of suffixes `ptr/p` and `mem/m`: -How to deal with the problems of `imem/umem`? +#### `iptr/uptr` and `intp/uintp`: -Interpret them in the documentation as: **signed/unsigned _mem_ory-pointer-sized integers.** +- Pros: "Pointer-sized integer", exactly what they are. +- Cons: C/C++ have `intptr_t/uintptr_t`, which are typically *only* used for storing casted pointer values. We don't want people to confuse the Rust types with the C/C++ ones, as the Rust ones have more typical use cases. -With this interpretation in mind, the RFC author (@CloudiDust) considers `imem`/`umem` to once again be his favourite among the three pairs of candidates. +However, note that Rust (barring FFI) don't have `size_t`/`ptrdiff_t` etc like C/C++ do, so the disadvantage may not actually matter much. -## Advantages +Also, there are talks about parametrizing data structures over their indexing/size types, if this lands, then in a sense the pointer-sized integers would no longer be the "privileged types for sizes/indexes", and `iptr/uptr` or `intp/uintp` would be quite fine names. -The advantages of both approaches over `int`/`uint`: +#### `imem/umem` and `intm/uintm`: -- The names are foreign to programmers from other languages, so they are less likely to make incorrect assumptions, or use them out of habit. -- But not too foreign, they still look like integers. (Depending on personal taste, Approach 1 is either better or worse than Approach 2 on this one.) -- They follow the same *signed-ness + size* naming pattern used by other integer types like `i32/u32`. (Approach 2 is arguably better here.) +When originally proposed, `mem`/`m` are interpreted as "for memory related things like offsets, indices, sizes, pointer values". However this interpretation seems vague and not quite convincing. But actually, they can be interpreted as **_mem_ory-pointer-sized**, and be a *precise* size specifier just like `ptr`. -# Drawbacks +- Pros: Types with similar names do not exist in mainstream languages, so people will not make incorrect assumptions. +- Cons: `mem` -> `memory-integer-sized` is not obvious. -Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. +However, people will be tempted to read the documentation anyway when they encounter `imem/umem` or `intm/uintm`. And this RFC author expects the interpretation to be quite easy to internalize. + +### Note: -`intp/uintp` are supported by several community members, but they still may remind people of `intptr_t`/`uintptr_t`, although arguably to a lesser extent, as `p` can be interpreted as `platform-dependent` here. +This RFC author personally prefers `imem/umem` now, but `intp/uintp` and `iptr/uptr` also have plenty of community support. No one else seem to care about `intm/uintm`, and they are only in for symmetry and completeness. -With different trade-offs considered, `intm/uintm` may or may not be a better name than `imem/umem` depending on personal tastes. +# Drawbacks + +Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. # Alternatives @@ -95,18 +100,10 @@ With different trade-offs considered, `intm/uintm` may or may not be a better na Which may hurt in the long run, especially when there is at least one (would-be?) high-profile language (which is Rust-inspired) taking the opposite stance of Rust. -## B. Approach 1a: Use `intx/uintx` as the new type names, and `ix/ux` as the new suffixes. +## B. Use `intx/uintx` as the new type names. `intx/uintx` were actually the names that got promoted in the previous revisions of this RFC, where `x` means "unknown", "variable" or "platform-dependent". However the `x` suffix was too vague as there were other integer types that have platform-dependent sizes, like register-sized ones, so `intx/uintx` lost their "promoted" status in this revision. -## C. Approach 1b: Use the proposed literal suffixes as the new type names, not just literal suffixes. - -While that will make type names more closely follow the `i32/u32` pattern, they may be too short (and tempting) and may have unrelated meanings on their own (`ip`/`up`/`um` etc.) - -## D. Approach 1c: Use the proposed type names as literal suffixes, not just type names. - -`uintp`/`uintm`/`uintx` may or may not too long as literal suffixes. - # Unresolved questions -None. +This RFC author believes that we should nail down our general integer types/coercions/data structure indexing policies, as discussed in [Restarting the `int/uint` Discussion](http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), before deciding which names to rename `int/uint` to. From f7c94cc373d91e635fe7724b14e616fa1ec91dad Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Thu, 1 Jan 2015 14:39:27 +0800 Subject: [PATCH 10/20] Typo correction. memory-integer-sized -> memory-pointer-sized. --- text/0000-rename-int-uint.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 727a2ae1dae..25612210bac 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -82,7 +82,7 @@ Also, there are talks about parametrizing data structures over their indexing/si When originally proposed, `mem`/`m` are interpreted as "for memory related things like offsets, indices, sizes, pointer values". However this interpretation seems vague and not quite convincing. But actually, they can be interpreted as **_mem_ory-pointer-sized**, and be a *precise* size specifier just like `ptr`. - Pros: Types with similar names do not exist in mainstream languages, so people will not make incorrect assumptions. -- Cons: `mem` -> `memory-integer-sized` is not obvious. +- Cons: `mem` -> `memory-pointer-sized` is not obvious. However, people will be tempted to read the documentation anyway when they encounter `imem/umem` or `intm/uintm`. And this RFC author expects the interpretation to be quite easy to internalize. From 26692e32b86e1bd58abdbc99ca6486c781d33374 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Thu, 1 Jan 2015 15:49:49 +0800 Subject: [PATCH 11/20] Adjusted discussions about `imem/umem`. --- text/0000-rename-int-uint.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 25612210bac..6592aebe812 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -66,7 +66,7 @@ Each pair of candidates make different trade-offs, and choosing one would be a q - But not too foreign, they still look like integers. (`intp/uintp` and `intm/uintm` may be a bit better here.) - They follow the same *signed-ness + size* naming pattern used by other integer types like `i32/u32`. (`iptr/uptr` and `imem/umem` are better here as they follow the pattern more faithfully. Please see the following discussion for why `imem/umem` also have the *size* part.) -### The advantages and disadvantages of suffixes `ptr/p` and `mem/m`: +### The advantages and disadvantages of suffixes `ptr`/`p` and `mem`/`m`: #### `iptr/uptr` and `intp/uintp`: @@ -79,12 +79,16 @@ Also, there are talks about parametrizing data structures over their indexing/si #### `imem/umem` and `intm/uintm`: -When originally proposed, `mem`/`m` are interpreted as "for memory related things like offsets, indices, sizes, pointer values". However this interpretation seems vague and not quite convincing. But actually, they can be interpreted as **_mem_ory-pointer-sized**, and be a *precise* size specifier just like `ptr`. +When originally proposed, `mem`/`m` are interpreted as "memory numbers" (See @1fish2's comment in[RFC PR 464](https://github.com/rust-lang/rfcs/pull/464)): + +> `imem`/`umem` are "memory numbers." They're good for indexes, counts, offsets, sizes, etc. As memory numbers, it makes sense that they're sized by the address space. + +However this interpretation seems vague and not quite convincing, especially when all other integer types in Rust are named precisely in the "`i`/`u` + `size`" pattern, with no "indirection" involved. What is "memory-sized" anyway? But actually, they can be interpreted as **_mem_ory-pointer-sized**, and be a *precise* size specifier just like `ptr`. - Pros: Types with similar names do not exist in mainstream languages, so people will not make incorrect assumptions. -- Cons: `mem` -> `memory-pointer-sized` is not obvious. +- Cons: `mem` -> *memory-pointer-sized* is not as obvious as `ptr` -> *pointer-sized*. -However, people will be tempted to read the documentation anyway when they encounter `imem/umem` or `intm/uintm`. And this RFC author expects the interpretation to be quite easy to internalize. +However, people will be tempted to read the documentation anyway when they encounter `imem/umem` or `intm/uintm`. And this RFC author expects the "memory-pointer-sized" interpretation to be quite easy to internalize once the documentation gets consulted. ### Note: From bde58ccf1f931861d90209427e5e5e45de29ea4f Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Sat, 3 Jan 2015 12:17:59 +0800 Subject: [PATCH 12/20] Added discussions about `idiff/usize`. And general refinements. --- text/0000-rename-int-uint.md | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 6592aebe812..1cbd3c8c5d3 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -6,6 +6,8 @@ This RFC proposes that we rename the pointer-sized integer types `int/uint` to stress the fact that they are pointer-sized, so as to avoid misconceptions and misuses. +Also, please see **Alternative C and D** for a possible alternative/enhancement that may or may not be a better solution. + # Motivation Currently, Rust defines two [machine-dependent integer types](http://doc.rust-lang.org/reference.html#machine-dependent-integer-types) `int/uint` that have the same number of bits as the target platform's pointer type. These two types are used for many purposes: indices, counts, sizes, offsets, etc. @@ -71,7 +73,7 @@ Each pair of candidates make different trade-offs, and choosing one would be a q #### `iptr/uptr` and `intp/uintp`: - Pros: "Pointer-sized integer", exactly what they are. -- Cons: C/C++ have `intptr_t/uintptr_t`, which are typically *only* used for storing casted pointer values. We don't want people to confuse the Rust types with the C/C++ ones, as the Rust ones have more typical use cases. +- Cons: C/C++ have `intptr_t/uintptr_t`, which are typically *only* used for storing casted pointer values. We don't want people to confuse the Rust types with the C/C++ ones, as the Rust ones have more typical use cases. Also, people may wonder why all data structures have "pointers" in their method signatures. Besides the "funny-looking" aspect, the names may have an incorrect "pointer fiddling and unsafe staff" connotation there. However, note that Rust (barring FFI) don't have `size_t`/`ptrdiff_t` etc like C/C++ do, so the disadvantage may not actually matter much. @@ -86,9 +88,9 @@ When originally proposed, `mem`/`m` are interpreted as "memory numbers" (See @1f However this interpretation seems vague and not quite convincing, especially when all other integer types in Rust are named precisely in the "`i`/`u` + `size`" pattern, with no "indirection" involved. What is "memory-sized" anyway? But actually, they can be interpreted as **_mem_ory-pointer-sized**, and be a *precise* size specifier just like `ptr`. - Pros: Types with similar names do not exist in mainstream languages, so people will not make incorrect assumptions. -- Cons: `mem` -> *memory-pointer-sized* is not as obvious as `ptr` -> *pointer-sized*. +- Cons: `mem` -> *memory-pointer-sized* is definitely not as obvious as `ptr` -> *pointer-sized*. -However, people will be tempted to read the documentation anyway when they encounter `imem/umem` or `intm/uintm`. And this RFC author expects the "memory-pointer-sized" interpretation to be quite easy to internalize once the documentation gets consulted. +However, people will be tempted to read the documentation anyway when they encounter `imem/umem` or `intm/uintm`. And this RFC author expects the "memory-pointer-sized" interpretation to be easy (or easier) to internalize once the documentation gets consulted. ### Note: @@ -108,6 +110,20 @@ Which may hurt in the long run, especially when there is at least one (would-be? `intx/uintx` were actually the names that got promoted in the previous revisions of this RFC, where `x` means "unknown", "variable" or "platform-dependent". However the `x` suffix was too vague as there were other integer types that have platform-dependent sizes, like register-sized ones, so `intx/uintx` lost their "promoted" status in this revision. +## C. Use `idiff/usize` as the new type names. + +Previously, the names involving suffixes like `diff`/`addr`/`size`/`offset` are rejected mainly because they favour specific use cases of `int/uint` while overlooking others. However, it is true that in the majority of cases in safe code, Rust's `int/uint` are used just like standard C/C++ `ptrdiff_t/size_t`. When used in this context, names `idiff/usize` have clarity and familiarity advantages **over all other alternatives**. + +(Note: this author advices against `isize`, as it most likely corresponds to C/C++ `ssize_t`. `ssize_t` is in the POSIX standard, not the C/C++ ones, and is *not for offsets/diffs* according to that standard.) + +But how about the other use cases of `int/uint` especially the "storing casted pointers" one? Using `libc`'s `intptr_t`/`uintptr_t` is not an option here, as "Rust on bare metal" would be ruled out. Forcing a pointer value into something called `idiff/usize` doesn't seem right either. Thus, this leads us to: + +## D. Rename `int/uint` to `iptr/uptr`, with `idiff/usize` being aliases and preferred in container method signatures. + +Best of both worlds, maybe? This author believes that if `imem/umem` are deemed too foreign (and quite a few people do think so), then this can be a good solution. + +We may even treat `iptr/uptr` and `idiff/usize` as different types to prevent people from accidentally mixing their usage. + # Unresolved questions -This RFC author believes that we should nail down our general integer types/coercions/data structure indexing policies, as discussed in [Restarting the `int/uint` Discussion](http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), before deciding which names to rename `int/uint` to. +This RFC author believes that we should nail down our general integer type/coercion/data structure indexing policies, as discussed in [Restarting the `int/uint` Discussion](http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), before deciding which names to rename `int/uint` to. From 742e511b40b3b4e0fad4e833b31dc569723db8e2 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Sun, 4 Jan 2015 22:27:56 +0800 Subject: [PATCH 13/20] Added more discussions about Alternative D. And the author now considers Alternative D to be quite unsatisfying. The preference of `imem/umem` is now stressed in the summary. --- text/0000-rename-int-uint.md | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 1cbd3c8c5d3..2c44db2325c 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -4,9 +4,7 @@ # Summary -This RFC proposes that we rename the pointer-sized integer types `int/uint` to stress the fact that they are pointer-sized, so as to avoid misconceptions and misuses. - -Also, please see **Alternative C and D** for a possible alternative/enhancement that may or may not be a better solution. +This RFC proposes that we rename the pointer-sized integer types `int/uint` to stress the fact that they are pointer-sized, so as to avoid misconceptions and misuses. Among all the candidates, this RFC author (@CloudiDust) considers `imem/umem` to be his favourite names. # Motivation @@ -44,7 +42,7 @@ Before the rejection of [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464) This RFC originally proposed a new pair of alternatives `intx/uintx`. -However, given the discussions about the previous revisions of this RFC, and the discussions in [Restarting the `int/uint` Discussion]( http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), this RFC author (@CloudiDust) now believes that `iptr/uptr` and `imem/umem` can still be viable candidates. +However, given the discussions about the previous revisions of this RFC, and the discussions in [Restarting the `int/uint` Discussion]( http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), this RFC author now believes that `iptr/uptr` and `imem/umem` can still be viable candidates. This RFC also proposes two more pairs of candidates: `intp/uintp` and `intm/uintm`, which are actually variants of the above, but stress the `int` part instead of the `ptr`/`mem` part, which may make them subjectively better (or worse), as some would think `iptr/uptr` and `imem/umem` stress the wrong part of the name, and don't look like integer types. @@ -120,10 +118,18 @@ But how about the other use cases of `int/uint` especially the "storing casted p ## D. Rename `int/uint` to `iptr/uptr`, with `idiff/usize` being aliases and preferred in container method signatures. -Best of both worlds, maybe? This author believes that if `imem/umem` are deemed too foreign (and quite a few people do think so), then this can be a good solution. +Best of both worlds, maybe? + +`iptr/uptr` will be used for storing casted pointer values, while `idiff/usize` will be used for offsets and sizes/indices, respectively. We may even treat `iptr/uptr` and `idiff/usize` as different types to prevent people from accidentally mixing their usage. +This will bring the Rust type names quite in line with the standard C99 type names, which may be a plus from the familiarity point of view. + +However, this setup brings two sets of types that share the same underlying representations, which also brings confusion. Furthermore, C distinguishes between `size_t`/`uintptr_t`/`intptr_t`/`ptrdiff_t` not only because they are used under different circumstances, but also because the four may have representations that are potentially different from *each other* on some architectures. Rust assumes a flat memory address space and its `int/uint` types don't exactly share semantics with any of the C types if the C standard is strictly followed. Thus, this RFC author believes that, it is better to completely forego type names that will remind people of the C types. + +`imem/umem` are still the better choices. + # Unresolved questions This RFC author believes that we should nail down our general integer type/coercion/data structure indexing policies, as discussed in [Restarting the `int/uint` Discussion](http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), before deciding which names to rename `int/uint` to. From 602fd1070676de87fecba9240620e1319726efc1 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Sun, 4 Jan 2015 23:37:37 +0800 Subject: [PATCH 14/20] Adjusted the discussions about `ptr` and `mem`. --- text/0000-rename-int-uint.md | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 2c44db2325c..ef7ec3ee49d 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -71,11 +71,7 @@ Each pair of candidates make different trade-offs, and choosing one would be a q #### `iptr/uptr` and `intp/uintp`: - Pros: "Pointer-sized integer", exactly what they are. -- Cons: C/C++ have `intptr_t/uintptr_t`, which are typically *only* used for storing casted pointer values. We don't want people to confuse the Rust types with the C/C++ ones, as the Rust ones have more typical use cases. Also, people may wonder why all data structures have "pointers" in their method signatures. Besides the "funny-looking" aspect, the names may have an incorrect "pointer fiddling and unsafe staff" connotation there. - -However, note that Rust (barring FFI) don't have `size_t`/`ptrdiff_t` etc like C/C++ do, so the disadvantage may not actually matter much. - -Also, there are talks about parametrizing data structures over their indexing/size types, if this lands, then in a sense the pointer-sized integers would no longer be the "privileged types for sizes/indexes", and `iptr/uptr` or `intp/uintp` would be quite fine names. +- Cons: C/C++ have `intptr_t/uintptr_t`, which are typically *only* used for storing casted pointer values. We don't want people to confuse the Rust types with the C/C++ ones, as the Rust ones have more typical use cases. Also, people may wonder why all data structures have "pointers" in their method signatures. Besides the "funny-looking" aspect, the names may have an incorrect "pointer fiddling and unsafe staff" connotation there, as `ptr` isn't usually seen in safe Rust code. #### `imem/umem` and `intm/uintm`: @@ -83,16 +79,16 @@ When originally proposed, `mem`/`m` are interpreted as "memory numbers" (See @1f > `imem`/`umem` are "memory numbers." They're good for indexes, counts, offsets, sizes, etc. As memory numbers, it makes sense that they're sized by the address space. -However this interpretation seems vague and not quite convincing, especially when all other integer types in Rust are named precisely in the "`i`/`u` + `size`" pattern, with no "indirection" involved. What is "memory-sized" anyway? But actually, they can be interpreted as **_mem_ory-pointer-sized**, and be a *precise* size specifier just like `ptr`. +However this interpretation seems vague and not quite convincing, especially when all other integer types in Rust are named precisely in the "`i`/`u` + `{size}`" pattern, with no "indirection" involved. What is "memory-sized" anyway? But actually, they can be interpreted as **_mem_ory-pointer-sized**, and be a *precise* size specifier just like `ptr`. - Pros: Types with similar names do not exist in mainstream languages, so people will not make incorrect assumptions. -- Cons: `mem` -> *memory-pointer-sized* is definitely not as obvious as `ptr` -> *pointer-sized*. +- Cons: `mem` -> *memory-pointer-sized* is definitely not as obvious as `ptr` -> *pointer-sized*. The unfamiliarity may turn newcomers away from Rust. -However, people will be tempted to read the documentation anyway when they encounter `imem/umem` or `intm/uintm`. And this RFC author expects the "memory-pointer-sized" interpretation to be easy (or easier) to internalize once the documentation gets consulted. +However, this RFC author expects newcomers to read the documentation when they encounter `imem/umem` or `intm/uintm` because they wonder "what on earth are these two types?" If they don't bother reading the documentation, then it is unlikely that they will be using Rust anyway (`imem/umem` or `intm/uintm` are minor problems compared to something like explicit lifetimes or the borrow checker). And the "memory-pointer-sized" interpretation is easy (or easier) to internalize once the documentation gets consulted. ### Note: -This RFC author personally prefers `imem/umem` now, but `intp/uintp` and `iptr/uptr` also have plenty of community support. No one else seem to care about `intm/uintm`, and they are only in for symmetry and completeness. +This RFC author personally prefers `imem/umem` now, but `intp/uintp` and `iptr/uptr` also have plenty of community support. Few people seem to care about `intm/uintm`, and these two are only in for symmetry and completeness. # Drawbacks From de09272abbe92c15667f374fc98f734ea78fb583 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Mon, 5 Jan 2015 22:12:24 +0800 Subject: [PATCH 15/20] Major revision to promote `ipsz/upsz`. --- text/0000-rename-int-uint.md | 109 ++++++++++++++++++++++++----------- 1 file changed, 74 insertions(+), 35 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index ef7ec3ee49d..bd77378e8de 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -4,7 +4,7 @@ # Summary -This RFC proposes that we rename the pointer-sized integer types `int/uint` to stress the fact that they are pointer-sized, so as to avoid misconceptions and misuses. Among all the candidates, this RFC author (@CloudiDust) considers `imem/umem` to be his favourite names. +This RFC proposes that we rename the pointer-sized integer types `int/uint` to `ipsz/upsz`, so as to avoid misconceptions and misuses. # Motivation @@ -42,40 +42,47 @@ Before the rejection of [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464) This RFC originally proposed a new pair of alternatives `intx/uintx`. -However, given the discussions about the previous revisions of this RFC, and the discussions in [Restarting the `int/uint` Discussion]( http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), this RFC author now believes that `iptr/uptr` and `imem/umem` can still be viable candidates. - -This RFC also proposes two more pairs of candidates: `intp/uintp` and `intm/uintm`, which are actually variants of the above, but stress the `int` part instead of the `ptr`/`mem` part, which may make them subjectively better (or worse), as some would think `iptr/uptr` and `imem/umem` stress the wrong part of the name, and don't look like integer types. +However, given the discussions about the previous revisions of this RFC, and the discussions in [Restarting the `int/uint` Discussion]( http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), this RFC author (@CloudiDust) now believes that `intx/uintx` are not ideal. Instead, `ipsz/upsz` are this author's favourites now. # Detailed Design -Rename `int/uint` to one of the above pairs of candidates. +Rename `int/uint` to `ipsz/upsz`, and use `ipsz/upsz` directly as literal suffixes for pointer-sized integers. `ipsz/upsz` are short for **signed/unsigned integer, pointer-sized**. Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. -Also, a decision should be made about whether to introduce the respective literal suffixes `ip/up` or `im/um`, or use the type names as literal suffixes. +## Advantages -This author believes that, if `iptr/uptr` or `imem/umem` are chosen, they should be used directly as suffixes, but if `intp/uintp` or `intm/uintm` are chosen, then `ip/up` or `im/um` should be used, as `uintp`/`uintm` may be too long as suffixes. `ip/up` and `im/um` don't make good *type names* though, as they are too short and have meanings unrelated to integers. +The advantages of `ipsz/upsz` are: -## Advantages +- The names are foreign to programmers from other languages, so they are less likely to make incorrect assumptions, or use them out of habit. +- They follow the same *signed-ness + size* naming pattern used by other integer types like `i32/u32`. +- The actual semantics of the types are right there in the names, but the names don't stress the `p` (pointer) parts. +- If anything, the names actually stress the `sz` (size) parts. +- They are (comparatively) easy on the eyes. -Each pair of candidates make different trade-offs, and choosing one would be a quite subjective matter. But they are all better than `int/uint`. +In order to see why some of the above points are advantages, please refer to the **Alternatives** section for the discussions of the other candidates. -### The advantages of all candidates over `int/uint`: +# Drawbacks -- The names are foreign to programmers from other languages, so they are less likely to make incorrect assumptions, or use them out of habit. -- But not too foreign, they still look like integers. (`intp/uintp` and `intm/uintm` may be a bit better here.) -- They follow the same *signed-ness + size* naming pattern used by other integer types like `i32/u32`. (`iptr/uptr` and `imem/umem` are better here as they follow the pattern more faithfully. Please see the following discussion for why `imem/umem` also have the *size* part.) +- Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. +- Some consider `ipsz/upsz` to be letter soup, and they are right. But this author expects `ipsz/upsz` to be easily understandable *and pleasant to use* once the documentation gets consulted. -### The advantages and disadvantages of suffixes `ptr`/`p` and `mem`/`m`: +# Alternatives -#### `iptr/uptr` and `intp/uintp`: +## A. Keep the status quo. + +Which may hurt in the long run, especially when there is at least one (would-be?) high-profile language (which is Rust-inspired) taking the opposite stance of Rust. + +The following alternatives make different trade-offs, and choosing one would be quite a subjective matter. But they are all better than the status quo. + +## B. `iptr/uptr`: - Pros: "Pointer-sized integer", exactly what they are. - Cons: C/C++ have `intptr_t/uintptr_t`, which are typically *only* used for storing casted pointer values. We don't want people to confuse the Rust types with the C/C++ ones, as the Rust ones have more typical use cases. Also, people may wonder why all data structures have "pointers" in their method signatures. Besides the "funny-looking" aspect, the names may have an incorrect "pointer fiddling and unsafe staff" connotation there, as `ptr` isn't usually seen in safe Rust code. -#### `imem/umem` and `intm/uintm`: +## C. `imem/umem`: -When originally proposed, `mem`/`m` are interpreted as "memory numbers" (See @1fish2's comment in[RFC PR 464](https://github.com/rust-lang/rfcs/pull/464)): +When originally proposed, `mem`/`m` are interpreted as "memory numbers" (See @1fish2's comment in [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464)): > `imem`/`umem` are "memory numbers." They're good for indexes, counts, offsets, sizes, etc. As memory numbers, it makes sense that they're sized by the address space. @@ -84,48 +91,80 @@ However this interpretation seems vague and not quite convincing, especially whe - Pros: Types with similar names do not exist in mainstream languages, so people will not make incorrect assumptions. - Cons: `mem` -> *memory-pointer-sized* is definitely not as obvious as `ptr` -> *pointer-sized*. The unfamiliarity may turn newcomers away from Rust. -However, this RFC author expects newcomers to read the documentation when they encounter `imem/umem` or `intm/uintm` because they wonder "what on earth are these two types?" If they don't bother reading the documentation, then it is unlikely that they will be using Rust anyway (`imem/umem` or `intm/uintm` are minor problems compared to something like explicit lifetimes or the borrow checker). And the "memory-pointer-sized" interpretation is easy (or easier) to internalize once the documentation gets consulted. +Also, for some, `imem/umem` just don't feel like integers no matter how they are interpreted, especially under certain circumstances. In the following snippet: -### Note: +```rust +fn slice_or_fail<'b>(&'b self, from: &umem, to: &umem) -> &'b [T] +``` -This RFC author personally prefers `imem/umem` now, but `intp/uintp` and `iptr/uptr` also have plenty of community support. Few people seem to care about `intm/uintm`, and these two are only in for symmetry and completeness. +`umem` still feels like a pointer-like construct here (from "some memory" to "some other memory"), even though it doesn't have `ptr` in its name. -# Drawbacks +## D. `intp/uintp` and `intm/uintm`: -Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. +Variants of Alternatives B and C. Instead of stressing the `ptr` or `mem` part, they stress the `int` or `uint` part. -# Alternatives +They are more integer-like than `iptr/uptr` or `imem/umem` if one knows where to split the words. -## A. Keep the status quo. +The problem here is that they don't strictly follow the `i/u + {size}` pattern, are of different lengths, and the more frequently used type `uintp`(`uintm`) has a longer name. Granted, this problem already exists with `int/uint`, but those two are names that everyone is familiar with. -Which may hurt in the long run, especially when there is at least one (would-be?) high-profile language (which is Rust-inspired) taking the opposite stance of Rust. +So they are not as pretty as `iptr/uptr` or `imem/umem`. + +## E. `intx/uintx`: -## B. Use `intx/uintx` as the new type names. +The original proposed names of this RFC, where `x` means "unknown/variable/platform-dependent". -`intx/uintx` were actually the names that got promoted in the previous revisions of this RFC, where `x` means "unknown", "variable" or "platform-dependent". However the `x` suffix was too vague as there were other integer types that have platform-dependent sizes, like register-sized ones, so `intx/uintx` lost their "promoted" status in this revision. +They share the same problems with `intp/uintp` and `intm/uintm`, while *in addition* failing to be specific enough. There are other kinds of platform-dependent integer types after all (like register-sized ones), so which ones are `intx/uintx`? -## C. Use `idiff/usize` as the new type names. +## F. `idiff(isize)/usize`: -Previously, the names involving suffixes like `diff`/`addr`/`size`/`offset` are rejected mainly because they favour specific use cases of `int/uint` while overlooking others. However, it is true that in the majority of cases in safe code, Rust's `int/uint` are used just like standard C/C++ `ptrdiff_t/size_t`. When used in this context, names `idiff/usize` have clarity and familiarity advantages **over all other alternatives**. +Previously, the names involving suffixes like `diff`/`addr`/`size`/`offset` are rejected mainly because they favour specific use cases of `int/uint` while overlooking others. However, it is true that in the majority of cases in safe code, Rust's `int/uint` are used just like standard C/C++ `ptrdiff_t/size_t`. When used in this context, names `idiff(isize)/usize` have clarity and familiarity advantages **over all other alternatives**. -(Note: this author advices against `isize`, as it most likely corresponds to C/C++ `ssize_t`. `ssize_t` is in the POSIX standard, not the C/C++ ones, and is *not for offsets/diffs* according to that standard.) +(Note: this author advices against `isize`, as it most likely corresponds to C/C++ `ssize_t`. `ssize_t` is in the POSIX standard, not the C/C++ ones, and is *not for offsets* according to that standard. However some may argue that, `isize/usize` are different enough from `ssize_t/size_t` so this author's worries are unnecessary.) But how about the other use cases of `int/uint` especially the "storing casted pointers" one? Using `libc`'s `intptr_t`/`uintptr_t` is not an option here, as "Rust on bare metal" would be ruled out. Forcing a pointer value into something called `idiff/usize` doesn't seem right either. Thus, this leads us to: -## D. Rename `int/uint` to `iptr/uptr`, with `idiff/usize` being aliases and preferred in container method signatures. +## G. `iptr/uptr` *and* `idiff/usize`: -Best of both worlds, maybe? +Rename `int/uint` to `iptr/uptr`, with `idiff/usize` being aliases and used in container method signatures. + +Best of both worlds on the first glance. `iptr/uptr` will be used for storing casted pointer values, while `idiff/usize` will be used for offsets and sizes/indices, respectively. -We may even treat `iptr/uptr` and `idiff/usize` as different types to prevent people from accidentally mixing their usage. +`iptr/uptr` and `idiff/usize` may even be treated as different types to prevent people from accidentally mixing their usage. This will bring the Rust type names quite in line with the standard C99 type names, which may be a plus from the familiarity point of view. However, this setup brings two sets of types that share the same underlying representations, which also brings confusion. Furthermore, C distinguishes between `size_t`/`uintptr_t`/`intptr_t`/`ptrdiff_t` not only because they are used under different circumstances, but also because the four may have representations that are potentially different from *each other* on some architectures. Rust assumes a flat memory address space and its `int/uint` types don't exactly share semantics with any of the C types if the C standard is strictly followed. Thus, this RFC author believes that, it is better to completely forego type names that will remind people of the C types. -`imem/umem` are still the better choices. +## H. `isiz/usiz`: + +A pair of variants of `isize/usize`. This author believes that the missing `e` may be enough to warn people that these are not `ssize_t/size_t` with "Rustfied" names. But at the same time, `isiz/usiz` mostly retain the familiarity of `isize/usize`. Actually, this author considers them more pleasant to use than the "full version". + +However, `isiz/usiz` still hide the actual semantics of the types, and omitting but a single letter from a word does feel a bit too hack-ish. + +## I. `iptr_size/uptr_size`: + +The names are very clear about the semantics, but are also irregular, too long and feel out of place. + +## J. `iptrsz/uptrsz`: + +Clear semantics, but still a bit too long (though better than `iptr_size/uptr_size`), and the `ptr` parts are still a bit concerning (though to a much less extent than `iptr/uptr`). + +## H. `ipsz/upsz`: + +Now it is clear where this final pair of alternatives comes from. + +By shortening `ptr` to `p`, `ipsz/upsz` no longer stress the "pointer" parts in anyway. Instead, the `sz` or "size" parts are (comparatively) stressed. Interestingly, `ipsz/upsz` look similar to `isiz/usiz`. + +So this pair of names reflects both the precise semantics of "pointer-sized integers" and the fact that they are commonly used for "sizes". See: + +```rust +fn slice_or_fail<'b>(&'b self, from: &upsz, to: &upsz) -> &'b [T] +``` + +Some may still find `upsz` a bit strange here, but no one would be very likely to think that he/she is dealing with pointers. With the help of the documentation, this author believes `ipsz/upsz` to be the overall winner among all the alternatives. Still, not everyone likes letter soup. # Unresolved questions -This RFC author believes that we should nail down our general integer type/coercion/data structure indexing policies, as discussed in [Restarting the `int/uint` Discussion](http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), before deciding which names to rename `int/uint` to. +None. Necessary decisions about Rust's general integer type policies have been made in [Restarting the `int/uint` Discussion](http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131). From 8ca0e6c871e101b0b4777ec95ed40dc263047bd7 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Mon, 5 Jan 2015 22:47:05 +0800 Subject: [PATCH 16/20] Added examples to most alternatives. And fixed a typo. --- text/0000-rename-int-uint.md | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index bd77378e8de..9c26b95d1cf 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -78,7 +78,15 @@ The following alternatives make different trade-offs, and choosing one would be ## B. `iptr/uptr`: - Pros: "Pointer-sized integer", exactly what they are. -- Cons: C/C++ have `intptr_t/uintptr_t`, which are typically *only* used for storing casted pointer values. We don't want people to confuse the Rust types with the C/C++ ones, as the Rust ones have more typical use cases. Also, people may wonder why all data structures have "pointers" in their method signatures. Besides the "funny-looking" aspect, the names may have an incorrect "pointer fiddling and unsafe staff" connotation there, as `ptr` isn't usually seen in safe Rust code. +- Cons: C/C++ have `intptr_t/uintptr_t`, which are typically *only* used for storing casted pointer values. We don't want people to confuse the Rust types with the C/C++ ones, as the Rust ones have more typical use cases. Also, people may wonder why all data structures have "pointers" in their method signatures. Besides the "funny-looking" aspect, the names may have an incorrect "pointer fiddling and unsafe staff" connotation there, as `ptr` isn't usually seen in safe Rust code. + +In the following snippet: + +```rust +fn slice_or_fail<'b>(&'b self, from: &uptr, to: &uptr) -> &'b [T] +``` + +It feels like working with pointers, not integers. ## C. `imem/umem`: @@ -107,7 +115,12 @@ They are more integer-like than `iptr/uptr` or `imem/umem` if one knows where to The problem here is that they don't strictly follow the `i/u + {size}` pattern, are of different lengths, and the more frequently used type `uintp`(`uintm`) has a longer name. Granted, this problem already exists with `int/uint`, but those two are names that everyone is familiar with. -So they are not as pretty as `iptr/uptr` or `imem/umem`. +So they may not be as pretty as `iptr/uptr` or `imem/umem`. + +```rust +fn slice_or_fail<'b>(&'b self, from: &uintm, to: &uintm) -> &'b [T] +fn slice_or_fail<'b>(&'b self, from: &uintp, to: &uintp) -> &'b [T] +``` ## E. `intx/uintx`: @@ -123,6 +136,10 @@ Previously, the names involving suffixes like `diff`/`addr`/`size`/`offset` are But how about the other use cases of `int/uint` especially the "storing casted pointers" one? Using `libc`'s `intptr_t`/`uintptr_t` is not an option here, as "Rust on bare metal" would be ruled out. Forcing a pointer value into something called `idiff/usize` doesn't seem right either. Thus, this leads us to: +```rust +fn slice_or_fail<'b>(&'b self, from: &usize, to: &usize) -> &'b [T] +``` + ## G. `iptr/uptr` *and* `idiff/usize`: Rename `int/uint` to `iptr/uptr`, with `idiff/usize` being aliases and used in container method signatures. @@ -143,15 +160,27 @@ A pair of variants of `isize/usize`. This author believes that the missing `e` m However, `isiz/usiz` still hide the actual semantics of the types, and omitting but a single letter from a word does feel a bit too hack-ish. +```rust +fn slice_or_fail<'b>(&'b self, from: &usiz, to: &usiz) -> &'b [T] +``` + ## I. `iptr_size/uptr_size`: The names are very clear about the semantics, but are also irregular, too long and feel out of place. +```rust +fn slice_or_fail<'b>(&'b self, from: &uptr_size, to: &uptr_size) -> &'b [T] +``` + ## J. `iptrsz/uptrsz`: Clear semantics, but still a bit too long (though better than `iptr_size/uptr_size`), and the `ptr` parts are still a bit concerning (though to a much less extent than `iptr/uptr`). -## H. `ipsz/upsz`: +```rust +fn slice_or_fail<'b>(&'b self, from: &uptrsz, to: &uptrsz) -> &'b [T] +``` + +## K. `ipsz/upsz`: Now it is clear where this final pair of alternatives comes from. From acaa5604687589b9adb4cb7254600d3da70919af Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Mon, 5 Jan 2015 23:07:31 +0800 Subject: [PATCH 17/20] Stop promoting `ipsz/upsz` too much. --- text/0000-rename-int-uint.md | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 9c26b95d1cf..7e334150781 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -4,7 +4,7 @@ # Summary -This RFC proposes that we rename the pointer-sized integer types `int/uint` to `ipsz/upsz`, so as to avoid misconceptions and misuses. +This RFC proposes that we rename the pointer-sized integer types `int/uint`, so as to avoid misconceptions and misuses. # Motivation @@ -46,26 +46,16 @@ However, given the discussions about the previous revisions of this RFC, and the # Detailed Design -Rename `int/uint` to `ipsz/upsz`, and use `ipsz/upsz` directly as literal suffixes for pointer-sized integers. `ipsz/upsz` are short for **signed/unsigned integer, pointer-sized**. +Rename `int/uint` to one of the following pairs of alternatives, and decide how to name the literal suffixes for pointer-sized integers based on the selected alternative. Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. -## Advantages - -The advantages of `ipsz/upsz` are: - -- The names are foreign to programmers from other languages, so they are less likely to make incorrect assumptions, or use them out of habit. -- They follow the same *signed-ness + size* naming pattern used by other integer types like `i32/u32`. -- The actual semantics of the types are right there in the names, but the names don't stress the `p` (pointer) parts. -- If anything, the names actually stress the `sz` (size) parts. -- They are (comparatively) easy on the eyes. - -In order to see why some of the above points are advantages, please refer to the **Alternatives** section for the discussions of the other candidates. +See **Alternatives B to K** for the alternatives. # Drawbacks - Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. -- Some consider `ipsz/upsz` to be letter soup, and they are right. But this author expects `ipsz/upsz` to be easily understandable *and pleasant to use* once the documentation gets consulted. + # Alternatives @@ -192,7 +182,7 @@ So this pair of names reflects both the precise semantics of "pointer-sized inte fn slice_or_fail<'b>(&'b self, from: &upsz, to: &upsz) -> &'b [T] ``` -Some may still find `upsz` a bit strange here, but no one would be very likely to think that he/she is dealing with pointers. With the help of the documentation, this author believes `ipsz/upsz` to be the overall winner among all the alternatives. Still, not everyone likes letter soup. +Some may still find `upsz` a bit strange here, but no one would be very likely to think that he/she is dealing with pointers. Still, `ipsz/upsz` may be too foreign, and many do not like letter soup. `iptrsz/uptrsz` may actually be better in this regard. # Unresolved questions From f6b2d62db924ccf814e361389aadf87927b719a7 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Tue, 6 Jan 2015 00:02:55 +0800 Subject: [PATCH 18/20] Adjusted the discussions about `*ptrsz` and `*psz`. Also `*siz`, actually. --- text/0000-rename-int-uint.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 7e334150781..5e0c8924473 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -42,7 +42,7 @@ Before the rejection of [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464) This RFC originally proposed a new pair of alternatives `intx/uintx`. -However, given the discussions about the previous revisions of this RFC, and the discussions in [Restarting the `int/uint` Discussion]( http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), this RFC author (@CloudiDust) now believes that `intx/uintx` are not ideal. Instead, `ipsz/upsz` are this author's favourites now. +However, given the discussions about the previous revisions of this RFC, and the discussions in [Restarting the `int/uint` Discussion]( http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), this RFC author (@CloudiDust) now believes that `intx/uintx` are not ideal. Instead, one of the other pairs of alternatives should be chosen. # Detailed Design @@ -56,7 +56,6 @@ See **Alternatives B to K** for the alternatives. - Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. - # Alternatives ## A. Keep the status quo. @@ -124,12 +123,12 @@ Previously, the names involving suffixes like `diff`/`addr`/`size`/`offset` are (Note: this author advices against `isize`, as it most likely corresponds to C/C++ `ssize_t`. `ssize_t` is in the POSIX standard, not the C/C++ ones, and is *not for offsets* according to that standard. However some may argue that, `isize/usize` are different enough from `ssize_t/size_t` so this author's worries are unnecessary.) -But how about the other use cases of `int/uint` especially the "storing casted pointers" one? Using `libc`'s `intptr_t`/`uintptr_t` is not an option here, as "Rust on bare metal" would be ruled out. Forcing a pointer value into something called `idiff/usize` doesn't seem right either. Thus, this leads us to: - ```rust fn slice_or_fail<'b>(&'b self, from: &usize, to: &usize) -> &'b [T] ``` +But how about the other use cases of `int/uint` especially the "storing casted pointers" one? Using `libc`'s `intptr_t`/`uintptr_t` is not an option here, as "Rust on bare metal" would be ruled out. Forcing a pointer value into something called `idiff/usize` doesn't seem right either. Thus, this leads us to: + ## G. `iptr/uptr` *and* `idiff/usize`: Rename `int/uint` to `iptr/uptr`, with `idiff/usize` being aliases and used in container method signatures. @@ -148,7 +147,7 @@ However, this setup brings two sets of types that share the same underlying repr A pair of variants of `isize/usize`. This author believes that the missing `e` may be enough to warn people that these are not `ssize_t/size_t` with "Rustfied" names. But at the same time, `isiz/usiz` mostly retain the familiarity of `isize/usize`. Actually, this author considers them more pleasant to use than the "full version". -However, `isiz/usiz` still hide the actual semantics of the types, and omitting but a single letter from a word does feel a bit too hack-ish. +However, `isiz/usiz` still hide the actual semantics of the types, and omitting but a single letter from a word does feel too hack-ish. ```rust fn slice_or_fail<'b>(&'b self, from: &usiz, to: &usiz) -> &'b [T] @@ -164,7 +163,7 @@ fn slice_or_fail<'b>(&'b self, from: &uptr_size, to: &uptr_size) -> &'b [T] ## J. `iptrsz/uptrsz`: -Clear semantics, but still a bit too long (though better than `iptr_size/uptr_size`), and the `ptr` parts are still a bit concerning (though to a much less extent than `iptr/uptr`). +Clear semantics, but still a bit too long (though better than `iptr_size/uptr_size`), and the `ptr` parts are still a bit concerning (though to a much less extent than `iptr/uptr`). On the other hand, being "a bit too long" may not be a disadvantage here. ```rust fn slice_or_fail<'b>(&'b self, from: &uptrsz, to: &uptrsz) -> &'b [T] @@ -172,17 +171,17 @@ fn slice_or_fail<'b>(&'b self, from: &uptrsz, to: &uptrsz) -> &'b [T] ## K. `ipsz/upsz`: -Now it is clear where this final pair of alternatives comes from. +Now (and only now, which is the problem) it is clear where this final pair of alternatives comes from. By shortening `ptr` to `p`, `ipsz/upsz` no longer stress the "pointer" parts in anyway. Instead, the `sz` or "size" parts are (comparatively) stressed. Interestingly, `ipsz/upsz` look similar to `isiz/usiz`. -So this pair of names reflects both the precise semantics of "pointer-sized integers" and the fact that they are commonly used for "sizes". See: +So this pair of names actually reflects both the precise semantics of "pointer-sized integers" and the fact that they are commonly used for "sizes". However, ```rust fn slice_or_fail<'b>(&'b self, from: &upsz, to: &upsz) -> &'b [T] ``` -Some may still find `upsz` a bit strange here, but no one would be very likely to think that he/she is dealing with pointers. Still, `ipsz/upsz` may be too foreign, and many do not like letter soup. `iptrsz/uptrsz` may actually be better in this regard. +`ipsz/upsz` have gone too far. They are completely incomprehensible without the documentation. Many rightfully do not like letter soup. The only advantage here is that, no one would be very likely to think he/she is dealing with pointers. `iptrsz/uptrsz` are better in this regard. # Unresolved questions From 3cc625e4500013f2cf451f28abeb2e6ae341d952 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Tue, 6 Jan 2015 19:18:13 +0800 Subject: [PATCH 19/20] Winners: `isize/usize`. --- text/0000-rename-int-uint.md | 60 +++++++++++++++++++++++++----------- 1 file changed, 42 insertions(+), 18 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 5e0c8924473..322eea93b29 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -4,7 +4,7 @@ # Summary -This RFC proposes that we rename the pointer-sized integer types `int/uint`, so as to avoid misconceptions and misuses. +This RFC proposes that we rename the pointer-sized integer types `int/uint`, so as to avoid misconceptions and misuses. After extensive community discussions and several revisions of this RFC, the finally chosen names are `isize/usize`. # Motivation @@ -42,20 +42,44 @@ Before the rejection of [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464) This RFC originally proposed a new pair of alternatives `intx/uintx`. -However, given the discussions about the previous revisions of this RFC, and the discussions in [Restarting the `int/uint` Discussion]( http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), this RFC author (@CloudiDust) now believes that `intx/uintx` are not ideal. Instead, one of the other pairs of alternatives should be chosen. +However, given the discussions about the previous revisions of this RFC, and the discussions in [Restarting the `int/uint` Discussion]( http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131), this RFC author (@CloudiDust) now believes that `intx/uintx` are not ideal. Instead, one of the other pairs of alternatives should be chosen. The finally chosen names are `isize/usize`. # Detailed Design -Rename `int/uint` to one of the following pairs of alternatives, and decide how to name the literal suffixes for pointer-sized integers based on the selected alternative. +- Rename `int/uint` to `isize/usize`, with `is/us` being their literal suffixes, respectively. +- Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. -Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. +Some would prefer using `isize/usize` directly as literal suffixes here, as `is/us` are actual words and maybe a bit *too* pleasant to use. But on the other hand, `42isize` can be too long for others. -See **Alternatives B to K** for the alternatives. +`usize` in action: + +```rust +fn slice_or_fail<'b>(&'b self, from: &usize, to: &usize) -> &'b [T] +``` + +See **Alternatives B to L** for the other alternatives that are rejected. + +## Advantages of `isize/usize`: + +- The names indicate their common use cases (container sizes/indices/offsets), so people will know where to use them, instead of overusing them everywhere. +- The names follow the `i/u + {suffix}` pattern that is used by all the other primitive integer types like `i32/u32`. +- The names are newcomer friendly and have familiarity advantage over almost all other alternatives. +- The names are easy on the eyes. # Drawbacks +## Drawbacks of the renaming in general: + - Renaming `int`/`uint` requires changing much existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. +## Drawbacks of `isize/usize`: + +- the names fail to indicate the precise semantics of the types - *pointer-sized integers*. (And they don't follow the `i32/u32` pattern as faithfully as possible, as `32` indicates the exact size of the types, but `size` in `isize/usize` is vague in this aspect.) +- the names favour some of the types' use cases over the others. +- the names remind people of C's `ssize_t/size_t`, but `isize/usize` don't share the exact same semantics with the C types. + +Familiarity is a double edged sword here. `isize/usize` are chosen not because they are perfect, but because they represent a good compromise between semantic accuracy, familiarity and code readability. Given good documentation, the drawbacks listed here may not matter much in practice, and the combined familiarity and readability advantage outweighs them all. + # Alternatives ## A. Keep the status quo. @@ -117,23 +141,17 @@ The original proposed names of this RFC, where `x` means "unknown/variable/platf They share the same problems with `intp/uintp` and `intm/uintm`, while *in addition* failing to be specific enough. There are other kinds of platform-dependent integer types after all (like register-sized ones), so which ones are `intx/uintx`? -## F. `idiff(isize)/usize`: - -Previously, the names involving suffixes like `diff`/`addr`/`size`/`offset` are rejected mainly because they favour specific use cases of `int/uint` while overlooking others. However, it is true that in the majority of cases in safe code, Rust's `int/uint` are used just like standard C/C++ `ptrdiff_t/size_t`. When used in this context, names `idiff(isize)/usize` have clarity and familiarity advantages **over all other alternatives**. +## F. `idiff/usize`: -(Note: this author advices against `isize`, as it most likely corresponds to C/C++ `ssize_t`. `ssize_t` is in the POSIX standard, not the C/C++ ones, and is *not for offsets* according to that standard. However some may argue that, `isize/usize` are different enough from `ssize_t/size_t` so this author's worries are unnecessary.) +There is a problem with `isize`: it most likely will remind people of C/C++ `ssize_t`. But `ssize_t` is in the POSIX standard, not the C/C++ ones, and is *not for index offsets* according to POSIX. The correct type for index offsets in C99 is `ptrdiff_t`, so for a type representing offsets, `idiff` may be a better name. -```rust -fn slice_or_fail<'b>(&'b self, from: &usize, to: &usize) -> &'b [T] -``` - -But how about the other use cases of `int/uint` especially the "storing casted pointers" one? Using `libc`'s `intptr_t`/`uintptr_t` is not an option here, as "Rust on bare metal" would be ruled out. Forcing a pointer value into something called `idiff/usize` doesn't seem right either. Thus, this leads us to: +However, `isize/usize` have the advantage of being symmetrical, and ultimately, even with a name like `idiff`, some semantic mismatch between `idiff` and `ptrdiff_t` would still exist. Also, for fitting a casted pointer value, a type named `isize` is better than one named `idiff`. (Though both would lose to `iptr`.) ## G. `iptr/uptr` *and* `idiff/usize`: Rename `int/uint` to `iptr/uptr`, with `idiff/usize` being aliases and used in container method signatures. -Best of both worlds on the first glance. +This is for addressing the "not enough use cases covered" problem. Best of both worlds at the first glance. `iptr/uptr` will be used for storing casted pointer values, while `idiff/usize` will be used for offsets and sizes/indices, respectively. @@ -141,11 +159,13 @@ Best of both worlds on the first glance. This will bring the Rust type names quite in line with the standard C99 type names, which may be a plus from the familiarity point of view. -However, this setup brings two sets of types that share the same underlying representations, which also brings confusion. Furthermore, C distinguishes between `size_t`/`uintptr_t`/`intptr_t`/`ptrdiff_t` not only because they are used under different circumstances, but also because the four may have representations that are potentially different from *each other* on some architectures. Rust assumes a flat memory address space and its `int/uint` types don't exactly share semantics with any of the C types if the C standard is strictly followed. Thus, this RFC author believes that, it is better to completely forego type names that will remind people of the C types. +However, this setup brings two sets of types that share the same underlying representations. C distinguishes between `size_t`/`uintptr_t`/`intptr_t`/`ptrdiff_t` not only because they are used under different circumstances, but also because the four may have representations that are potentially different from *each other* on some architectures. Rust assumes a flat memory address space and its `int/uint` types don't exactly share semantics with any of the C types if the C standard is strictly followed. + +Thus, even introducing four names would not fix the "failing to express the precise semantics of the types" problem. Rust just doesn't need to, and *shouldn't* distinguish between `iptr/idiff` and `uptr/usize`, doing so would bring much confusion for very questionable gain. ## H. `isiz/usiz`: -A pair of variants of `isize/usize`. This author believes that the missing `e` may be enough to warn people that these are not `ssize_t/size_t` with "Rustfied" names. But at the same time, `isiz/usiz` mostly retain the familiarity of `isize/usize`. Actually, this author considers them more pleasant to use than the "full version". +A pair of variants of `isize/usize`. This author believes that the missing `e` may be enough to warn people that these are not `ssize_t/size_t` with "Rustfied" names. But at the same time, `isiz/usiz` mostly retain the familiarity of `isize/usize`. However, `isiz/usiz` still hide the actual semantics of the types, and omitting but a single letter from a word does feel too hack-ish. @@ -181,7 +201,11 @@ So this pair of names actually reflects both the precise semantics of "pointer-s fn slice_or_fail<'b>(&'b self, from: &upsz, to: &upsz) -> &'b [T] ``` -`ipsz/upsz` have gone too far. They are completely incomprehensible without the documentation. Many rightfully do not like letter soup. The only advantage here is that, no one would be very likely to think he/she is dealing with pointers. `iptrsz/uptrsz` are better in this regard. +`ipsz/upsz` have gone too far. They are completely incomprehensible without the documentation. Many rightfully do not like letter soup. The only advantage here is that, no one would be very likely to think he/she is dealing with pointers. `iptrsz/uptrsz` are better in the comprehensibility aspect. + +## L. Others: + +There are other alternatives not covered in this RFC. Please refer to this RFC's comments and [RFC PR 464](https://github.com/rust-lang/rfcs/pull/464) for more. # Unresolved questions From 777aef180253f9f66a3cd51a95b78096dbc73023 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Tue, 6 Jan 2015 19:29:06 +0800 Subject: [PATCH 20/20] Fixed capitalization and one incorrect adjective. --- text/0000-rename-int-uint.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-rename-int-uint.md b/text/0000-rename-int-uint.md index 322eea93b29..43c5317fc0c 100644 --- a/text/0000-rename-int-uint.md +++ b/text/0000-rename-int-uint.md @@ -74,9 +74,9 @@ See **Alternatives B to L** for the other alternatives that are rejected. ## Drawbacks of `isize/usize`: -- the names fail to indicate the precise semantics of the types - *pointer-sized integers*. (And they don't follow the `i32/u32` pattern as faithfully as possible, as `32` indicates the exact size of the types, but `size` in `isize/usize` is vague in this aspect.) -- the names favour some of the types' use cases over the others. -- the names remind people of C's `ssize_t/size_t`, but `isize/usize` don't share the exact same semantics with the C types. +- The names fail to indicate the precise semantics of the types - *pointer-sized integers*. (And they don't follow the `i32/u32` pattern as faithfully as possible, as `32` indicates the exact size of the types, but `size` in `isize/usize` is vague in this aspect.) +- The names favour some of the types' use cases over the others. +- The names remind people of C's `ssize_t/size_t`, but `isize/usize` don't share the exact same semantics with the C types. Familiarity is a double edged sword here. `isize/usize` are chosen not because they are perfect, but because they represent a good compromise between semantic accuracy, familiarity and code readability. Given good documentation, the drawbacks listed here may not matter much in practice, and the combined familiarity and readability advantage outweighs them all. @@ -191,7 +191,7 @@ fn slice_or_fail<'b>(&'b self, from: &uptrsz, to: &uptrsz) -> &'b [T] ## K. `ipsz/upsz`: -Now (and only now, which is the problem) it is clear where this final pair of alternatives comes from. +Now (and only now, which is the problem) it is clear where this pair of alternatives comes from. By shortening `ptr` to `p`, `ipsz/upsz` no longer stress the "pointer" parts in anyway. Instead, the `sz` or "size" parts are (comparatively) stressed. Interestingly, `ipsz/upsz` look similar to `isiz/usiz`.