-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Span length should be unsigned #1242
Comments
@JosephBialekMsft the design decision to use a signed type for span's indexing and size is discussed in the original proposal and follows general thinking described in the Core Guidelines. It might also be worth looking at: https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es102-use-signed-types-for-arithmetic and https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es107-dont-use-unsigned-for-subscripts-prefer-gslindex for context. The fact that unsigned values are used for indexing inside loops by the Windows kernel could be considered a reliability issue. Signed arithmetic allows the compiler to invoke undefined behavior on overflow/underflow. Unsigned arithmetic simply masks the bug. |
Thanks Neil, I gave that a read. In the context of span, many of examples listed (integer underflow) would matter in a particularly meaningful way. A negative value is invalid just like a value of 4294967294 is almost certain to be invalid in a span. In the case of integer overflows, it's certainly possible that you overflow your index while iterating through a span, although again, such situations are likely to be pretty rare based on my experience reviewing code that uses unsigned indexes. The downside is you double the the amount of checks that are required (>= 0 and < size), and checks are being made for situations that are logically nonsensical (a negative index being passed). Binary protocols intentionally use unsigned sizes since it allows for a greater range of meaningful values per bit of storage used. Many pointer/size interfaces use unsigned sizes since it makes more logical sense and cuts the amount of checks in half. This may just devolve to a religious debate of signed vs unsigned. In the world I live in, unsigned is everywhere. It'd be nice to have an unsigned span to interop cleanly since porting all this unsigned code to signed is impossible in many cases (i.e. we are talking about publicly documented interfaces/structures) and the port itself would certainly lead to bugs even if it were possible. |
@JosephBialekMsft , @neilmacintosh I decided to give this a summary, let me know if I missed anything. span's index and size are currently signed (ptrdiff_t) Pros:
Cons:
Conclusion:
|
I don't agree that it removes undefined behavior cases. An unsigned value wrapping is defined behavior, even if it isn't expected by the developer (but that falls in to the bullet point of "eliminated some potential bugs"). In that case we still have memory safety, but the logic could be incorrect. Maybe I'm not thinking of an undefined behavior case you're thinking of though. While pro #2 is true for most practical scenarios, but it's not always true. Windows on x86 supports partitioning 3GB of user-mode virtual address space and 1GB of kernel-mode virtual address space. I believe Linux has similar support. Thus on x86 it would be technically possible for someone to make an allocation that doesn't fit in to a span. Although it is unlikely. If 64bit OS's ever allow full use of the 64bit virtual address space (57bit is on the way from Intel) we'd have similar problems. Just bringing this up for completeness. I think it'd be useful to just allow span to specify if it's going to use a signed or unsigned size span. I haven't looked at all the code but I imagine the optimizer should just eliminate the <=0 checks for unsigned since that condition is impossible. You may not need to change anything else about the class. |
For the Core Guidelines this was commented by Herb Sutter: #1115 |
Thanks @robert-andrzejuk for pointing to Herb's nice exposition of the Core Guidelines thinking on signed vs. unsigned integer types for indexing. Given that span has already been standardized with a signed index type, making the sort of change @JosephBialekMsft suggests would represent a significant fork for this GSL implementation. I see no reason for such a fork, given it does not follow the thinking contained in the Core Guidelines. |
Editors call: If it isn't already said, we should say that unsigned overflow is well defined but defined to something that is wrong for what we use indices for. @neilmacintosh and @GabrielDosReis could you please add links to the comment threads in the GSL repo that also cover this with additional detail? |
I think it was this conversation: microsoft/GSL#322 |
Thanks @neithernut, there is another, much earlier discussion here too: microsoft/GSL#171. Let me also briefly summarize our thinking again here, just for clarity.
|
Also remember that |
Ok. I think this discussion is done at this point, and "the way of the Guidelines" on this topic is pretty clear so I'm closing the issue. |
It doesn't make a whole lot of sense to me for span's length to be a signed value. The length can never be negative in the first place, nor are negative values used for any special purpose.
Signed lengths for span have the following annoyances:
The text was updated successfully, but these errors were encountered: