-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
debug safety feature: runtime undefined value detection #211
Comments
I wonder if you could make use of "undefined" not UB. In what I'm imagining, the bit pattern would be unspecified, i.e. whatever is in the stack or register or whatever, but LLVM would not optimize out things that use the undefined data, and would not have permission to reformat your hard drive. |
This does not work in general unless you opt out of performance: https://www.ralfj.de/blog/2021/11/18/ub-good-idea.html https://www.ralfj.de/blog/2021/11/24/ub-necessary.html Zigs compilation modes for safety checks are exactly this with varying degree of performance (Debug vs ReleaseSafe). |
I think this or something like it would've saved me 5 hrs yesterday |
Just opened a duplicate with some concrete representations, so I'll post those here: Our goal here is to identify any type with an unused value which we can reserve to mean
A type An exhuastive A An undefined array is equivalent to an array full of undefined values. We can't do anything about "standard" (ABI-allowed) nullable pointers, but slices could be made larger if necessary. For non-nullable pointers and slices, we could use the null pointer value. Other optionals can use a padding bit. Error sets can use a special tag (maybe Vectors, like arrays, can set their elements to I don't know how async frames are represented, but we can surely just add an extra bit if necessary. That leaves the following non-zero-bit runtime types which we can't represent
That's actually not bad! Most "interesting" types can represent |
Some processes do not need their input to be defined. For example, let's say we're memcopying a bunch of stuff. Some of that stuff may be undefined, but we know it won't be read at the new location, so it does not matter. Throwing an error here would only serve to obstruct the programmer by preventing them from using More generally, let's say we have a large array of stuff that needs processing. Let's assume that this array may contain undefined data, but that we're able to confirm that there's no danger in processing these undefined entries, and that it is more efficient to process these entries regardless than it is to skip or filter them out. If we care significantly about the performance of such a loop, then we don't want an error whenever the data we process happens to be undefined. The core of the problem is that the very purpose of undefined is to say "I don't care", and it is difficult for the compiler to know when the programmer starts caring, if ever. There are very few cases where the programmer must care, but those are limited to cases where undefined data could directly cause a crash (such as pointer dereferencing and while-loop conditions). Otherwise, the programmer may well have figured out exactly how undefined inputs would affect a process and manually deemed it safe. One also has to consider cases where the data is only partially undefined, and where the non-undefined part makes the data safe. (For example, it should be perfectly safe to read from an 256-entry lookup table when the index is an undefined In addition to all of this, we've also got cases where we may want to read the data regardless of it being undefined, such as print-debugging. I am not totally against some manner of anti-undefined run-time checks as long as they do not hinder such use-cases, but I do believe that such additions would require more changes to the language itself than simply requiring these checks, perhaps in a manner similar to the likes of |
This is closely related to #63, but that issue doesn't seem to have been linked here yet so I thought I should drop a comment doing so. #63 (comment) brings up an alternative and more general approach than the one proposed here, though that approach also has its own drawbacks. |
If the programmer initializes a variable to undefined or otherwise sets the value to undefined anywhere, we can secretly make the type of the variable a maybe type and have a bit to keep track of whether it is undefined at any given point in time. Then if the programmer tries to use a value which is undefined, we detect it with a runtime check, and crash with a stack trace.
The text was updated successfully, but these errors were encountered: