Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better codegen of undefined #460

Closed
PavelVozenilek opened this issue Sep 10, 2017 · 11 comments
Closed

better codegen of undefined #460

PavelVozenilek opened this issue Sep 10, 2017 · 11 comments
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Milestone

Comments

@PavelVozenilek
Copy link

Variable in Zig can be left uninitialized by undefined.

It would be handy if there was compiler command line option/project config file settings to override this and make all undefined variables initialized with their default.

If the application was behaving unpredictably before and after forced initialization it looks OK one knows the cause.

This is a feature in Jai language.

@thejoshwolfe
Copy link
Sponsor Contributor

In safety mode, initializing to undefined is actually a memset to 0xaa. This is to make it easier to notice if you're using undefined values in a debugger.

This is different from what you're proposing, which is memset to 0, but i think it has the same effect or better.

The value 0xaa was chosen because it is recognizable by a human, not likely to appear by chance in actual undefined values, and not likely to cause apparently correct behavior if you accidentally use it. Using 0 does not have these benefits.

Zig tries very hard to prevent undefined behavior, and helping you find where your code is reading uninitialized memory is part of that goal. If your code actually uses undefined values, it is unquestionably a bug. We don't want the compiler to enable that bug to go unnoticed by setting uninitialized memory to 0.

Of course this isn't a perfect solution. Valgrind can give much more sophisticated diagnostics than 0xaa, but the question is: what value should safety mode use for undefined? 0xaa is better for catching bugs than 0, and catching bugs is more important than making code appear to work with undefined behavior.

@andrewrk
Copy link
Member

andrewrk commented Sep 10, 2017

I'm going to re-open this issue for a couple reasons:

@andrewrk andrewrk reopened this Sep 10, 2017
@andrewrk andrewrk added this to the 0.2.0 milestone Sep 10, 2017
@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Sep 10, 2017
@andrewrk andrewrk changed the title Disable undefined to help debugging better codegen of undefined Sep 10, 2017
@PavelVozenilek
Copy link
Author

Assigning always the same fixed value to undefined will hide potential errors. Something random (and nonzero) would be better to spot problem before it gets into release.

@andrewrk
Copy link
Member

I agree with this statement. However this introduces non reproducible builds, which is something to discuss before causing to happen.

@thejoshwolfe
Copy link
Sponsor Contributor

Randomness can be reproducible if it's externally seeded. The command line option to enable undefined value randomization can take a seed, then the builds are always a deterministic function of their inputs.

@PavelVozenilek
Copy link
Author

> introduces non reproducible builds

Compiler initializes random generator (e.g. xoroshiro128 - http://vigna.di.unimi.it/xorshift/xoroshiro128plus.c ) from current time at the start and then inserts initialization memcpy after variable declaration. No hardcoded random numbers in executable.

The generator could always set the highest bit to 1, to make the number large or negative, to increase probability of undesirable effects.

Forcing "everything defined" would simply use value 0 instead of the generator. Specifying the seed on command line would be too much hassle and it wouldn't speed up bug discovery.

@andrewrk
Copy link
Member

I think there's value in 0xaa. The pattern is very noticeable in both hex and binary, and it's also a large or negative number. It's likely nothing will be mapped to address 0xaaaaaaaaaaaaaaaa.

If we do want to use a random seed and achieve reproducible builds, we can't seed based on time. But we could seed based on seemingly random things such as number of characters in all the source code combined.

@PavelVozenilek
Copy link
Author

PavelVozenilek commented Sep 11, 2017

If 0xaa is/will be the default then there's no point in my proposal - it is already here.

Under the term "reproducible build" I understand generated binary being always the same on the disk, byte for byte, not actual execution always following the same path.

If my understanding is correct source of the seed doesn't matter.

@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Oct 19, 2017
@andrewrk
Copy link
Member

We can use valgrind's client request mechanism to not set undefined bytes to 0xaa when RUNNING_ON_VALGRIND is true.
http://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.clientreq

it still doesn't solve assigning undefined later, but I sent a request to the valgrind mailing list asking about it.

@andrewrk
Copy link
Member

quick response.
http://valgrind.org/docs/manual/mc-manual.html#mc-manual.clientreqs
VALGRIND_MAKE_MEM_UNDEFINED and VALGRIND_MAKE_MEM_DEFINED.

we can have our cake and eat it too.

@andrewrk andrewrk added the accepted This proposal is planned. label Dec 3, 2017
@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Feb 28, 2018
@andrewrk andrewrk removed the accepted This proposal is planned. label Feb 19, 2019
@andrewrk
Copy link
Member

I moved all the issues mentioned in #460 (comment) to their own issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Projects
None yet
Development

No branches or pull requests

3 participants