VIP: Specify behavior of uninitialized variables #1493

charles-cooper · 2019-06-21T16:27:13Z

Simple Summary

Require newly declared variables to be initialized.

Abstract

Motivation

Uninitialized variables can currently refer to garbage memory (#1476 (comment)). Probably not what people expect.

Specification

option 1) Require variables to be set by the programmer
option 2) if the programmer declares a new variable without setting it, auto set it to a default value of 0

EDIT: the eventual consensus was to choose option 1. See #1493 (comment) for the rationale.

Backwards Compatibility

Programs which don't initialize variables will no longer compile.

Dependencies

Modifies existing behavior of creating new memory variables

Copyright

Copyright and related rights waived via CC0

jacqueswww · 2019-06-25T18:02:15Z

This shouldn't be happening. I assume this is happening after calls to private functions? (ie. a variable get defined after a private call) :/

fubuloubu · 2019-06-25T18:29:20Z

Even if it shouldn't be happening, IMO (1) is the best way to mitigate a mistake in calling code or whatever.

jacqueswww · 2019-06-25T18:48:48Z

Yes, the compiler should have a flag that adds mstore <var_pos> 0 to memory after a variable is created, the flag is only set after a self call to private occurs.

fubuloubu · 2019-06-25T19:08:39Z

That... Seems way more complicated then just enforcing you set a local on definition if it is not set before it's read.

jakerockland · 2019-06-25T22:58:18Z

My 2¢: Also seems to me that (1) makes it the most explicit as to what is going on / less error prone.

jacqueswww · 2019-06-25T23:51:12Z

I feel like there are two separate problems here, one is changing how vyper currently works to be more explicit about initial values.
The other is that currently vyper has default assignment of initial values, but after changing to CALL the default values of the EVM are not zero by default in memory, as it was before.

charles-cooper · 2019-06-26T00:42:13Z

I am leaning towards having the programmer explicitly set it. Then they don't have to depend on what the language decides as the default value. Also, there isn't much reason to not initialize a variable - most of the time, the programmer wants to set it at some point in the function. The main use case I can think of is when a programmer defines a variable, and then the value assigned to it is branch dependent. That's a kind of 'edge case' that we could discuss separately (maybe worth implementing something like the balanced return checker but for setting variables) but in any case it is still clearer to set it to a sentinel value at declaration! As a secondary concern, since declared variables are usually assigned to, it's better performance to skip the compiler generated mstore 0 and go straight to assigning whatever the programmer was planning to assign to it.

jakerockland · 2019-06-26T01:05:31Z

Yeah definitely are two problems at play @jacqueswww, so we may want to update the issues to more accurately reflect that.

On the note of the first problem though, I think I'd be in favor of a VIP that makes Vyper more explicit about initial values.

That may warrant more discussion though that just fixing the currently broken behavior of how the initial values are supposed to behave.

jakerockland · 2019-06-26T06:25:12Z

Thinking about this more, fixing the current broken behavior seems higher priority change than removing default initial values, as the later would be non backwards compatible.

Especially given this is explicitly marked to be the case in the docs: https://vyper.readthedocs.io/en/v0.1.0-beta.10/types.html#initial-values

Still, I'd be in favor of a VIP that removes default initial values and requires them to be set by the programmer. 😅

fubuloubu · 2019-06-26T07:39:19Z

One thing to note about the suggestion to require the developer to set a variable before reading it: it doesn't have to be at creation of the variable, but it should be set before it's read (aka write once before read). That would allow the use case @CharlesCooper was describing with dependent settings.

davesque · 2019-07-01T17:39:51Z

@fubuloubu Re: Setting var values before read. That would waste gas in the case that the variable is set to zero intially since all memory is well-defined initially as zero. Also, as @jacqueswww described it to me, if the internal call context storage is working correctly, even uninitialized zero values will be explicitly copied to memory and then restored later when the internal call returns. So maybe it's actually the case that there's a bug in the context storage mechanism for internal calls?

It's a bit hard to tell if anyone is suggesting this, but I wouldn't want to insert extra ops meant to zero-out memory when, as far as we know, the internal call context storage mechanism should be preserving variable values across calls, even if those values were uninitialized zero values. We should instead try to figure out why our mental model of what should be happening isn't matching up with what actually happens.

However, I'm all in favor of requiring that variables syntactically have an initial value. But if that initial value is zero, we shouldn't do any extra work that doesn't need to happen. Having new memory set to zero is a detail of the EVM spec (it's not a Vyper language feature) and is not likely to change.

jakerockland · 2019-07-01T17:54:22Z

@davesque Yeah that's a good point that if we make a VIP around values needing initial values to be explicitly set by the programmer when implementing we are sure to have a compile-time optimization that doesn't add extra ops if the default value is just 0.

charles-cooper · 2019-07-01T17:59:50Z

@davesque it has to do with when a private function has a larger frame than the caller, and a variable is declared after the private function call.

@private
def priv(): # frame is > 30 words
    xs: uint256[30]
    for i in range(0, 30):
        xs[i] = 7 # write values to memory

@public
def caller() -> uint256: # frame is < 10 words
    self.priv()
    x: uint256 # looks like should be zero, but it's occupying space that was written to by priv()
    return x # not 0

davesque · 2019-07-01T21:02:53Z

@charles-cooper @jacqueswww

Okay, yeah that seems correct.

I've been looking into this for an hour or two and here's my current understanding of this:

Variable records are only created as assignment statements are encountered in the code. However, the private call builder just uses whatever variables it finds in the context to copy local vars to the stack and pop them off. For variable assignments with explicit values that occur after private calls, this isn't a problem since the value will always be set at the appropriate location in memory. However, for uninitialized variable assignments, it will just use whatever memory was not written over by the frame recovery which will be junk values from the callee's frame if that frame was larger than the caller's frame.

So it seems like, ideally, the compiler would be aware of all assignment statements in the caller's scope and include them all (including implicit, uninitialized zero values) when copying the caller's frame to the stack. Or, it would have to zero-out memory for uninitialized variables after any private call, because how could it otherwise know whether or not the private call had written beyond the caller's frame in memory? I guess it's a question of whether copying all local vars to the stack (including ones that haven't even been used yet) would be more expensive than zeroing out memory for uninitialized vars that occur after private calls. My guess is that zeroing-out memory would be less expensive.

Sorry if any of this is redundant. Partially, I'm writing this all down here so you guys can tell me if I'm on the right track with my thinking and also just in case any of this is novel to the discussion.

davesque · 2019-07-01T21:13:59Z

@jacqueswww Is there any point in time at which we would start asking whether or not doing private calls with our own stack discipline is worth it over just using an EVM opcode? I'm not keenly aware of the gas costs involved but just though I'd pose the question.

charles-cooper · 2019-07-01T21:18:27Z

@davesque there is an issue on this, basically vyper used to use CALL to self but that was too expensive. There might be a couple more options if EIPs 615 or 1380 are merged in the next fork so we want to keep from making too large of an overhaul right now before we have all options at our disposal.

charles-cooper · 2019-07-01T21:20:33Z

FWIW I really think the best course of action at this stage is to get rid of declaring variables without initialization. Technically it's not fixing the bug but it's simpler for the compiler and cleaner for the user for the reasons detailed above.

davesque · 2019-07-01T21:45:14Z

@charles-cooper Yeah, I was aware that we used to do it that way. I was more asking if we should ever revert back to using that approach. But I forgot about those EIPs. Thanks!

And yeah I think you're right about requiring explicit assignment. I'm not sure there's a viable alternative. The one exception I can think of is if the assignment is the default value for a type and it occurs in a public call, before any private calls. Then we should be able to safely optimize out the mstore <pos> 0. Otherwise, I'm not sure we can make that assumption.

jakerockland · 2019-07-01T22:16:11Z

@davesque @charles-cooper thanks for the elaboration/conversation here, it's helped me understand the underlying issue a lot more

Given the above, I agree with your approach @charles-cooper that the faster path to implementation is likely just removing uninitialized variables + making them explicit.

dHonerkamp · 2019-09-02T22:03:24Z

Given how unexpected & potentially critical this behaviour is, I'd like to just add a reminder that it is still not reflected in the documentation (https://vyper.readthedocs.io/en/latest/types.html#initial-values). I think a short note would already help a lot

charles-cooper mentioned this issue Jun 25, 2019

Cannot post events after create an uninitialized local variable #1476

Closed

davesque mentioned this issue Jul 1, 2019

Meeting 1st July #1509

Closed

charles-cooper mentioned this issue Sep 3, 2019

Force users to initialize variables. #1597

Merged

This was referenced Sep 24, 2019

add ZERO_WEI and note about memory variable initialization to docs #1623

Merged

Beta 13 Release #1622

Merged

charles-cooper changed the title ~~Specify behavior of uninitialized variables~~ VIP: Specify behavior of uninitialized variables Sep 25, 2019

fubuloubu closed this as completed in #1597 Sep 27, 2019

charles-cooper mentioned this issue Nov 1, 2019

initializing empty arrays #1675

Closed

charles-cooper mentioned this issue May 14, 2023

Immutable variables can be read before assignment #3101

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VIP: Specify behavior of uninitialized variables #1493

VIP: Specify behavior of uninitialized variables #1493

charles-cooper commented Jun 21, 2019 •

edited by fubuloubu

jacqueswww commented Jun 25, 2019 •

edited

fubuloubu commented Jun 25, 2019

jacqueswww commented Jun 25, 2019 •

edited

fubuloubu commented Jun 25, 2019 •

edited

jakerockland commented Jun 25, 2019

jacqueswww commented Jun 25, 2019 •

edited

charles-cooper commented Jun 26, 2019

jakerockland commented Jun 26, 2019

jakerockland commented Jun 26, 2019

fubuloubu commented Jun 26, 2019

davesque commented Jul 1, 2019 •

edited

jakerockland commented Jul 1, 2019

charles-cooper commented Jul 1, 2019 •

edited

davesque commented Jul 1, 2019 •

edited

davesque commented Jul 1, 2019 •

edited

charles-cooper commented Jul 1, 2019

charles-cooper commented Jul 1, 2019

davesque commented Jul 1, 2019

jakerockland commented Jul 1, 2019

dHonerkamp commented Sep 2, 2019

VIP: Specify behavior of uninitialized variables #1493

VIP: Specify behavior of uninitialized variables #1493

Comments

charles-cooper commented Jun 21, 2019 • edited by fubuloubu

Simple Summary

Abstract

Motivation

Specification

Backwards Compatibility

Dependencies

Copyright

jacqueswww commented Jun 25, 2019 • edited

fubuloubu commented Jun 25, 2019

jacqueswww commented Jun 25, 2019 • edited

fubuloubu commented Jun 25, 2019 • edited

jakerockland commented Jun 25, 2019

jacqueswww commented Jun 25, 2019 • edited

charles-cooper commented Jun 26, 2019

jakerockland commented Jun 26, 2019

jakerockland commented Jun 26, 2019

fubuloubu commented Jun 26, 2019

davesque commented Jul 1, 2019 • edited

jakerockland commented Jul 1, 2019

charles-cooper commented Jul 1, 2019 • edited

davesque commented Jul 1, 2019 • edited

davesque commented Jul 1, 2019 • edited

charles-cooper commented Jul 1, 2019

charles-cooper commented Jul 1, 2019

davesque commented Jul 1, 2019

jakerockland commented Jul 1, 2019

dHonerkamp commented Sep 2, 2019

charles-cooper commented Jun 21, 2019 •

edited by fubuloubu

jacqueswww commented Jun 25, 2019 •

edited

jacqueswww commented Jun 25, 2019 •

edited

fubuloubu commented Jun 25, 2019 •

edited

jacqueswww commented Jun 25, 2019 •

edited

davesque commented Jul 1, 2019 •

edited

charles-cooper commented Jul 1, 2019 •

edited

davesque commented Jul 1, 2019 •

edited

davesque commented Jul 1, 2019 •

edited