Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(StateVariables): SharedMutable storage #4761

Open
6 of 11 tasks
Tracked by #2783
LHerskind opened this issue Feb 26, 2024 · 7 comments
Open
6 of 11 tasks
Tracked by #2783

feat(StateVariables): SharedMutable storage #4761

LHerskind opened this issue Feb 26, 2024 · 7 comments
Assignees

Comments

@LHerskind
Copy link
Contributor

LHerskind commented Feb 26, 2024

Following the naming in #4735 for slow updates

Also see: #5078 for changes to macros that is required with this type of storage.

The Slow updates tree (https://github.com/LHerskind/RandomRamblings/blob/main/SST.pdf) is to be replaced by a new type of shared mutable state that should be easier to use.

@nventuro figured that by altering the constrains that we are checking in the slow updates tree, we can "discard" most of the tree that we previously stored in public state, and instead just have the leafs directly.

The idea

As for the slow updates tree, we have leafs defined as leaf: {pre, post, time_of_change}.
But for a storage declaration, we also have a delay D. Whenever an update is initiated, we update the post to be the new value, and require a provided time_of_change to be at least now + D.

This ensures that an update will always be delayed by at least D.

With this constraint, we can compute a time_horizon which tells us when the next change "could" happen.

If there is a pending update, then the time_horizon is simply the "shortest" time that we could have until the next update, e.g., min(leaf.time_of_change, now + D).

If there are no pending updates, then the time_horizon is now + D, e.g., the earliest time an update could happen if someone later in the same block as your tx enqueues an update.

As long as the transaction is included BEFORE this potential change, then we know the exact value and that it is matching.

We therefore need a way to ensure that this is the case. We could do so by making public calls, which we would very much prefer not to, or we can introduce a new value max_block_number to the tx_context.

This max_block_number defines the last block where this transaction can be included, and the base_rollup circuit must therefore check that max_block_number <= global_variables.block_number


To consider:

  • The introduction and enforcement of max_block_number could make it possible for sequencers to censor forced transactions that rely on it.
  • Populating the max_block_number will likely require a new oracle that throughout a simulation is outputting constraints of the values read.
  • The storage implementation would need to store len(T)*2+1 values so macro serialization need to consider it.

Python pseudo implementation

class SlowState:
    
    def __init__(self, D, val):
        self.D = D
        self.pre = None
        self.post = val
        self.time_of_change = 0

    def read(self, now, max_block_number):
        is_after_change = now >= self.time_of_change
        time_horizon = now + self.D
        
        if is_after_change:
            assert max_block_number < time_horizon
            return self.post
        else:
            time_horizon = min(self.time_of_change, time_horizon)
            assert max_block_number < time_horizon
            return self.pre

    def write(self, now, val, time_of_change):
        new_time_horizon = now + self.D
        assert time_of_change >= new_time_horizon

        is_after_change = now >= self.time_of_change
        if is_after_change:
            self.pre = self.post
        self.post = val
        self.time_of_change = time_of_change

Tasks

  1. blocked
  2. blocked
  3. discussion-needed
  4. nventuro
@spalladino
Copy link
Collaborator

IIUC a user would not be able to send a private tx right before a time_of_change, since the tx's max_block_number would be very close in time, and instead would have to wait until after the time of change. Is there a way around this? I'm thinking if we could let the user build a tx with an upcoming value, but it feels dangerous to have a tx in which some state is anchored to a state root and other bits of state are not.

Another concern on this approach is that we don't have a global epoch, but rather each slot has its own clock for effecting changes. I'm worried that if a tx touches multiple SharedMutable slots, the updates may not be "in sync" and this would always lead to a very short max_block_number. For instance, having a 1-hour delay D seems reasonable, but if a tx hits 5 SharedMutable slots with that delay that are all staggered at 10-minute intervals (ie slot 1 changes at :00, slot 2 changes at :10, and so forth), the tx will always have a TTL of under 10 minutes. Is this a realistic concern? Does it make sense to enforce, or at least strongly push for, synchronised updates?

@LHerskind
Copy link
Contributor Author

I'm thinking if we could let the user build a tx with an upcoming value

I don't think we can do this. As it is only a pending, it could change before becoming "active". In that case I'm not sure how we would catch the issue.

The updates may not be "in sync" and this would always lead to a very short max_block_number

On this, yes it can be an issue that they are staggered and short. What we were discussing was likely having a minimum D and trying to push for "social consensus" on when updates should occur under normal operation - e.g., Wednesday morning etc.

I don't recall if @nventuro had more comments on it.

@iAmMichaelConnor
Copy link
Contributor

iAmMichaelConnor commented Mar 4, 2024

How does this approach compare against the other various suggestions of "Slow" state that were devised at the offsite? Are any of the other approaches viable and worth consideration? This approach looks the same as the one we discussed on the whiteboards downstairs?


The introduction and enforcement of max_block_number could make it possible for sequencers to censor forced transactions that rely on it.

What's a "forced transaction"?


Populating the max_block_number will likely require a new oracle that throughout a simulation is outputting constraints of the values read.

I didn't quite follow this. Is it saying that for a particular function, the max_block_num will need to be chosen (and injected into the public inputs via an oracle call) at the very end of that function, so that the max_block_num is <= the minimum of all time_horizons of all slow states that have been read? It's a tricky problem, which might need the function to be simulated repeatedly until a satisfying max_block_num (which satisfies all reads of all slow states) is found.

It might need an oracle call, such as:

    def read(self, now):
        is_after_change = now >= self.time_of_change
        time_horizon = now + self.D
        
        if is_after_change:
            let max_block_number = oracle.get_a_max_block_num_that_is_lt_time_horizon_and_gt_now_and_eq_to_any_previously_got_max_block_num_or_else_re_simulate_this_function_with_a_better_choice_of_max_block_num(time_horizon, now);
            assert max_block_number < time_horizon
            return self.post
        else:
            time_horizon = min(self.time_of_change, time_horizon)
            let max_block_number = oracle.get_a_max_block_num_that_is_lt_time_horizon_and_gt_now_and_eq_to_any_previously_got_max_block_num_or_else_re_simulate_this_function_with_a_better_choice_of_max_block_num(time_horizon, now);
            assert max_block_number < time_horizon
            return self.pre

The code snippets (in the original post) don't mention the archive tree. To read in private, the tuple (pre, post, time_of_change) of the SlowState will need to be read from a snapshot of the public data tree from some historical block number of the archive tree, right? Otherwise, how is now validated to be correct? Iiuc, now would actually be the block number of the archive tree that has been read from?


Further to @spalladino's points, we will have to be mindful of the impact this could have on the size of privacy sets of the network. There's a lot of leakage happening in places, without proper standardisation and developer education: # notes, # nullifiers, max_block_num, # logs, log lengths, # enqueued calls, the public function calls that have been made, etc. For this particular topic, it might help standardisation if aztec.nr ensured the time_of_change for all states are equal modulo 100 (for example).
We spoke at the offsite about this topic, and explored approaches which enshrined a minimum "epoch length", for example. It'd be worth us documenting those alternatives, if possible, even if they get discarded later.
Privacy is so important to Aztec, that it might be worth enshrining time_of_change boundaries.

Edit: and yes, I'm aware I was against enshrining the slow updates tree because I didn't want to enshrine an epoch length, but as I think more about the potential for network-wide leakiness if users are left to standardise things for themselves, I'm becoming more comfortable with such enshrinement. :insert_face_of_someone_awkwardly_changing_their_mind:


Further to @spalladino's points (again), Is there a risk of spikes/troughs in network activity around time_of_change boundaries, that might have an impact on our ongoing modelling of the network? Users might hold off on submitting txs (or rush to submit txs) when it's getting too close to a boundary. Then shortly after a boundary no txs would happen as users generate new proofs. Then there might be a spike in txs once all those proofs are generated and submitted. The volatility would be less if each state had its own D and time_of_change, but then privacy sets would be adversely impacted.

@nventuro
Copy link
Contributor

nventuro commented Mar 5, 2024

How does this approach compare against the other various suggestions of "Slow" state that were devised at the offsite? Are any of the other approaches viable and worth consideration? This approach looks the same as the one we discussed on the whiteboards downstairs?

Yes, they're all quite similar. I think Lasse was simply pointing out at the beginning that this can also be seen as a kind of slow updates tree where one of the constraints is slightly different.

Is it saying that for a particular function, the max_block_num will need to be chosen (and injected into the public inputs via an oracle call) at the very end of that function, so that the max_block_num is <= the minimum of all time_horizons of all slow states that have been read?

Maybe I'm missing something, but this doesn't seem like a difficult problem to solve. We'd add a new max block number field to the context, which gets populated and possible updated with lower values whenever application code read this kind of state. At the end we'll have the highest block number that fullfills all assertions.

The code snippets (in the original post) don't mention the archive tree. To read in private, the tuple (pre, post, time_of_change) of the SlowState will need to be read from a snapshot of the public data tree from some historical block number of the archive tree, right? Otherwise, how is now validated to be correct? Iiuc, now would actually be the block number of the archive tree that has been read from?

Yes, now is the block number of whatever block the local node has synced up to. The proof of pre post and time_of_change would be made for that block.

@iAmMichaelConnor
Copy link
Contributor

Maybe I'm missing something, but this doesn't seem like a difficult problem to solve. We'd add a new max block number field to the context, which gets populated and possible updated with lower values whenever application code read this kind of state. At the end we'll have the highest block number that fullfills all assertions.

My worry is that if a function wants to read N different slow states, there might be N different time horizons. Those time horizons will only be discovered as the function is simulated. But the max_block_num needs to be decided at the beginning of the function, because it gets used in constraints, and those constraints might affect the code path of the function. So you might need to repeatedly simulate the function, until you've learned all time horizons, and then you can pick your max_block_num for that function, and then you can finally simulate the whole function. And of course, the max_block_num that you eventually decide upon might be used in an if statement that triggers a different code path, and causes different slow states to then be read, which might result in new time horizons being found, meaning you might have to iteratively simulate some more to find a new max_block_num.

@iAmMichaelConnor
Copy link
Contributor

I remember at the offsite Joe was proposing a Slow state approach which didn't need the max_block_num. Has that been proven not to work, for some reason?


The task list might need to include lines in the kernel circuit to compute the minimum max_block_num from the previous kernel and the latest-popped function.


Are we sure we want to implement this approach, or do we feel we need to explore and compare more approaches before deciding? I'm conscious the design has changed several times recently, so might still have room for change? I'm also conscious the slow updates tree implementation had some drawbacks and might have benefited from more rounds of criticism before being implemented.
It's becoming quite an important concept for the protocol, so if the design needs to change in future, it might have significant knock-on impacts on other parts of the protocol and stack.
Thoughts?

@nventuro
Copy link
Contributor

nventuro commented Mar 6, 2024

My worry is that if a function wants to read N different slow states, there might be N different time horizons. Those time horizons will only be discovered as the function is simulated. But the max_block_num needs to be decided at the beginning of the function, because it gets used in constraints, and those constraints might affect the code path of the function.

I'm thinking of a simpler model in which we find the correct value as the simulation goes, but perhaps I'm missing some critical context. Similarly to how we have context.push_read_request() instead of directly asserting that a note exists, we could do something like this:

impl SharedMutableState {
  fn read_shared_mutable_state(&self, &mut context) -> T {
    context.request_max_block_number(self.time_of_change);

    if context.now() < self.time_of_change {
      self.pre
    } else {
      self.post
    }
  }
}

impl PrivateContext {
  fn new() -> Self {
     ...
     max_block_number = MAX_UINT64;
  }

  fn request_max_block_number(&mut self, max_block_number) {
    // max_block_number may only decrease, so prior assertions about it being smaller than
    // some value still hold.
    self.max_block_number = min(self.max_block_number, max_block_number);
  }
}

And then the kernel would also keep the minimum of its current max_block_number and whatever is requested by each execution.

@LHerskind LHerskind changed the title feat: SharedMutable storage feat(StateVariables): SharedMutable storage Mar 8, 2024
nventuro added a commit that referenced this issue Mar 22, 2024
Part of #4761.

This adds a new validity condition to transactions called
`max_block_number`, causing them to fail if the current block is larger
than a requested max block. This can be used to construct proofs that
are only valid if included before a certain block (which is exactly how
SharedMutableStorage/SlowJoe/SlowUpdatesTree2.0 works).

---

I made `max_block_number` an `Option<u32>` both to not have to include a
initial value equal to the largest block, and also to avoid issues that
arise from abuse of `std::unsafe::zeroed`. Many parts of the stack
assume a (mostly) zeroed transaction is a valid one, but a
`max_block_number` value of 0 is not useful. With `Option`, a zeroed
value means no max block number was requested (`is_none()` returns
true), and this entire issue is avoided.

This property is initially set to `is_none()`, meaning there's no max
block number constraint. The `PrivateContext` now has a
`request_max_block_number` function that can be used to add constraints.
Each time a lower max block number is seen it replaces the current one.
The private kernel aggregates these across private calls and ends up
with the smallest one.

This value is stored in a new struct called `RollupValidationRequests`,
an extension from @LeilaWang's work in
#5236. These are
validation requests accumulated during private and public execution that
are forwarded to the rollup for it to check. Currently we only have
`max_block_number`, but there may be more. Note that we currently have a
slight duplication in the public kernal tail public inputs, but this is
expected to be sorted out very soon as this struct is refactored.

---

Note that in the end to end tests we're only testing that the sequencer
drops the transaction, but not that the base rollup rejects this
transaction (this is only tested in the rollup circuit unit tests).
Testing this would require bypassing the sequencer tx validation logic
and manually building a block, but this is a fairly involved endeavor
and one that our infrastructure does not currently easily support. I'm
still looking into a way to add this test.
nventuro added a commit that referenced this issue Apr 10, 2024
(Large) part of
#4761.

This is an initial implementation of `SharedMutableStorage`, with some
limitations. I think those are best worked on in follow-up PRs, once we
have the bones working.

The bulk of the SharedMutable pattern is in `ScheduledValueChange`, a
pure Noir struct that has all of the block number related logic.
`SharedMutable` then makes a state variable out of that struct, adding
public storage access both in public and private (via historical reads -
see #5379), and using the new `request_max_block_number` function (from
#5251).

I made an effort to test as much as I could of these in Noir, with
partial success in the case of `SharedMutable` due to lack of certain
features, notably noir-lang/noir#4652. There
is also an end-to-end test that goes through two scheuled value changes,
showing that scheduled values do not affect the current one.

I added some inline docs but didn't include proper docsite pages yet so
that we can discuss the implementation, API, etc., and make e.g.
renamings less troublesome.

### Notable implementation details

I chose to make the delay a type parameter instead of a value mostly
because of two reasons:
- it lets us nicely serialize and deserialize `ScheduledValueChange`
without including this field (which we are not currently interested in
storing)
- it lets us declare a state variable of type `SharedMutable<T, DELAY>`
without having to change the signature of the `new` function, which is
automatically injected by the macro.

Overall I think this is fine, especially since we may later make the
delay mutable (see below), but still worth noting.

Additionally, I created a simple `public_storage` module to get slightly
nicer API and encapsulation. This highlighted a Noir issue
(noir-lang/noir#4633), which currently only
affects public historical reads but will also affect current reads once
we migrate to using the AVM opcodes.

### Future work

- #5491
- #5492 (this
takes care of padding during storage slot allocation)
- #5501
- #5493

---------

Co-authored-by: Jan Beneš <janbenes1234@gmail.com>
AztecBot pushed a commit to AztecProtocol/aztec-nr that referenced this issue Apr 11, 2024
(Large) part of
AztecProtocol/aztec-packages#4761.

This is an initial implementation of `SharedMutableStorage`, with some
limitations. I think those are best worked on in follow-up PRs, once we
have the bones working.

The bulk of the SharedMutable pattern is in `ScheduledValueChange`, a
pure Noir struct that has all of the block number related logic.
`SharedMutable` then makes a state variable out of that struct, adding
public storage access both in public and private (via historical reads -
see #5379), and using the new `request_max_block_number` function (from
#5251).

I made an effort to test as much as I could of these in Noir, with
partial success in the case of `SharedMutable` due to lack of certain
features, notably noir-lang/noir#4652. There
is also an end-to-end test that goes through two scheuled value changes,
showing that scheduled values do not affect the current one.

I added some inline docs but didn't include proper docsite pages yet so
that we can discuss the implementation, API, etc., and make e.g.
renamings less troublesome.

### Notable implementation details

I chose to make the delay a type parameter instead of a value mostly
because of two reasons:
- it lets us nicely serialize and deserialize `ScheduledValueChange`
without including this field (which we are not currently interested in
storing)
- it lets us declare a state variable of type `SharedMutable<T, DELAY>`
without having to change the signature of the `new` function, which is
automatically injected by the macro.

Overall I think this is fine, especially since we may later make the
delay mutable (see below), but still worth noting.

Additionally, I created a simple `public_storage` module to get slightly
nicer API and encapsulation. This highlighted a Noir issue
(noir-lang/noir#4633), which currently only
affects public historical reads but will also affect current reads once
we migrate to using the AVM opcodes.

### Future work

- AztecProtocol/aztec-packages#5491
- AztecProtocol/aztec-packages#5492 (this
takes care of padding during storage slot allocation)
- AztecProtocol/aztec-packages#5501
- AztecProtocol/aztec-packages#5493

---------

Co-authored-by: Jan Beneš <janbenes1234@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

4 participants