Add a way to artificially increase the size of ChunkStateWitness #11687

jancionear · 2024-06-28T11:36:35Z

This PR adds a new feature to neard: artificial_witness_size.
When enabled, the node will read artificial_witness_size_to_add from config.json and add this many bytes of data to all produced witnesses. Data is added by injecting a dummy transaction with random contract code of desired size. The injected transaction is automatically removed before witness validation.

artificial_witness_size_to_add is a MutableConfigValue so it can be adjusted on the fly without restarting the node.

The feature has to be enabled on all nodes in the network because normal chunk validators will reject witnesses with the injected transaction when the feature isn't enabled.

Doing it by injecting transactions doesn't change the serialization format, so it's compatible with existing networks.

The extra transaction is only added when artificial_witness_size_to_add is larger than zero, so it's safe to go back to using binaries where this feature is disabled by setting artificial_witness_size_to_add to zero and then disabling the feature.

We can use it to test how witness distribution behaves at larger sizes and to simulate doomsday scenarios.

I manually tested that the feature works on localnet by setting up a 4 node/6 shard localnet and forcing it to send 100 MB witnesses. The network struggled, but it went forward at 0.1 bps.

I increased the number of buckets for witness size because the current ones go only up to 28 MB. After the change they go up to 120 MB.

Refs: #11184

The current ones go only up to 28 MB. After the change they go up to 120 MB.

We don't want big witnesses to be rejected because of the default limits

codecov · 2024-06-28T11:57:55Z

Codecov Report

Attention: Patch coverage is 80.76923% with 5 lines in your changes missing coverage. Please review.

Project coverage is 71.73%. Comparing base (89fad46) to head (d5cb7d9).

Files	Patch %	Lines
chain/client/src/client.rs	0.00%	4 Missing ⚠️
nearcore/src/dyn_config.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #11687      +/-   ##
==========================================
- Coverage   71.74%   71.73%   -0.01%     
==========================================
  Files         790      790              
  Lines      161838   161862      +24     
  Branches   161838   161862      +24     
==========================================
+ Hits       116106   116111       +5     
- Misses      40689    40711      +22     
+ Partials     5043     5040       -3

Flag	Coverage Δ
backward-compatibility	`0.23% <0.00%> (-0.01%)`	⬇️
db-migration	`0.23% <0.00%> (-0.01%)`	⬇️
genesis-check	`1.35% <3.84%> (+<0.01%)`	⬆️
integration-tests	`37.80% <80.76%> (-0.01%)`	⬇️
linux	`69.13% <34.61%> (-0.03%)`	⬇️
linux-nightly	`71.23% <80.76%> (-0.01%)`	⬇️
macos	`51.39% <34.61%> (-1.22%)`	⬇️
pytests	`1.58% <3.84%> (+<0.01%)`	⬆️
sanity-checks	`1.38% <3.84%> (+<0.01%)`	⬆️
unittests	`66.33% <80.76%> (+<0.01%)`	⬆️
upgradability	`0.28% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

nagisa

This seems alright, though to me it seems like appending a field, or perhaps otherwise making the witness serialization invalid from the first principles while this feature is enabled would be a safer bet. But I'm not quite sure why it is necessary to have this be compatible with existing networks...

(Making it invalid to deserialize normally might be just length * 1024 || random_data || actual_witness_data or something)

nagisa · 2024-07-01T11:09:09Z

chain/client/src/stateless_validation/chunk_validator/mod.rs

@@ -269,6 +269,9 @@ impl Client {
        // wait for validation to finish.
        self.send_state_witness_ack(&witness, &signer);

+        #[cfg(feature = "artificial_witness_size")]


What would happen if a node with this compile feature sent a witness to a node without? Would the node reject the witness entirely, or would it try to operate on the now-present artificially generated entries and somehow fail at a later date (due to e.g. the fact that added_artificial_witness_size not a real account.)

From what I can tell, these artificial witnesses are structurally valid, they just use well-known values in certain fields to differentiate them from the regular data.

What would happen if a node with this compile feature sent a witness to a node without? Would the node reject the witness entirely, or would it try to operate on the now-present artificially generated entries and somehow fail at a later date (due to e.g. the fact that added_artificial_witness_size not a real account.)

I believe that witness validation would fail - apply_chunk expects that all transactions are valid and will fail if one of them isn't, so witness validation will fail.

nagisa · 2024-07-01T11:10:53Z

chain/client/src/stateless_validation/state_witness_producer.rs

+    use std::str::FromStr;
+
+    let size_as_usize: usize = size_to_add.as_u64().try_into().expect("Can't convert u64 to usize");
+    let mut random_data = vec![0; size_as_usize];


This can potentially add more than exactly requested number of bytes due to encoding overhead changing. Is that okay?

This can potentially add more than exactly requested number of bytes due to encoding overhead changing. Is that okay?

It's fine we're interested about the difference between 10 MB and 15 MB, a kilobyte of difference doesn't matter that much.

pugachAG

After discussing this with @jancionear we agreed to proceed with a slightly different approach that covers the underlying use case for this PR as well as allows us to test blockchain recovery with large witness after increasing block production time.

jancionear · 2024-07-01T15:21:14Z

This seems alright, though to me it seems like appending a field, or perhaps otherwise making the witness serialization invalid from the first principles while this feature is enabled would be a safer bet. But I'm not quite sure why it is necessary to have this be compatible with existing networks...

(Making it invalid to deserialize normally might be just length * 1024 || random_data || actual_witness_data or something)

Setting up a new network is a lot of work, so I wanted to make it compatible with existing networks (forknet-20, forknet-100, statelessnet). Changing the serialization format would require us to restart all of the nodes in a network at once, which would be distruptive.

I'm thinking about running a witness size test on statelessnet, as it's the most "real" and diverse test environment we have, and it wouldn't be possible to do a coordinated restart there.

jancionear · 2024-07-03T15:10:15Z

Closing in favor of #11703

This PR introduces a way to generate arbitrary large chunk state witness as a result of processing a single receipt. This is needed to test what happens with the blockchain if we missed something in the witness size checks. Part of #11656. Large witness generation is triggered by a function call action with a method name `internal_record_storage_garbage_<GARBAGE_SIZE>` with `<GARBAGE_SIZE>` corresponding to a number of megabytes of random data to be added to the storage proof. For example `internal_record_storage_garbage_20` would result in 20MB of garbage data in the witness. Having size as part of the method name (and not as an argument) makes it easier to send a transaction using near cli (`args` needs to be base64 encoded). Note that we don't need to deploy any contracts, failed receipt will still result in garbage added to the witness. This makes it very easy to use, just sending a transaction with a single function call action is enough. The functionality is enabled only when neard is complied with `test_features` enabled. Alternative approaches considered: * Adding a new feature that disables witness size checks. Then we can deploy a contract that read/writes a lot of storage data to make the storage proof large. We decided not to proceed with this approach since this is considerably harder to use and also at some point we will still hit compute/gas limits. Another benefit of the chosen approach is that it doesn't increase apply chunk latency, so it allows us to test large witness in isolation from slow chunk application. * #11687. The resulting behaviour diverges significantly from what we would expect to happen in the real world when large witness is a result of applying some chunk. For example when testing doomsday scenario we want the next chunk producer to try distributing the same witness after the chunk is missed. This PR also covers the underlying use case #11184. Large part of this PR is refactoring around runtime tests in chain to make it easier to reuse the existing test setup code.

jancionear added 3 commits June 28, 2024 11:24

Add a way to artificially increase witness size

5173d0d

Increase number of buckets for state witness size

d05d6d7

The current ones go only up to 28 MB. After the change they go up to 120 MB.

Increase maximum allowed witness size when the feature is enabled

d5cb7d9

We don't want big witnesses to be rejected because of the default limits

jancionear added the A-stateless-validation Area: stateless validation label Jun 28, 2024

jancionear requested review from pugachAG and shreyan-gupta June 28, 2024 11:36

jancionear requested a review from a team as a code owner June 28, 2024 11:36

nagisa reviewed Jul 1, 2024

View reviewed changes

pugachAG requested changes Jul 1, 2024

View reviewed changes

pugachAG mentioned this pull request Jul 3, 2024

feat: generate large state witness #11703

Merged

jancionear closed this Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a way to artificially increase the size of ChunkStateWitness #11687

Add a way to artificially increase the size of ChunkStateWitness #11687

jancionear commented Jun 28, 2024

codecov bot commented Jun 28, 2024

nagisa left a comment •

edited

Loading

nagisa Jul 1, 2024 •

edited

Loading

jancionear Jul 1, 2024

nagisa Jul 1, 2024

jancionear Jul 1, 2024

pugachAG left a comment

jancionear commented Jul 1, 2024

jancionear commented Jul 3, 2024

Add a way to artificially increase the size of ChunkStateWitness #11687

Add a way to artificially increase the size of ChunkStateWitness #11687

Conversation

jancionear commented Jun 28, 2024

codecov bot commented Jun 28, 2024

Codecov Report

nagisa left a comment • edited Loading

Choose a reason for hiding this comment

nagisa Jul 1, 2024 • edited Loading

Choose a reason for hiding this comment

jancionear Jul 1, 2024

Choose a reason for hiding this comment

nagisa Jul 1, 2024

Choose a reason for hiding this comment

jancionear Jul 1, 2024

Choose a reason for hiding this comment

pugachAG left a comment

Choose a reason for hiding this comment

jancionear commented Jul 1, 2024

jancionear commented Jul 3, 2024

nagisa left a comment •

edited

Loading

nagisa Jul 1, 2024 •

edited

Loading