fix(state-viewer): print hash and memory usage of tree nodes #8919

nikurt · 2023-04-18T10:29:40Z

makes neard view-chain --format full print hash and node's memory usage.
adds a print sub-sub-sub-command to print the sub-tree contained in a state part: ./neard view-state state-parts --root-dir /tmp --shard-id 0 apply --part-id 42 --action print current
fixes neard view-state-part dump to print state record in a human readable way.

Also adds an action 'print' to `neard view-state state-part apply` to print the sub-tree contained in a state part. Also fixes `neard view-state-part dump` to print state record in a human readable way.

nikurt · 2023-04-18T15:55:21Z

cc: @Longarithm

akhi3030 · 2023-04-18T15:57:11Z

assigning @Longarithm as the reviewer.

@jakmeier

My previous PR #8762 was reverted by #8780 because release automation was broken. Looks like the cause of broken automation was a crate `near-test-contract`. We had a discussion about this package and the reasons to publish it in the first place in the very first PR. Now we have proof that a dependency that mentioned in the `dev-dependencies` section of `Cargo.toml` is not required to be published to crates.io. Once figured out I created a PR #8779 but it was unnoticed and instead all the changes were reverted in #8780. As agreed with @jakmeier I am creating the same changes but disabled `near-test-contract` for publishing necessary crates in the future.

Replace Children::encode_into and decode methods by implementations of BorshSerializable and BorshDeserializable traits and with that change RawTrieNode into an enum for which borsh implementations can be derived. While the first step is quite straightforward, the second requires RawTrieNode::Branch variant being split into two: BranchNoValue and BranchWithValue. The main benefit of this is less code (the commit has delta of -68 lines) and API consistent with other serialised objects. However, this doesn’t come without downsides. Firstly, having separate BranchNoValue and BranchWithValue branches means that it’s now a bit more awkward to handle branch as a single case with common code paths having to be refactored. Secondly, one might imagine in the future a new representation for nodes which embed value. For example {4u8}{key_len:u32}{key}{value} might be added for a leaf. This would essentially mean that value comprises of the rest of encoded bytes. The issue for borsh is that it would like to know length of {value} before it reads it. Either redundant {value_len:u32} would need to be included or some hacks employed (namely not using borsh for RawTrieNodeWithSize and then consuming the rest of the reader when decoding RawTrieNode). Lastly, I’m not certain about performance implications. Borsh might actually be slower than hand-implemented decoder which operates on a slice. If this is actually the case I haven’t measured.

As it fails with: ``` thread 'tests::test_contract_precompilation' panicked at 'Compilation result should be non-empty', runtime/runtime/src/lib.rs:2550:14 ``` presumably, because the compilation cache is not populated on aarch64-apple. A prior art: #8412

This PR re-introduces changes from #8805 together with #8892 after fixing #8908: The PR addresses #8265 by introducing aggregation of compute usage across all operations performed during chunk application and limiting this compute usage to 1s. This should not change the behavior of nodes in the short run because compute costs match gas costs which is validated by the assert, so any discrepancy should be caught on canary nodes.

This closes up the near-vm work. I believe #8886 (comment) is the only remaining point to be checked in former PRs, in addition to all the code added in this PR. Note that there are still things in #8323 that are not in this PR, because either I noticed they actually should not be merged (eg. the wasmparser bump would be a protocol upgrade, so will wait until we get limited replayability), or because they are actually orthogonal to this landing.

Part of #8827. Adds `flat_storage_creation_enabled` config flag to enable/disable flat storage creation. The default value is `true`. This is needed for SREs to be able to manually enable/disable flat storage creation when taking state backup.

…mmaries for PRs (#8926) ### Purpose To reduce the effort involved in writing a pull request description. [Here](https://githubnext.com/projects/copilot-for-pull-requests) you can find more details about this feature. Below is a copy of the GiHub PRs Copilot tutorial: ### Get started When you are ready to create a new pull request you can use Copilot to suggest parts of the pull request description. To do this, simply include the relevant markers in your text and they will be expanded for you automatically when you save the description. You can edit your description as much as you like, and whatever markers are present when you save it will be expanded automatically for you. Watch for the 👀 👀 👀 reaction from Copilot to show that its working! ### Markers `copilot:all` all the content, in one go. `copilot:summary` a one-paragraph summary of the changes in the pull request. `copilot:walkthrough` a detailed list of changes, including links to the relevant pieces of code. `copilot:poem` a poem about the changes in the pull request. ### Example Here's an example for a pull request titled "Add edges for type assignability". Original description provided by author: ``` copilot:summary We need this information to provide as input to the analysis. The compiler already knows it so it makes sense to extract it here. ``` Expanded description with Copilot's suggestions: > 🤖 Generated by Copilot > > This pull request adds semantic analysis to the feature extraction plugin, which enhances the feature graph with information about types and assignability of expressions in Java code. It modifies the pom.xml file, the FeaturePlugin.java file, the DotOutput.java file, the FeatureGraph.java file, the FeatureGraphChecks.java file, and the graph.proto file to support the new feature. It also adds three new classes: AssignabilityAnalysis.java, TypeAnalysis.java, and TypeScanner.java, which implement the semantic analysis logic. Finally, it adds two new test classes: AssignabilityAnalysisTest.java and TypeScannerTest.java, which verify the correctness of the semantic analysis and the feature graph construction. We need this information to provide as input to the analysis. The compiler already knows it so it makes sense to extract it here. ### Errors and omissions from the AI The outputs might contain errors and omissions. Once they've been generated you can modify or remove them by editing the description as normal. We include an attribution line to highlight Copilot's suggestions. This can be removed if you wish. ### Rolling it out We include Copilot markers in our pull request templates so that every pull request includes some standard content. Here's an example: ``` ### WHAT copilot:summary copilot:poem ### WHY  ### HOW copilot:walkthrough ``` Check out the GitHub guidance [on creating your own pull request template](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository).

Avoided clippy warnings that want me to rename enum values or change function signatures. The clippy effort is tracked in #8145 Can't say that any of the findings were critical, but the following warnings in my personal opinion make the code most readable: * Prefer `is_empty()` * Derive `Default` for enums * Use short-hand field initializes: `MyStruct { x }` instead of `MyStruct { x: x }` * Prefer `sort_by_key()` * Prefer `!x` to `x == false`

Borsh can now read from a `Read`er and write to a `Writer` so use it for handling database state dump files. Negative delta FTW.

@jakmeier

We have two types of errors in the codebase resulting in `clippy::derivable_impls` warning: 1. Enums implement `default()` explicitly when it can be derived. This seems to be an unambiguous readability improvement. Removing an explicit function definition is good. `#[default]` is clear enough. 2. Structs implement `default()` to initialize all fields to default values of corresponding types. This seems an improvement too. @jakmeier please check whether it's a readability for `QemuCommandBuilder`. The PR was generated by running this and adjusting the result manually ```sh cargo clippy --fix -- -A clippy::all -W clippy::derivable_impls ``` This is part of #8145

Adds a message to the changelog of the next release about compute costs protocol change.

This patch includes the final code parts of the meta-tx gas cost estimations. It was delayed because the results were too flaky for continuous estimations. Specifically, this adds the estimations 1. `ActionDelegate` to estimate send + send + exec all together 2. `ActionDelegateSendNotSir`, `ActionDelegateSendSir`, `ActionDelegateExec` to estimate each step separately Making these estimations stable is tricky due to the limit that only one delegate action can be in a receipt. The check is only done in the `send` step, so I was able to cheat my way around it for `ActionDelegateExec`. But I didn't find a way for the send cost estimations. Hence in that case we measure the cost of verifying a single action. This results in high variation within the sample measurements, which makes the average a poor statistic to look at. So instead, I report the 75th percentile of at least 12 measurements. The 75th percentile was chosen over the media (50th percentile) to ensure we pick at least a somewhat pessimistic value of the distribution. Repeated estimations will still result in flaky results, so it's far from a perfect solution. But at least it removes outliers on both extremes. In the current state, the `icount` estimations seem to be very stable, which is what we need for continuous estimation. For `time` based estimations, the `ActionDelegate` sometimes gets a `HIGH-VARIANCE` or `SUBTRACTION-UNDERFLOW` uncertainty marker. This is a common problem shared with other actions and I don't intend to solve it in this PR. But the other other three estimations should be stable also in `time` measurements.

…to its previous block (#8681)

…thEpochManagerAdapter. (#8917) As a reminder of the overall goal, we want to split RuntimeWithEpochManagerAdapter, so that any code that needs the runtime will use an Arc<RuntimeAdapter>, and any code that uses the epoch manager will use the EpochManagerHandle. This PR splits usages in state-viewer, so that it no longer uses RuntimeWithEpochManagerAdapter. It will use EpochManagerHandle wherever possible, but when needed, we fall back to EpochManagerAdapter for code that still needs to convert from RuntimeWithEpochManagerAdapter (since that cannot in general provide an EpochManagerHandle). When EpochManagerAdapter is no longer created from RuntimeWithEpochManagerAdapter, we can remove EpochManagerAdapter and always use EpochManagerHandle, a concrete type.

Before this change, the mock node keeps handles to the ClientActor and ViewClientActor, and mocks the role of the PeerManagerActor. Here we change things so that it instead just listens on a localhost socket and sends messages to a node that has been started normally with nearcore::start_with_config(). This has the advantage of allowing us to in theory test/benchmark the whole node's code via only its public API.

These are not currently supported and any decision to enable them should be a intentional decision, not an accidental one. Overall these changes are no-op since the deserialization will fail earlier when trying to deserialize with pwasm deserializer, but if and when we eventually get rid of it, it would be nice if these extensions didn't accidentally get enabled. This puts to rest the question raised in #8886 (comment).

Adding README documentation for setting up mirror tests from #8162, https://github.com/near/near-ops/pull/972.

Co-authored-by: nikurt <86772482+nikurt@users.noreply.github.com>

Count the number chunks produced and expected to be produced so far for each individual shard. When one chunk producer is assigned to multiple shards, we can see the production / expected for each of the shard. Tested on an RPC node in mainnet. Paste from the metrics tab of a node in mainnet. [link](https://gist.github.com/VanBarbascu/26dc011291df7f128d609c8457d57469)

Also adds an action 'print' to `neard view-state state-part apply` to print the sub-tree contained in a state part. Also fixes `neard view-state-part dump` to print state record in a human readable way.

Also adds a unit test for state dump and an integration test for state dump and state sync from that dump

Replace Children::encode_into and decode methods by implementations of BorshSerializable and BorshDeserializable traits and with that change RawTrieNode into an enum for which borsh implementations can be derived. While the first step is quite straightforward, the second requires RawTrieNode::Branch variant being split into two: BranchNoValue and BranchWithValue. The main benefit of this is less code (the commit has delta of -68 lines) and API consistent with other serialised objects. However, this doesn’t come without downsides. Firstly, having separate BranchNoValue and BranchWithValue branches means that it’s now a bit more awkward to handle branch as a single case with common code paths having to be refactored. Secondly, one might imagine in the future a new representation for nodes which embed value. For example {4u8}{key_len:u32}{key}{value} might be added for a leaf. This would essentially mean that value comprises of the rest of encoded bytes. The issue for borsh is that it would like to know length of {value} before it reads it. Either redundant {value_len:u32} would need to be included or some hacks employed (namely not using borsh for RawTrieNodeWithSize and then consuming the rest of the reader when decoding RawTrieNode). Lastly, I’m not certain about performance implications. Borsh might actually be slower than hand-implemented decoder which operates on a slice. If this is actually the case I haven’t measured.

Avoided clippy warnings that want me to rename enum values or change function signatures. The clippy effort is tracked in #8145 Can't say that any of the findings were critical, but the following warnings in my personal opinion make the code most readable: * Prefer `is_empty()` * Derive `Default` for enums * Use short-hand field initializes: `MyStruct { x }` instead of `MyStruct { x: x }` * Prefer `sort_by_key()` * Prefer `!x` to `x == false`

Also adds a unit test for state dump and an integration test for state dump and state sync from that dump

Also adds an action 'print' to `neard view-state state-part apply` to print the sub-tree contained in a state part. Also fixes `neard view-state-part dump` to print state record in a human readable way.

tools/state-viewer/src/state_parts.rs

* makes `neard view-chain --format full` print hash and node's memory usage. * adds a `print` sub-sub-sub-command to print the sub-tree contained in a state part: `./neard view-state state-parts --root-dir /tmp --shard-id 0 apply --part-id 42 --action print current` * fixes `neard view-state-part dump` to print state record in a human readable way.

nikurt added 4 commits April 17, 2023 13:54

feat: Huge state parts

ef3b989

Refuse state part and state header requests

63db8f6

fix

fc02ebc

fix(view-chain): print_recursively() shows hash and memory_usage

e18b1b9

Also adds an action 'print' to `neard view-state state-part apply` to print the sub-tree contained in a state part. Also fixes `neard view-state-part dump` to print state record in a human readable way.

nikurt requested a review from a team as a code owner April 18, 2023 10:29

nikurt requested a review from akhi3030 April 18, 2023 10:29

akhi3030 requested a review from Longarithm April 18, 2023 15:57

nikurt changed the title ~~fix(state-viewer): print tree shows hash and memory usage~~ fix(state-viewer): print hash and memory usage of tree nodes Apr 24, 2023

khorolets and others added 20 commits April 25, 2023 14:49

store: use borsh for state file (#8560)

a309785

Borsh can now read from a `Read`er and write to a `Writer` so use it for handling database state dump files. Negative delta FTW.

fix: fix a formatting issue (#8933)

d417925

Add Compute Costs to CHANGELOG (#8928)

88640c7

Adds a message to the changelog of the next release about compute costs protocol change.

(ND-322)feat: undo-block tool - revert the current head of the chain …

1f46bf2

…to its previous block (#8681)

mirror mocknet test README (#8192)

2a9b648

Adding README documentation for setting up mirror tests from #8162, https://github.com/near/near-ops/pull/972.

[doc] How resharding works and how we could improve it (#8419)

c0967a8

Co-authored-by: nikurt <86772482+nikurt@users.noreply.github.com>

nikurt and others added 18 commits April 25, 2023 15:43

fix

5971c85

fix

2f862cc

Merge branch 'master' into nikurt-print-tree

0415168

Refuse state part and state header requests

fec8475

fix(view-chain): print_recursively() shows hash and memory_usage

a27821d

Also adds an action 'print' to `neard view-state state-part apply` to print the sub-tree contained in a state part. Also fixes `neard view-state-part dump` to print state record in a human readable way.

fix

8ef2542

feat: State sync from local filesystem (#8913)

6f273b1

Also adds a unit test for state dump and an integration test for state dump and state sync from that dump

fix

22d6eec

fix

36a067b

merge

420daf1

fix

e9b419a

feat: State sync from local filesystem (#8913)

0338f04

Also adds a unit test for state dump and an integration test for state dump and state sync from that dump

fix(view-chain): print_recursively() shows hash and memory_usage

d6cc27c

Also adds an action 'print' to `neard view-state state-part apply` to print the sub-tree contained in a state part. Also fixes `neard view-state-part dump` to print state record in a human readable way.

merge

a7a459d

merge

f5f9c7f

merge

d1aba3a

Longarithm requested changes Apr 25, 2023

View reviewed changes

tools/state-viewer/src/state_parts.rs Outdated Show resolved Hide resolved

nikurt requested a review from Longarithm April 25, 2023 15:01

nikurt added 2 commits April 25, 2023 17:01

Rename Apply->Load

27a403c

Rename Apply->Load

822a8dd

Longarithm approved these changes Apr 25, 2023

View reviewed changes

tools/state-viewer/src/state_parts.rs Show resolved Hide resolved

tools/state-viewer/src/state_parts.rs Outdated Show resolved Hide resolved

tools/state-viewer/src/state_parts.rs Outdated Show resolved Hide resolved

nikurt added 2 commits April 25, 2023 17:10

Comments

d3c3cba

Comments

ee7e44a

nikurt added the S-automerge label Apr 25, 2023

Comments

8adcfd4

near-bulldozer bot merged commit 50470fd into master Apr 25, 2023

near-bulldozer bot deleted the nikurt-print-tree branch April 25, 2023 15:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(state-viewer): print hash and memory usage of tree nodes #8919

fix(state-viewer): print hash and memory usage of tree nodes #8919

nikurt commented Apr 18, 2023 •

edited

Loading

nikurt commented Apr 18, 2023

akhi3030 commented Apr 18, 2023

fix(state-viewer): print hash and memory usage of tree nodes #8919

fix(state-viewer): print hash and memory usage of tree nodes #8919

Conversation

nikurt commented Apr 18, 2023 • edited Loading

nikurt commented Apr 18, 2023

akhi3030 commented Apr 18, 2023

nikurt commented Apr 18, 2023 •

edited

Loading