Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat(stateless-validation): Rewards for stateless validators #11121

Merged
merged 11 commits into from
Apr 23, 2024

Conversation

birchmd
Copy link
Contributor

@birchmd birchmd commented Apr 19, 2024

The goal of this PR is to give rewards to stateless validators (validators which do not produce blocks or chunks but do validate chunks by producing endorsements).

This means updating EpochInfoAggregator to track the expected number of endorsements for each validator as well as how many were actually produced. For the MVP we are not actually tracking endorsements directly, instead we assume that if a chunk was included in a block then all stateless validators for that shard should be rewarded (because there must have been a sufficient number of endorsements for the block producer to include the chunk). However, this logic will need to be iterated on in future versions of the stateless validation protocol. Additionally, the reward calculator required updating to use the new endorsement stats as part of the uptime calculation which forms the basis of the rewards.

Both of those changes are relatively minor. The technical snag in all of this is that the EpochInfoAggregator and other structures related to validator stats are persisted in the node state via borsh serialization. To avoid a large database migration, I have made this update backwards compatible with the old format without the endorsement stats. The logic for this is in core/primitives/src/types/chunk_validator_stats.rs. The legacy variant is distinguished from the new one in the serialization by serializing the one's compliment of the u64 values for the new variant. During deserialization we check if the first u64 is larger than 2^32 - 1 or not. If it is larger then we assume the value is actually a one's compliment and proceed with the new variant deserialization, and otherwise we do the legacy deserialization.

This means that the serialized values can be interpreted improperly if 2^32 (4.29 billion) or more chunks are produced by a single validator during one epoch. I do not expect this to be an issue, but it is a limitation worth documenting.

@Longarithm told me that the DB migration is actually fine here because it's not that much data. So the DB migration of the EpochValidatorInfo column is also implemented.

@birchmd birchmd requested a review from Longarithm April 19, 2024 13:57
@birchmd birchmd requested a review from a team as a code owner April 19, 2024 13:57
Copy link

codecov bot commented Apr 19, 2024

Codecov Report

Attention: Patch coverage is 75.35934% with 120 lines in your changes are missing coverage. Please review.

Project coverage is 71.10%. Comparing base (16e2321) to head (153d010).
Report is 1 commits behind head on master.

Files Patch % Lines
core/store/src/migrations.rs 0.00% 98 Missing ⚠️
chain/epoch-manager/src/lib.rs 79.01% 17 Missing ⚠️
chain/chain/src/runtime/tests.rs 96.07% 0 Missing and 2 partials ⚠️
core/primitives/src/views.rs 0.00% 2 Missing ⚠️
nearcore/src/migrations.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #11121      +/-   ##
==========================================
- Coverage   71.10%   71.10%   -0.01%     
==========================================
  Files         772      773       +1     
  Lines      153724   154176     +452     
  Branches   153724   154176     +452     
==========================================
+ Hits       109310   109629     +319     
- Misses      39963    40096     +133     
  Partials     4451     4451              
Flag Coverage Δ
backward-compatibility 0.24% <0.00%> (-0.01%) ⬇️
db-migration 0.24% <0.00%> (-0.01%) ⬇️
genesis-check 1.43% <0.00%> (-0.01%) ⬇️
integration-tests 36.88% <28.54%> (-0.03%) ⬇️
linux 69.51% <71.25%> (-0.01%) ⬇️
linux-nightly 70.57% <75.15%> (+<0.01%) ⬆️
macos 54.29% <71.25%> (+0.11%) ⬆️
pytests 1.65% <0.00%> (-0.01%) ⬇️
sanity-checks 1.44% <0.00%> (-0.01%) ⬇️
unittests 66.76% <75.35%> (+0.02%) ⬆️
upgradability 0.29% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@Longarithm Longarithm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we migrate DB, I don't understand why we don't use ChunkValidatorStatsV2 as ChunkValidatorStats everywhere, I don't think V1 is needed for ChunkValidatorStats.

We can pretend that endorsement stats existed from the beginning and were equal to zero, the logic for computing rewards should be gated by protocol version, so that zero endorsement stats won't cause issues with replayability.

It should simplify code in core/primitives/src/types/chunk_validator_stats.rs. ChunkValidatorStats::V1 { ValidatorStats ... } can be renamed to ChunkValidatorStats::old(ValidatorStats ...).

@birchmd
Copy link
Contributor Author

birchmd commented Apr 20, 2024

why we don't use ChunkValidatorStatsV2 as ChunkValidatorStats everywhere

I think it is important to wrap structs that use Borsh serialization in an enum because it makes it easy to change those data structures in the future without another DB migration. And if we make it an enum any way then I think we might as well include the original variant (it is 16 bytes smaller than the new variant).

the logic for computing rewards should be gated by protocol version

I don't think this is needed because the new logic is identical to the old logic in the case that expected_endorsements == 0. This is why the tests only needed syntactic changes, not logic changes.

ChunkValidatorStats::old(ValidatorStats ...)

I have added a convenience function for this case in 66a8124

@Longarithm
Copy link
Member

it is important to wrap structs that use Borsh serialization in an enum

That makes sense, but for structures which are part of network messaging or protocol, like, if changing structure changes a hash. But for ValidatorStats it is not the case.

I don't see what is the usecase of rewriting V1 with V2 using endorsement_stats_mut, or comparing them with v1_v2_eq. And if we ever introduce V3, then we'll have to add comparison and conversion between all pairs of versions. And migration shouldn't be more complex than that; furthermore, migration code can be removed, and even more important, versioned code requires maintenance and understanding. So I'm not sure whether migrations should be avoided at all costs.

From other perspective: if we replace all occurrences of ChunkValidatorStats::V1(stats) with ChunkValidatorStats::V2(production: stats, endorsements: ValidatorStats::default()), it just strictly reduces number of lines of code. And saving 16 bytes for column which takes like 8 MBs whereas other columns take GBs doesn't look important.

Copy link
Member

@Longarithm Longarithm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments.

UPD: Ideally we should have some integration test checking that if chunk validators skip the work, they are kicked out. As this case may be hard to separate in tests, can be done outside of the PR.

Comment on lines 155 to 163
let numer = U256::from(
produced_blocks * expected_chunks_produced * expected_endorsements
+ produced_chunks
* expected_blocks_produced
* expected_endorsements
+ produced_endorsements
* expected_blocks_produced
* expected_chunks_produced,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are increasing number of factors, can we add some comments why overflows don't happen at any point? We recently hit a crash because of that in statelessnet so we need to be careful about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is test_reward_no_overflow which checks the calculation still works with the anticipated maximum values.

Comment on lines 793 to 796
|env: &mut TestEnv,
expected_blocks: &mut [u64],
expected_chunks: &mut [u64],
expected_endorsements: &mut [u64]| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we touch this code, can we replace &[u64] with &[u64; 2]? It would make it much easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in fcc4e9a

Comment on lines 94 to 95
let expected_blocks_produced = stats.block_stats.expected;
let expected_chunks_produced = stats.chunk_stats.expected();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, can we rename these to expected_blocks and expected_chunks? Because in formulas it makes me always wonder whether it is "expected" or "produced".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in da37c4a

Comment on lines +422 to +424
// Test rewards when some validators are only responsible for endorsements
#[test]
fn test_reward_stateless_validation() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think validators responsible for all 3 roles are also worth unit testing, because that's what we will see on MVP - top validators will produce blocks and chunks and validate all the shards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test1 has all three roles in test_reward_stateless_validation

@birchmd
Copy link
Contributor Author

birchmd commented Apr 23, 2024

use ChunkValidatorStatsV2 as ChunkValidatorStats everywhere

Done in b1053a8

@Longarithm Longarithm added this pull request to the merge queue Apr 23, 2024
@Longarithm
Copy link
Member

Thank you very much!

Merged via the queue into master with commit f95087b Apr 23, 2024
28 of 29 checks passed
@Longarithm Longarithm deleted the stateless-validator-rewards branch April 23, 2024 10:29
jancionear pushed a commit to jancionear/nearcore that referenced this pull request May 13, 2024
<p>This PR was automatically created by Snyk using the credentials of a
real user.</p><br /><h3>Snyk has created this PR to upgrade
react-router-dom from 6.20.1 to 6.21.1.</h3>

:information_source: Keep your dependencies up-to-date. This makes it
easier to fix existing vulnerabilities and to more quickly identify and
fix newly disclosed vulnerabilities when they affect your project.
<hr/>

- The recommended version is **7 versions** ahead of your current
version.
- The recommended version was released **22 days ago**, on 2023-12-21.


<details>
<summary><b>Release notes</b></summary>
<br/>
  <details>
    <summary>Package name: <b>react-router-dom</b></summary>
    <ul>
      <li>
        <b>6.21.1</b> - 2023-12-21
      </li>
      <li>
        <b>6.21.1-pre.0</b> - 2023-12-21
      </li>
      <li>
        <b>6.21.0</b> - 2023-12-13
      </li>
      <li>
        <b>6.21.0-pre.3</b> - 2023-12-06
      </li>
      <li>
        <b>6.21.0-pre.2</b> - 2023-12-05
      </li>
      <li>
        <b>6.21.0-pre.1</b> - 2023-12-05
      </li>
      <li>
        <b>6.21.0-pre.0</b> - 2023-12-05
      </li>
      <li>
        <b>6.20.1</b> - 2023-12-01
      </li>
    </ul>
from <a
href="https://snyk.io/redirect/github/remix-run/react-router/releases">react-router-dom
GitHub release notes</a>
  </details>
</details>


<details>
  <summary><b>Commit messages</b></summary>
  </br>
  <details>
    <summary>Package name: <b>react-router-dom</b></summary>
    <ul>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/08cda17f450ebe19481be3fc080d243ec5ef509f">08cda17</a>
chore: Update version for release (near#11132)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/14df5db772c1ec92fd680b7646c8865d5caf1a84">14df5db</a>
Exit prerelease mode</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/c55f8417e17491e8e4f5626680bc4c9b8d0f9ed9">c55f841</a>
Update release notes</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/e5d73cf2251bf99d3113695a57c34c84e0ac92e5">e5d73cf</a>
chore: Update version for release (pre) (near#11131)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/a7ec1c50756172b92aeff2225cbedca7254c63b7">a7ec1c5</a>
Enter prerelease mode</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/ba99ec567e1899f811973d99ca804dc620a9cfae">ba99ec5</a>
Merge branch &#x27;main&#x27; into release-next</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/8bb3ffdf3225dafd2c8df11a27141b0745fcc647">8bb3ffd</a>
Fix typo in changeset</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/0f04d11d152a2d7d7ebace786bc7c2fd30d55fbc">0f04d11</a>
Fix issues with partial hydration combined with route.lazy (near#11121)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/cc4436c544300a09d9d95c5f733d3b00fd7d426f">cc4436c</a>
Update changelogs for useResolvedSplat example</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/d9b36f6e16a7b437be30faf7fdd8154cff40e9cc">d9b36f6</a>
Update useResolvedPath docs</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/f8c54a89f604185e8462a7768d0172a670beae6d">f8c54a8</a>
chore: format</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/abb23de1345349fc81a5beb6ff6697ea04816078">abb23de</a>
doc edited for unstable_useViewTransitionState (near#11117)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/a4e91e089a730443f638742759e7dd335aea1399">a4e91e0</a>
chore: sort contributors list</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/2ea19faf9b47222a797037a63e61375622ad0fb6">2ea19fa</a>
docs: Add minor grammatical changes to &#x60;errorElement&#x60; docs for
better readability (near#11119)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/87d5d6114420b3f00e156c04564c2c24496b6235">87d5d61</a>
Fix unit test (near#11115)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/19e6398f20779dd34d8b4eca0f3605bf5831a7f0">19e6398</a>
Add v7_relativeSplatPath to createBrowserRouter docs</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/c292bdb9bd1171546b5937c6ff226a6f064c8652">c292bdb</a>
Merge branch &#x27;release-next&#x27; into dev</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/5567158e1f7ab96683bcb4c757e27b1c22ceb68f">5567158</a>
Merge branch &#x27;release-next&#x27;</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/69ba50e06633fd4add234fb47f2d49b0a5ee41f9">69ba50e</a>
chore: Update version for release (near#11114)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/ffd4373737a5bda3ad0fcebfa4895fbd8038a504">ffd4373</a>
Exit prerelease mode</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/54ec39ee35cfde99ee7218e6aa2bd1b7d8afc787">54ec39e</a>
Allow persisted fetchers unmounted during submit to revalidate
(near#11102)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/a76ee41e08f0b2a88557b48656ff2fc318f5c4d0">a76ee41</a>
Dedup relative path logic in resolveTo (near#11097)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/ea0ffeef8a0c353f8ef402b5df2ac7c60a4d0b49">ea0ffee</a>
chore: Update version for release (pre) (near#11095)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/fe3c071037b18f447dabb44fb24e4b1bce2a2a15">fe3c071</a>
Slight refactor to partial hydration to leverage state.initialized
properly (near#11094)</li>
    </ul>

<a
href="https://snyk.io/redirect/github/remix-run/react-router/compare/8b1ee67ebc00fc3e778d5e36fe9bca140b576b5c...08cda17f450ebe19481be3fc080d243ec5ef509f">Compare</a>
  </details>
</details>
<hr/>

**Note:** *You are seeing this because you or someone else with access
to this repository has authorized Snyk to open upgrade PRs.*

For more information: <img
src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiIyODZlOTdmZi04MTUxLTRiMWYtODM2Zi0zNzJiMTg4NzZjZWIiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6IjI4NmU5N2ZmLTgxNTEtNGIxZi04MzZmLTM3MmIxODg3NmNlYiJ9fQ=="
width="0" height="0"/>

🧐 [View latest project
report](https://app.snyk.io/org/ecp88/project/98480bdc-d80b-4fd1-89d7-c4c56a706763?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr)

🛠 [Adjust upgrade PR
settings](https://app.snyk.io/org/ecp88/project/98480bdc-d80b-4fd1-89d7-c4c56a706763/settings/integration?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr)

🔕 [Ignore this dependency or unsubscribe from future upgrade
PRs](https://app.snyk.io/org/ecp88/project/98480bdc-d80b-4fd1-89d7-c4c56a706763/settings/integration?pkg&#x3D;react-router-dom&amp;utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr#auto-dep-upgrades)

<!---
(snyk:metadata:{"prId":"286e97ff-8151-4b1f-836f-372b18876ceb","prPublicId":"286e97ff-8151-4b1f-836f-372b18876ceb","dependencies":[{"name":"react-router-dom","from":"6.20.1","to":"6.21.1"}],"packageManager":"npm","type":"auto","projectUrl":"https://app.snyk.io/org/ecp88/project/98480bdc-d80b-4fd1-89d7-c4c56a706763?utm_source=github&utm_medium=referral&page=upgrade-pr","projectPublicId":"98480bdc-d80b-4fd1-89d7-c4c56a706763","env":"prod","prType":"upgrade","vulns":[],"issuesToFix":[],"upgrade":[],"upgradeInfo":{"versionsDiff":7,"publishedDate":"2023-12-21T17:00:03.440Z"},"templateVariants":[],"hasFixes":false,"isMajorUpgrade":false,"isBreakingChange":false,"priorityScoreList":[]})
--->

Co-authored-by: snyk-bot <snyk-bot@snyk.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants