Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce RAM and persistent storage by deduplicating inlined dict type info #369

Merged

Conversation

fxamacker
Copy link
Member

@fxamacker fxamacker commented Feb 28, 2024

Closes #358 Updates epic #292

This PR deduplicates Cadence dictionary type resulting in reduced memory and persistent storage.

More specifically, this encodes inlined atree slab extra data section as two-element array:

  • array of deduplicated type info
  • array of deduplicated extra data with type info index

TODO (est. 1.5 - 2 days effort total):
[x] - initial deduplication of dict type info
[x] - build, run, and pass migration with validation enabled for all accounts and all payloads from mainnet snapshot
[x] - improve deduplication (1/2 day effort to wrap up + rerun test using mainnet snapshot)

UPDATE (Feb 29, 2024): 🛑 Work on this PR is paused to work on urgent request to create tooling for Cadence 1.0 migration.

NOTE: This PR is expected to produce small improvements to atree inlining & deduplication results 📊.


  • Targeted PR against main branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work
  • Code follows the standards mentioned here
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

This change deduplicates Cadence dictionary type and composite
type info, resulting in reduced memory and also persistent storage.

More specifically, this encodes inlined atree slab extra data
section as two-element array:
- array of deduplicated type info
- array of deduplicated extra data with type info index
@fxamacker fxamacker added performance improvement E&V Team Execution / Verification / Edge Team labels Feb 28, 2024
@fxamacker fxamacker self-assigned this Feb 28, 2024
@codecov-commenter
Copy link

codecov-commenter commented Feb 28, 2024

Codecov Report

Attention: Patch coverage is 62.92135% with 66 lines in your changes are missing coverage. Please review.

Project coverage is 70.38%. Comparing base (2fbf860) to head (35fdb7e).
Report is 8 commits behind head on feature/array-map-inlining.

Files Patch % Lines
typeinfo.go 61.48% 43 Missing and 14 partials ⚠️
array_debug.go 30.76% 6 Missing and 3 partials ⚠️
Additional details and impacted files
@@                      Coverage Diff                       @@
##           feature/array-map-inlining     #369      +/-   ##
==============================================================
+ Coverage                       62.45%   70.38%   +7.92%     
==============================================================
  Files                              15       15              
  Lines                           10919    12653    +1734     
==============================================================
+ Hits                             6820     8906    +2086     
+ Misses                           3119     2749     -370     
- Partials                          980      998      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fxamacker fxamacker changed the title Reduce RAM and persistent storage by deduplicating inlined dict and composite type info Reduce RAM and persistent storage by deduplicating inlined dict type info Feb 29, 2024
@fxamacker fxamacker marked this pull request as draft February 29, 2024 15:48
@fxamacker
Copy link
Member Author

I found some improvements to deduplication while migration + validation was running last night.

@fxamacker
Copy link
Member Author

Work on this PR is paused this morning to meet about and begin work on urgent request to help with Cadence 1.0 migration.

This PR has about 1/2 day of effort left to improve its deduplication for further reduction of memory and persistent storage.

@fxamacker fxamacker marked this pull request as ready for review February 29, 2024 20:12
@fxamacker
Copy link
Member Author

Yesterday, this passed atree inlining & deduplication migration with validation enabled (for all payloads for all accounts) using mainnet Jan 19, 2024 checkpoint as input.

Copy link
Contributor

@ramtinms ramtinms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@fxamacker fxamacker merged commit 92714ca into feature/array-map-inlining Mar 14, 2024
7 checks passed
fxamacker added a commit to onflow/cadence that referenced this pull request Mar 14, 2024
This commit bumps atree version to benefit from deduplication
of atree inlined Cadence dictionary type info.

For more info, see
onflow/atree#369
fxamacker added a commit to onflow/cadence that referenced this pull request Mar 15, 2024
This commit bumps atree version to benefit from deduplication
of atree inlined Cadence dictionary type info.

For more info, see
onflow/atree#369
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E&V Team Execution / Verification / Edge Team improvement performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants