Skip to content

New Feature: Support "diff-based" commit strategy (instead of uploading full bytes) #100

@snawaz

Description

@snawaz

Summary

Currently, the commit instruction uploads/copying the entire account data to the chain, even when only a few bytes have changed. For very large accounts, this is highly inefficient and consumes unnecessary compute units (see #96), bandwidth, and on-chain time.

This issue proposes introducing a diff-based commit mechanism.... where only the modified parts of the account are uploaded, as an alternative strategy.

Motivation

  • Commit is likely the most frequently invoked instruction in the system (think of HFTs)... potentially hundreds or thousands of times for every single delegate or undelegate.
  • Optimizing it (even at the cost of other less frequently used instructions) could yield major performance improvements.

Proposed Approach (High Level)

  • Add a new commit instruction (e.g CommitDiff) that accepts a "diff" payload instead of full bytes.
    • Or new args to the existing one — new instruction would be probably simpler/cleaner/efficient though. 🤔
  • The caller should decide which strategy is optimal (full-data commit vs. diff-based) based on the size and frequency of changes.
  • Allow clients to explicitly choose a commit strategy based on their needs.

Questions

  • What diff format should we support?
    • offset-based (or overwrite-at-offset-based)... like replace these bytes at this position with this new data.
      • The diff is just a tuple of (offset, length, new_bytes) or simply (u64, Vec<u8>).
      • This format seems to be the most simplest and pretty straightforward.
      • The client, however, needs to compute the diff and ephemeral-rollups-sdk could provide a compute_diff() function.
    • patch-based (or semantic/structural-based diff).. like a list of commands to transform the account data.
      • This format seems quite complicated to me at the moment, especially because we do not know the exact semantic of the target account CommitDiff would update.
  • Are there any security implications or verification considerations when applying diffs on-chain?
    • It seems diff-based updates are not any different from regular full-bytes updates, from security point of view.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions