Skip to content

Conversation

@buddhisthead
Copy link
Collaborator

@buddhisthead buddhisthead commented Dec 2, 2025

Description

This PR adds decoding of the Mark/Set/Go snapshots within the NewEpocData CBOR bootstrap file (dump from a Haskell node). The Mark/Set/Go data is defined in the Shelley Era ledger file, specifically in regards to the rewards calculations. The three snapshots (mark, set, go) are each in the same format, but represent rewards for epochs numbered -2, -1, and 0 where 0 is the current Epoch. This enables us to avoid computing the rewards from all past blocks and instead simply bootup the system state from the pre-computed rewards values.

Related Issue(s)

#433

How was this tested?

I tested this manually by running the test_streaming_parser.rs example. You can repeat this easily with:
make snap-test-streaming
which will download the CBOR file, build, and run the example with INFO level logging. You should see text like this amongst the output.

Raw Snapshots Data:
  Mark snapshot: 1334122 delegators, 22586623335.121437 total stake (ADA)
  Set snapshot: 1334566 delegators, 22567839323.08353 total stake (ADA)
  Go snapshot: 1334209 delegators, 22600232613.96385 total stake (ADA)
  Fee: 59492.203551 ADA

Checklist

  • My code builds and passes local tests
  • I added/updated tests for my changes, where applicable
  • I updated documentation (if applicable)
  • CI is green for this PR

Impact / Side effects

There should be no major impacts. It did require the update of the callback API, which I added to the consumer as an example showing some logging.

Reviewer notes / Areas to focus

Nothing specific.

Copilot AI review requested due to automatic review settings December 2, 2025 01:25
Copilot finished reviewing on behalf of buddhisthead December 2, 2025 01:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements decoding support for Mark/Set/Go snapshots from Cardano Haskell node CBOR bootstrap files. These snapshots contain pre-computed reward distribution data for epochs -2, -1, and 0 relative to the current epoch, enabling system state initialization without recomputing rewards from historical blocks. The implementation uses a hybrid parsing approach: streaming for UTXOs (for memory efficiency) and memory-based parsing for the remainder of the snapshot file (~500MB) to access protocol parameters and stake distribution snapshots.

Key changes:

  • Added mark_set_go.rs module with VMap structure and snapshot parsing logic
  • Extended streaming snapshot parser with hybrid memory/streaming approach
  • Implemented snapshot callback handlers in bootstrapper and publisher modules
  • Added fees field to PotBalances and snapshots metadata to SnapshotMetadata

Reviewed changes

Copilot reviewed 10 out of 12 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
common/src/snapshot/mark_set_go.rs New module implementing VMap CBOR decoder and Mark/Set/Go snapshot parsing with flexible format handling
common/src/snapshot/streaming_snapshot.rs Extended parser with hybrid approach, added fees tracking, snapshot parsing integration, and SnapshotsCallback trait requirement
common/src/snapshot/mod.rs Added re-exports for new snapshot types (RawSnapshotsContainer, SnapshotsCallback, VMap)
common/src/snapshot/error.rs Updated error formatting to use inline format strings
modules/snapshot_bootstrapper/src/snapshot_bootstrapper.rs New module implementing SnapshotHandler with callbacks and message publishing for bootstrap state distribution
modules/snapshot_bootstrapper/src/publisher.rs Added SnapshotsCallback implementation with stake calculation and logging
common/examples/test_streaming_parser.rs Added SnapshotsCallback implementation and env_logger initialization for testing
common/Cargo.toml Added log and env_logger dependencies
Makefile Added LOG_LEVEL variable support for snap-test-streaming target
docker-compose.yaml New file defining three Cardano node services (purpose unclear)
.gitignore Added development artifacts (restore.sh, startup.sh, configuration/)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 1226 to 1235
let fees = match remainder_decoder.decode::<u64>() {
Ok(fees_value) => {
info!(" • Successfully parsed fees: {} lovelace", fees_value);
fees_value as i64
}
Err(e) => {
info!(" • Failed to parse fees: {}, using 0", e);
0
}
};
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential type conversion issue: fees is decoded as u64 but then cast to i64 (line 1229), which could overflow for values greater than i64::MAX. Later at line 1349, it's cast back to u64. This double conversion is unnecessary and could cause issues.

Consider using u64 consistently for fees, or if negative values are semantically meaningful, handle the conversion more carefully with proper overflow checking.

Copilot uses AI. Check for mistakes.
fn on_snapshots(&mut self, snapshots: RawSnapshotsContainer) -> Result<()>;
}

/// From ttps://github.com/rrruko/nes-cddl-hs/blob/main/nes.cddl
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in URL comment: "ttps" should be "https".

Suggested change
/// From ttps://github.com/rrruko/nes-cddl-hs/blob/main/nes.cddl
/// From https://github.com/rrruko/nes-cddl-hs/blob/main/nes.cddl

Copilot uses AI. Check for mistakes.
Comment on lines 232 to 234
let mark_total: i64 = snapshots.mark.0.iter().map(|(_, stake)| stake).sum();
let set_total: i64 = snapshots.set.0.iter().map(|(_, stake)| stake).sum();
let go_total: i64 = snapshots.go.0.iter().map(|(_, stake)| stake).sum();
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential overflow when summing stakes: The code sums i64 values which could overflow if there are many large stake amounts. While Cardano's total supply is limited, summing many individual stakes without overflow checking could still be problematic.

Consider using checked arithmetic or a larger type (e.g., i128) for the accumulation to prevent potential overflow issues.

Suggested change
let mark_total: i64 = snapshots.mark.0.iter().map(|(_, stake)| stake).sum();
let set_total: i64 = snapshots.set.0.iter().map(|(_, stake)| stake).sum();
let go_total: i64 = snapshots.go.0.iter().map(|(_, stake)| stake).sum();
let mark_total: i128 = snapshots.mark.0.iter().map(|(_, stake)| i128::from(*stake)).sum();
let set_total: i128 = snapshots.set.0.iter().map(|(_, stake)| i128::from(*stake)).sum();
let go_total: i128 = snapshots.go.0.iter().map(|(_, stake)| i128::from(*stake)).sum();

Copilot uses AI. Check for mistakes.
Comment on lines +120 to +122
let mark_total: i64 = snapshots.mark.0.iter().map(|(_, amount)| amount).sum();
let set_total: i64 = snapshots.set.0.iter().map(|(_, amount)| amount).sum();
let go_total: i64 = snapshots.go.0.iter().map(|(_, amount)| amount).sum();
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential overflow when summing stakes: The code sums i64 values which could overflow if there are many large stake amounts. While Cardano's total supply is limited, summing many individual stakes without overflow checking could still be problematic.

Consider using checked arithmetic or a larger type (e.g., i128) for the accumulation to prevent potential overflow issues.

Suggested change
let mark_total: i64 = snapshots.mark.0.iter().map(|(_, amount)| amount).sum();
let set_total: i64 = snapshots.set.0.iter().map(|(_, amount)| amount).sum();
let go_total: i64 = snapshots.go.0.iter().map(|(_, amount)| amount).sum();
let mark_total: i128 = snapshots.mark.0.iter().map(|(_, amount)| *amount as i128).sum();
let set_total: i128 = snapshots.set.0.iter().map(|(_, amount)| *amount as i128).sum();
let go_total: i128 = snapshots.go.0.iter().map(|(_, amount)| *amount as i128).sum();

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overkill.

Comment on lines +280 to +282
let mark_total: i64 = snapshots.mark.0.iter().map(|(_, amount)| amount).sum();
let set_total: i64 = snapshots.set.0.iter().map(|(_, amount)| amount).sum();
let go_total: i64 = snapshots.go.0.iter().map(|(_, amount)| amount).sum();
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential overflow when summing stakes: The code sums i64 values which could overflow if there are many large stake amounts. While Cardano's total supply is limited, summing many individual stakes without overflow checking could still be problematic.

Consider using checked arithmetic or a larger type (e.g., i128) for the accumulation to prevent potential overflow issues.

Suggested change
let mark_total: i64 = snapshots.mark.0.iter().map(|(_, amount)| amount).sum();
let set_total: i64 = snapshots.set.0.iter().map(|(_, amount)| amount).sum();
let go_total: i64 = snapshots.go.0.iter().map(|(_, amount)| amount).sum();
let mark_total: i128 = snapshots.mark.0.iter().map(|(_, amount)| *amount as i128).sum();
let set_total: i128 = snapshots.set.0.iter().map(|(_, amount)| *amount as i128).sum();
let go_total: i128 = snapshots.go.0.iter().map(|(_, amount)| *amount as i128).sum();

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a bit overkill to me.

Comment on lines 88 to 242
pub fn parse_single_snapshot(decoder: &mut Decoder, snapshot_name: &str) -> Result<Snapshot> {
eprintln!(" {snapshot_name} snapshot - checking data type...");

// Check what type we have - could be array, map, or simple value
match decoder.datatype().context("Failed to read snapshot datatype")? {
minicbor::data::Type::Map | minicbor::data::Type::MapIndef => {
eprintln!(
" {snapshot_name} snapshot is a map - treating as stake distribution directly"
);
// Try VMap first, then fallback to simple map
match decoder.decode::<VMap<StakeCredential, i64>>() {
Ok(snapshot_stake) => {
eprintln!(
" {} snapshot - successfully decoded {} stake entries with VMap",
snapshot_name,
snapshot_stake.0.len()
);
Ok(Snapshot { snapshot_stake })
}
Err(vmap_error) => {
eprintln!(
" {snapshot_name} snapshot - VMap decode failed: {vmap_error}"
);
eprintln!(
" {snapshot_name} snapshot - trying simple BTreeMap<bytes, i64>"
);

// Reset decoder and try simple map format
// Note: We can't reset the decoder, so we need to handle this differently
// For now, return an empty snapshot to continue processing
eprintln!(
" {snapshot_name} snapshot - using empty fallback due to format mismatch"
);
Ok(Snapshot {
snapshot_stake: VMap(Vec::new()),
})
}
}
}
minicbor::data::Type::Array => {
eprintln!(" {snapshot_name} snapshot is an array");
decoder.array().context("Failed to parse snapshot array")?;

// Check what the first element type is
eprintln!(
" {snapshot_name} snapshot - checking first element type..."
);
match decoder.datatype().context("Failed to read first element datatype")? {
minicbor::data::Type::Map | minicbor::data::Type::MapIndef => {
eprintln!(
" {snapshot_name} snapshot - first element is a map, parsing as stake"
);
// First element is snapshot_stake
let snapshot_stake: VMap<StakeCredential, i64> = decoder.decode()?;

// Skip delegations (second element)
decoder.skip().context("Failed to skip snapshot_delegations")?;

// Skip pool_params (third element)
decoder.skip().context("Failed to skip snapshot_pool_params")?;

Ok(Snapshot { snapshot_stake })
}
other_type => {
eprintln!(
" {snapshot_name} snapshot - first element is {other_type:?}, skipping entire array"
);
// We don't know how many elements are in this array, so just skip the first element
// and let the array parsing naturally complete
decoder.skip().context("Failed to skip first element")?;

// Try to skip remaining elements, but don't fail if there aren't exactly 3
loop {
match decoder.datatype() {
Ok(minicbor::data::Type::Break) => {
// End of indefinite array
break;
}
Ok(_) => {
// More elements to skip
decoder.skip().ok(); // Don't fail on individual skips
}
Err(_) => {
// End of definite array or other error - break
break;
}
}
}

Ok(Snapshot {
snapshot_stake: VMap(Vec::new()),
})
}
}
}
minicbor::data::Type::U32
| minicbor::data::Type::U64
| minicbor::data::Type::U8
| minicbor::data::Type::U16 => {
let value = decoder.u64().context("Failed to parse snapshot value")?;
eprintln!(
" {snapshot_name} snapshot is a simple value: {value}"
);

// Return empty snapshot for simple values
Ok(Snapshot {
snapshot_stake: VMap(Vec::new()),
})
}
minicbor::data::Type::Break => {
eprintln!(
" {snapshot_name} snapshot is a Break token - indicates end of indefinite structure"
);
// Don't consume the break token, let the parent structure handle it
// Return empty snapshot
Ok(Snapshot {
snapshot_stake: VMap(Vec::new()),
})
}
minicbor::data::Type::Tag => {
eprintln!(
" {snapshot_name} snapshot starts with a CBOR tag, trying to skip tag and parse content"
);
let _tag = decoder.tag().context("Failed to read CBOR tag")?;
eprintln!(
" {snapshot_name} snapshot - found tag {_tag}, checking tagged content..."
);

// After consuming tag, try to parse the tagged content
match decoder.datatype().context("Failed to read tagged content datatype")? {
minicbor::data::Type::Map | minicbor::data::Type::MapIndef => {
let snapshot_stake: VMap<StakeCredential, i64> = decoder.decode()?;
Ok(Snapshot { snapshot_stake })
}
other_tagged_type => {
eprintln!(
" {snapshot_name} snapshot - tagged content is {other_tagged_type:?}, skipping"
);
decoder.skip().ok(); // Don't fail on skip
Ok(Snapshot {
snapshot_stake: VMap(Vec::new()),
})
}
}
}
other_type => {
eprintln!(
" {snapshot_name} snapshot has unexpected type: {other_type:?}, skipping..."
);
decoder.skip().ok(); // Don't fail on skip

// Return empty snapshot
Ok(Snapshot {
snapshot_stake: VMap(Vec::new()),
})
}
}
}
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function uses eprintln! for debug/diagnostic output throughout (lines 89-236). This is inconsistent with the rest of the codebase which uses the tracing crate (as seen in streaming_snapshot.rs and other modules).

Consider replacing all eprintln! calls with appropriate tracing macros like info! or debug! for consistency and to allow proper log level control.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. Will switch to using logging instead.

Comment on lines 27 to 28
log = "0.4"
env_logger = "0.10" # Or a newer version
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env_logger dependency is only used in an example binary (test_streaming_parser.rs), not in the library code. It should be a dev-dependency rather than a regular dependency.

In Cargo.toml, move env_logger = "0.10" from the [dependencies] section to a [dev-dependencies] section to avoid including it unnecessarily in production builds.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same.

Comment on lines 1185 to 1197
// Calculate remaining file size from current position
let current_file_size = utxo_file.metadata()?.len();
let remaining_bytes = current_file_size.saturating_sub(position_after_utxos);

if ledger_state_len < 2 {
return Err(anyhow!(
"LedgerState array too short: expected at least 2 elements, got {}",
ledger_state_len
));
}

// Parse CertState [3][1][0]
let cert_state_len = decoder
.array()
.context("Failed to parse CertState array")?
.ok_or_else(|| anyhow!("CertState must be a definite-length array"))?;
info!(
" • Reading remainder of file into memory: {:.1} MB from position {}",
remaining_bytes as f64 / 1024.0 / 1024.0,
position_after_utxos
);

if cert_state_len < 3 {
return Err(anyhow!(
"CertState array too short: expected at least 3 elements, got {}",
cert_state_len
));
}
// Read the entire remainder of the file into memory
let mut remainder_buffer = Vec::with_capacity(remaining_bytes as usize);
utxo_file.read_to_end(&mut remainder_buffer)?;
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code allocates the entire remainder of the file into memory (line 1196-1197), which the comment estimates at ~500MB. However, there's no validation that this is actually a reasonable size. If the snapshot file is larger than expected or corrupted, this could allocate excessive memory or fail.

Consider adding a sanity check on remaining_bytes before allocation (e.g., remaining_bytes > 2GB) and returning an error if it exceeds a reasonable threshold. This would prevent unexpected memory allocation failures.

Copilot uses AI. Check for mistakes.
Comment on lines +2126 to +2131
let raw_snapshots = RawSnapshotsContainer {
mark: parsed_snapshots[0].snapshot_stake.clone(),
set: parsed_snapshots[1].snapshot_stake.clone(),
go: parsed_snapshots[2].snapshot_stake.clone(),
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code uses indexing into parsed_snapshots vector (lines 2127-2129) without bounds checking. While we know the vector should have exactly 3 elements (from the push operations), using direct indexing with [0], [1], [2] could panic if there's a logic error.

Consider using pattern matching or get() with proper error handling for defensive programming:

let [mark, set, go] = parsed_snapshots.as_slice() else {
    return Err(anyhow!("Expected exactly 3 snapshots"));
};
let raw_snapshots = RawSnapshotsContainer {
    mark: mark.snapshot_stake.clone(),
    set: set.snapshot_stake.clone(),
    go: go.snapshot_stake.clone(),
    fee: fee_value,
};
Suggested change
let raw_snapshots = RawSnapshotsContainer {
mark: parsed_snapshots[0].snapshot_stake.clone(),
set: parsed_snapshots[1].snapshot_stake.clone(),
go: parsed_snapshots[2].snapshot_stake.clone(),
let [mark, set, go] = parsed_snapshots.as_slice() else {
return Err(anyhow!("Expected exactly 3 snapshots (Mark, Set, Go), got {}", parsed_snapshots.len()));
};
let raw_snapshots = RawSnapshotsContainer {
mark: mark.snapshot_stake.clone(),
set: set.snapshot_stake.clone(),
go: go.snapshot_stake.clone(),

Copilot uses AI. Check for mistakes.
@echo "This will parse the entire snapshot and collect all data with callbacks..."
@echo "Expected time: ~1-3 minutes for 2.4GB snapshot with 11M UTXOs"
@echo ""
@$(CARGO) run --release --example test_streaming_parser -- "$(SNAPSHOT)"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add rust logging (default is "info")

);
if let Some(count) = metadata.utxo_count {
eprintln!(" • UTXO count: {}", count);
eprintln!(" • UTXO count: {count}");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clippy cleanup

@@ -0,0 +1,246 @@
// ================================================================================================
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole file is just for parsing snapshots, which are complicated. There is a lot of error handling here because it took a lot of work to decipher the structure. I don't want to lose this code for when we migrate to some other structure (like canonical format).

@buddhisthead buddhisthead requested a review from lowhung December 2, 2025 16:37

// Show snapshots info if available
if let Some(snapshots_info) = &metadata.snapshots {
eprintln!(" 📸 Snapshots Info:");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't want to be pedantic, but I'm not a huge fan of emojis in prints / logs

Copy link
Collaborator Author

@buddhisthead buddhisthead Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's a reasonable nit. It will generate a lot more changes because if I remove this one, then I should remove them all. It's actually kind of cool looking in the display, but does cause problems for using other tools like grep, etc.


/// Raw snapshots container with VMap data
#[derive(Debug, Clone)]
pub struct RawSnapshotsContainer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

/// snapshot_stake: stake distribution map (credential -> lovelace amount)
pub snapshot_stake: VMap<StakeCredential, i64>,
// snapshot_delegations: delegation map (credential -> stake pool key hash)
// pub snapshot_delegations: VMap<Credential, StakePool>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very confident we need to decode both snapshot_delegations and snapshot_pool_params in order to calculate rewards. I'd love for @sandtreader to chime in here. At the very least we need to know specifically who has delegated to each pool so that each delegator's rewards can be calculated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, we can add that in the next PR. I would like to get clarity first on exactly what is needed. Do we need to calculate rewards at all? I thought we want to just set them from the state in the snapshot. So, will wait to hear from @sandtreader and then will write up a new ticket or enhance the one we have.

Copy link
Collaborator

@lowhung lowhung Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good with me 👍

* A hybrid approach seeks the file position to the point just after the UTXOs
* At that point, we read the file into memory and use standard mini-cbor decoding
* The main difficulty in using a streaming approach is the complex layout of the
* governance structures.
* This approach is more than 10x faster than using Amaru's lazy decoder, which
* itself doesn't actually save any more memory than the approach take here. The
* reason is that the Lazy decoder must hold all of a structure in memory in order
* for the base decoder to decode it.
* Update the bootstrapper callback for the new mark/set/go snapshot api
* Remove all emojis
@buddhisthead buddhisthead force-pushed the cet/stream-mark-set-go branch from d013f67 to abf7593 Compare December 2, 2025 21:04
@buddhisthead buddhisthead changed the title Cet/stream mark set go feat: update snapshot parser for mark/set/go Dec 2, 2025
@buddhisthead buddhisthead merged commit 6b4a449 into main Dec 2, 2025
2 checks passed
@buddhisthead buddhisthead deleted the cet/stream-mark-set-go branch December 2, 2025 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants