Skip to content

log :: 2025‐08

Matthias Benkort edited this page Sep 9, 2025 · 34 revisions

Tip

KEYWORDS

Chain Sync, Chain Selection, Deterministic Simulation, Stage Graph, P2P Networking

SUMMARY

Refactors to the chain sync client (split clients to avoid lock/starvation, “caught-up” semantics) with a lingering pure-stage .await panic under tracing and a plan to land changes in small PRs.

Chain selection work delivers major perf gains and fixes (random walks, rollback cases, profiling-driven data-structure swaps), with a review-ready PR and end-to-end investigations.

The simulation framework advances toward deterministic runs: Rust simulator + scheduler/event-loop APIs, external-effect hooks, seeds/CBOR traces, and handover notes for time simulation & coverage.

Networking/ops updates include a dedicated network crate, N2N clients, pipelining discussions for Pallas, “adversarial node” scaffolding, and easier node bootstrapping; ongoing metrics/observability cleanup supports debugging.

2025-08-29

Chain sync client refactoring

  • Got into troubles with merge conflicts following other changes so spent a lot of time resolving those

  • I tried to add back the timeout() around await_next() to prevent lags caused by deadlocks but this lead to nowhere

  • I stole RK's PeerClient splitting code to avoid having a single structure holding both chain_sync and fetch_block clients which lead to starvation:

    1. we need a mutable reference on a client to be able to use it
    2. if we share a mutable reference it needs to be behind a Mutex which entails lock() to access it
    3. By decomposing PeerClient and resuing its constituent parts we don't have a contention anymore and can simply await_next when we are caught up on chain sync, which will block the stage without blocking subsequent fetch blocks
  • Unfortunately, there is some issue somewhere which I don't understand:

    thread 'tokio-runtime-worker' (1221021) panicked at /home/curry/amaru/crates/pure-stage/src/tokio.rs:257:13:
     stage `select_chain-4` used .await on something that was not a stage effect
     stack backtrace:
     0: __rustc::rust_begin_unwind
     1: core::panicking::panic_fmt
     2: <pure_stage::tokio::TokioBuilder as pure_stage::stagegraph::StageGraph>::wire_up::{{closure}}::{{closure}}
     3: tokio::runtime::task::core::Core<T,S>::poll
     4: tokio::runtime::task::harness::Harness<T,S>::poll
     5: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
     6: tokio::runtime::scheduler::multi_thread::worker::Context::run
     7: tokio::runtime::context::scoped::Scoped<T>::set
     8: tokio::runtime::context::runtime::enter_runtime
     9: tokio::runtime::scheduler::multi_thread::worker::run
    10: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
    11: tokio::runtime::task::core::Core<T,S>::poll
    12: tokio::runtime::task::harness::Harness<T,S>::poll
    note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
    
  • This issue does not occur systematically when I add --with-json-traces so there's perhaps some connection with tracing system

  • I have tried to apply the "split client" change to main and it seems to be working fine so I probably messed up with the wiring when rebasing. I will do a smaller PR to just handle the split, then another one to handle CaughtUp logic independently

  • This whole journey is another good reminder small increments are key to avoid wasting time 🤦 when one is not 100% focused

Consensus simulation

Getting acquainted with the simulation framework:

  • Refactored the retrieval of arguments: https://github.com/pragma-org/amaru/pull/414.
  • Added comments and documentation to the code.
  • Extracted some structs and functions to their own file.
  • Shared the random_bytes function that was duplicated.

Chain selection

The PR is now ready for review. The most notable changes are:

  • Some Cargo.toml changes to make sure that the pprof crate used to profile the bench does not get compiled under Windows (since it can't be).
  • A faster access to the best chain when updating the Me tracker: we select the chain that has self.best_chain as its last header.
  • The bench now has an execution time of 481ns per action and running the node with Jaeger showed a 10x improvement thanks to the modification above.
  • The random_walk algorithm has been fixed so that a given peer truly walks several branches of a tree of headers.

2025-08-28

Chain selection

Despite getting good execution times per action on the HeadersTree bench (roughly 1 to 3µs), the end-to-end job still times-out when the size of the HeadersTree is set to k. In order to investigate this I'm installing a Cardano node on my machine and trying to run the demo.

Additionally I realized that the random walk done for simulating actions by different peers on a tree of headers is not very complete:

  • When we rollback the actions on a given child, we don't resume with the next.
  • Fixing this is not entirely trivial because of the max_length of a tree which move the HeadersTree off the original root after a number of roll forwards. Then some actions will become invalid and the simulation will fail. => I need to fix this later.

In order to investigate the time-out issue, I tried to reproduce it locally and run the amaru demo (which I should have started by doing a long time ago):

  • I built a cardano-node (using nix, fortunately no issue there).
  • I started by initializing the node from preprod, but each epoch is taking a long time to recompute.
  • So I downloaded the mithril-client and a preprod snapshot.
  • I started my local cardano-node with this snapshot.
  • I bootstrapped an amaru node connected to the cardano-node
    • The message AMARU: invalid AMARU_LOG value (environment variable not found), falling back to 'error,amaru=debug' is not great IMO. Instead of saying "invalid" we could just say what the default will be. "invalid" makes me think I made a mistake.
  • With Arnaud we observed an increasing memory consumption for the node. I also had this impression but things seem to stabilize around 700Mb.
  • I'm also able to see spans in Jaeger but that doesn't tell me much right now.

For now I have the impression that I'm looking for a needle in a haystack. I will try to gather more clues tomorrow and possibly run some comparisons with main to try to understand how the size of the HeadersTree make a difference.

Compiling the amaru node in release mode

This takes a long time. I wanted to check if linking was an issue. I tried a faster linker on Mac (ld64.lld) but that didn't make a difference.

Simulation testing framework handover

As SA is going to leave the project for new adventures, we discussed what's the current state of the simulation framework, and how to handle handover to the rest of the team:

  • Checkout EDR: https://github.com/pragma-org/amaru/blob/main/engineering-decision-records/011-deterministic-simulation-testing.md

  • Started design work on simulating time in Haskell, with blog post

  • Main logic:

    • nextTime : Maybe Time = what's your next interesting time point, for each node
    • advanceToTime : Time -> [Outgoing] = advance to time and collect all messages that should be sent
  • main event loop is:

    • collect all messages with time arrival
    • enqueue them in heap
    • ask nodes for their next interesting time
    • enqueue those in heap
    • dequeue highest priority action
    • either send message to node or trigger internal action
  • Interfaces with the environment are abstracted:

    • NodeHandle is an interface defining how nodes communicate together
    • Runtime is the interface than handles send/receive from nodes in the event loop
    • each node is a state machine Input -> State -> (State, [Effect])
  • nodes leaving/joining is not simulated -> could be done with a specific kind of message from the environment

  • from the syntax (with bindings), it defines semantics (first-order, sequential basic actions ~ bytecode)

  • We then looked at existing rust simulator code:

    • what about fetch_block stage? how to handle RPC?
    • ATM we can deal with it as an ExternalEffect
    • once we have network-v2 from Pallas, we shall be able to integrate the actual network handling state machine as one or more stages and run them as part of the simulation, including time handling
    • racing several calls is not possible currently because RPC in pure-stage are "blocking", eg. only one call can happen at a time

TODO

  • ET takes ownership of the simulation framework working with SA
  • Goals (in priority order):
    • Ensure Simulation covers all consensus stage
      • with simulated externals effects
      • Be able to run simulation with synthetic headers and blocks (eg. faking the ledger)
    • implement time handling in simulation event loop
  • Rewire stages pipeline to put select_chain at the end
    • check RK's current work

2025-08-27

Chain selection

Today was a lot of polishing to get the PR ready for review, and... the end-to-end snapshot job failed for performance reasons. After a few shots in the dark, I resorted to using a profiler, first by using https://github.com/mstange/samply, but this was impractical: during the headers tree bench there is a long preparation phase that dwarfs a lot of the execution. I eventually generated a flamegraph using https://crates.io/crates/pprof and this revealed an issue that I completely overlooked. The Tree<H> that I added to track parent->child relationship was very expensive to update with new nodes. I have now replaced it by a map BTreeMap<HeaderHash, Vec<HeaderHash>> simply storing those relationships and the results of the bench are much better. For example here is an execution of the bench with 10 peers:

start executing the actions
tree size: 86367
tree leaves: 19470
number of peers: 10
number of actions: 56476
number of rollbacks: 2717
time after executing actions: 171.249375ms
time per action: 3.032µs
headers best chain size after executing actions: 2160
number of results: 56476

A previous execution was more looking like:

start executing the actions
tree size: 86367
tree leaves: 19470
number of peers: 10
number of actions: 56476
number of rollbacks: 2717
time after executing actions: 7.11901925s
time per action: 126.053µs
headers best chain size after executing actions: 2160
number of results: 56476

The difference is so significative that I suppose that this will fix the end-to-end job performance. Next steps:

  • Rebase on main with Roland's latest changes that require a merge.
  • Hopefully final cleanup.

2025-08-26

Chain selection

The new PR for the chain selection is almost ready for review, I'm only adjusting a few things after coderabbitai's suggestions. The biggest win is the processing time, at least on the new bench. I have tested the processing of a large number of peers doing a random walk across a large tree and the result is < 1ms per action:

tree size: 86367
tree leaves: 19470
number of actions: 1088156
number of rollbacks: 51679
number of peers: 200
time after executing actions: 837.041801125s
time per action: 769.229µs
headers best chain size after executing actions: 2160
number of results: 1088156

The README has also been modified to better represent the current data structure but I still need to update the diagrams.

I'll make the PR reviewable as soon as possible now.

NOTE: We considered using criterion for the benchmarks but it does not appear well suited to how we do things. What could be useful is to write criterion benches for different size of trees and run a single action in a tight loop with multiple headers in order to get statistics on the execution time for a single action, but that would require significant work and this PR has already been delayed for too long.

Network chain sync

In my work on "pipelining" chain sync (which is not really pipelining but just cleaning up code and removing some unneeded locking) I removed the trace that says we are "caught-up" and which is probably used somewhere down the line. It's useful to know that we are caught-up with the chain in general, so I would like to restore that property. Turns out I was looking for a definition of what caught-up means in the consensus node and it's defined in the Genesis document:

The GSM transitions from Syncing to CaughtUp when every peer simultaneously indicates that is has no subsequent headers to send and the syncing node has selected a chain no worse than any received header. This trigger is notably not time based, because an adversary is free to create an alternative chain with a very young tip in order to cause the node to lower its defenses by selecting it.

It seems relatively straightforward in the chain_sync_client to propagate a message to our downstream peers that would indicate that situation, with downstream peers doing whatever they want with it, instead of having a shared memory variable being updated, for one peer.

However it becomes more tricky for multiple peers as we need them to be simultaneously idle. I am spelunking in the consensus code, essentially related to Genesis and ChainSyncJump to try to find the place where this is set exactly and it's of course not quite trivial. The entry point is in the GSM module. The blockUntilCaughtUp function translates the above condition into code.

In the afternoon we put the finishing touch to the work on scaffolding the "Adversarial node" for running inside antithesis, in a mob programming session with the HAL team. It turned out to be a relatively straightforward journey and we ultimately managed to build a program that synchronizes a fixed number of headers through N2N protocol, that can be called within a docker container as required by AT's Test composer. What's enlightening is to contrast the amount of code one has to write to connect a client to a node in Amaru in Haskell and Rust.

2025-08-22

Chain selection

  • Finished the refactor that now tracks all chain headers.
  • I haven't solved the special case at the end of the README but I'm not quite sure if it needs solving now.
  • I think I eventually understand better the ChainSync protocol in the sense that it is trying to maximize throughput, minimize latency and as such rollbacks might happen not because of a switch but just because a block has been found to be invalid later.
  • I re-arranged the presentation of diagnostic data at the end of a failed property test so that:
    • We see exactly the last action executed + before state / after state for both the HeadersTree and the expected chain(s)
    • We get a better list of actions that can be copied directly to a unit test.
  • I finally understood how to use the debugger in RustRover: you need to pass --profile dev-debug in the run configuration. This can be quite useful sometimes.
  • I'm definitely going to stop trying to use the multimap library in Rust. It is immature to the point of being a correctness risk. For example you conceptually insert elements at the end of a list for a key but you get at the beginning, so that map.get(k, map.insert(k, v)) != v which is not very intuitive.
  • Next up should be: => Performance tests as the first order of priority => A discussion of special rollback cases to make sure we cover everything => Maybe add a property test showing that feeding the output of a HeadersTree to another HeadersTree gives the same result. => Preparation of PR with more clean-ups (all tests + clippy are passing now on the branch called etorreborre/test/test-headers-tree-with-random-tree-actions-refactor-track-peer-chains)

2025-08-21

Chain selection

Today was a lot of rework and I'm still in an unfinished state:

  • The attempt to track only the tips of the best chains doesn't work.
  • We need to track:
    • The headers, by hash, to retrieve child->parent relationships
    • A Tree<H>, to retrieve parent->child relationship (a Tree zipper could do both...)
    • Full chains for each peer (with only the header hashes).
    • A distinguished "best chain". => I need to make all the tests pass again with that
  • The case of an HeaderTree that is instantiated without any peer yet is a bit troublesome. I might go back to having at least a Me peer that we introduced with Arnaud.
  • The fact that a header has a parent hash that is sometimes None when the parent header is Genesis which actually has a pseudo-hash is also a bit troublesome. I wish there was a more disciplined way of dealing with this.
  • Eventually I'd like to rewrite many of our tests in a standardized way using the Action data type that we introduced and some automatic printlns when something fails.

2025-08-20

Network layer

  • Spent the day polishing this PR, mostly to use the ChainSyncClient in the various places where it's used, eg. Amaru pull stage and the import_headers command

    • The latter's code was much simplified as I replaced use of pull stage and manual handling of responses with ChainSyncClient's API.
  • The part I am not sure about is the removal of the piece of state that tracks whether or not we are caught up with the chain. It's an Arc referenced that's shared between stages, namely the pull and ledger stages:

    • It's updated in the pull stage: It's true until the peer sends us an Await message at which point it becomes false
    • It's used in the ledger stage: When the node is caught up, we info! every chain extension, whereas we only trace! when it's not
  • While tactically meaningful, it does not make much sense in the long run because there are multiple pull stages, one per upstream peer, and it could be the case we flip flop between those tests depending on peer's state

    • We want a clear strategy to deal with the question of whether or not we are at the tip of the chain, and this also might have an impact elsewhere: Currently the cardano-node does not open its downstream port until it's fully caught up, do we want to mimic that behaviour? I am unsure what the impacts are on the security of the protocol, or it could be that downstream would actually disconnect if they find our peer is lagging behind, or even flag it as unreliable in the p2p network?
    • This concern should be handled in the chain selection stage, possibly using a message to propagate the information to other stages
  • I also went back quickly through Duncan's video which can be found from the typed-protocols page. In the last part he explains how pipelining can be superimposed over an existing protocol client:

    • It boils down to saying that when the continuation of handling a server response to a request that is guaranteed to set us in the same state (eg. where we can send requests again), then you can enqueue this continuation and immediately proceed to sending more requests
    • On the server's side, the requests will accumulate in their incoming buffer and simply be dequeued and handled one-by-one, ie. there's no need to implement specific logic for pipelining
    • On the client side, a separate thread or some interleaved action takes care of handling the responses, applying queued continuation to each of them
  • In pallas, this could be implemented for chain sync roughly as:

    • when pipelining, the client uses send_msg_chunks to send RequestNext, which does not change its state and enqueues the message directly into the buffer
    • it maintains a queue of "handlers", could simply be marker messages, which is dequeued either in a separate task or when enough messages have been sent (capped by distance from tip perhaps?)
    • the responses "handlers" are dequeued, the recv_full_msg is called and dequeues the responses one by one, handling them without changing the state again
    • The state (Agency) is changed at the end of the pipelining process
  • it seems this would require some significant change in Pallas, like a dedicated chainsync client?

Chain selection

I started a refactoring along the following ideas:

  • Use a BTreeMap<Hash, H> as an arena for headers
  • Use a BTreeMap<Peer, Hash> to track the peer position
  • Use a Vec<Hash> to track the hashes at the tip of the best chains -> by default use the .first() to denote the best chain

This is enough to implement most of the behaviors (and the code is simpler) but we need additional variables to keep performances in check:

  • The current best_length that is incremented when a header is added on top of a best_chain.
  • An additional Tree<H> to be able to follow parent -> child relationship when pruning the tree if the best chain length > max_length

Some tests are still failing with this approach but I have good hope to converge on something that's working. => We will still have to confirm that the performances are ok on large trees.

2025-08-19

Network layer

  • Spent some time in between other stuff reworking the chain_sync_client, ensuring all expected behaviour is covered by tests. The good news is that I got a n2n program that can sync a remote node using N2N protocol and the chain_sync_client. The less good news, or the puzzling news, is that it seems not straightforward to implement pipelining using pallas-network. Or rather that pallas-network does not provide pipelining out of the box.
  • Not sure how much this is important for us right now, the networking is not the bottleneck anyway, unless we end up in high latency connections
    • Perhaps something to experiment with using qc to increase latency artifically and see how amaru behaves?
    • when latency is high, pipeline is a way to "hide" or rahter "amortize" it
  • Had a quick discussion with SC who says that's definitely not implemented and not taken into account in the state machine for the client, would be interesting to see if I can design a pipelined chain sync client directly in Pallas? Is this interesting with the new framework he is working on anyway?
  • Current goal is to have feature parity with what we have now, while removing explicit locking and adding test coverage, then I will move other parts of the network stuff (eg. chain_forward)
  • Ironically I spent 2-3 hours in the morning with HAL team working on a N2N chain sync client for the adversarial node, and it's been far from trivial to even have something compiling that would connect to the server and start syncing.

Chain selection [ET]

  • Fixed the failing case of a rollback with 3 peers that was found via the property tests => now all the tests pass with nb peers = 4 and 100000 cases (including cases where the chain grows larger than max_length).
  • Commented, documented the test code and removed some unused functions
  • I thought a bit more about the case mentioned at the end of the README:
    • I added a failing test for it.
    • I think that we don't have to track all the peers for now but rather the best chains tips (maybe via an additional max-heap?).
      • Keeping track of the peers only seems useful to detect badly behaved peers and deal with them which is slightly outside of the current scope.
    • Once we make the failing test pass, we should investigate the performances and fix whatever is currently too naive. The cool thing is that we have a good test harness now (labeling the data generation would be cool though).

2025-08-18

Chain selection [AB + ET]

  • The property test with random walks on a pre-generated random header tree has found 2 bugs => They are now fixed and have corresponding unit tests
  • The data generation has been modified to only generate rollback that don't go too far in the past, because this makes expectations hard to handle.
  • We implemented better Display instances to be able to pick the inputs of a failed proptest and turn them easily into a unit test (this needs a bit of refactoring).
  • Running the property tests with 2 peers, long chains and a max_length < generated tree depth, passes ok with a large number of cases (100000).
  • However we found two issues with 3 peers:
    • One missing switch to fork.
    • One early switch to fork corresponding to the last warning that Arnaud highlighted in the README. => This will probably require to keep some additional state.
  • To be continued

2025-08-15

Separate Network layer

  • We expect soon the newly improved version of pallas-network to start being integrated in amaru, with the p2p logic and the cleaner separation of low-level I/Os and state management that @santicarmu presented us a couple of months ago
  • It therefore seemed to make sense to extract the network logic in its own crate, alongside consensus and ledger, as it's now mixed up with stages construction and management in pull.rs and forward_chain
  • I extracted a amaru-network crate, first moving the network code in pull.rs there
  • Doing so I realized the way we pulled chains from our peers was too complex and inefficient:
    • We rely on a lock to decide whether or not to wait or to request a new header, but this should not be needed as the underlying protocol already implements a state machine and is blocking when waiting
    • dolos is a good example of chain sync implementation which does not require a lock
  • I have started refactoring the chain_sync_client in that direction, first working on testing infrastructure which was somewhat annoying to set up as it requires to mock the underlying network primitives and therefore async traits which is not supported out-of-the-box by Rust
    • I managed to get a first simple test working 💪 with some simple precanned messages delivery

Chain selection [AB + ET]

More work on testing the chain selection with random trees of headers:

  1. We tried to implement some data generation where we generate a tree of headers, and as we unfold it we generate the corresponding SelectRollForward and SelectRollback actions -> this is too complicated!
    • We had a hard time fixing compilation errors while writing a recursive Strategy producing function, with a seemingly never ending stream of type and borrow checker errors, fixed with several layers of clone() into_iter() and what not...
    • Although it seemed sensible to try to leverage proptest's infrastructure to generate our chains, in order to benefit from shrinking and make it possible to choose different possible paths within the generation process
    • But We realized that the strategy of mixing up the generation of headers' tree and the generation of each peer's simulated behaviour was a dead end
  2. We reverted to:
    1. Generate a random tree but, this time, not involving the HeadersTree at all (in order to avoid mixing production code in test code), but a simple Tree<H>.
    2. Generate a "random walk" on that tree for a fixed set of peers:
      1. The walk starts at the root of the tree and for each child:
        1. Produces a random walk starting from that child.
        2. Optionally produces a rollback to run at that child (then we resume the walk from the next child in the list of children).
        3. This part has been done with a StdRng seeded with a proptest value for ease of implementation and reproducibility.
  3. [ET] ran it on small cases, depth=3, and this already found a bug!
    1. I added a corresponding unit test (2 peers at the same best chain tip, one rolls back on that chain) and fixed the issue
    2. Now tests pass with depth=3 cases=1000 but fail with depth=4 cases=100 => to be continued...

2025-08-14

Chain selection [ET]

More work on testing the chain selection with random trees of headers:

  1. I did a refactor of the tests to isolate:
    1. The data generation code. I parametrized more functions with a rng parameter in order to really control the generation via proptest.
    2. The simplified TestHeader.
    3. The support functions for the headers arena.
    4. A fluent configuration pattern for the ProptestConfig (strangely the proptest library is missing in that regards).
  2. I investigated the manipulation of seeds with proptest:
    1. It seems that the full seed printed as "cc c49d0698a56362f4808cf5d01c1adaea1a013b53752acfa26137713d3865e448" when it appears on CI needs to be copied verbatim in the PROPTEST_RNG_SEED (which I thought was just PROPTEST_SEED at first). I guess that the initial cc means ChaCha.
  3. We had a discussion about Arnaud about the best way to use a random headers tree to test the chain selection.
    1. What I did is too complicated and runs the risk of hiding failures because it generates data with the thing we want to test in the first place.
    2. I'm going to change what's currently there so that:
      1. We generate only SelectRollForward and SelectRollback values, matching exactly the interface ot the HeadersTree.
      2. We run a simpler selection algorithm to find the best chain based on the ForwardChainSelection events emitted by the HeadersTree.
      3. We assert that the resulting chain is one of the best chain found in the original random tree of headers.
        1. I should probably implement a separate tree structure just for that (I started with this idea at first) instead of reusing the arena and some prod. functions to create the initial tree.
    3. We implement some interesting ways to produce SelectRollback actions based off the initial random tree of headers. That's what I wanted to avoid doing by using a first pass of using a HeadersTree fed only with SelectRollForward events, and producing SelectRollback events.

=> That means that there remains more work than I expected but we should eventually have a clean testing pipeline, where we can: 1. Vary the input events including adversarial events. 2. Possibly a generalization to an arbitrary IsHeader in order to test different scenarios at the unit or integration level(s). 3. Just check the correctness of the HeadersTree outputs.

2025-08-13

Chain selection [ET]

There is now a test in place that:

  1. Generates a tree of headers.
  2. Sets a peer on top of each tip of the tree
  3. Transforms that tree into a set of "actions" that are sent to "our" node, as if they were coming from different peers nodes
    1. The actions are shuffled to simulate concurrency but topologically sorted so that we respect the parent->child relationship for a given peer.
  4. Those actions are used as roll forward action to initialize our node state, using the chain selection algorithm.
  5. Executing those roll forward commands generate events as an output of the chain selection algorithm: new tip, no change, rollback.
  6. Those events are sent to a downstream node (just a brand new HeadersTree in the scope of this test) and executed as select_roll_forward or select_rollback (no change events are obviously ignored).
  7. When everything has been executed we check that:
    1. Our node has found a best chain that is the length of the upstream tree best chain (it might be a different branch of the tree though).
    2. Our best chain and the downstream best chain are the same.

In order to implement this test more easily I had to create a TestHeader that is a bit different from the existing FakeHeader in the sense that it has a hash field that is returned directly by the IsHeader::hash() method. That makes it easier to build parent->child header relationships, whereas the IsHeader::hash() on a FakeHeader is a hash of the serialized data and is hard to control.

Now that the test is passing I'm going to do some refactoring because the headers_tree.rs file has grown out of proportions :-).

2025-08-12

Chain selection [ET]

  • The PR is now ready for review.
  • We need to generate random header trees for testing the chain selection:
    • I tried to move the generators I implemented yesterday to proptest.
    • The "difficult" case is the question of how to deal with recursivity, especially with mutable state since we need to create an arena of header nodes inside the HeadersTree. So I tried instead to generate a simple tree structure using the prop_recursive combinator of proptest.
    • But it doesn't really do what it says on the tin! I even took the provided JSON example in the doc and test it. The doc says that it is possible to set the number of nodes in a tree to an approximate number, and the depth to a precise number but this wasn't the case
    • Eventually I reverted to something much simpler: generating a random HeadersTree using a StdRng with a seed and get that seed from proptest. So if a proptest fails, with the proptest seed we should generate the same seed we give the StdRng and get back to same HeadersTree as a counter-example. The only thing we lose is shrinking. => The next step is now to generate random actions on that tree: roll forward, rollback for various peers. => I'm thinking of:
      • Generating a tree of headers
      • Creating lists of roll forward events for that tree, for each path from the root to each tip of the tree
      • Feed a random list of those events to the selection algorithm -> get back another list of events, which should contain some rollbacks.
      • Feed that new list of events to the selection algorithm and check if we eventually get the correct best chain.

2025-08-11

Chain selection [AB + ET]

  • The HeadersTree is now integrated to the rest of the code. There is still an issue though with its use during the simulation around some assumptions of what is the proper initial state (with an initial header or not, with some existing peers or not). => This needs to be fixed before the PR is submitted for review. => We tried to start the node after bootstrap and getting chain sync events from just one peer and it works ok. Notes on using the README to start a node: => It would be nice to explain why we do each step and what it does succinctly. => It would also be good to give some examples of publicly accessible nodes to sync from or how to find them. => It would be nice to have environment variables set-up by default with clap. For example: PEER_ADDRESS

  • There is now a function to generate random trees of blocks. It has to be retro-fitted though to work with proptest using the appropriate combinators. => The next step will be to use it to simulate upstream chain events.

2025-08-08

Chain selection [AB + ET]

  • We finished writing the test for the rollback.
  • We implemented a nicer way to pretty print the HeadersTree, which is a must-have for debugging.
  • All the tests are now passing and the code has been refactored / simplified a lot to make the overall process for chain selection clear:
    1. A new header is inserted in the tree structure or / and a peer pointer to that structure is updated.
    2. Based on the changes to the tree we compute what is the new best chain / best peer, and store that information.
    3. Based on the action, we determine the kind of event to emit downstream: no change, fork, new tip, rollback.
    4. We trim unused nodes in the arena to reclaim space when possible

Of course the devil is in the details, manipulating node indices, navigating the tree and being able to accurately determine the before / after situation :-).

2025-08-07

Weekly simulation testing update (by SA)

I've been working on prototyping how to implement simulation of time, as outlined in the simulation testing EDR. This includes partitions, timers going off causing retires, all simulated so that we don't have to wait for the timeouts. I think I've got something that works, need to clean up the code and document it.

Chain selection [AB + ET]

We implemented the rollback today. The code is still being refactored to make sure that:

  • We track just enough data to support roll forward / rollback.
  • We don't duplicate behaviours across those 2 events.

The algorithm is roughly:

  1. Add a header to the arena if necessary.
  2. Get the node id of where a peer is now at (possibly the node id created by the previous addition).
  3. Calculate the type of action that happens: new tip, no change, fork, rollback,...
  4. Update the tracking state based on that action (who is the best peer now, where is the best tip).
  5. Emit a downstream event depending on the upstream one (roll forward or rollback)

2025-08-06

Trimming headers' tree on chain growth [AB + ET]

Data structure

We discussed the structure of the headers tree in more details, esp. to trim/prune candidates upon growth beyond k

  • We realized that under the current Amaru requirements, we should always start data structure (volatile) prefilled with "trusted" k headers
    • This implies that we won't be able to join a testnet right away at genesis
    • This might simplify a lot the design of the chain selection because we can then assume that any header we receive that does not link to a header in this headers tree must be rejected as a potentially "adversarial" fork (at least, the peer is misbehaving and not anymore on the right chain)
    • This also simplifies Amaru because it means we don't need to care about Genesis 😬 ?
    • Practically in the current design, this means that we don't need to track different anchors per fragments
    • However, we can start with an empty block tree, either anchored at genesis (Testnet case), or at some other header (production case) that is far enough in the past to be guaranteed to be beyond 2k blocks
      • In this case, we are trusting this anchor/starting point from our bootstrap process
  • In the end, we'll still allow to start with an empty data structure and keep the rule that any new header (or rollback) must point to a header in the headers tree
    • When the best chain is shorter than k, we assume the root of the headers tree is "genesis-like"
    • Interestingly, running code coverage highlights the fact we don't have a testcase for the case where there's no best chain (it's an Option<> in the structure)
    • So just like what's done in chain_Selection module, we used a Tip to point to that best chain, which is always defined, and set it to something else than Origin when initializing the structure with definite peers and some headers.

Memory control

  • The HeadersTree can be initialized with a capacity to reserve enough memory from the start.
  • We implemented and tested the fact that the HeadersTree structure will not grow indefinitely when new headers are being added:
    • Indeed when the best chain is trimmed, a new anchor is defined for that chain (that becomes the new tree root), and the memory slot taken by the original root node will be reused by the arena backing the HeadersTree for subsequent insertions.

Testing panics [ET]

Tests that are executed with #[should_panic] print a stack message in the console which can be confusing when a test is passing correctly.

  • I added a must_panic macro to silence the panic hook just for the test.
  • That macro also accepts an expected message.

Disappearance of a mysterious traveler (@KtorZ)

While working on #369 - Add more preview snapshots, I run into an interesting case of an account delegated to a DRep which mysteriously ends up being undelegated after a while -- without any action from the DRep or the account. Chasing that down, it turns it's a known bug from the Haskell ledger, which we also knew of, but that had more consequences than we anticipated.

Context

  • We have the following account: 3dd92899f772d4616a1a7fcbf1c4b161deb8f7eecc77d6bbecee62ac (verification key);
  • Which delegates to a DRep f0ed00410031f3288d7889aa896cfdad79a7441885d3bae8982ac151 in epoch 669;
  • In epoch 680, however, the Haskell ledger reports this account as not-delegated; thus impacting the voting stake distribution for that epoch and causing an error in our snapshot tests.
  • The DRep in epoch 680 is still active, and there's no re-delegation certificate pertaining to the account in between epoch 669 and 680, nor there's any DRep de-registration pertaining to that DRep.

So what's going on?

An apparently unrelated cause

If we carefully monitor the changes in the Haskell ledger, and compare the ledger state after each block application. We discover that the delegation is being reset after a block at slot 58810216 in epoch 680. That block contains two transactions:

  1. A first one that is a simple payment from a wallet to another;
  2. A second, which contains a DRep de-registration certificate for the DRep: 4589ce916b6aba99aa5531364aa2623d7e0a9e3c35fffe6700bb0e86

Interestingly, the account under scrutiny has been delegated to this DRep by the past; briefly, in epoch 656, but was re-delegated 22 times since. So what's going on?

A known bug with more consequences

The version 9 of the protocol contained a synchronization discrepancy between two data sources related to DRep delegations:

  • On the one hand, the ledger uses a structure akin to a map (a.k.a UMap) from stake credential to an optional DRep (amongst other things).
  • In another part of the ledger is stored a map of drep credential to drep state; where states contain a list of delegators credentials.

The same information can be derived from both structures; so it is paramount to keep them in sync... which was not the case during re-delegation in version 9. Prior to the fix in version 10, any re-delegation would only modify the UMap, but will leave DRep's delegators list untouched. This is visible when querying the ledger state via the GetDRepState query, which will contain duplicates in the delegators of some DRep (thus indicating an account being delegated to multiple DReps, in principle...).

So, since our account has in the past been delegated to 4589ce916b6aba99aa5531364aa2623d7e0a9e3c35fffe6700bb0e86; when it was re-delegated, it remained in the list of delegators for that DRep. So when that DRep unregistered, it cleared the state of all accounts that were one day delegated to it.

2025-08-05

Chain selection improvements

The roll forward behaviour for new headers is now implemented with an improved algorithm:

  • In case of a new tip on the best chain by another peer (than the one which contributed last to the best chain), no rollback event is created.
  • In case of a fork, we compute the closest intersection point between the new best chain and the previous one to return
    • The intersection header
    • The new headers after this intersection header

2025-08-04

Eric's onboarding

  • Read, read, read on Cardano / Ourobouros:
  • Read the Engineering Decsion Records.
  • Watched the Pragma demos (they're getting longer and longer, is the next one going to take 2 hours :-)?).
  • Made a first pass on the code, contributed a first syntactic PR.
  • Had a meeting with RK to discuss pure-stage and the next steps for its integration:
    • How to integrate storage as an effect?
    • How to progressively integrate pure-stage to the rest of the codebase (and replace gasket)?

Chain selection improvements

The chain selection process logically works on a tree of headers representing chains contributed by the node's peers.

The current data structure representing those chains duplicates headers that are common in the tree, which is both wasteful and not the best for determining common ancestors.

We (AB and ET) started using an arena-back tree implementation to store those headers using the indextree library.

Clone this wiki locally