You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Parent issue to track the work required to handle the non-finalisation on Medalla.
Quick notes may be made inline but should be moved to their own issue once we start to flesh them out so we have a clear record of what was wrong and what we did to fix that specific issue.
Persist hot states periodically #2608 Persist hot states periodically to reduce the number of blocks that need to be replayed when regenerating states that have dropped out of the in-memory cache
Use cache to get block and state #2610 Many fork choice updates are being skipped with message "Skipping update of chain head" because of interim updates.
Teku following incorrect chain #2614 Reports that the Teku instances that are in sync are following a different chain to what should be canonical head
find a common ancestor before initiating a sync. #2600 Each time we select a new peer to sync from we start from the finalised checkpoint. We should find a more recent common ancestor if the finalised checkpoint is significantly in the past
We're running smoothly again on Medalla. There's still #2601 and #2616 to complete as follow up but will close this tracking ticket to keep things tidy. Those individual tickets will remain open and get done when we can.
Description
Parent issue to track the work required to handle the non-finalisation on Medalla.
Quick notes may be made inline but should be moved to their own issue once we start to flesh them out so we have a clear record of what was wrong and what we did to fix that specific issue.
CPU and Memory Consumption
Add state regeneration queue to limit concurrent regenerations #2589An excessive number of state regenerations were in progress concurrently resulting in excessive CPU and memory usageProcess gossip off of the netty thread #2583Netty threads were blocked processing gossip messages. Moved to thread pool.P2P Executor Queue Filling Up #2598P2P executor queue is filling upUse cache to get block and state #2610Same state is being regenerated multiple timesUse cache to get block and state #2610Significant number of state regenerations that replay 1 block. Why are these not hitting cache?Persist hot states periodically #2608Persist hot states periodically to reduce the number of blocks that need to be replayed when regenerating states that have dropped out of the in-memory cacheDon't reprocess blocks at startup #2615Skip processing blocks at startup if possibleOptimize checkpoint processing #2609Optimize checkpoint state generationState Cache Size Exceeding Limit #2612State cache exceeding maximum sizeFork choice
ProtoNode: Delta to be subtracted is greater than node weight. #2562Error in fork choice. Was happening prior to Medalla hitting issues but seems much more common now.Use cache to get block and state #2610Many fork choice updates are being skipped with message "Skipping update of chain head" because of interim updates.Teku following incorrect chain #2614Reports that the Teku instances that are in sync are following a different chain to what should be canonical headSync Issues
Be more lenient before deciding a peer is excessively throttling requests #2590Excessive throttling checks were kicking in too often because of all the empty slots on MedallaCheck latest peer status while syncing #2586Check latest peer status while syncingfind a common ancestor before initiating a sync. #2600Each time we select a new peer to sync from we start from the finalised checkpoint. We should find a more recent common ancestor if the finalised checkpoint is significantly in the pastOther
Avoid accessing store each time metrics are retrieved #2584Metrics were creating high contention on the Store locksdeadlock when updating finalized checkpoint #2625deadlock when updating finalized checkpointThe text was updated successfully, but these errors were encountered: