feat: add SkipMetricsTracking property to DbSettings by cbermudez97 · Pull Request #11515 · NethermindEth/nethermind

cbermudez97 · 2026-05-07T14:22:54Z

Changes

Introduced SkipMetricsTracking property in DbSettings to control metrics tracking for specific databases.
Updated FullPruningInnerDbFactory to set SkipMetricsTracking to true for inner databases to prevent stale references after pruning.
Added unit tests to verify the behavior of metrics tracking based on the new property.
Enhanced DbMonitoringModule to respect the SkipMetricsTracking setting when adding databases to the tracker.

Types of changes

What types of changes does your code introduce?

Bugfix (a non-breaking change that fixes an issue)
New feature (a non-breaking change that adds functionality)
Breaking change (a change that causes existing functionality not to work as expected)
Optimization
Refactoring
Documentation update
Build-related changes
Other: Description

Testing

Requires testing

Yes
No

If yes, did you write tests?

Yes
No

- Introduced SkipMetricsTracking property in DbSettings to control metrics tracking for specific databases. - Updated FullPruningInnerDbFactory to set SkipMetricsTracking to true for inner databases to prevent stale references after pruning. - Added unit tests to verify the behavior of metrics tracking based on the new property. - Enhanced DbMonitoringModule to respect the SkipMetricsTracking setting when adding databases to the tracker.

claude · 2026-05-07T14:23:27Z

Claude finished @cbermudez97's task in 3m 42s —— View job

PR Review: `feat: add SkipMetricsTracking property to DbSettings`

Summary

The fix is correct and well-designed. The root cause is clear: FullPruningInnerDbFactory creates indexed sub-DBs that get registered into the DbTracker via the DbFactoryInterceptor. After a pruning cycle, the old inner DB is discarded but still held in the tracker — a stale reference. The solution is sound:

FullPruningInnerDbFactory sets SkipMetricsTracking = true on all inner DB settings so the interceptor never registers them.
WorldStateModule manually registers the outer FullPruningDb wrapper once with a stable name. Since FullPruningDb.GatherMetric() delegates to _currentDb.GatherMetric(), the tracker always reads from whatever inner DB is currently active — automatically correct across pruning cycles.
The DbFactoryInterceptor correctly gates on !dbSettings.SkipMetricsTracking for both CreateDb and CreateColumnsDb.

The propagation through DbSettings.Clone() is safe: MemberwiseClone copies the field, then GetRocksDbSettings explicitly overwrites it to true after cloning.

Findings

Severity	Count
Critical	0
High	0
Medium	0
Low	3

Low — SkipMetricsTracking lacks XML docs (DbSettings.cs:15)
The coding style guide requires <summary> on all public APIs. The semantics here are non-obvious (callers don't know what it affects), so a one-liner doc would help. Other properties in the same class are also undocumented, so this is consistent with the existing pattern.

Low — TryAdd silently drops duplicate registrations (WorldStateModule.cs:45)
Not a production issue (singletons aren't recreated), but worth being aware of in test teardown/rebuild scenarios.

Low — No end-to-end regression test for the stale-reference bug (DbTrackerTests.cs:81)
TestSkipMetricsTracking validates the mechanism, but there's no test that exercises the full scenario: create FullPruningDb → trigger pruning cycle via TryStartPruning/Commit → assert tracker still reads from the new inner DB. The unit test is sufficient to merge, but an integration test would give stronger confidence.

Verdict: ready to merge. All findings are low-severity and none block correctness.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a SkipMetricsTracking flag to DbSettings and updates DB monitoring to avoid tracking ephemeral inner DB instances created during full pruning, preventing stale references and improving resilience when metric gathering fails.

Changes:

Add DbSettings.SkipMetricsTracking and honor it in DbMonitoringModule’s DB factory interceptor.
Mark full-pruning inner DB instances as skipped, and register the stable FullPruningDb wrapper with the tracker.
Add unit tests for skip-tracking behavior and for per-DB metric gather exception isolation.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/Nethermind/Nethermind.Init/Modules/WorldStateModule.cs	Registers the `FullPruningDb` wrapper in the tracker to keep metrics tied to the active inner DB across prune cycles.
src/Nethermind/Nethermind.Init/Modules/DbMonitoringModule.cs	Skips tracking based on `SkipMetricsTracking` and makes metric updates resilient to per-DB exceptions.
src/Nethermind/Nethermind.Db/FullPruning/FullPruningInnerDbFactory.cs	Sets `SkipMetricsTracking = true` for indexed inner DB instances.
src/Nethermind/Nethermind.Db/DbSettings.cs	Introduces the `SkipMetricsTracking` property.
src/Nethermind/Nethermind.Db.Test/Nethermind.Db.Test.csproj	Adds test dependency on `Nethermind.Init` to access `DbMonitoringModule`/tracker types.
src/Nethermind/Nethermind.Db.Test/DbTrackerTests.cs	Adds tests for skip-tracking and exception isolation during metrics update.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add XML doc to DbSettings.SkipMetricsTracking property - Clarify WorldStateModule comment for both FullPruningInnerDbFactory and MemDbFactory branches - DbMonitoringModule: clear stale dictionary entries on GatherMetric failure and log only once per failure streak (with recovery info log) - DbTrackerTests: add [TearDown] to reset shared static metrics keys, collapse double enumeration in TestSkipMetricsTracking, and add FullPruningDbTrackedWrapper_SurvivesPruningCycle integration test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: add SkipMetricsTracking property to DbSettings - Introduced SkipMetricsTracking property in DbSettings to control metrics tracking for specific databases. - Updated FullPruningInnerDbFactory to set SkipMetricsTracking to true for inner databases to prevent stale references after pruning. - Added unit tests to verify the behavior of metrics tracking based on the new property. - Enhanced DbMonitoringModule to respect the SkipMetricsTracking setting when adding databases to the tracker. * fix: address PR feedback for db metrics tracking - Add XML doc to DbSettings.SkipMetricsTracking property - Clarify WorldStateModule comment for both FullPruningInnerDbFactory and MemDbFactory branches - DbMonitoringModule: clear stale dictionary entries on GatherMetric failure and log only once per failure streak (with recovery info log) - DbTrackerTests: add [TearDown] to reset shared static metrics keys, collapse double enumeration in TestSkipMetricsTracking, and add FullPruningDbTrackedWrapper_SurvivesPruningCycle integration test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: dedupe DbTrackerTests container setup and metric-map iteration Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: Initialize _failingDbs with an empty HashSet --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: lukasz.rozmej <lukasz.rozmej@gmail.com>

@LukaszRozmej

* RLP tx decoding fixes (#11496) * Handle null txs * PR feedback * Revert `BlobTxStorage.TryDecodeFullTx` changes Silent `false` may be worse that NRE on DB corruption --------- Co-authored-by: Alexey Osipov <me@flcl.me> * eth/71 (#10844) * Improve eth/70 checks (#11456) * Improve eth/70 verification * Comments * Gas remake * Fix full sync * fix: prevent negative RequestSize crash when beacon pivot destination advances mid-sync (#11478) * fix: prevent negative RequestSize crash when beacon pivot destination advances mid-sync `HeadersSyncFeed.ShouldBuildANewBatch` checked `_lowestRequestedHeaderNumber == HeadersDestinationNumber`. For beacon headers, `HeadersDestinationNumber` is `BeaconPivot.PivotDestinationNumber`, which tracks `Head.Number - Reorganization.MaxDepth + 1` and so advances upward as the chain head progresses. When it stepped above `_lowestRequestedHeaderNumber` mid-sync, the `==` check missed it, `BuildNewBatch` produced a negative `RequestSize`, and `HeaderStore.FindReversedHeaders` crashed with `ArgumentOutOfRangeException` on `new Dictionary<>(negativeCount)`. Widen the guard to `<=` and add a regression test that reproduces the scenario via mocked `IBeaconPivot`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: shorten inline comments per review Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add new default for gnosis and gnosis archive config (#11269) feat: add Db.SkipCheckingSstFileSizesOnDbOpen=true default for gnosis and gnosis archive * Alchemy - Code Fix (#11714) * feat: add SkipMetricsTracking property to DbSettings (#11515) * feat: add SkipMetricsTracking property to DbSettings - Introduced SkipMetricsTracking property in DbSettings to control metrics tracking for specific databases. - Updated FullPruningInnerDbFactory to set SkipMetricsTracking to true for inner databases to prevent stale references after pruning. - Added unit tests to verify the behavior of metrics tracking based on the new property. - Enhanced DbMonitoringModule to respect the SkipMetricsTracking setting when adding databases to the tracker. * fix: address PR feedback for db metrics tracking - Add XML doc to DbSettings.SkipMetricsTracking property - Clarify WorldStateModule comment for both FullPruningInnerDbFactory and MemDbFactory branches - DbMonitoringModule: clear stale dictionary entries on GatherMetric failure and log only once per failure streak (with recovery info log) - DbTrackerTests: add [TearDown] to reset shared static metrics keys, collapse double enumeration in TestSkipMetricsTracking, and add FullPruningDbTrackedWrapper_SurvivesPruningCycle integration test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: dedupe DbTrackerTests container setup and metric-map iteration Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: Initialize _failingDbs with an empty HashSet --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: lukasz.rozmej <lukasz.rozmej@gmail.com> * Fix DbTracker repeatedly logging ObjectDisposedException after disposal (#11720) * Fix DbTracker repeatedly logging ObjectDisposedException after disposal When the Autofac LifetimeScope (or the shared cache SafeHandle) is disposed while MonitoringService's timer is still scheduled, `_sharedBlockCache.Value` in `UpdateDbMetrics` throws `ObjectDisposedException` via Autofac's LazyRegistrationSource. The generic catch logs it at Error and the callback stays registered, so the same exception re-fires on every metric interval — producing dozens of identical errors per minute on affected nodes. Catch `ObjectDisposedException` explicitly and short-circuit subsequent ticks via a `_stopped` flag. Adds a regression test that disposes the container and asserts the callback neither throws nor logs on repeated invocations. Fixes #11719 * Address review: debug-log first stop, drop redundant CreateDb in test - Log at Debug level in the new `ObjectDisposedException` branch so there is a (no-cost on production) signal that DbTracker has stopped updating metrics, rather than only inferring it from the absence of further Error logs. - Remove the duplicate `CreateDb` call in the regression test — the helper `ConfigureMetricUpdater` already registers the test DB. - Disable `TestLogger.IsDebug` in the regression test so the new Debug message does not trip the `LogList.Should().BeEmpty()` assertion; the test still asserts no Error-level spam, which was the bug. * Address review: make DbTracker IDisposable, drop redundant comment - Implement IDisposable on DbTracker so Autofac proactively sets _stopped during scope teardown, short-circuiting subsequent monitoring ticks before they touch disposed resources. The catch (ObjectDisposedException) remains as a backstop for the race where a tick is already executing when Dispose runs. - Mark _stopped as volatile since it is now written from the disposing thread and read from the monitoring timer thread. - Drop the inline comment in the catch block; the Debug log message already conveys the same information. * Fix Eth69/Eth70 receipt tests for null-means-unknown contract After dropping the FindHeader pre-check, the response loop relies solely on GetReceipts returning null to detect an unknown block. The two "unknown block hash" tests still mocked the old contract (FindHeader returns null + GetReceipts returns []), so the loop saw [] as a legitimate zero-tx block and kept going instead of breaking. Update the mocks to return null for unknown hashes, matching the ISyncServer.GetReceipts contract (null = unknown, [] = exists w/ 0 txs). * Make EraE tests visible and green (#11727) * fix(eth/70): reject null receipt payloads (#11615) * fix(eth/70): reject null receipt payloads * fix(eth/70): validate receipt payloads while decoding * refactor(eth): move null receipt validation into base serializer Apply the validation in V63 ReceiptsMessageSerializer so eth/63, eth/66, eth/69 and eth/70 all reject null receipt payloads at decode time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: lukasz.rozmej <lukasz.rozmej@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit fd2fd25) * fix(eth/70): stop response early when block has txs but no receipts FulfillReceiptsRequest used to emit `txReceipts.Add([])` for any block where SyncServer.GetReceipts(hash) returned empty, regardless of whether the block actually had zero transactions. The eth/70 receiver validates segment-complete responses against its own transaction count and throws SubprotocolException ("Receipt count mismatch with block transactions count") + disconnects the peer when an [] arrives for a block that locally has transactions. That made a node that is briefly without receipts (e.g. still syncing receipts, or its receipt store is pruned for the requested block) appear malicious to the requester. We observed this disconnect-storm pattern materially starving receipt-sync on small networks. Distinguish the two cases at the sender by looking up the block: - block is null or body is missing → we can't safely claim anything; break the response and let the requester ask another peer - block.Transactions.Length > 0 → same: we don't actually have the receipts even though we have the body; break - block.Transactions.Length == 0 → block is legitimately empty; emit [] as before Update Should_return_empty_receipts_block_when_local_block_has_no_receipts → ..._has_no_transactions to reflect the new precondition, and stub SyncServer.Find on two pre-existing empty-receipts-in-the-middle tests so they still represent the legitimate empty case. Add new regression test Should_stop_response_when_local_block_has_transactions_but_no_receipts covering the bug. Closes #11752. (cherry picked from commit f0f6ea2) * refactor(eth/70): disambiguate "unknown" vs "legit empty" in ISyncServer.GetReceipts Following @LukaszRozmej's review suggestion on #11752: rather than have the protocol handler do a second SyncServer.Find lookup to figure out whether an empty receipts array means "block has zero transactions" or "I don't have the receipts yet", push the disambiguation down to where the data lives. ISyncServer.GetReceipts now returns TxReceipt[]?: null → receipts are not known locally (block missing, body missing, or receipts not stored). Callers MUST NOT emit [] on the wire. empty [] → block is known and legitimately has zero transactions. non-empty → receipts for an executed block. SyncServer.GetReceipts implements the three cases directly: - blockHash is null OR block not found OR block body missing → null - block.Transactions.Length == 0 → [] - block has txs, receipts not stored → null - block has txs and receipts → receipts Eth70ProtocolHandler.FulfillReceiptsRequest is now a single null-check instead of the previous Find-then-classify dance. SyncPeerProtocolHandlerBase.Fulfill (eth/63-69 path) gets the same fix for free — same bug, same one-line guard. Tests: replace the Find-based stubs with GetReceipts-returning-null stubs and update OldStyleFullSynchronizerTests.Can_retrieve_empty_receipts to assert the new contract (genesis → BeEmpty; unknown blocks → BeNull). Closes #11752. (cherry picked from commit 1d880be) * Apply suggestions from code review Co-authored-by: Lukasz Rozmej <lukasz.rozmej@gmail.com> (cherry picked from commit c01295b) * Apply suggestion from @LukaszRozmej (cherry picked from commit 27258b9) * Drop redundant FindHeader pre-check in receipt response loop GetReceipts now returns null for unknown blocks (block missing, body missing, or receipts not stored), so the up-front FindHeader call before GetReceipts is redundant — the `if (receipts is null) break;` below it already handles the unknown case. Per @flcl42 review on #11754. (cherry picked from commit 742cb1a) * Fix shutdown race in SnapProvider PLINQ (closes #11806) (#11807) * Unwrap AggregateException(ObjectDisposedException) from snap PLINQ on shutdown When the node is stopped during snap sync, SnapProvider.AddAccountRange's parallel code-existence check (codeHashes.AsParallel().Where(_codeDb.KeyExists)) races RocksDB disposal in the DI container teardown. The resulting ObjectDisposedException is wrapped by PLINQ in an AggregateException, which falls past the snap dispatcher's existing `catch (ObjectDisposedException) → Info("Ignoring sync response as the DB has already closed.")` guard and lands on `catch (Exception) → Error("Error when handling response", e)`. The node recovers correctly on restart — this is purely a noisy shutdown log line — but the post-merge fuzz tests' StabilityVerification watchdog scans for non-allowlisted exception lines and fails the test on it, blocking the 1.38 release smoke run. Unwrap the AggregateException at the point of throw so the dispatcher's existing benign guard handles it uniformly. No new log path; reuses the already-tested "Ignoring sync response..." Info message. Race introduced 2024-03-28 by PR #6873 "Perf/dont redownload downloaded code" (commit 7059b45), latent until the fuzz watchdog started catching it. * Tidy unwrap: single Flatten, preserve stack via ExceptionDispatchInfo Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Guard against empty InnerExceptions in unwrap filter Enumerable.All() returns true vacuously on an empty sequence, which would let the filter pass and then InnerExceptions[0] throw ArgumentOutOfRangeException instead of re-throwing the original. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: lukasz.rozmej <lukasz.rozmej@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Default Discovery to V4 (#11614) * Default Discovery to V4 * Update tests * Activate BAL only when needed (#11795) * Activate BAL only when needed * Guard ChangeState against same-state transitions Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Align IsFinished with ShouldFinish; short-circuit cheap checks first Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: lukasz.rozmej <lukasz.rozmej@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: update Directory.Build.props for 1.38.0 --------- Co-authored-by: Alex <alexb5dh@gmail.com> Co-authored-by: Alexey Osipov <me@flcl.me> Co-authored-by: Amirul Ashraf <asdacap@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Co-authored-by: Carlos Bermudez Porto <43155355+cbermudez97@users.noreply.github.com> Co-authored-by: lukasz.rozmej <lukasz.rozmej@gmail.com> Co-authored-by: DeFi Junkie <deffie.jnkiee@gmail.com> Co-authored-by: Ben {chmark} Adams <thundercat@illyriad.co.uk>

cbermudez97 requested review from LukaszRozmej and asdacap as code owners May 7, 2026 14:22

github-actions Bot added bug fix + reliability database new feature labels May 7, 2026

cbermudez97 requested review from Copilot and removed request for LukaszRozmej and asdacap May 7, 2026 14:23

Copilot AI reviewed May 7, 2026

View reviewed changes

claude Bot reviewed May 7, 2026

View reviewed changes

Comment thread src/Nethermind/Nethermind.Db/DbSettings.cs

claude Bot reviewed May 7, 2026

View reviewed changes

Comment thread src/Nethermind/Nethermind.Init/Modules/WorldStateModule.cs Outdated

claude Bot reviewed May 7, 2026

View reviewed changes

Comment thread src/Nethermind/Nethermind.Db.Test/DbTrackerTests.cs

cbermudez97 and others added 2 commits May 11, 2026 10:00

Merge branch 'master' into fix/db-metrics

6966311

asdacap reviewed May 12, 2026

View reviewed changes

Comment thread src/Nethermind/Nethermind.Init/Modules/WorldStateModule.cs Outdated

chore: merge branch 'master' into fix/db-metrics

3ab79ee

cbermudez97 requested a review from asdacap May 21, 2026 16:41

refactor: dedupe DbTrackerTests container setup and metric-map iteration

5219b39

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LukaszRozmej approved these changes May 22, 2026

View reviewed changes

cbermudez97 removed the request for review from asdacap May 22, 2026 13:07

cbermudez97 and others added 2 commits May 22, 2026 09:12

Merge branch 'master' into fix/db-metrics

d667e88

fix: Initialize _failingDbs with an empty HashSet

fc2e7d1

cbermudez97 requested review from LukaszRozmej and asdacap May 22, 2026 13:37

benaadams approved these changes May 22, 2026

View reviewed changes

LukaszRozmej merged commit 27b3119 into master May 22, 2026
542 checks passed

LukaszRozmej deleted the fix/db-metrics branch May 22, 2026 22:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add SkipMetricsTracking property to DbSettings#11515

feat: add SkipMetricsTracking property to DbSettings#11515
LukaszRozmej merged 7 commits into
masterfrom
fix/db-metrics

cbermudez97 commented May 7, 2026

Uh oh!

claude Bot commented May 7, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

cbermudez97 commented May 7, 2026

Changes

Types of changes

What types of changes does your code introduce?

Testing

Requires testing

If yes, did you write tests?

Uh oh!

claude Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat: add SkipMetricsTracking property to DbSettings

Summary

Findings

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

claude Bot commented May 7, 2026 •

edited

Loading

PR Review: `feat: add SkipMetricsTracking property to DbSettings`