Skip to content

fix: bump prometheus, fix missing nearcore metrics#3397

Merged
gilcu3 merged 1 commit into
mainfrom
1909-upgrade-prometheus-crate-to-latest-version-01402
May 29, 2026
Merged

fix: bump prometheus, fix missing nearcore metrics#3397
gilcu3 merged 1 commit into
mainfrom
1909-upgrade-prometheus-crate-to-latest-version-01402

Conversation

@gilcu3

@gilcu3 gilcu3 commented May 29, 2026

Copy link
Copy Markdown
Contributor

Closes #1909 and #3395

I moved some near-sdk features around to make sure the old version of prometheus is not used by the node, and that reduced the contract size a bit. But the big jump must have happened elsewhere.

@gilcu3 gilcu3 linked an issue May 29, 2026 that may be closed by this pull request
@gilcu3 gilcu3 marked this pull request as draft May 29, 2026 13:39
@claude

claude Bot commented May 29, 2026

Copy link
Copy Markdown

Pull request overview

Bumps the prometheus crate from 0.13.4 to 0.14.0 so the MPC node and the embedded nearcore dependencies share the same global default registry. Because prometheus's default_registry() is a per-crate-version static, having two prometheus versions in the dependency graph meant metrics registered by near-o11y (which already depends on 0.14.0) lived in a registry the /metrics handler never gathered from. After this bump, default_registry().gather() in web.rs:50 returns nearcore's metric families as well as MPC's, restoring the previously missing visibility. A small E2E assertion is added to lock this in.

Changes:

  • Bump workspace prometheus to 0.14.0 and drop the outdated "cannot upgrade" comment.
  • Regenerate Cargo.lock (mpc-node now uses prometheus 0.14.0; the lingering 0.13.4 belongs to a pre-existing transitive near-o11y 0.35.0).
  • Extend test_web_endpoints to assert near_block_processed_total is present on /metrics.

Reviewed changes

Per-file summary
File Description
Cargo.toml Bump prometheus = 0.14.0; remove comment claiming the bump conflicted with nearcore.
Cargo.lock Updated mpc-node prometheus dep to 0.14.0.
crates/e2e-tests/tests/web_endpoints.rs Add near_block_processed_total to the expected substrings checked against the /metrics body.

Findings

No blocking issues. A couple of minor observations:

Non-blocking (nits, follow-ups, suggestions):

  • Cargo.lock:7202 — Two prometheus versions still resolve in the workspace because the older near-o11y 0.35.0 (used by the utilities crate, judging by the workspace comment about near-account-id-v1) still pulls in 0.13.4. Not a regression introduced here, but worth noting that any metrics registered through that path will still be invisible until that transitive is dropped.
  • crates/e2e-tests/tests/web_endpoints.rs:101 — Worth confirming near_block_processed_total is emitted by the embedded nearcore on a freshly spun-up local testnet even when block production is minimal. Counters are normally exposed at value 0 once registered, so this should be fine; just flagging in case the metric registration is lazy and could flake under timing.

✅ Approved

@gilcu3 gilcu3 force-pushed the 1909-upgrade-prometheus-crate-to-latest-version-01402 branch 2 times, most recently from f3d0a68 to 93f1cb9 Compare May 29, 2026 14:26
# above the post-feature size without leaving the contract free to creep up
# to the protocol boundary.
HARD_LIMIT=1520000
HARD_LIMIT=1450000

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really justified by the changes of this PR, but always nice to see :)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice! Wild guess is that the new rust compiler performs some better optimizations, and then this PR pushed it over the edge.

But still weird, this PR barely touches the contract 🤔

@gilcu3 gilcu3 force-pushed the 1909-upgrade-prometheus-crate-to-latest-version-01402 branch from 93f1cb9 to eb9f85f Compare May 29, 2026 14:40
@gilcu3 gilcu3 marked this pull request as ready for review May 29, 2026 14:49
@claude

claude Bot commented May 29, 2026

Copy link
Copy Markdown

PR title type suggestion: This PR changes only dependency files and tests, so the type prefix should probably be build: instead of fix:.
Suggested title: build: bump prometheus, add nearcore metrics

@claude

claude Bot commented May 29, 2026

Copy link
Copy Markdown

Pull request overview

Bumps prometheus from 0.13.4 to 0.14.0 so the node and the embedded nearcore code share the same global default_registry(), which means /metrics now also exposes nearcore-side counters such as near_block_processed_total. The PR also restructures near-sdk feature usage — dropping unit-testing from the workspace defaults and re-enabling it only where it is actually needed — which shrinks the contract wasm enough to lower the size guardrail by ~70 KB.

Changes:

  • Cargo.toml — bump workspace prometheus to 0.14.0; drop the stale "cannot upgrade" comment; remove unit-testing from default near-sdk features.
  • crates/node/Cargo.toml — enable near-sdk/non-contract-usage on the node; drop the redundant dev-dep line.
  • crates/contract/Cargo.tomltest-utils feature now also pulls near-sdk/unit-testing, so contract unit tests still compile.
  • crates/test-utils/Cargo.toml — enable near-sdk/unit-testing (transitive consumers rely on it).
  • Cargo.lockmpc-node resolves to prometheus 0.14.0; the near-* 0.35.x family was bumped to 0.35.1 as a side effect.
  • crates/e2e-tests/tests/web_endpoints.rs — assert that /metrics now exposes near_block_processed_total in addition to the existing mpc metric.
  • scripts/check-contract-wasm-size.sh — drop HARD_LIMIT from 1,520,000 to 1,450,000, reflecting the size win from no longer pulling unit-testing into the contract build.

Reviewed changes

Per-file summary
File Description
Cargo.toml Bump prometheus to 0.14.0; remove unit-testing from workspace near-sdk default features.
Cargo.lock mpc-node now uses prometheus 0.14.0; near-* 0.35.0 → 0.35.1 transitive bump.
crates/contract/Cargo.toml Add near-sdk/unit-testing to test-utils feature.
crates/node/Cargo.toml Enable near-sdk/non-contract-usage; drop duplicate dev-dep.
crates/test-utils/Cargo.toml Enable near-sdk/unit-testing.
crates/e2e-tests/tests/web_endpoints.rs Add near_block_processed_total to expected metrics substrings.
scripts/check-contract-wasm-size.sh Lower wasm HARD_LIMIT to 1,450,000.

Findings

No blocking issues — the prometheus version unification is the right fix for the missing-metrics bug, the feature shuffle correctly preserves test compilability (contract #[cfg(test)] modules still get near-sdk/unit-testing via the self dev-dep mpc-contract = { features = ["test-utils"] } at crates/contract/Cargo.toml:97), and the node has no usage of near_sdk::test_utils/testing_env so dropping the dev-dep is safe.

Non-blocking (nits, follow-ups, suggestions):

  • crates/test-utils/Cargo.toml:13 — The test-utils library crate itself only uses near_sdk::PublicKey (see src/attestation.rs:88,:93); it does not reference anything gated by unit-testing. The feature is being enabled here purely so downstream consumers transitively get it. That works because of cargo's feature unification, but it's load-bearing in a non-obvious way — worth a one-line comment, or moving the activation to the consumers that actually need it.
  • Cargo.lock:7202 (prometheus 0.13.4) — As the prior review noted, near-o11y 0.35.1 (pulled via the published near-sdk chain, not the nearcore git pin) still depends on 0.13.4, so any metric registered through that path will continue to be invisible. Not a regression here, just confirming the residual.
  • crates/e2e-tests/tests/web_endpoints.rs:101near_block_processed_total is a registered counter, so it should appear at value 0 even before the first block, but worth keeping an eye on for flakes on slow CI runners since the cluster setup is the only thing exercising nearcore before the assertion.

✅ Approved

@gilcu3 gilcu3 force-pushed the 1909-upgrade-prometheus-crate-to-latest-version-01402 branch from eb9f85f to 9423f12 Compare May 29, 2026 14:58
@claude

claude Bot commented May 29, 2026

Copy link
Copy Markdown

PR title type suggestion: This PR is primarily a dependency update and routine maintenance, so the type prefix should probably be chore: instead of fix:.

Suggested title: chore: bump prometheus and fix missing nearcore metrics

@claude

claude Bot commented May 29, 2026

Copy link
Copy Markdown

PR title type suggestion: This PR primarily updates dependencies and build files (Cargo.toml, Cargo.lock, scripts) with no changes to source code in src/ directories. The type should probably be chore: or build: instead of fix:.

Suggested title: chore: bump prometheus and update nearcore metrics or build: bump prometheus, fix missing nearcore metrics

@gilcu3 gilcu3 linked an issue May 29, 2026 that may be closed by this pull request
@gilcu3 gilcu3 added this pull request to the merge queue May 29, 2026
Merged via the queue into main with commit f004f01 May 29, 2026
16 checks passed
@gilcu3 gilcu3 deleted the 1909-upgrade-prometheus-crate-to-latest-version-01402 branch May 29, 2026 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nearcore metrics missing in 3.11.0 Upgrade prometheus crate to latest version, 0.14.0

3 participants