Skip to content

CAMEL-23865: Use EWMA smoothing for throughput and fix TUI sparkline scaling#24356

Merged
davsclaus merged 3 commits into
mainfrom
fix/CAMEL-23865
Jul 1, 2026
Merged

CAMEL-23865: Use EWMA smoothing for throughput and fix TUI sparkline scaling#24356
davsclaus merged 3 commits into
mainfrom
fix/CAMEL-23865

Conversation

@davsclaus

@davsclaus davsclaus commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Switch the throughput metric in LoadThroughput from raw instantaneous rate to EWMA (exponentially weighted moving average) smoothing with a 1-minute decay window. This produces stable, converging throughput readings instead of oscillating 0/spike values — consistent with how LoadTriplet already computes CPU load averages.

Also fixes the TUI sparkline charts to properly display sub-1.0 msg/s rates by scaling throughput values by 100x internally.

Changes

camel-management (core):

  • LoadThroughput: EWMA smoothing with Math.exp(-1/60) decay, negative delta clamping, reset() stops the StopWatch
  • LoadThroughputTest: Awaitility-based tests verifying EWMA convergence, smoothing bounds, and reset behavior

camel-jbang-plugin-tui:

  • MetricsCollector: reads EWMA throughput from status JSON, stores scaled by THROUGHPUT_SCALE=100; shared niceMax() and formatThroughput() utility methods
  • OverviewTab / EndpointsTab: sparkline charts use scaled values with 1-2-5 Y-axis snapping; delegate to MetricsCollector for formatting

docs:

  • Upgrade guide entry for throughput MBean semantics change (instantaneous to EWMA)

Claude Code on behalf of Claus Ibsen

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run
  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
  • You can label PRs using skip-tests and test-dependents to fine-tune the checks executed by this PR.
  • Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

@gnodet gnodet left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this improvement, Claus — the EWMA smoothing approach is sound and consistent with the existing LoadTriplet pattern, and the TUI sparkline scaling is a nice enhancement. The formula is correct and the overall direction is good.

A few items to address before merging (details in inline comments):

Blocking:

  1. The new test introduces Thread.sleep() calls — project conventions require Awaitility instead.
  2. Negative deltas are no longer guarded, which can produce persistent negative throughput via EWMA memory.

Suggestions:
3. reset() doesn't stop the watch — the first post-reset update can produce a spike that persists ~60s via EWMA.
4. niceMax and formatThroughput are duplicated between OverviewTab and EndpointsTab with divergent implementations.
5. The throughput JMX MBean attribute changed semantics (instantaneous → EWMA) — this should get an upgrade guide entry per project conventions.

This review covers project conventions and code correctness. It does not replace specialized review tools such as CodeRabbit, Sourcery, or SonarCloud.

This review was generated by an AI agent and may contain inaccuracies. Please verify all suggestions before applying.

} else {
thp = 0;
// instantaneous rate in exchanges/second for this interval
double instantRate = (1000d / time) * delta;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code guarded if (delta > 0) and set thp = 0 for non-positive deltas. With EWMA, a negative instantRate (from currentReading < last, e.g. after a route restart that resets the exchange counter) bleeds into thp and takes ~60 seconds to decay back toward zero.

Consider clamping:

Suggested change
double instantRate = (1000d / time) * delta;
double instantRate = Math.max(0, (1000d / time) * delta);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added Math.max(0, ...) clamping on the instantaneous rate. Also added watch.stop() in reset() so the next update() re-enters the initialization branch cleanly, avoiding post-reset spikes.

t.update(total);
Thread.sleep(10);
for (int i = 0; i < 120; i++) {
total++;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test introduces multiple Thread.sleep() calls. Per project conventions:

"New test code MUST NOT introduce Thread.sleep() calls. Use the Awaitility library instead."

Since LoadThroughput relies on StopWatch internally, it requires real elapsed time. A better long-term approach would be an injectable time source for fully deterministic tests. At minimum, the sleep-based timing should be replaced with Awaitility polling assertions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — rewrote all tests using Awaitility's pollInterval + untilAsserted pattern. Since LoadThroughput relies on real elapsed time via StopWatch, the Awaitility polling drives the updates at 10ms intervals while the assertions verify convergence.

@Test
public void testSmoothing() throws Exception {
LoadThroughput t = new LoadThroughput();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion (> 0) would also pass with the old raw throughput calculation. Consider tightening it to verify that EWMA smoothing is actually converging toward the expected steady-state rate (e.g., within a reasonable tolerance of the true rate).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — tightened the convergence assertion to check throughput > 5.0 (verifying actual EWMA convergence toward the steady-state rate rather than just > 0). The smoothing test now also checks the value is bounded within a reasonable range.

import static org.junit.jupiter.api.Assertions.assertTrue;

@DisabledOnOs(OS.AIX)
public class LoadThroughputTest {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is pure Java math + Thread.sleep — no JMX interaction. The @DisabledOnOs(OS.AIX) annotation may have been cargo-culted from other management tests that interact with platform MBeans. Is there a specific AIX reason to skip this test? If not, consider removing it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — removed. This is a pure Java math test with no JMX interaction, so the AIX exclusion was cargo-culted from other management tests.

* 0.20 → 1, 1.5 → 2, 3 → 5, 7 → 10, 15 → 20, 35 → 50, 80 → 100, etc.
*/
private static long niceMax(long rawMax) {
long s = MetricsCollector.THROUGHPUT_SCALE;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

niceMax and formatThroughput are duplicated between OverviewTab and EndpointsTab with different implementations:

  • Here (OverviewTab): hardcoded array of 10 steps up to 1000 * THROUGHPUT_SCALE; falls back to rawMax (no rounding) beyond that.
  • EndpointsTab: uses a while(true) loop with 1-2-5 step sequence that scales indefinitely.

The same data could get different Y-axis ceilings depending on which chart displays it. Consider extracting both methods to a shared utility (e.g., on MetricsCollector where THROUGHPUT_SCALE already lives) with a single correct implementation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — extracted both niceMax and formatThroughput to MetricsCollector as package-private static utility methods. Used the more robust EndpointsTab version (1-2-5 loop that scales indefinitely) as the single implementation. Both tabs now delegate to MetricsCollector.niceMax() and MetricsCollector.formatThroughput().

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

🧪 CI tested the following changed modules:

  • core/camel-management
  • docs
  • dsl/camel-jbang/camel-jbang-plugin-tui

ℹ️ Dependent modules were not tested because the total number of affected modules exceeded the threshold (50). Use the test-dependents label to force testing all dependents.

Build reactor — dependencies compiled but only changed modules were tested (3 modules)
  • Camel :: Docs
  • Camel :: JBang :: Plugin :: TUI
  • Camel :: Management

⚙️ View full build and test results

davsclaus and others added 3 commits July 1, 2026 13:11
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Claus Ibsen <claus.ibsen@gmail.com>
Use THROUGHPUT_SCALE (x100) in MetricsCollector so fractional msg/s
rates are preserved as longs. Apply niceMax (1-2-5 step sequence)
for Y-axis scaling. Use Locale.US for consistent dot decimal formatting.
Endpoint charts use showYAxis(false) until tamboui supports yAxisFormatter.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Claus Ibsen <claus.ibsen@gmail.com>
- Clamp negative instantaneous rate to zero to prevent EWMA memory bleed
- Stop StopWatch on reset() to avoid post-reset throughput spike
- Replace Thread.sleep() with Awaitility in tests per project conventions
- Remove unnecessary @DisabledOnOs(OS.AIX) from pure-math test
- Tighten test assertions to verify EWMA convergence behavior
- Extract niceMax/formatThroughput to MetricsCollector shared utility
- Add upgrade guide entry for throughput MBean semantics change

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Claus Ibsen <claus.ibsen@gmail.com>
@github-actions github-actions Bot added the docs label Jul 1, 2026
@davsclaus davsclaus requested a review from gnodet July 1, 2026 12:35
@davsclaus davsclaus merged commit 5c46803 into main Jul 1, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants