Improve numerical stability of variance calculation in envelope by networmix · Pull Request #107 · networmix/NetGraph

networmix · 2026-02-07T10:22:10Z

Summary

Refactored the variance calculation in Envelope.from_values() to use a numerically stable two-pass algorithm instead of the computational formula, improving accuracy for edge cases with extreme values.

Changes

Replaced computational variance formula with the numerically stable sum-of-squared-deviations approach: sum((x - mean)²) / n
Removed single-pass calculation of sum_squares which can suffer from catastrophic cancellation when values are large
Implemented two-pass algorithm: First pass computes mean and frequency map, second pass calculates variance using deviations from mean
Optimized for duplicate values: Iterates over the frequency map rather than raw values, providing efficiency gains when Monte Carlo results contain many duplicates

Implementation Details

The new approach trades a second iteration over unique values for significantly improved numerical stability. This is particularly beneficial when:

Values have large magnitudes (where E[X²] can dominate and lose precision)
There are many duplicate values (frequency map iteration is more efficient than iterating all raw values)

The change maintains the same computational complexity while providing better accuracy for edge cases.

https://claude.ai/code/session_01BH7FXdY35eRtf98jo8kQiG

Note

Low Risk
Small, localized change to a statistical calculation; main risk is minor numeric/behavioral drift in reported stdev_capacity for some datasets.

Overview
Updates CapacityEnvelope.from_values() variance/stddev computation to a numerically stable two-pass approach (sum((x-mean)^2)/n) instead of E[X^2]-(E[X])^2, removing the sum_squares accumulator.

The second pass iterates over the computed frequencies map (unique values) to keep performance reasonable when Monte Carlo outputs contain many duplicates, while leaving the envelope output fields unchanged.

^{Written by Cursor Bugbot for commit c6ade7e. This will update automatically on new commits. Configure here.}

…m_values() The computational variance formula E[X²] - E[X]² suffers from catastrophic floating-point cancellation when capacity values are large or nearly identical. This produced silently wrong stdev values (e.g., 41 million instead of 0 for identical values) or complex numbers when the computed variance went negative. Replace with the numerically stable two-pass formula sum((x - mean)²) / n, iterating over the frequency map for efficiency with duplicate values. https://claude.ai/code/session_01BH7FXdY35eRtf98jo8kQiG

Copilot

Pull request overview

Refactors the capacity variance/standard deviation computation in CapacityEnvelope.from_values() to reduce catastrophic cancellation risk for large-magnitude inputs, while leveraging the existing frequency map representation used for Monte Carlo outputs.

Changes:

Replaced the computational variance formula (E[X²] - (E[X])²) with a deviation-based computation over the frequency map.
Removed the sum_squares single-pass accumulation and added a second pass over unique values to compute variance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-07T10:24:36Z

ngraph/results/artifacts.py

+        # First pass: build frequency map and compute mean
        frequencies = {}


PR description mentions refactoring Envelope.from_values(), but the code change is in CapacityEnvelope.from_values(). Please update the description (or code) to match the actual API being modified to avoid confusion for reviewers and future readers.

@claude need an update?

Copilot · 2026-02-07T10:24:36Z

ngraph/results/artifacts.py

+        # Second pass over unique values: compute variance using the
+        # numerically stable formula sum((x - mean)^2) / n.
+        # Iterating over the frequency map is efficient when there are
+        # many duplicate values (common in Monte Carlo results).
+        variance_sum = 0.0
+        for value, count in frequencies.items():
+            diff = value - mean_capacity
+            variance_sum += count * diff * diff
+        stdev_capacity = (variance_sum / n) ** 0.5


The variance/stdev computation was changed to a new two-pass algorithm, but there are no unit tests asserting stdev_capacity correctness or demonstrating improved numerical stability (e.g., large-magnitude values that cause catastrophic cancellation in the old formula, and duplicate-heavy inputs). Please add tests that validate mean_capacity/stdev_capacity for representative cases and edge cases.

Copilot AI review requested due to automatic review settings February 7, 2026 10:22

Copilot started reviewing on behalf of networmix February 7, 2026 10:22 View session

Copilot AI reviewed Feb 7, 2026

View reviewed changes

networmix merged commit 04e88bb into main Feb 7, 2026
13 checks passed

networmix deleted the claude/fix-major-bug-4sCRZ branch February 7, 2026 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve numerical stability of variance calculation in envelope#107

Improve numerical stability of variance calculation in envelope#107
networmix merged 1 commit intomainfrom
claude/fix-major-bug-4sCRZ

networmix commented Feb 7, 2026 •

edited by cursor bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

networmix Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# First pass: build frequency map and compute mean
		frequencies = {}

Conversation

networmix commented Feb 7, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Implementation Details

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

networmix Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

networmix commented Feb 7, 2026 •

edited by cursor bot

Loading