Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stddev stability issue #657

Closed
sk1p opened this issue Feb 29, 2020 · 5 comments · Fixed by #672
Closed

stddev stability issue #657

sk1p opened this issue Feb 29, 2020 · 5 comments · Fixed by #672
Labels
bug Something isn't working

Comments

@sk1p
Copy link
Member

sk1p commented Feb 29, 2020

See the following CI failure: https://travis-ci.org/LiberTEM/LiberTEM/jobs/656632518

Not sure if this is fixed in #640.

@sk1p sk1p added the bug Something isn't working label Feb 29, 2020
@sk1p
Copy link
Member Author

sk1p commented Mar 1, 2020

@uellue
Copy link
Member

uellue commented Mar 9, 2020

Lovely! </irony> Those are not small errors anymore. We should definitely observe if this happens again with the new code. How can we make the error reproducible, i.e. get the same state of the random number generator etc.?

Or perhaps we could add code similar to

@pytest.mark.parametrize(
'TYPE',
['JOB', 'UDF']
)
def test_numerics_fail(lt_ctx, TYPE):
dtype = 'float32'
# Highest expected detector resolution
RESOLUTION = 4096
# Highest expected detector dynamic range
RANGE = 1e6
# default value for all cells
VAL = 1.1
data = np.full((2, 1, RESOLUTION, RESOLUTION), VAL, dtype=np.float32)
data[0, 0, 0, 0] += VAL * RANGE
dataset = MemoryDataSet(
data=data,
tileshape=(2, RESOLUTION, RESOLUTION),
num_partitions=1,
sig_dims=2,
)
mask0 = np.ones((RESOLUTION, RESOLUTION), dtype=np.float64)
analysis = lt_ctx.create_mask_analysis(
dataset=dataset, factories=[lambda: mask0], mask_count=1, mask_dtype=dtype,
)
analysis.TYPE = TYPE
results = lt_ctx.run(analysis)
expected = np.array([[
[VAL*RESOLUTION**2 + VAL*RANGE],
[VAL*RESOLUTION**2]
]])
naive = _naive_mask_apply([mask0], data)
naive_32 = _naive_mask_apply([mask0.astype(dtype)], data)
# The masks are float64, that means the calculation is performed with high resolution
# and the naive result should be correct
assert np.allclose(expected, naive)
# We make sure LiberTEM calculated this with the lower-precision dtype we set
assert np.allclose(results.mask_0.raw_data, expected[0]) == np.allclose(naive_32, expected)
# Confirm that the numerical precision is actually insufficient.
# If this succeeds, we have to rethink the premise of this test.
assert not np.allclose(results.mask_0.raw_data, expected[0])
@pytest.mark.parametrize(
'TYPE',
['JOB', 'UDF']
)
def test_numerics_succeed(lt_ctx, TYPE):
dtype = 'float64'
# Highest expected detector resolution
RESOLUTION = 4096
# Highest expected detector dynamic range
RANGE = 1e6
# default value for all cells
VAL = 1.1
data = np.full((2, 1, RESOLUTION, RESOLUTION), VAL, dtype=np.float32)
data[0, 0, 0, 0] += VAL * RANGE
dataset = MemoryDataSet(
data=data,
tileshape=(2, RESOLUTION, RESOLUTION),
num_partitions=1,
sig_dims=2,
)
mask0 = np.ones((RESOLUTION, RESOLUTION), dtype=np.float32)
analysis = lt_ctx.create_mask_analysis(
dataset=dataset, factories=[lambda: mask0], mask_count=1, mask_dtype=dtype,
)
analysis.TYPE = TYPE
results = lt_ctx.run(analysis)
expected = np.array([[
[VAL*RESOLUTION**2 + VAL*RANGE],
[VAL*RESOLUTION**2]
]])
naive = _naive_mask_apply([mask0.astype(dtype)], data.astype(dtype))
assert np.allclose(expected, naive)
assert np.allclose(expected[0], results.mask_0.raw_data)
to test standard deviation numerical stability with a test case that is sensitive to numerical errors and can be verified with a simple, stable calculation?

@sk1p
Copy link
Member Author

sk1p commented Mar 9, 2020

Those are not small errors anymore

I didn't have time to check in the error log yet - how large were the absolute errors? Maybe we should have better assertion output here, which gives some statistics on the difference?

To reproduce, maybe try to run in a loop again?

sensitive to numerical errors

I mean, one easy way is to just increase the dynamic range by a large margin, but this would be guaranteed to fail and also be expected. The hard part would be to find out how stable the algorithm should actually be.

I remember from the paper that they did implement Kahan summation in the minibatch part to increase stability, but I'm not sure how feasible this is.

@uellue
Copy link
Member

uellue commented Mar 9, 2020

how large were the absolute errors?

For example 2.4994637e-01 vs 2.4994595e-01, i.e. two orders of magnitude more than the decimal digits of precision of float32.

The hard part would be to find out how stable the algorithm should actually be

What about "at least as good as numpy.var()" if we pose the task in such way that the true result is known?

In fact, LiberTEM already performs a three-level hierarchical sum if tile and partition sizes are well-chosen: First within the tile, then within the partition, and then final aggregation.

I was already thinking on how to do a complete pairwise hierarchical sum without modifying the source buffer with only s * log(N) memory use (s: tile size, N: number of frames) in case this becomes an issue😁.

Or just switch to float64 😄.

@sk1p
Copy link
Member Author

sk1p commented Mar 9, 2020

What about "at least as good as numpy.var()" if we pose the task in such way that the true result is known?

Well, that would mean actually decreasing the dynamic range of our test data, as the two-pass algorithm is AFAIK the most stable of the ones compared in the paper, right?

uellue added a commit to uellue/LiberTEM that referenced this issue Mar 9, 2020
Counter stability issues with this algorithm

Closes LiberTEM#657
@sk1p sk1p closed this as completed in #672 Mar 9, 2020
sk1p pushed a commit that referenced this issue Mar 9, 2020
Counter stability issues with this algorithm

Closes #657
twentyse7en pushed a commit to twentyse7en/LiberTEM that referenced this issue Mar 12, 2020
Counter stability issues with this algorithm

Closes LiberTEM#657
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants