Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Propagation #46

Merged
merged 24 commits into from Jul 5, 2020
Merged

Error Propagation #46

merged 24 commits into from Jul 5, 2020

Conversation

ffreyer
Copy link
Collaborator

@ffreyer ffreyer commented May 13, 2019

As an alternative to jackknife, one can estimate the error of some function via an expansion of the expectation value of g() around the mean of x. This is, for example, discussed in the book Quantum Monte Carlo Methods, page 55, 56. Unlike jackknife (and bootstrap) this algorithm can be implemented online.

For the initial commit I added two versions, one using Welfords algorithm (1) and one using summations (2) of, for example x^2 (like in LogBinner). I have done some benchmarking and comparing between the two implementations and jackknife:

  • All algorithms scale with T = O(N), jackknife may scale worse (tested 1e3, 1e4, 1e5, 1e6)
  • (2) is 4-10x faster than jackknife, (1) about 1-2x faster than jackknife
  • complex functions affect jackknife much more (tested g(x, y) = (exp(sin(x)) + exp(-cos(y))) ^ (y - x^2) / log(10.0 + x^2 + y))
  • Memory usage of (1) and (2) is O(1)
  • The relative difference between jackknife and (1), (2) scales with 1/N

TODO

  • generalize code (input types, including matrices, number of arguments)
  • docs, clean up names

@carstenbauer
Copy link
Owner

carstenbauer commented May 13, 2019

Note that Measurement.jl implements (some form of - see below) linear error propagation as well (see here). It uses Calculus.jl to calculate the necessary derivatives.

Note the caveats and warnings on wikipedia regarding linear error propagation.

@carstenbauer
Copy link
Owner

carstenbauer commented May 13, 2019

Also, there is MonteCarloMeasurements.jl (not related to Markov chain Monte Carlo) which uses a random sampling method (thus dubbed "MonteCarlo") to go beyond Measurements.jl and handle nonlinear uncertainty propagation well.

It falls into the first of 5 categories listed here on wikipedia to handle strong non-linear functions.

@carstenbauer
Copy link
Owner

carstenbauer commented May 13, 2019

Maybe http://mathfaculty.fullerton.edu/sbehseta/Pages%20from%20KassBehseta.pdf is a good read to understand the differences between

  • linear error propagation theory (what you are doing here, what Measurements.jl is doing)
  • simulation based uncertainty propagation (MonteCarloMeasurements.jl)
  • Jackknife and Bootstrap

Regarding the difference between 2 and 3 see also this answer on stackexchange.

@carstenbauer
Copy link
Owner

carstenbauer commented May 13, 2019

Stupid test on this branch:

using Measurements, MonteCarloMeasurements, BinningAnalysis, Distributions, Statistics

function error_BinningAnalysis(f, x, y)
   ep = Error_Propagator1()
   for i in 1:length(x)
       push!(ep, x[i], y[i])
   end
   sqrt(BinningAnalysis.var_O1(f, ep))
end

function error_Measurements(f, x, y)
   mx = measurement(mean(x), std(x))
   my = measurement(mean(y), std(y))
   Measurements.uncertainty(f(mx,my)) / sqrt(length(x))
end

function error_MonteCarloMeasurements(f, x, y)
   px = Particles(x)
   py = Particles(y)
   std(f(px,py)) / sqrt(length(x))
end

x = rand(100)
y = x .+ rand(Normal(0., 0.2), 100)
f = (x,y) -> sin(abs(x))/log(y+1) # crazy non-linear function

@btime error_BinningAnalysis($f, $x, $y)
@btime error_Measurements($f, $x, $y)
@btime error_MonteCarloMeasurements($f, $x, $y)
@btime jackknife($f, $x, $y)[2]

This is what I get

julia> @btime error_BinningAnalysis($f, $x, $y)
  287.129 ns (3 allocations: 96 bytes)
0.03476324406512978

julia> @btime error_Measurements($f, $x, $y)
  352.799 ns (16 allocations: 752 bytes)
0.09049688511741694

julia> @btime error_MonteCarloMeasurements($f, $x, $y)
  3.212 μs (18 allocations: 4.59 KiB)
1.248968883896152

julia> @btime jackknife($f, $x, $y)[2]
  4.957 μs (28 allocations: 3.33 KiB)
0.03473403285416829

Assuming that MonteCarloMeasurements.jl's method is the most reliable, the other methods are completely off for this (crazy) non-linear function. Why does 1 and 4 almost perfectly agree here?

For a simple linear function we find:

julia> f = (x,y) -> x + y
#32 (generic function with 1 method)

julia> @btime error_BinningAnalysis($f, $x, $y)
  240.098 ns (3 allocations: 96 bytes)
0.06111615859525542

julia> @btime error_Measurements($f, $x, $y)
  236.547 ns (8 allocations: 368 bytes)
0.04508900420179231

julia> @btime error_MonteCarloMeasurements($f, $x, $y)
  707.307 ns (7 allocations: 1008 bytes)
0.06111615859525539

julia> @btime jackknife($f, $x, $y)[2]
  3.400 μs (28 allocations: 3.33 KiB)
0.06111615859525534

One would expect almost perfect agreement between all methods. Why is Measurements off here? It looks like Measurements.jl is doing a simplified linear error propagation which amounts to dropping the covariance terms. See the docs

In general, calculating the covariances is not an easy task. The trick adopted in Measurements.jl in order to deal with simple functional correlation is to propagate the uncertainty always using really independent variables.

Even simpler test:

julia> f = (x,y) -> x
#42 (generic function with 1 method)

julia> @btime error_BinningAnalysis($f, $x, $y)
  242.818 ns (3 allocations: 96 bytes)
0.02852033713783177

julia> @btime error_Measurements($f, $x, $y)
  192.845 ns (4 allocations: 192 bytes)
0.028520337137831746

julia> @btime error_MonteCarloMeasurements($f, $x, $y)
  467.347 ns (2 allocations: 32 bytes)
0.028520337137831746

julia> @btime jackknife($f, $x, $y)[2]
  3.462 μs (28 allocations: 3.33 KiB)
0.028520337137831794

Also,

julia> z = zeros(eltype(x), size(x));

julia> f = (x,y) -> x + y
#9 (generic function with 1 method)

julia> error_BinningAnalysis(f, x, z)
0.02713603571750006

julia> error_Measurements(f, x, z)
0.027136035717500083

somewhat confirming that Measurements negelects covariances.

@carstenbauer
Copy link
Owner

It'd be also interesting to think about which of those methods are usable with correlated input data where (not all elements of x and y are independent). That is how compatible they are with something like prebinning.

@ffreyer
Copy link
Collaborator Author

ffreyer commented May 14, 2019

Just a quick note on my performance tests - I included filling an array (by index, not with push!) when testing jackknife, since that isn't necessary for error propagation.

@ffreyer
Copy link
Collaborator Author

ffreyer commented May 14, 2019

MonteCarloMeasurement doesn't seem to do error propagation like this. It's simply calculating std(f.(x, y)) / sqrt(length(x)), at least the way you used it.

In particles.jl you can find

@forward @eval($PT).particles Statistics.mean, Statistics.cov, Statistics.var, Statistics.std, Statistics.median, Statistics.quantile, Statistics.middle

which just forwards the values in a Particle to the normal definitions of mean etc.

@carstenbauer
Copy link
Owner

carstenbauer commented May 14, 2019

MonteCarloMeasurement doesn't seem to do error propagation like this.

I thought that's the idea. Sample the input distribution, propagate the samples through the function and obtain samples of the output distribution (basically this is the first of the two estimators in the QMC book). If the density is high enough this should work I guess. However, I fail to see that this works generally, i.e. for lower densities and arbitrary functions. Maybe I'm also just using the package wrongly.

@carstenbauer
Copy link
Owner

carstenbauer commented May 14, 2019

Covar.jl should produce identical results to what you are doing here - it's linear propagation theory (LPT) in matrix form - and it proves to be in my tests (that I sent you by email).

@carstenbauer
Copy link
Owner

It also makes sense that Jackknife agrees with LPT in most cases as it is basically "LPT + weak non-linearities". It should be closer to the true uncertainty than LPT in non-linear scenarios but probably not by much.

Bootstrap should be a way to handle strong non-linearities well. We should implement it as well. If only for comparisons.

@ffreyer
Copy link
Collaborator Author

ffreyer commented May 14, 2019

MonteCarloMeasurement doesn't seem to do error propagation like this.

I thought that's the idea. Sample the input distribution, propagate the samples through the function and obtain samples of the output distribution (basically this is the first of the two estimators in the QMC book). If the density is high enough this should work I guess. However, I fail to see that this works generally, i.e. for lower densities and arbitrary functions. Maybe I'm also just using the package wrongly.

What I mean is that std(fs) = sqrt(mean(fs.^2) - mean(fs)^2), that it doesn't include and derivative of f or variances of the input variables. I think LPT would have those either way. In a sense it's the other extreme of jackknife, where instead of leaving out one value you drop all but one.

@ffreyer
Copy link
Collaborator Author

ffreyer commented May 22, 2019

I added a third version which includes the LogBinners binning. It keeps information about various each binning level around, so autocorrelation can be handled later. The memory usage for this should be O(log(N) * M), with M arguments and N push! operations. In my timing test (1000 Float64 values, g(x, y) = y - x^2) I'm now at 80-100μs, from 8μs in the generalized version or 1μs in the original version.

I tested the error propagator with the converging data test we have for LogBinner. The E.P. converges as expected and gives (approximately) the same result as jackknife, after using the autocorrelation correction sqrt(1 + 2tau(BA, 8)).
https://gist.github.com/ffreyer/1021bc3a210d7ff8192fb81795f80b92

Frederic Freyer added 2 commits May 23, 2019 10:52
or rather a check that dimensionality between arguments and N_arguments 
matches. Should be safer and faster
@ffreyer
Copy link
Collaborator Author

ffreyer commented May 23, 2019

It might be a good idea to let the user supply the partial derivatives for var_O1. That way we can drop the dependency on ForwardDiff and let the user use exact derivatives when available or an approximation of their choice when not. It also gives them more freedom when defining g, since it's no longer explicitly needed.

@codecov
Copy link

codecov bot commented Jul 25, 2019

Codecov Report

Merging #46 into master will increase coverage by 0.51%.
The diff coverage is 97.77%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #46      +/-   ##
==========================================
+ Coverage   95.87%   96.38%   +0.51%     
==========================================
  Files           7        9       +2     
  Lines         218      360     +142     
==========================================
+ Hits          209      347     +138     
- Misses          9       13       +4
Impacted Files Coverage Δ
src/BinningAnalysis.jl 100% <ø> (ø) ⬆️
src/ErrorPropagation/statistics.jl 96.77% <96.77%> (ø)
src/ErrorPropagation/binning.jl 98.63% <98.63%> (ø)
src/generic.jl 94.11% <0%> (-5.89%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9531b3c...b0cb8de. Read the comment docs.

@ffreyer
Copy link
Collaborator Author

ffreyer commented Jul 25, 2019

I've rewritten ErrorPropagator in the style of LogBinner, with all its functionality. I dropped the convergence functions though. I also copied the related tests.

I dropped ForwardDiff in favor of giving either the gradient as a function, or as a vector of values at means(ep).

I copied the jackknife related tests and adjusted the values for the ´f(x) = identity´ cases. They now reflect the "correct" results obtained from mean(ts) and std(ts)/sqrt(length(ts)). I haven't adjusted the cases using f(x1, x2) = x1 - x2^2 yet. The complex case also still needs work.

@ffreyer
Copy link
Collaborator Author

ffreyer commented Aug 21, 2019

I was calculating the convariance matrix incorrectly. The correct formula is cov(x, y) = <(x - <x>) * conj(y - <y>)> (matches Julia's results, see also https://stats.stackexchange.com/questions/215982/covariance-matrix-of-complex-random-variables). This transforms too <x * conj(y)> - <x><conj(y)> which is now used by covmat and var(ep, function).

I also adjusted the tests now. ErrorPropagator and jackknife only differ for g(x, y) = x^2 - y, where they are within ~1% of each other for the standard deviation and within ~25% for the expectation value of g(x, y). This gets better for longer time series.

@ffreyer ffreyer changed the title Error Propagation (WIP) Error Propagation Aug 21, 2019
@ffreyer
Copy link
Collaborator Author

ffreyer commented Aug 21, 2019

This is ready to be reviewed/merged.

@ffreyer
Copy link
Collaborator Author

ffreyer commented Jun 23, 2020

I also adjusted the tests now. ErrorPropagator and jackknife only differ for g(x, y) = x^2 - y, where they are within ~1% of each other for the standard deviation and within ~25% for the expectation value of g(x, y). This gets better for longer time series.

I adjusted the tests again, now jackknife tests against 1000 random values (w/ set seed), rather than 10 set values. The error propagator tests now use the results from jackknife and generally match those with an accuracy of 1e-12. The "worst case" from before, i.e. the 25% difference is now below 0.1%.

I'll look through this stuff again and if I don't find anything wrong with it within a week or so I'll just merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants