-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multihypo docs page, better intro? #674
Comments
Separately, In case of a
What does that mean... When a factor is used as conditional likelihood for calculating proposal distributions, when nullhypo is active we add a large amount of entropy to the given sample (must stay on-manifold). This gain in entropy is then 'rejected' during the product between proposals steps from other factors on the variable. The multiscale Gibbs operation selects the most relevant kernels from incoming proposal densities, and it is very likely that the high entropy kernels will be rejected. Said another way, if a variable were not restricted (say one connected factor at It is possible that the current IIF code is not always adding the noise (entropy) right, or the |
Thanks @dehann for the writeup, thats pretty much what I already figured out by performing convolutions using factors with the nullhypo set. However, Best, |
Hi @dehann , I came across having to describe the mechanics of the "adding entropy" for my thesis again (see previous post). Could you elaborate that term a little or point me to the implementation? Grasping the mathematical model behind this would be really important for me :) Best, |
Hi Leo, I can point you to the code where it is added: https://github.com/JuliaRobotics/IncrementalInference.jl/blob/4ac8ccda4f27d8834950ad6a1913851031932add/src/ApproxConv.jl#L228-L239 |
Thanks for the hint, if maniAddOps does not do anything strange, by using using Gadfly
spreadDist = 1;
n = 1000;
x = Vector{Float64}(undef,n)
for k in 1:length(x)
x[k] = spreadDist*(rand()-0.5)
end
pl = plot(x=x, Geom.histogram(bincount=10)) Next question would be if julia> fg = initfg()
julia> getSolverParams(fg).spreadNH
3.0 using Caesar, RoMEPlotting, LinearAlgebra, Statistics, Gadfly
N = 100
fg = initfg()
addVariable!(fg, :l0, Point2, N=N)
trans = MvNormal([1.0,0.0], Diagonal([0.01;0.01]))
addVariable!(fg, :l1, Point2, N=N)
l1Prior = PriorPoint2(MvNormal([1,0], Diagonal([0.01;0.01])))
addFactor!(fg,[:l1], l1Prior)
f1 = Point2Point2(trans)
addFactor!(fg, [:l0, :l1], f1, nullhypo=0.5)
pts = approxConv(fg, :l0l1f1, :l0)
p = kde!(pts)
pll0 = plotKDE(p)
ensureAllInitialized!(fg)
pl = plotSLAM2DLandmarks(fg) The particles of l1 get spread by the convolution in a circle with diameter 3.0 (+the prior noise on l1 and the factors noise) around l0s position implied by the factor in a uniform way. Best, |
Hi @lemauee , So the most important aspect of nullhypo is that for some fraction of the particles calculated on a variable proposal (via samples as proxy) on say p(X | Y=Y) (i.e. evaluating the conditional p(X|Y) for given values Y), that for some of the samples it is as if p(X|Y) never even existed. So if you say 10% of samples are "nullhypo" (and remember given the Bayes tree approach we dont want to change the factor graph by deleting factors) we need to do the next best thing ... how to make as if the factor was never there? Here in IIF we inject noise onto 10% of existing samples in X, and then solve the remaining 90% according to the structure in p(X|Y). It is important that the 10% follows from the existing values in X, since the factor should have NO affect on those 10% samples. However we cannot just use the values in X because that will break the Markov assumption in sampling that each step in the sampling chain should only be correlated to one previous step, and not correlate back multiple stages in the sampling. The practical effect of this is that say true value x* is around say 10, but the conditional p(X|Y=Y) is trying to force X toward say -100, then there needs to be some way (during this iteration of the Gibbs sampling) to relieve that "strain". By saying 10% nullhypo, it allows some chance that a few samples will be much closer to x*. As the Gibbs process continues, samples nearer x* will be selected and all the other 99% of samples predicted by p(X|Y=Y) and nullhypo will be rejected at that stage. Therefore the posterior on X might be a little biased due to our workaround approach, but that is the difference between say 10 to 11, vs 10 to -100. The "win" in this compromise is that the structure of the Bayes tree does not have to be updated. What spreadNH does is magnify the existing spread of pre-X values by adding random values that much bigger than the current spread (makes us add noise relative to the scale being used). If spreadNH is too big, then the bias on posterior will get a bit worse. If spreadNH is too small, then large discrepancies between true value and proposals from p(X|Y) will not allowed to alleviate the "strain" as though a nullhypo. spreadNH=3 works well from trail and error in SLAM, as a good tradeoff. The idea is to add this "noise" to pre-X samples on-manifold. So we have to be careful how the exponential map is defined. Before AMP 41, RoME 244, we have to use On rand vs randn, it does not really matter, since this is a rejection sampling technique that is still localized upto spreadNH around current values. rand is a little better in that there is no concentration within spreadNH region -- but either should work fine. rand should have just a little less bias in the final posterior result. Hope that helps, we still need to write all this up in our publications :-) Best, |
Hi @dehann , Thanks for the insight, getting the idea of working on the existing values of X was really important to understand the behavior. In my little example that could not be distinguished from my (wrong) assumption. So to get a belief for a certain contitional before the gibbs sampling step in case there is a nullhypo on that conditional, the prior kernels from the variable that is being sampled to, "pre-X" are spread for the nullhypo amount, and not the kernels originating from the factor "X|Y"? Whats still a statement that needs clarification for me from your post is "makes us add noise relative to the scale being used". How does the relative spreadNH translate an absolute spread? Or did I missunderstand something there? I experienced first hand what setting spreadNH too big does, the all-famous 150 pose example with 10% outliers added, and nullhypo set to 0.1 looks like this when spreadNH is set to the standard 3.0: But with speadNH=3.0 the example looks wrong after the whole round: If its of any help I can port the currently "read-from-matfile" example to a "1-script-julia-version". I don't want to see Caesar loose against paramertric solving in such a simple scenario ;) Best, |
"""
I have read the data association page and I think it makes sense. One of the questions I had was on scope/approach - depending on where you want to go with it, it may make sense to first introduce the problem in general terms before describing how it is tackled within Caesar.
"""
The text was updated successfully, but these errors were encountered: