-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abel transforms of non-noisy & non-experimental data of smooth functions #229
Comments
@oliverhaas thanks for sharing your nice work! It would be great if there was a little more information about the FMM+EC method, your particular implementation of Hansen-Law method, and code documentation ;-) From my quick tests your method (based on your example), it looks good. Here is a transform pair function 3 (1D image size 201):
Is there a pure python version of your code available? Do you have any plans to make available the FFM+EC method for PyAbel?
I think this is because they all employ a similar final step matrix product.
PyAbel's most accurate method appears to be Code for the calculation above. A second run reduces the 3pt execution time to 1ms. # hacked from example001_gaussian.py (@oliverhaas)
import openAbel
import abel
import numpy as np
import matplotlib.pyplot as mpl
import time
# Parameters
nData = 201
# PyAbel transform pair
f = abel.tools.analytical.TransformPair(nData, 3)
# data. This way it's much faster if repeated transforms are done.
t0 = time.time()
abelObj = openAbel.Abel(nData, 1, 0, f.dr)
dataOut = abelObj.execute(f.abel)
t1 = time.time()
mseOA = np.square(dataOut - f.func).mean()
print('openAbel in {:.1f} ms, mse={:.2e}'.format((t1-t0)*1000, mseOA))
t0 = time.time()
pya = abel.dasch.three_point_transform(f.abel, dr=f.dr)
msePA = np.square(pya - f.func).mean()
t1 = time.time()
print('PyAbel three_point in {:.1f} ms, mse={:.2e}'.format((t1-t0)*1000, msePA))
# Plotting
fig, axarr = mpl.subplots()
axarr.plot(f.r, f.func, 'r-', label='analy.')
axarr.plot(f.r, pya, 'g-.', label='PyAbel 3pt mse={:.2g}'.format(msePA))
axarr.plot(f.r, dataOut, 'b:', label='openAbel mse={:.2g}'.format(mseOA))
axarr.set_ylabel('value')
axarr.legend()
fig.suptitle('Inverse Abel Transform {}'.format(f.label), fontsize=16)
mpl.tight_layout()
mpl.subplots_adjust(top=0.87)
mpl.savefig('fmm.png', dpi=75)
mpl.show() |
Thank you for the quick response. Many of your questions/remarks could be answered with "Hopefully I get around to it soon to improve it." And I hope that's somewhat normal :). I'll comment anyway.
I'm writing up something more detailed for my PhD, but in principle the references are mentioned in my readme.
I followed the original Hansen-Law paper fairly closely, but made two smaller adjustments:
If you have anything in particular let me know. I'm improving the code documentation slowly, but if I'm almost exclusively the sole user of it then I want to keep a small (but of course still helpful) amount.
Unfortunately not. That's one thing I'm not really planing on doing either. Personally for some calculations Python gets really difficult for me if you need that one single manual for loop and it messes up your whole run time :).
Yes I have/had. It depends a little bit on what you're expecting. I planned to write a simple wrapper for sure. But for example if a pure Python version is the minimum requirement for the PyAbel collaborators, then I'd probably stay away from directly contributing to the master branch, and only have my simple wrapper in my own branch and won't bother too much.
I'm not sure if I understand correctly what you mean. But I don't think the matrix product has anything to do with it. One could write my FMM+EC in matrix form and it would still be higher order.
Yes the three_point method seems to have the smallest error for fixed small data sets. Convergence order with increasing N (in PyAbel's case the width of the image in pixels) is still only first order. I actually expected it to have third order convergence since it uses three points and thus a quadratic approximation (EDIT: forgot it takes the derivative, so it should still be second order), but something apparently is not cleanly in the method or implementation, since it only converges in first order. If I wanted to rewrite your example to be more flattering for my code I would
# hacked from example001_gaussian.py (@oliverhaas)
# @stggh version modfied again by @oliverhaas
import openAbel
import abel
import numpy as np
import matplotlib.pyplot as mpl
import time
# Parameters
nData = 201
# PyAbel transform pair
f = abel.tools.analytical.TransformPair(nData, 3)
# data. This way it's much faster if repeated transforms are done.
abelObj = openAbel.Abel(nData, 1, 0, f.dr, order = 3) # default is order = 2
t0 = time.time()
dataOut = abelObj.execute(f.abel)
t1 = time.time()
mseOA = np.square(dataOut - f.func).mean()
print('openAbel in {:.1f} ms, mse={:.2e}'.format((t1-t0)*1000, mseOA))
t0 = time.time()
pya = abel.dasch.three_point_transform(f.abel, dr=f.dr)
t1 = time.time()
msePA = np.square(pya - f.func).mean()
print('PyAbel three_point in {:.1f} ms, mse={:.2e}'.format((t1-t0)*1000, msePA))
# Plotting
fig, axarr = mpl.subplots()
axarr.plot(f.r, f.func, 'r-', label='analy.')
axarr.plot(f.r, pya, 'g-.', label='PyAbel 3pt mse={:.2g}'.format(msePA))
axarr.plot(f.r, dataOut, 'b:', label='openAbel mse={:.2g}'.format(mseOA))
axarr.set_ylabel('value')
axarr.legend()
fig.suptitle('Inverse Abel Transform {}'.format(f.label), fontsize=16)
mpl.tight_layout()
mpl.subplots_adjust(top=0.87)
mpl.savefig('fmm.png', dpi=75)
mpl.show() 2018-09-26 - DH made some small formatting edits. |
Hi Oliver! Thank you so much for showing us your great work on this new method!
PyAbel already has some Cython code (for the Direct method). It would be nicer to keep as much as possible in pure python, but if Cython is required for speed, then that shouldn't keep us from bringing a useful method into PyAbel. It seems that this method offers access to a very different space of speed and accuracy than our other methods. The issue of convergence with increasing order is interesting. This is not something that we have investigated much in the past, but I would have thought that the higher order methods would have converged faster as well, and it is strange that they don't seem to. |
A pure Python would probably be nice in general, but I will put my priorities somewhere else in the near future. Of course I would support anyone who wants to give it a shot :).
I reread the Dasch paper and as far as I understand it's actually just analytical integration of a piecewise polynomial, so it should be converging with the respective order of the polynomial. I'm guessing there is a small bug somewhere in the implementations. I actually wrote a small wrapper as mentioned before, so people can use openAbel from PyAbel (see my PyAbel fork https://github.com/oliverhaas/PyAbel and of course you need my code installed as well https://github.com/oliverhaas/openAbel, run/look at example_openAbel.py in the PyAbel examples folder; I'm still working a little bit on both, so don't be surprised). Not a lot of documentation, since it's more or less just for testing/developing. Cheers Oliver Haas |
To add to my previous post, I looked at the comparison of PyAbel and openAbel methods again and tried to figure out the convergence issue. I switched from looking at pointwise convergence to looking at the convergence of the mean squared error (MSE) like @stggh did in his example, which makes more sense in general :). And I changed the test function from a Gaussian to a function which decays to exactly zero at the upper end of the interval. So here is the result (forgive me for only doing few points for the slower methods): In combination with my previous results I have some remarks:
A small note: When I say something converges in nth order, it is more correctly (n+1/2)th order due to the square root in the Abel transform. Cheers Oliver Haas EDIT: I replaced the plot with a slightly better one. |
Hello again,
When I talk about convergence in the context of (non-adaptive) numerical methods I mean usually the order of convergence of the relative error of the method with increasing number of grid points for the same problem. For example if you numerically integrate a Gaussian (mu=0, sig=1) from 0 to 1 with trapezoidal rule and use N = 1000 points, you will get a 100 times smaller error than if you would use only N = 100 points. This is because trapezoidal rule is 2nd order accurate (100 = 10^2 times smaller error if you multiply N by 10).
The method two_point implicitly assumes that the function can be approximated by piecewise linearly connecting the data points, which leads to a constant derivative in between points.
Instead one could calculate the derivative of the function first (and make some smaller adjustments) and then just multiply only by J(Iu, Ju). The method three_point does in principle the same, only with piecewise quadratically connecting the data points. But again, there is an error in the implementation I think. Otherwise it should look different in the convergence plot.
For me it is not clear how Dasch handles the right end of the integration interval. Possibly he doesn't treat it cleanly already in his paper. Cheers Oliver Haas |
Excellent! Thanks, for the detailed explanation. It had not occurred to me that ΔJ formed the derivative. |
Hello again, so here probably my final remarks: I looked at three_point again and I think the reason it's only first order is due to Dasch using a first order second derivative stencil at zero. The stencil [1, -2, 1] is only second order accurate if central finite differences are used, which is not done because data is not available at r = -dr. Instead a forward difference stencil has to be used ([2, -5, 4, -1], see https://en.wikipedia.org/wiki/Finite_difference_coefficient) if second order accuracy is desired everywhere. It's probably not relevant for most people, since if only one transform is done this only affects the result at r = 0. Anyway I rewrote three_point just to check (see https://github.com/oliverhaas/PyAbel) and I get second order convergence (see bottom left plot). As a note: Please don't use example_dasch_methods.py as a test case, since the combination of how the forward transform is done with hansenlaw and the choice of the data set (non-zero at right end of interval) make it so that by coincidence my version of three_point has a larger error than even onion_peeling. This will probably the last of my contributions for a while (testing and implementation wise), but I will be around posting and maybe answering questions anyone might have. I know that some of my posts have been a little bit messy :). Cheers and have a nice weekend. Oliver Haas |
Thanks @oliverhaas, I will take a look at your changes to |
Hello PyAbel community!
If this is the wrong place to write something like this or if there is no interest at all let me know. I'm still fairly new to the whole collaborating and sharing an github thing.
I stumbled on PyAbel quite a while back and was impressed with the collaboration and documentation on it. Although I couldn't use it directly for my work it helped me quite a bit to get things sorted.
I needed a fast way to calculate accurate forward and backward Abel transforms of large non-experimental & non-noisy data sets of discretized smooth functions and stumbled upon plenty of publications on the topic of Abel transforms. However, either accuracy or run times of the methods weren't sufficient for my work, since the main focus of aforementioned publications is usually the transformation of noisy data and small data sets.
In the end I managed to adapt the Fast Multipole Method (FMM) in combination with high-order end corrections (EC) to my needs and wanted to share some thoughts here. There were similar problems to the Abel transform solved by these methods already published in literature, but since the Abel transform is not one of them up to now and I think there are some pitfalls I managed to avoid I wanted to share.
And if I'm missing something in my explanations or misunderstood something I'm happy to learn.
I implemented the above mentioned FMM+EC and a couple other methods in Python/Cython (repository is here https://github.com/oliverhaas/openAbel, but I just started uploading couple of days ago). Here is an overview of some of the methods, their error, convergence and run times for a test case of a simple Gaussian forward Abel transform:
![figure_1-1](https://user-images.githubusercontent.com/4050664/46040985-01e29a00-c112-11e8-92df-91735680e802.png)
One can see that in the case of sufficiently smooth functions the FMM+EC has both linear O(N) run time, high order convergence and small errors. I should note that I intentionally used the forward Abel transform, since it behaves quite a bit nicer than the backward Abel transform. Actually in my work I can circumvent a lot of the problems of the backward Abel transform, because I can differentiate analytically and I don't have experimental noise anyway. But for experimentalists this is of course not the case.
For comparison I tried to do a similar range of plots including both my methods and PyAbel's methods. There are some non-ideal parts of the plots, especially since I haven't dissected PyAbel enough to perfectly call each subroutine, but I think it's a good overview. Legend as above, with PyAbel methods as PA and their PyAbel name. Again a simple Gaussian.
![figure_1-2](https://user-images.githubusercontent.com/4050664/46042113-e1680f00-c114-11e8-93c2-2de092e5f5ea.png)
My observation is that accuracy drops quite a bit due to the numerical derivative, even for smooth functions. Convergence order still stays if one takes the correct finite difference of suitable order, obviously. Overall I strongly recommend trying to take the derivative analytically (for the non-experimentalists of course), which in my case was luckily possible due to the convolution derivative theorem and Leibniz's integral rule.
I'm not sure why all PyAbel methods seem to be at most first order. Especially from the Dasch two-point and three-point methods I expected different convergence orders. I'm guessing that most implementations in PyAbel either explicitly or implicitly use first order accurate finite difference derivative.
And I assume the large errors of the Basex method will be mitigated due to the Tikhonov regularization as discussed in #225.
I had some success with FMM+EC and numerical derivatives with maximally flat filters, and if there is interest I can share them at some point. However, typically accuracy is pretty much determined by the amount of noise, and there is often no benefit in going to high order methods. Still, the linear run time of FMM+EC might be useful in some cases as well.
Some special remarks on the Hansen-Law method: I think the approach is quite clever and works well if only low accuracy (~1.e-3) is required. And with that I mean the general approach with the fitting of an exponential model to the Abel transform. I tried to produce fits with more exponentials with only limited success. It's a lot of manual work to get higher accuracy and the end result is slower and less robust than FMM+EC in my experience. Furthermore increasing the order of the integration doesn't really do much either (Hansen-Law assume piecewise linear functions); usually Hansen-Law is limited by the exponential fit accuracy.
Have a nice week.
Cheers
Oliver Haas
The text was updated successfully, but these errors were encountered: