Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling use of up/down asymmetric images #54

Merged
merged 33 commits into from Jan 4, 2016

Conversation

DhrubajyotiDas
Copy link
Member

This PR address #27. It contains an implementation of the BASEX algorithm for Abel-inverting images which do not possess up/down symmetry. Up/down symmetry is defined as f(r,-z) = f(r,z). The underlying BASEX algorithm only requires the images being inverted to have axial symmetry, i.e. f(-r) = f(r). This PR enables using PyAbel for this more general case.

The major changes can be summarized as

  • From the end user's perspective, asymmetric transforms are toggled with a new parameter vertical_symmetry = <True, False, or None> (see changed function definition here). The default setting is vertical_symmetry=False so PyAbel should work just as before if the end user does not change anything.
  • n is decomposed into n_horz and n_vert (see lines here through here)
  • nbf is treated as nbf_horz and nbf_vert (here). Using nbf = 'auto' is strongly recommended for now.
  • Independent basis sets are generated for the vertical and horizontal directions (see lines here and here)
  • The inverted image is generated by matrix multiplying the horizontal and vertical components to the raw data (here)
  • A few new helper functions are introduced, including a new centering function. My centering conventions might be different from what was the provision conclusions in Fixed centering function #37 and recon image has shape (width+1,height+1) #34. If necessary, I can amend the PR to reflect the community's consensus on how even-shaped images should be handled, etc.

@DhrubajyotiDas
Copy link
Member Author

Both Travis and AppVeyor seem to have similar errors

ERROR: Failure: ImportError (cannot import name 'get_basis_sets_cached')

As per #36, I have renamed certain functions, including get_basis_sets_cached as get_bs_basex_cached. Possibly that's the reason for this error? I am not very familiar with how the CI tools work, so I will ask the veterans @DanHickstein, @rth, and @stggh to help me out here.

@stggh
Copy link
Collaborator

stggh commented Dec 15, 2015

You are correct. See abel/tests/test_basex.py

@DhrubajyotiDas
Copy link
Member Author

Thanks for the tip @stggh!

Travis seems to fail when nbf is set to n//2+1 in test_basex_basis_set() of abel/tests/test_basex.py.

From what I understand about the number of basis sets necessary to resolve the image, if n=101, that needs 51 basis functions (n//2+1) rather than 50 (n//2) as I mentioned here before.

Maybe @DanHickstein can look into updating dan_basis100{}_1.bst.gz to have this work? Thoughts?

@DanHickstein
Copy link
Member

Nice work @DhrubajyotiDas!

Currently, abel.BASEX (soon to be renamed abel.iabel_basex) is using vertical symmetry, so if we wanted to keep the current behavior, shouldn't we use vertical_symmetry=True? Not arguing for one default rather than the other (either seems fine to me), but if we want to maintain current behavior, I think it should be False.

TravisCI and Appveyor just automatically run the tests that are located in abel/tests. You can run them on your system with

python -c "import abel.tests; abel.tests.run_cli(coverage=True)"

which is much faster than waiting for Travis. (I always forget this command, but luckily @rth wrote it down here: https://github.com/PyAbel/PyAbel/blob/master/CONTRIBUTING.md)

You are right that we now believe it to be foolish to run Basex with n=101 and nbf=50. A long time ago we were generating the basis sets with a matlab script and wanted to make sure that the python code was producing the same thing. So, we included this test. Now, it seems a little silly. But it is a good test that nothing changes with the results of the basis set generation algorithms. Maybe open a separate issue and we can decide what to do.

On the topic of tests, I think that it would also be a good idea to include a test that ensures that both the symmetric and asymmetric BASEX transforms return (nearly) identical results.


Should we consider merging everything into one centering function? I can see that the current centering function does not satisfy the need for returning asymmetric images, but can't the current function be modified fairly easily? Currently, it takes the parameter n - which should probably be renamed size or output_size - but couldn't the function be modified to accept either integer (size=200) or tuple (size=(350,200)) arguments?

I suppose that we may also want to give the user the option to only specify the center column, and also it might be nice to allow the image to be cropped to the maximum size possible given the specified center (in which case the output size need not be specified). The function could default to size=(-1,-1), in which case the maximum image size is returned. The user could specify size=(-1,200) to crop/pad the image to 200 columns, but use all of the rows. The only issue is, in the case of something like size=(200,-1) (or size=(-1,-1)), it's not clear if an odd or even sized image should be returned. I think an additional col_parity='odd' keyword argument would be necessary.

BTW, the consensus was actually to move the centering functions out of the transform functions (i.e., abel.BASEX should just assume that it's given a proper image and we leave the centering to the user), we just had not gotten around to this yet. So, if you don't feel like modifying your centering function, we can probably still merge and just open a separate issue about creating a single centering function.

' This behaviour is currently not tested and should not be used\
unless you know exactly what you are doing. Setting nbf="auto" is best for now.')
nbf = [nbf]*2 # Setting identical number of vert and horz functions
elif type(nbf) == list:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also in the case nbf is a tuple, maybe rather do something like elif isinstance(nbf, (list, tuple)):

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this change. Thanks!

@DhrubajyotiDas
Copy link
Member Author

@DanHickstein Thanks for the feedback!

I agree that the default symmetry setting should be vertical_symmetry=True. That was an oversight on my part. Strangely, the options become incomprehensible after staring at them continuously for a few hours!


Currently, the user can specify the center as

  • integer, in which case it is assumed to be the center column of the image, or
  • tuple, where the center is given in (x,y) format. This is for backwards compatibility.

The size of the image output from my centering function center_image_asym is always (n_vert, n_horz), either by cropping or padding zeroes as necessary.
IMO, since the user has already specified n (which, as you mentioned, we should rename to something more intuitive such as size), it seems appropriate to use that as the spec for how big the inverted images ought to be.
I think the original unified centering function might be suited to do all this OOTB. My main reason for writing another centering function was simply for code readability, and because I wanted rows of zeroes to be padded when a vector was passed as rawdata. I believe the original centering function duplicated the vector.


@rth Thanks for the suggestions!

I'll add the unit tests and send up a few more commits. Currently both symmetric and asymmetric BASEX show excellent agreement for a Gaussian function.

example_1

example_2

EDIT: I'll get back to you on the question of Mc_vert.

@DanHickstein
Copy link
Member

Awesome work guys!

@rth, I think that our new convention is actually that abel.basex_transform should be renamed abel.iabel_basex_transform and abel.BASEX should re-constructed (without centering) as abel.iabel_basex. We can probably implement this change to BASEX after this merge, though we may want to hold off on messing with Hansen-Law until @stggh is done with his PR.

I didn't realize before that your abel.iabel_direct only works for one quadrant (more precisely, doesn't it work for half of the image?). I think that the convention that we decided upon is that the function abel.iabel_xxxx_transform contains the core routine, but can have non-standard inputs and outputs depending on the transform. abel.iabel_xxxx will contain a wrapped function that will accept a centered (but whole) image and return an object containing results.transform (and potentially other results). So, I think that you probably just need to change your abel.iabel_direct to abel.iabel_direct_transform and add abel.iabel_direct as a wrapper function that will handle a full image. Similar change for the forward transform is probably needed.

No need to worry about this for this PR, but in the future we can make the change. Same thing for the centering function. Let's combine the two functions later.

I'm glad that @rth asked about the basis functions. I've read the paper many times, and I think that I understand the simple matrix multiplications that take place during the actual transform. But I really have no idea what's going on with the generation of the basis functions. If you can shed light on this, it would be much appreciated.

@DhrubajyotiDas
Copy link
Member Author

@rth and @DanHickstein

Regarding the vertical basis functions, here are the relevant sections from the original Dribinksi et al. paper

image


image


image
image

image


Let's look at the vertical_symmetry=True case (classical BASEX):

To clarify the notation, the basis sets that were being calculated earlier

  • M, and Mc

are equivalent to the RHS of Eq 14 and 15 respectively. Unfortunately, the expressions as provided are hard to evaluate numerically (lots of things go haywire when for example, k = 0). However, the authors do provide a nicer algorithm in the Appendix of the paper (Eq A1 - A3) which can be implemented numerically in order to do the actual evaluation. The original matlab code, and all python ports use this formulation. This might explain the mismatch between the matlab code and the paper that @DanHickstein noticed in the past.

Once we have values for rho_k and X_ki (saved inside M and Mc), we need to evaluate Z_mj. While there are no real restrictions on what this should be, in BASEX, Z_mj is taken to have the same form as the Gaussian-type functions of rho_k. This is reflected in the last equation pasted above.

What happens in the case of vertical_symmetry = True is that we can adopt a very useful shortcut, which is that we can simply copy the values of M into Z_mj directly. M stores rho_k evaluated from -r_max to r_max, and for square images with up/down symmetry, that's also -z_max to z_max. Reusing M so that Z_mj contains the same value of -z and z is where the up/down symmetry is enforced.

Once we have X_ki and Z_mj, we can calculate A and B (featured in Eq 13), which are the M_left and M_right matrices in the code.
M_left = inv(Mc.T.dot(Mc)).dot(Mc.T)
M_right = M.dot(inv((M.T.dot(M) + E)))
In the current iteration of BASEX there isn't any regularization (q_1 and q_2 = 0, so E = empty).


No up/down symmetry case:

In this situation, we cannot reuse M so that Z_mj contains the same value of -z and z because the image is undefined for z < 0. So while Z_mj will contain values evaluated using the same function rho_k, the domain is z >= 0, which means the calculation for Z_mj needs to be done explicitly. Fortunately, because we don't need to calculate a forward Abel transform on rho_k for Z_mj (see difference between the expressions for X_ki and Z_mj in Eq 12), this additional calculation is quick.

The new A and B in this case are given by
vert_left = inv(Mc_vert.T.dot(Mc_vert) + E_vert).dot(Mc_vert.T)
horz_right = M_horz.dot(inv(M_horz.T.dot(M_horz) + E_horz))

Once we have A and B, the image reconstruction is identical.


Regarding @rth's concern

I mean, the fact that Mc_vert has a shape (n_vert, nbf_vert) with basis functions in the vertical direction, does this means that different rows are somehow coupled during the transform?

This is directly addressed in the next paragraph of the paper (Pg. 2636), where the authors state that the two transformations (matrix A and B) are completely independent, and work on the raw data matrix P independently. Because of the way the three matrices A, P, and B are multiplied to calculate the matrix of coefficients C (Eq 13 in the clipped image above), every row of P sees a single, uncorrelated element from A and from B. The rows are therefore calculated independently; in fact, I am pretty sure that if the rows were not independent, you couldn't write P = XCZ

Hope this helped clear things up!

P.S. The linear algebra is also pretty confusing to me, but I think I managed to work it out explicitly for a small 5 by 5 image by hand and it seemed to work.

@DhrubajyotiDas
Copy link
Member Author

@rth I'm still working on squashing the n and nbf are tuples bug and vectorizing the inner loops, so let's not merge the PR just yet.

@rth
Copy link
Collaborator

rth commented Dec 17, 2015

@DhrubajyotiDas Thank you for this very extensive explanations! I understand it better now.

I think when we do the documentation, we should include your comment above in the technical subsection of the BASEX algorithm, as this is the first real connexion we have between the Python (or Matlab) code and the Dribinksi et al. paper. So this would be very useful for someone who would want to understand how the BASEX implementation works (in particular with respect to the basis set generation). Thanks again for your comment.

@DanHickstein
Copy link
Member

@DhrubajyotiDas, Thanks for this detailed analysis of the BASEX algorithm! I feel like I almost understand what's going on now!

It would be great if you could include some comments in the code to indicate which sections of the code are performing which equations in the paper. Also, if the variables can be renamed in order to offer better agreement with the paper, I think that we should do this. There is nothing sacred about the current variable names – in general, we just adopted whatever the authors of the original matlab code used – and I suspect that they were working quickly and not giving much thought to code readability.

Maybe a crazy thought, but, if we have the asymmetric transform, is there any reason to keep the symmetric transform? If the user wants a symmetric transform, it seems like we could just average the top and bottom half of the image, and pass the averaged half-image to the asymmetric BASEX algorithm. I would expect that it would run even faster since it would, for example, be performing the transform on a 500x1000 image instead of the full 1000x1000 pixel image. More importantly, there is a lot of duplicated code between the symmetric and asymmetric transforms, and it would be nice to keep just one well-documented basis set generation function.

A few more features we can think about implementing:

  1. Is it possible to enable BASEX to run on just one horizontal part of an image? I expect that some users will just have one side of an image, and all of the other algorithms can handle this, except for BASEX, which would require that we

  2. Can we implement the forward Abel transform using BASEX? Now that @DhrubajyotiDas has a good handle on the linear algebra, this should be reasonably straightforward, as the same basis sets should be used for both the forward and the inverse transform, correct?

  3. Are we any closer to figuring out how to use basis functions that are larger than a pixel? Intuitively, this seems possible, and highly desirable when the data contains only broad features, since it would tend to eliminate high-frequency noise.

Thanks again for your great work on this!

Merry Christmas everyone!

@stggh
Copy link
Collaborator

stggh commented Dec 25, 2015

In relation to item 2), IMO the naming convention:
fabel_<implemenation> : forward transform (when defined)

iabel_<implemenation> : inverse implementation (when defined)

is now somewhat redundant in light of the inverse parameter inverse=False/True) used in <direct> and <hansenlaw> code. For HL it results in 6 functions (4 wrapper functions, and the 2 actual), which IMO, unnecessarily obfuscates the code, reducing user comprehension of the algorithm.

I would like to suggest that we now drop the f/i-prefix and just pass the boolean parameter inverse?

@DhrubajyotiDas
Copy link
Member Author

@DanHickstein Regarding your comment:

Maybe a crazy thought, but, if we have the asymmetric transform, is there any reason to keep the symmetric transform?

I agree, the asymmetric transform is a superset of the symmetric transform, and averaging the top and bottom halves of the given image and performing an asymmetric transform on that ought to be equivalent to performing a symmetric transform on the entire image. I have yet to verify this explicitly, but if it turns out to be true (and I believe it will), we should be able to condense a lot of the code into a single approach. I think a cleaner way to handle this would be to open a new PR for these after the changes in this PR have been merged, just to keep the git history clean.

Regarding the speed-up in transforming an averaged half-image versus a full-image, I don't think there will be that much of an impact during the actual inversions, since those are just matrix multiplications and are very fast as it is. The basis-set generation will be (slightly) faster, but most of the computational time is spent on calculating the horizontal functions (related to the horizontal width of the image), which will be the same between the half and full images.


  1. Is it possible to enable BASEX to run on just one horizontal part of an image?

This is definitely possible. I can think of two ways of doing this:

a) Generate basis sets from r = 0 to r = r_max and proceed as usual. This option would entail writing more code in the xxxx_basis.py file.

b) Mirror the input raw data to make a full horizontal slice from r = -r_max to r = r_max and proceed as usual. This option would entail a small modification to any preprocessing functions we use before sending stuff over to xxxx_basis.py functions.

My personal preference would be for Option b), mainly in the interests of maintaining code legibility. The basis set generation code is pretty esoteric as it is, and a new mechanism for dealing with half-images could be hard to digest.


  1. Can we implement the forward Abel transform using BASEX?

I think the original Matlab script had the command to implement the forward transform using the calculated matrices. It's here in the current code
P = dot(dot(Mc,Ci),M.T) # This calculates the projection, which should recreate the original image
I haven't tested this to see if this actually recreates an image similar to the given raw_data. If it does, I think it would be very useful in testing the goodness of the transform.

Regarding calculating the forward transform directly from inverted raw_data, I don't think BASEX provides a direct method of calculating the forward Abel transform. However, one technique that I can think of is to do something analogous to what is currently happening (when the given raw_data is already forward transformed):

  • Currently in BASEX, we try to express the provided (forward transformed) raw_data as sums of basis functions.
  • These basis functions were calculated by forward transforming Gaussian-ish functions. BASEX provides a way to do these calculations.
  • Now, when we are provided with raw_data that has not been forward transformed (or in other words, the provided raw_data is similar to what is currently being output by BASEX), we could try to express this raw_data as sums of the Gaussian-ish functions.
  • We can then forward transform these Gaussian-ish functions according to the technique in BASEX to arrive at the desired forward transform of raw_data

I don't know yet how to do this, but I suspect it could be as simple as switching M and Mc in the code...


  1. Are we any closer to figuring out how to use basis functions that are larger than a pixel?

I haven't had a chance to investigate this (traveling for the holidays), but I plan to look into this once I get back (early January). Hopefully then I'll finally be able to incorporate some of Roman's recommended changes to this PR and close this to work on the new stuff we have discussed so far, including the naming conventions that @stggh mentioned in his last comment.

@rth
Copy link
Collaborator

rth commented Jan 4, 2016

Happy New Year everybody!

Also, just curious, do you see a way to vectorize the other double loop in that basis set calculation function (starting here)?

Well the Matlab basis set generation consisted of 3 loops. The inner one got vectorized in the current Python implementation. We might factorize one more, which would be much faster, although I'm not sure how easy that would be. I didn't see anything immediate when writing this a while back, but it might be worth getting back to it at some point later.

@DhrubajyotiDas Can we merge this PR then?

@DhrubajyotiDas
Copy link
Member Author

Yes, I think this PR can be merged now, and some of @DanHickstein's suggestions tackled with a separate PR. Who pushes the big green button usually?

@DanHickstein
Copy link
Member

Yes, a separate PR for the other changes is fine.

@DhrubajyotiDas, I leave it to you do push the "big green button"! :)

And great work on this! Thank you!

DhrubajyotiDas added a commit that referenced this pull request Jan 4, 2016
Enabling use of up/down asymmetric images
@DhrubajyotiDas DhrubajyotiDas merged commit a39501a into PyAbel:master Jan 4, 2016
@DanHickstein
Copy link
Member

Huzzah!

@stggh
Copy link
Collaborator

stggh commented Jan 4, 2016

"big green button"! :)

Dumb Q: Is there actually big green button?
I have been driving github via the command line. I like the command line, but often git complains about my lack of git-knowledge. Is there another way?

@DhrubajyotiDas
Copy link
Member Author

There actually is/was a big green button that says Merge pull request.

@DanHickstein
Copy link
Member

I avoid the command line for the most part. I use the GitHub app on the Mac. Also, the web interface offers nice functionality for opening and merging pull requests.

@stggh may not see the big green button, since I don't think that he has write access to the repo.

@DhrubajyotiDas
Copy link
Member Author

@DanHickstein same here as far as command-line trepidation goes. I use the Sourcetree app on Windows, and it seems to get the job done, except for that weird merge issue I had a few weeks ago that @rth bailed me out from. I do like the functionality of the Github web-interface quite a bit.

@rth
Copy link
Collaborator

rth commented Jan 4, 2016

For my part, I also do everything git related in the command line. However, for merging pull requests on github I don't think there is another solution than pushing the "big green button" in the github web-interface (it just disappeared after this PR was merged)...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants