Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating polarized field propagation? #91

Closed
Jashcraf opened this issue Mar 3, 2023 · 25 comments
Closed

Integrating polarized field propagation? #91

Jashcraf opened this issue Mar 3, 2023 · 25 comments

Comments

@Jashcraf
Copy link
Contributor

Jashcraf commented Mar 3, 2023

I work a lot with Jones Pupils (see image below). Doing diffraction simulation with these involves propagating the 4 complex "wavefronts" that make up the Jones pupil function, and I was curious about what it would take to integrate the same utility to prysm?

image

Of course one could define some propagator() function that describes the common optical path and loop over it, but for performance I was curious about how the current prysm.propagation module could support wavefronts of shape [2 x 2 x Npix x Npix]

np.fft.fft2 defaults to performing the computation on the last two axes of an array. A quick simulation with a 2x2x256x256 array shows that np.fft.fftshift appears to work similarly, so propagation routines in prysm that call these functions are not immediately affected.

I'm more concerned with mft and czt because I don't really know how they work. Since mft uses matrix multiplication the arrays should be broadcasted just fine, but I don't really understand how a 4-D array would affect the configuration of the bases.

Would supporting "broadcasted" propagation be meaningful to do?

@brandondube
Copy link
Owner

brandondube commented Mar 4, 2023

As far as performance is concerned, I think it "doesn't matter." When fft2 is broadcast/"vectorized" over a subset of the total dimensions, it's just a loop over what is in this case a total of 4 things. Looping in python vs C over 4 things seems meh.

We could certainly look at adding broadcast/vectorization support to MFT and CZT. In the former case, it is just two lines to make the matmuls broadcast nicely. For CZT there is mild complexity since right now a product of the form c=a*b is broken into tmp = a*brow; c=tmp*bcol, which won't broadcast super easily when the 2x2 is on the front.

Something to think about is "how does this impact gradient back propagation?" As-is, the routines all presume in one way or another 2D input/output, and (even if only just) you never need more than the immediate past to have all your inputs in hand for the next backprop step. I may be wrong, but I think there will need to be a step where a 2x2xMxN array has to be initialized and filled in to do "Jones backprop" properly. That will need an oracle input, where something in the code, or the user, knows we're doing a Jones prop. Putting a spot in generic propagation routines specific to polarization seems like an imperfect separation of concerns.

It may instead be simpler/"better" to define polarization.py:jones_adapter or something, which looks something like

def jones_adapter(J, prop_func, *prop_func_args, **prop_func_kwargs):
    J00 = J.data[0,0,...] # can actually leave off the ellipsis if you want...
    J01 = J.data[0,1,...]
    J10 = J.data[1,0,...]
    J11 = J.data[1,1,...]
    for E in zip(J00, J01, J10, J11):
        wf = Wavefront(E, J.dx, J.wavelength, J.space)
        prop_func(wf, *prop_func_args, **prop_func_kwargs)

e.g.,

# wf = "Jones wavefront," WF = from prysm.propagation import Wavefront as WF
wf = Wavefront(np.random.rand(2,2,1024,1024), dx=1, wavelength=.6328, space='pupil')
focused_wfs = jones_adapter(J, WF.focus, Q=2, efl=100)

This may be super difficult to read properly (wf/WF) and is an unusual way of using python. And it would only work with the WF.xxx methods, not the free functions. That's probably OK.


e: it may be possible to detect the difference between wf and WF, so that if the user does wf.focus instead of WF.focus, you do something like cls = prop_func.__self__.__class__; if getattr(cls, prop_func.__name__) != prop_func: # user gave method on an instance, instead of the class; prop_func = getattr(cls, prop_func.__name__)


Another design might consider taking a page out of machine learning's book and make a context manager that does deep, insidious, evil things to generate a "tape" without actually doing the propagations, but storing the variables, etc, so that you could do

with EvilRecorder() as rec:
    wf_fpm = wf.focus(...) # none of these lines actually run,
    wf_fpm_prime = wf_fpm * fpm # the decorator just stores what would have happened
    wf_at_lyot = wf_fpm_prime.unfocus(...)

runner = rec.history()
jones_wf_at_lyot = runner(J)

The code for this idea would be a fair amount harder to write, but is actually possible, I think. Better/worse, not sure. You would only "do one thing" to make many scalar into Jones props, instead of doing it to each one. But it's evil and magical, which is fairly antithetical to how prysm works.

@brandondube
Copy link
Owner

Thinking about this more, there is a small issue with the "jones adapter" approach: it turns a 2x2 input into a length 4 list return. It may be better to introduce a new type, which stores things as a length 4 tuple or list internally, but define __getitem__ on it or whichever dunder controls slicing operations. That way something ~= J = JonesWavefront(J00, J01, J02, J03); J[0,1] does what you want.

This in some way breaks all of the operators you have in pol.py, so maybe this is not a very good solution either.

Just copying the list elements into a new 2x2xMxN matrix fixes the interface problem, but is Not Super Great for performance. It might be possible to make a non-contiguous numpy array and avoid the copy. That would probably be best.

@Jashcraf
Copy link
Contributor Author

consider myself @'d

first thoughts are that in order to use polarization.py in propagation it needs to be able to do matrix multiplications with the incoming (generally) 2x2 wavefront. Keeping them as numpy arrays is nais because we can just matmul our way to victory

M,N = 256,256
jones_wavefront = np.random.random(2,2,M,N)
wf_after_polarizer = prysm.polarization.linear_polarizer(shape=[M,N]) @ jones_wavefront

But 2x2 matrix multiplications aren't so special that we absolutely need matmul. You could just as easily perform the same operation on two lists with a matmul_lists() function or something, but I don't know how meaningful the performance slows doing 8 elementwise multiplications v.s. 1 matmul.

I don't think I really get the non-contiguous numpy array, how does that solve the interface problem?

@brandondube
Copy link
Owner

brandondube commented Apr 12, 2023

Taking the sketch slightly further,

def jones_adapter(J, prop_func, *prop_func_args, **prop_func_kwargs):
    J00 = J.data[0,0]
    J01 = J.data[0,1]
    J10 = J.data[1,0]
    J11 = J.data[1,1]
    tmp = []
    for E in zip(J00, J01, J10, J11):
        wf = Wavefront(E, J.dx, J.wavelength, J.space)
        ret = prop_func(wf, *prop_func_args, **prop_func_kwargs)
        tmp.append(ret)
    
    # one path, return list (no extra copies/allocs)
    # return tmp

    # different path, pack it back in (waste copies)
    out = np.empty((2,2,*tmp[0].shape), tmp[0].dtype)
    out[0,0] = tmp[0]
    out[0,1] = tmp[1]
    out[1,0] = tmp[2]
    out[1,1] = tmp[3]
    return out

Another (ugly) approach is to make our own JonesMatrix class, which defines __matmul__ and does the dot products, as you describe. That is just moving the performance issue down the road a little bit.

At 8192x8192, a few timings...

N=8192
a = np.random.rand(N,N)
b = np.random.rand(N,N)
c = np.random.rand(N,N)
d = np.random.rand(N,N)
abcd = np.asarray([
    [a,b],
    [c,d]
])

%timeit abcd = np.asarray([[a,b],[c,d]])

204 ms ± 5.47 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit a*b
60.6 ms ± 2.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit np.exp(1j*a)
1.02 s ± 5.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

round numbers the copy is four times as expensive as an elementwise multiply, four times cheaper than expi.

I do (strongly) prefer that we define some function, I don't know, apply_polarization_component(Jwf) even if the implementation is just matmul inside. This preserves a plain english (or so) API.

@Jashcraf
Copy link
Contributor Author

The function would be easy to implement, I can throw something together and make a demo.

I haven't used Wavefront too much yet, do all of the propagation functions operate on Wavefront? I see that they are all methods of the class and would be implemented like

wf = Wavefront(Jxx,*wf_args)
wf_prop = wf.focus(*focus_args)

As opposed to just keeping them as numpy arrays and calling

wf_prop = focus(Jxx,*focus_args)

@brandondube
Copy link
Owner

I prefer that the function is smart enough to understand the difference and act appropriately. It is mildly arcane, but you can sniff out whether something is a method or a free-function. The three cases would be

from prysm.propagation import focus, Wavefront as WF
wf = Wavefront(...)

# case 1
jones_adapter(focus)

# case 2
jones_adapter(WF.focus)

# case 3
jones_adapter(wf.focus)

These can be sussed out as follows:

if hasattr(func, '__self__'): # case 3
    func = getattr(func.__class__, func.__name__) # wf.focus -> WF.focus, pass first arg of wf_new aka "self"

if func.__class__.__module__ == 'builtins': # case 1, no need to make the "dummy" wf obj in the loop

# defacto case 2 at this point

The free functions in propagation.py (focus, focus_fixed_sampling, ...) operate on arrays. To make chaining nicer, there is also the Wavefront object which has methods that wrap the free functions

@Jashcraf
Copy link
Contributor Author

Jashcraf commented Apr 19, 2023

Working a bit on this, starting with the propagation functions and will move onto the methods of the Wavefront class

Here's the adapter in my jones_adapter branch of my prysm fork

And these are the tests for focus, unfocus, and angular_spectrum, all of which pass

When writing the tests I did find it a bit weird to separate the wavefunction from the remaining *args and **kwargs. For example in the test_jones_adapter_angular_spectrum()

jones_prop = jones_adapter(jones,angular_spectrum,wvl,dx,z,Q=2)

could instead be

jones_prop = jones_adapter(angular_spectrum,jones,wvl,dx,z,Q=2)

What do you think? The only part I can think of where this is kind of a holdup might be for the propagation methods of Wavefront.

@brandondube
Copy link
Owner

I think it is probably nicer if the jones adapter is written as a decorator, that way the usage will be fully transparent, with the contents of propagation.py (eventually) written as,

@jones_adapter
def focus(...):
    pass

# usage
focus(J, ...)

In the interim, it would look like

Jfocus = jones_adapter(focus)
Jfocus(J, ...)

or as a less clear one-liner,

jones_adapter(focus)(J, ...)

This way the calling syntax is kept fully ordinary, not extraordinary, and the fact that it is a higher order function is "hidden" from the user

@Jashcraf
Copy link
Contributor Author

Ah I see, I wasn't sure how committed we were to actually making it a decorator. I'm a little unfamiliar with how to structure the decorator so I'll play around with it a bit.

@Jashcraf
Copy link
Contributor Author

This has been simmering on the backburner for long enough, best return to it before it's well-done

@brandondube do you have any tests you'd recommend to check that I correctly wrote to_fpm_and_back and babinet correctly (as well as the backprops)? Last I left the polarized field propagation I was separating these functions from Wavefront so that we could apply the @jones_wrapper to them.

@Jashcraf
Copy link
Contributor Author

@brandondube
Copy link
Owner

Could you help me understand the purpose of pasting most/all of propagation.py in polarization.py, thereby adding a bunch of immortal garbage to the git history?

Could you not achieve the same thing, by

# file: polarization.py

from prysm.propagation import focus # etc

focus = jones_decorator(focus)

The edit-run-test cycle is terrible this way since autoreload won't work, but you can do the equivalent action in jupyter (etc) for live changes with autoreload

For testing, I think it would be reasonable to pick one function (say, focus) and do ~=

Jfoc = jones_decorator(focus)

fld = np.random.rand(32,32,2,2)
scalar = propagation.focus(fld,Q=1)
jones = Jfoc(fld, Q=1)
J00 = jones[...,0,0]
J01 = jones[...,0,1]
J19 = jones[...,1,0]
J11 = jones[...,1,1]
assert np.allclose(scalar, J00, atol=1e-12)
assert np.allclose(scalar, J01, atol=1e-12)
assert np.allclose(scalar, J10, atol=1e-12)
assert np.allclose(scalar, J11, atol=1e-12)

It would probably be useful to add a utility function to polarization.py that takes an MxNx2x2 array and returns J00..J11, i.e., those four lines above, so they don't have to be pasted everywhere.

In the broken out to fpm and back and backpropagation routines, I think there is a backwards co-mingling. All of the other free functions don't use any classes, but here something is stuffed into a class. I would prefer that not be done, and these refactoring changes be done directly in propagation.py. I also think the transfer function of free space (angular_spectrum_transfer_function) does not need to be decorated. And, does the new polarized wavefront actually have any changes, compared to ordinary wavefront?

@brandondube
Copy link
Owner

@Jashcraf nudge

@Jashcraf
Copy link
Contributor Author

Jashcraf commented Nov 6, 2023

Adding a sample code snippet here so that I don't have to dig through slack to find. I'll get on it this week after I'm done with JATIS proofs.

funcs_to_change = ['focus_fixed_sampling'] # etc
for func in dir(propagation):
    if func.__name__ in funcs_to_change:
        setattr(propagation, func.__name__, jones_adapter(func))

@brandondube
Copy link
Owner

👋 can we close this out by the new year?

@Jashcraf
Copy link
Contributor Author

made the push, ty for the bump

Made the addition to my fork of prysm

Let me know what you think and I can make a PR

@brandondube
Copy link
Owner

A few comments

  • In the docstring of jones_adapter, I think the two paragraphs and J = should go under parameters, in a Notes section. I think almost no-one will consume this function directly, but it is generally preferable that someone can unfold the documentation (shift-tab in Jupyter, say) and see the parameters without having to open full screen.
  • Is the name make_propagation_polarized a slight misnomer since the functions remain compatible with scalar fields? Perhaps add_jones_support_to_propagation ?
  • In the demo, please make the filename capitalized like the other ipynbs. (I actually see I need to make them consistent either spaces or hyphens, let's just do Polarized_Propagation and I'll change the rest to match)
  • The text below what J is, "...you would use the following code" is a little awkward. Perhaps this could be clearer with "in prysm, polarization-modifying components are simply arrays of dimension (M,N,2,2), which allows their effect to vary spatially." It is a little awkward to talk about spatially varying, then immediately show a uniform component. Perhaps the linear polarizer and plotting macro could be moved to be before talking about spatially non-uniform things? I think this would flow better.
  • One time it is "shape" and the other it is just shape - please make consistent
  • please just say array, not numpy array (cupy arrays are arrays, too and I won't stand for their erasure!)
  • The block "this is neat, but doesn't show off how the optic can vary as a function of space. Let's apply a radial error in the retardance of the and elements. We added a method to generate the Jones matrix of Vector Vortex Retarders (VVRs), which are used in coronagraphy and vortex fiber nullers." is not worded in a way similar to the rest of prysm's documentation; it is in first-person and more playful than the rest of the documentation. Perhaps after restructuring the linear retarder to be earlier (which makes the first sentence redundant), this could read something like "Any shape (M,N,2,2) complex array can be used as a polarization component. x/polarization contains a function that generates a vector vortex retarder (VVR), a component that is like a spiral half wave plate..."
  • There is usually some sort of "closure" in the notebooks in prysm's documentation (a wrap-up, in summary, ...). The closing here, "this is just one simple illustration" is stylistically very different.
  • Code-wise, it seems wrong that the user needs to understand more advanced numpy syntax to do
# now multiply it by the polarizing element
A = A[...,np.newaxis,np.newaxis]
j_out = propagate(vvr*A)
  • could there instead be a function, apply_[jones or polarization]_component_to_field which does the right thin in case the field is scalar/vector?
  • It is not obvious whether the make_propagation_polarized bit in the docs is print debugging, or is meant to show the user which functions become supported. Could the clarity here be improved?

Thanks!

@brandondube
Copy link
Owner

ping

@Jashcraf
Copy link
Contributor Author

Jashcraf commented Jan 21, 2024

  • In the docstring of jones_adapter, I think the two paragraphs and J = should go under parameters, in a Notes section. I think almost no-one will consume this function directly, but it is generally preferable that someone can unfold the documentation (shift-tab in Jupyter, say) and see the parameters without having to open full screen.
    • Change made
  • Is the name make_propagation_polarized a slight misnomer since the functions remain compatible with scalar fields? Perhaps add_jones_support_to_propagation ?
    • I see what you mean but the latter is a little verbose, what do we think of add_jones_propagation?
  • In the demo, please make the filename capitalized like the other ipynbs. (I actually see I need to make them consistent either spaces or hyphens, let's just do Polarized_Propagation and I'll change the rest to match)
    • Change made
  • The text below what J is, "...you would use the following code" is a little awkward. Perhaps this could be clearer with "in prysm, polarization-modifying components are simply arrays of dimension (M,N,2,2), which allows their effect to vary spatially." It is a little awkward to talk about spatially varying, then immediately show a uniform component. Perhaps the linear polarizer and plotting macro could be moved to be before talking about spatially non-uniform things? I think this would flow better.
    • Made suggested change to phrasing. I'm considering just leaving off the part with the uniform retarder, but will return to it.
  • One time it is "shape" and the other it is just shape - please make consistent
    • Changed to shape
  • please just say array, not numpy array (cupy arrays are arrays, too and I won't stand for their erasure!)
    • Change made

I'll resume this and work on the remaining changes between meetings tomorrow

@brandondube
Copy link
Owner

what do we think of add_jones_propagation

👍

@brandondube
Copy link
Owner

i am once again pinging

ping

@Jashcraf
Copy link
Contributor Author

Jashcraf commented Apr 2, 2024

Now that thesis is done I’m finally returning

Is it best to refer to experimental modules like prysm.x.polarization as x.polarization or x/polarization?

@brandondube
Copy link
Owner

x/polarization

@Jashcraf
Copy link
Contributor Author

Jashcraf commented Apr 2, 2024

Sweet, I’ll make the PR

@brandondube
Copy link
Owner

merged in PR #111

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants