Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ndim fft #113

Merged
merged 61 commits into from
Jul 12, 2019
Merged

Ndim fft #113

merged 61 commits into from
Jul 12, 2019

Conversation

herjy
Copy link
Collaborator

@herjy herjy commented Jul 3, 2019

All right, I am mildly fed up with this, so I will submit it for review. The critical part, which is the ndimensional fft for Observation works marvelously well. The same trick does not work at all for LowResObservation. If anything it makes everything slower. Here is a recap of what you'll find:

  • On the simple example from the notebook, the execution of blend().fit drops from 88 ms to 18 ms. I will do more tests to see hoe the various dimensions of the arrays affect this, but it looks nice!
  • Problem: It does not pass the tests and I am not entirely sure why.
  • make_operator is now a method of LowResObsevation, which makes more sense.
  • I considered switching resampling.match_patches to a method of that class too, but its use in match and match_psfs made me think twice.
  • I timed the setup time for LowRes and the run times for blend.fit in the notebooks. For comparison, the same thing is done on the current implementation in a branch I called 'speed_test' which I will use to investigate further the gain that the new implementation gives us.
    On the bad news side (yes this was all good news):
  • The more I try to use Ndim_fft in LowRes, the slower it gets. I tried an implementation full-on ndim fft (it was beautiful, not a single for loop), but after running (building the resampling operator) for MANY minutes I lost patience and gave up. You will find a hybrid implementation that still uses scipy.
  • The prototypical code for ndim fft in `make_operator is left commented. I could never make it work for reasons that are still unclear.

Since I already spent days on trying to make LowRes faster without success, Idon't want to delay this any further, I suggest that you guys review the part in Observation. If I can find why LowRes is slow, I'll update everything. Writing this makes me think that it could be a problem of fftpack_shapes but I am not convinced.
Good luck.

fred3m and others added 30 commits June 13, 2019 12:26
While writing tests I found that the `BoundingBox` API was clunky and
supported features that were inconsistent. The new implementation is
cleaner and uses a shortened `Box` class name. The `resize` method
was also removed, because it didn't add anything to the functionality
of trim.
Generating a PSF image for given function in a given shape is
common enough that a convenience method has been added to generate
the x, y pixel grid for the shape and create a PSF image using the
function parameters.
While creating the tests a few inconsistencies/sloppy coding were
fixed:
- `Scene.psfs` is now always 3D, even the `target_psf` for the
  model scene. This makes the API more uniform for `Scene`
  and its inherited classes.
- PSF matching and convolutions now use real FFTs. This has the
  advantage that it runs faster, fixes a potential bug in the
  PSF matching code, and is more similar to the `scipy.signal.convolve`
  method.
In addition to creating the units tests for the classes in component.py,
the commit also fixes the `Prior` API to use the form `sed_grad` and
`morph_grad` (instead of `grad_sed`, ...) to match the attributes of
`Component`.
docs/quickstart.ipynb Outdated Show resolved Hide resolved
Copy link
Collaborator

@fred3m fred3m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I didn't see anything that Peter didn't already comment on.

docs/tutorials/multiresolution.ipynb Outdated Show resolved Hide resolved
@pmelchior
Copy link
Owner

pmelchior commented Jul 6, 2019

Besides the three open comments, it would be nice to have a simple comparison of the time for obs.match(frame) (with PSF matching) and of obs.render(model) in three configurations of (C,Ny,Nx)

  • 1, 1000,1000
  • 5, 100, 100
  • 100, 100, 100

The model can be simple zeros.

@herjy
Copy link
Collaborator Author

herjy commented Jul 9, 2019

Sorry, I saw your latest comment just now. I had similar stuffs done, so I'll do that quickly tomorrow. In the mean time, here is a new commit with Ndim_fft in place where it can be done without losing anything
I also added 5bands to the data, so that we have nicer colour images ;)

@herjy
Copy link
Collaborator Author

herjy commented Jul 11, 2019

On the performance side, here is what this PR changes. In this plot, the y-axis is the time gained in percents by using this implementation compared to the master branch at the time of the PR for three different tasks making use of rfftn(axes = (1, 2)): psf convolution observation.render(), observation setup observation().match(frame), fitting blend.fit().
bands_gain

@pmelchior
Copy link
Owner

OK, this looks good in the regime where we need it. The drop off at the right is probably because of memory thrashing / cache misses.

@herjy
Copy link
Collaborator Author

herjy commented Jul 11, 2019

Finally, I figured it out! So: for some values of the psf size, the fast shape of the fft returns an even value, which autograd.fft cannot deal with. To have autograd run smooth and clean, we tested the fast shape and made it uneven, but only along one axis, which, apparently, for the 2-d fft was ok, but not when going Ndim (this is still a little weird to me). So now I make both axes uneven whenever things fast shapes are even and we the Fourrier operations run nominally.

@fred3m
Copy link
Collaborator

fred3m commented Jul 11, 2019

Weird. It might be worth creating an issue with autograd or even fixing this ourselves upstream because I doubt that we're the only ones to get dinged by this.

@pmelchior
Copy link
Owner

Is this a problem from autograd or from numpy FFTs?

docs/tutorials/multiresolution.ipynb Show resolved Hide resolved
scarlet/interpolation.py Show resolved Hide resolved
scarlet/observation.py Show resolved Hide resolved
@fred3m
Copy link
Collaborator

fred3m commented Jul 12, 2019

It's a problem with autograd. Numpy.rfftn has no problem with even shapes, in fact I think that it prefers them and this code (without the check for an even last dimension) is the same basic algorithm that scipy.signal.fftconvolve uses.

But with autograd they raised an Exception when using rfftn with an even numbered last dimension, hence the fix that is in master (https://github.com/fred3m/scarlet/blob/master/scarlet/observation.py#L190-L194). But I had never run across a problem with PSF sizes, so I didn't notice that I needed to use the same fix when we calculated the optimal shape for PSF matching.

@pmelchior
Copy link
Owner

So, it looks like we are in a good state now. @herjy please confirm that we can merge this PR.

@herjy
Copy link
Collaborator Author

herjy commented Jul 12, 2019

To answer Fred's last comment, It is somewhat strange that the problem did not arise earlier, but did now in the ndim. It could be related to the ndim implementation and the use of axes, but looking at the code for fft, I don't see it.
Anyway, I now have a much better grasp at this part of the code, and I think we are ready to merge.

@pmelchior
Copy link
Owner

I just spotted this: cb8bd40 deletes the file docs/changes.rst. That's our change log and should not be delete. I'll restore it but future future reference: git rm needs to be used with caution!

@pmelchior pmelchior merged commit fe56458 into master Jul 12, 2019
@pmelchior pmelchior deleted the Ndim_fft branch October 22, 2019 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants