-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup warp for 3D images #4410
base: main
Are you sure you want to change the base?
Conversation
Speedup more noticeable in smaller images - Switch from Numpy Python API to C API - Avoid np.asarray call - Minor optimizing changes
import numpy as np | ||
cimport numpy as cnp | ||
cimport numpy as np | ||
np.import_array() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this necessary as cimport
should take care of it. I think..... otherwise, we have real issue in the library ^_^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well then it seems we do have a library issue, check this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we seldom use the capi directly, you seem to be using it.
Is using the CAPI meaningful? We typically use memoryvirws and the python API. Could you share a benchmark with u with code and the benchmark in the appropriate folder. |
|
||
cdef np_floats[:, ::1] img = np.ascontiguousarray(image) | ||
cdef np_floats[:, ::1] M = np.ascontiguousarray(H) | ||
cdef np_floats[:, ::1] img = np.PyArray_GETCONTIGUOUS(image) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you using the capi?
Why not simply leave this as is, I don't think you are avoiding copies here
https://github.com/numpy/numpy/blob/v1.17.0/numpy/core/_asarray.py#L179
import numpy as np | ||
cimport numpy as cnp | ||
cimport numpy as np | ||
np.import_array() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we seldom use the capi directly, you seem to be using it.
Sure, here is how I'm benchmarking the code. You just need to update |
@hmaarrfk I just checked multiple Cython files across the library and noticed that the Numpy's Python API is the one being used, while the C API used for simple stuff like data type checks. I see that as a problem since Cython is basically C so it would seem more natural to use the C API rather than the Python one, it's like you go from C to Python to C so why not just use C code allover. I've actually benchmarked the use of C vs Python API in Now I think opening an issue and having a discussion about the use of C API would be better than in this PR, so what do you think should I revert to Python API for now ? |
I can see how we are using c datatypes from numpy, but most of the code is using cython memoryviews isn't it? |
Doing the computations on the memoryviews is definitely the right call, the problem here is with the setup code (like making sure the numpy array is contiguous or creating an array), using Numpy's C API avoids some unnecessary overhead. |
Is that seriously a large cost? It seems to me that that is the cost paid for by using numpy and python. I would be happy to work with you to isolate your two changes: your computation order changes. The usage of the numpy c api. I'm still not totally convinced that the usage of the Capi is the right call, but interested in being shown otherwise. |
The cost is really subjective, for a cheap computation (small image size, nearest neighbour interpolation) reducing the overhead will help, but when the computation is expensive (large image size, bicubic interpolation) it doesn't really matter as most of the time share is spent on the computation itself.
I know it might be inconvenient to use the C API as you're probably more used to using the Python API (who isn't), but the C API is definitely the right call here as Cython generates more code like fetching the numpy module, fetching and invoking the function, more checks, while when using the C API it just calls the function.
That's great, I think separating the two changes would be better. |
Description
This PR speeds up warping for 3D images (3D or multichannel 2D) up-to 5x by writing
_warp_fast_batch
which handles a batch of 2D images directly.The original approach had some performance problems like:
And also an API problem where the input had to have its channel axis placed last, overcoming that would require transposing the image before and after the computation
The proposed code works on images with format [plane, row, column] or [row, column, channel]
by specifying the
channel_axis
parameter and controls whether the iteration over the channel axis should be the innermost loop or the outermost loop which basically impacts the caching.I've ran some benchmarks and here are the results
Each figure has 3 rows (for interpolation order) and 5 columns (for image size nxn) of subplots which compares the number of channels vs speedup
A bunch of TODOs:
Reviews and change request are most welcome
Checklist
./doc/examples
(new features only)./benchmarks
, if your changes aren't covered by anexisting benchmark
For reviewers
later.
__init__.py
.doc/release/release_dev.rst
.@meeseeksdev backport to v0.14.x