Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT or AOT compiler for faster processing #224

Closed
ErichZimmer opened this issue Oct 26, 2021 · 25 comments
Closed

JIT or AOT compiler for faster processing #224

ErichZimmer opened this issue Oct 26, 2021 · 25 comments
Assignees

Comments

@ErichZimmer
Copy link
Contributor

Is your feature request related to a problem? Please describe.
On large images or vector fields, the standard scipy library becomes very slow and at times, takes longer than the correlation it self. Furthermore, individual functions (like deformation and subpixel approximation) can be accelerated for more efficient processing.

Describe the solution you'd like
Use Numba to accelerate functions deemed "slow".

Describe alternatives you've considered
Keep everything the same as when using multiprocessing, batch processing is on par with open source optimized c++ PIV softwares.

Additional context
If we do create a numba accelerated version of OpenPIV, let's make it its own package.

@alexlib
Copy link
Member

alexlib commented Oct 27, 2021

It's a good idea to add numba. I am not sure whether a separate sofware would not confuse our users. I hope there's a way to make it as an optional part during the installation and an optional flag to use during the run. Could you please see how trackpy managed to handle it in their standard installation and use? it's probably somewhere if numba is installed properly, run numba, otherwise run whatever you have

@alexlib
Copy link
Member

alexlib commented Oct 28, 2021

From reading http://soft-matter.github.io/trackpy/v0.5.0/tutorial/performance.html I understand it's possible to keep two versions - python and numba and make numba as default and python as a fallback if numba is not installed. we just need to learn the technique from trackpy

@alexlib
Copy link
Member

alexlib commented Oct 28, 2021

@ErichZimmer
Copy link
Contributor Author

I think two versions should be made for computationally intensive functions; an accelerated one using Pythran and Numba and a fallback with vectorized Numpy. This should keep the performance acceptable when a user doesn't have any AOT or JIT compilers. Maybe something like specialized functions that can only run when Pythran or Numba is present and throws an ImportException otherwise? Maybe we can do something like the FluidImage package?

@ErichZimmer
Copy link
Contributor Author

This would leave three versions of each computationally intensive function though (looped, vectorized Numpy, AOT/JIT)...

@alexlib
Copy link
Member

alexlib commented Dec 3, 2021

The problem is how to make it installable on the fly and how to support / debug it. Speed is an issue but it is secondary to usefulness.
Is it simple for fluidimage?

@ErichZimmer
Copy link
Contributor Author

Playing around with Julia, I was able to get ~2x faster subpixel estimation and using Julia's AbstractFFTs package, the cross correlation algorithm was also ~3x faster on a single thread. Hopefully, Numba could attain this same performance boost as both use the LLMV compiler, but the limits of Numba still impose quite a constraint on performance and the way the code has to be written.

On fluidimage, they use Pythran and Theano. Their application of these packages are not simple as their code was written with the compilers in mind.

@alexlib
Copy link
Member

alexlib commented Feb 6, 2022

I vote for simplicity over speed. I'm sure Julia is a great software and it already has some PIV code - we'll have to focus on the Python advantage and stay calm.

@ErichZimmer
Copy link
Contributor Author

Yes, and focus on documentation. Just need better internet connection to add more documentation :D .

@timdewhirst
Copy link
Member

This is a particular area of interest for me, and I was wondering if you have some benchmarks or breakdown of the areas that are slow e.g. from the thread it sounds as though peak discovery and fitting are slow because they're coded in pure python, but FFT based correclation is relatively fast as it relies heavily on numpy. It would also be useful to understand what has been tried (pypy, cython, numba, ...) and what the aim is - is there a particular scenario in mind? Finally, could this be an opportunity to join the c++ and python openpiv branches - core functionality in c++, exposed to python by using pybind11 or similar?

@alexlib
Copy link
Member

alexlib commented Feb 7, 2022

In the past we used Cython and left this direction because of the debugging complexity and mainly the installation issues on various Windows platforms.
I think @ErichZimmer also tried numba

The idea to merge the C++ with Python is an old dream. It can be only if we have an easy installation, simple debugging, and fallback to Python for the users that cannot compile, etc. plus we need to keep an option for the online cloud computing eg AWS or Google cloud.

@ErichZimmer
Copy link
Contributor Author

I'm for a c++11 extension, but if we do go that route, we should place the c++11 files in a folder named process (or something similar). Hopefully, we could get the same amount of compatibility as Numpy and Scipy if we use setup.py "correctly" with a fallback to python in case anything goes wrong.

@alexlib
Copy link
Member

alexlib commented Feb 7, 2022

I'm for a c++11 extension, but if we do go that route, we should place the c++11 files in a folder named process (or something similar). Hopefully, we could get the same amount of compatibility as Numpy and Scipy if we use setup.py "correctly" with a fallback to python in case anything goes wrong.

let's see some demos on some simple examples. i am not sure it's quite simple. the user should be able to run the code without noticing any difficulties or differences.

@timdewhirst
Copy link
Member

certainly adding C++ (or any compiled language with a non-trivial toolchain) is going to be a little challenging, but I wonder at this point if it's significanly more complex that some of the approaches mentioned above; in addition I would imagine if this route was to be considered, using a docker image to build the package would be the sensible approach.

@alexlib
Copy link
Member

alexlib commented Feb 7, 2022

certainly adding C++ (or any compiled language with a non-trivial toolchain) is going to be a little challenging, but I wonder at this point if it's significanly more complex that some of the approaches mentioned above; in addition I would imagine if this route was to be considered, using a docker image to build the package would be the sensible approach.

do you mean to use docker to build binary wheels for all platforms? or use docker on all platforms to run the package?

@timdewhirst
Copy link
Member

Sorry, that was unclear - I think using docker to run on all platforms would be the simplest approach. I would have to look into using docker to build platform specific wheels, I’m not sure about what’s involved at all! Will do some reading!

@alexlib
Copy link
Member

alexlib commented Feb 7, 2022

docker is a neat idea - but we need to be careful with the GUI. I've found it to be quite difficult to manage for the openptv project. Hopefully you'll find some trick to make it working for all the platforms.

@ErichZimmer
Copy link
Contributor Author

What about pybind11? It seems like another viable and somewhat flexible option.

@timdewhirst
Copy link
Member

This problem is really split into a couple of different and independent tasks:

  • create python bindings for C++ code: for this pybind11, boost.python, SWIG, etc can be used
  • packaging:
    • building the C++ code into a wheel that can run cross platform: for this, we need access to multiple build environments; something like https://github.com/pypa/cibuildwheel looks very promising
    • making a docker image to give uniform build and packaging: this has the advantage of being runnable in a cloud environment without further modification, potentially at the expense of some performance (?)

I'm planning on taking a look at creating a dockerfile for openpiv-c, and also at pybind11 for python bindings. This should hopefully help answer a couple of questions!

@alexlib
Copy link
Member

alexlib commented Feb 8, 2022 via email

@timdewhirst
Copy link
Member

agreed - taking this approach will help keep the ideas/strategies independent.

@ErichZimmer
Copy link
Contributor Author

It would be nice if the c++11 version would be modular, as in multiple functions to do the evaluation like the current python version. This would increase flexibility as some user might want something like ensemble correlation which can easily be implemented with the current python version.

@timdewhirst
Copy link
Member

That’s the pattern the code currently follows - the evaluation strategy should be defined in python, the computationally heavy work done in C++; the lib at the moment is a lightweight set of classes that allow this.

@ErichZimmer
Copy link
Contributor Author

For compiling code in setup.py, distutils and Numpy's distutils could be used as it seems quite simple. In case if there is an error in the compiling of code, a try-catch statement could be used.

@ErichZimmer
Copy link
Contributor Author

Due to the complexity of using pythran/numba and the development of an openpiv version with a python frontend/c++ backend, I will close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants