Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Parallelism during Cython compilation of third party projects #2841
Is there any easy way to parallelize Cython compilation? I have a system with 192CPUs and 1TB RAM but it still takes ~7m to build Pandas (even with ccache)
I opened an issue with Pandas and the devs pointed to Cython, though there wasn’t any certainty that anything could be done to improve the situation within Cython.
Is there any easy way to cause a third party project to Cythonize in parallel? Is this even a Cython issue?
Thanks, I apologize if this isn’t a Cython issue, or is one that just can’t be trivially addresses by Cython devs. I’ve not had much opportunity to work with Cython which I hope explains my lack of direction in trying to chase this down.
I did modify the Pandas setup.py to utilize ‘nthreads’ but I’m looking more for something like ‘make -j’ behavior- it seems ‘nthreads’ only optimizes for one file at a time, whereas I’m looking for multiple files being compiled simultaneously
I’m wondering if the logic in each third party project would need to be rewritten to look more like this: https://github.com/cython/cython/blob/master/Tools/cystdlib.py
Oh and just to be clear- I’m not talking about runtime parallelism, the GIL, etc.. I’m purely talking about the process of Cythonizing code that utilizes Cython, which as I understand it takes place at build/install time of the third party project (in this example, Pandas, probably the project using Cuthon most heavily)
And I understand it’s not for Cython devs to PR to third parties if such functionality exists- I’m happy to do that myself.
In Python 3.4(or 3.5?) and later, you can use something like
Other than that, yes, passing