New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try compiling parts of mypy with cython and see if it's faster #3408
Comments
I will work on this. |
I wonder how Nuitka would do... |
Results from the first quick and dirty test with Cython:
I profiled a mypy self-check and found that
I compiled those two modules using Cython (with the
The first benchmark runs the partially-cythonized mypy over an unmodified mypy code tree, and the second runs the unmodified mypy over itself. The partially-Cythonized mypy passes I'm sure we can squeeze out more by compiling more modules, or quite a bit more if we were willing to introduce cython-specific syntax. |
I would much prefer mypy to stay pure Python. Maybe we can use some of the profile data to optimize some of mypy's Python code? |
(Also, I once ran mypy with Nuitka and there wasn't much improvement there either.) |
I already do this. The point is that the protocols PR #3132 touches the hottest function |
If there isn't interest in the Cython direction, I won't experiment further. I think that the gains here will be only incremental and probably not worth the release hassles unless we were willing to add Cython-specific annotations. It would be great if mypy and its dependencies (cough typed-ast) were all pure Python, so that we could explore speed improvements from running under pypy. But it looks like typed-ast will make that very difficult. |
I experimented with pypy before switching to typed-ast and it was actually slower than CPython when type checking mypy at least. When I self-checked mypy in a loop, I think that the first iteration was about half as fast as CPython, and then maybe iteration 5 or so was up to around 2 times faster. This was a while back, though, and I haven't tried with a recent PyPy release. It would easy enough to check out an old mypy commit without typed-ast and see how it behaves with a recent pypy, but it's unlikely that we are actually going to move away from typed-ast. |
I just checked mypy v471, which I believe was the last version supporting the old parser, and compared it to CPython. I also tried Nuitka. Here is what I got: $ python3 -V
Python 3.5.2
$ pypy3 -V
Python 3.5.3 (2875f328eae2, Apr 02 2017, 17:58:49)
[PyPy 5.7.1-beta0 with GCC 6.2.0 20160901]
$ /usr/bin/time pypy3 -m mypy mypy
20.17 user
0.34 system
0:20.57 elapsed
99% CPU
$ /usr/bin/time python3 -m mypy mypy
12.04 user
0.29 system
0:12.37 elapsed
99% CPU
$ /usr/bin/time python3 -m mypy --fast-parse mypy
10.87 user
0.46 system
0:11.37 elapsed
99% CPU
$ /usr/bin/time ./mypy.exe mypy
12.81 user
1.15 system
0:13.96 elapsed
100% CPU
$ /usr/bin/time ./mypy.exe --fast-parse mypy
11.37 user
1.14 system
0:12.52 elapsed
99% CPU I can provide any further timing upon request. Based on this, I can surmise that PyPy is not a good path to go down, as it performs much worse than even the non-fast-parse implementation. I presume that one could extend their ast to support type comments and essentially implement fast-parse on top of that, however, I do not see this as being a good path to go down. Similarly, Nuitka provided the same or worse performance, and I see no way to improve this. Also as a side note, I find it very strange it created an So I think it is safe to say that neither of these are possible alternatives. In general in trying PyPy, Cython, and Nuitka on different projects in the past, PyPy can lead to impressive speedups for some but not all workloads, Cython without special typing can lead to a 5-10% speedup, and Nuitka never seems to add much of any advantage. $ cat test.py
from mypy.api import run
import time
for i in range(5):
start = time.time()
run('mypy')
print(time.time() - start)
$ pypy3 test.py
21.934080839157104
13.73313283920288
9.043972730636597
10.095704317092896
9.367397785186768 This is actually a nice improvement, faster even than the fastparse of 0.471. However, I don't think this means we should consider PyPy. My thinking is two-fold. First, most things using mypy are almost certainly running on CPython, as PyPy does not support PEP 484 or 526. Secondly, mypy is most often called in a one-off fashion. It does not make much sense for it to be called multiple times like in my test. I think the cost of the startup is too great to compensate for any speed gained by it. However this is a fair bit of my opinion! Interpret the numbers how you will. |
Closing as I believe we decided we aren't going to do this. |
It seems to be a hard to defeat myth that this is necessary. It isn't. It's more like renaming a I do not know mypy's implementation, but the run profile above suggests that the main bottleneck is Python's Then, the method names suggest that the next bottleneck could be a visitor implementation. In Cython, we achieved a visible speedup of the compiler by converting the main visitor base class(es) into extension types. See (Again, I'm mostly guessing here, but it seems unlikely that the designs of Cython and mypy are entirely different.) |
Yes, since writing that, I've learned that it isn't needed :) Also, thank you for taking a look at the profiling data, your recommendations seem useful. There definitely are things we can do to optimize regardless of whether we choose to use Cython or not, so I appreciate your comment. |
Maybe we could us Cython to run mypy faster. We probably don't want to use any Cython type annotations, but Cython can speed up straight Python code as well by removing interpretation overhead.
Note that using Cython may make it significantly harder to generate mypy wheels, and cause other potential issues, so we'd likely only want to use Cython if there is a significant performance improvement, and even then it might be unclear if we really want to use Cython.
Here is a potential way to run an experiment (note that I don't have much experience with Cython):
...
The text was updated successfully, but these errors were encountered: