Made PrimePY a LOT faster #9

rhbvkleef · 2021-03-25T21:04:11Z

You probably dont want to merge this as it significantly changes the spirit of the piece of code, but I wanted to show that Python can be VERY in some cases.

cristi-neagu · 2021-03-25T22:20:15Z

This closes #3 by being almost twice as fast. Nice job!

cristi-neagu · 2021-03-27T15:50:22Z

Btw, would have been nice to post the numbers you were getting with C# and C++ as well, for comparison. All machines are different, so we don't know what "25939 passes" actually means.

rhbvkleef · 2021-03-28T13:16:42Z

This is tested on Arch Linux with kernel 5.11.6, 32GB DDR4 RAM, i7-9750H.

Language	Iterations
C++ (GCC)	11556
C++ (Clang)	11267
C# (Mono)	1521
C# (DotNet Core)	3317
Python	10345

C++

GCC 10.2.0
g++ -O3 PrimeCPP.cpp
Clang 11.1.0
clang++ -O3 PrimeCPP.cpp

C#

Mono 6.12.0
csc -o+ -debug- PrimeCS.cs
DotNet Core 5.0.104
dotnet run --configuration=release

Python

CPython 3.9.2
python PrimePY.py

JL102 · 2021-03-28T21:24:17Z

This is quite impressive. But here's a question. It's using multiple libraries instead of using native Python. Do those additional libraries use native code instead of pure python? If they are, it feels a bit like cheating, and perhaps there should be two versions - one with pure Python and built-in libraries, and the other which takes advantage of extra libraries for better performance.

rhbvkleef · 2021-03-28T22:00:21Z

If they are, it feels a bit like cheating, and perhaps there should be two versions

Both Numpy and Numba use native code (although none of Numpy´s native code is actually executed). Numba contains quite a significant amount of native code, and actually produces machine-code at runtime. Is it cheating? Maybe, but not because the libraries use C code. The whole runtime is written in C, so using it as a qualifier is either meaningless or incredibly ambiguous. I think the benchmark for that should be whether we use specific-purpose native code, and that is NOT the case.

I do, however, se a case for keeping two separate solutions, one using the pure STL (in which case we can expect performance that is on par with the dotnet performance) and one where we use Numba´s JIT (or another Python JIT for that matter).

JL102 · 2021-03-28T22:09:39Z

Fair reasoning!
I'm still quite unfamiliar with python at the moment, so I apologize if my question seemed accusatory. I see now that the biggest improvement was by using a JIT compiler, correct? Does that mean that Python does not do JIT compiling by default?

cristi-neagu · 2021-03-28T22:19:47Z

Well, think of Numba kinda like it used to be when people were writing C or C++ but the compilers weren't super optimized like today, and if they had something performance critical they would insert some assembly code to make it as fast as possible. It's sort of like that, except code written for Numba is much, much close to Python than assembly is to C. It's basically just Python with types.

So in this respect i wouldn't consider using Numba cheating. It's still Python. It's like choosing a different compiler for your C code cause it's a bit better optimized.

rhbvkleef · 2021-03-29T12:34:11Z

Fair reasoning!

Thanks! I was afraid it would come across as a bit rant-ey. I´m happy my point got across despite that.

I apologize if my question seemed accusatory

Don´t worry about it! It didn´t. I just wanted to make sure that I explained my position properly, and wanted to explain why I see it differently.

I see now that the biggest improvement was by using a JIT compiler, correct?

Yes, that´s where the largest speed gain is.

Does that mean that Python does not do JIT compiling by default?

CPython doesn´t by default. There are interpreters (like Pypy and Jython) that do, but they are oftentimes slower than using Numba. They do provide a more "Pythonic" experience: Numba restricts the amount of Python we can write, so that it will be able to optimize it better.

knowlen · 2021-04-12T19:26:56Z

This is quite impressive. But here's a question. It's using multiple libraries instead of using native Python. Do those additional libraries use native code instead of pure python? If they are, it feels a bit like cheating, and perhaps there should be two versions - one with pure Python and built-in libraries, and the other which takes advantage of extra libraries for better performance.

Python built-ins are implemented in C. The source code for Python can be found here. Many Numpy ops just link to the Python implementation (eg; numpy.sum() is just sum()). Numpy still gets called through the interpreter though; which means everything has to happen in memory + you cannot optimize around multiple / overlapping functions. For example,

import numpy as np
def f(n):
    return np.sum(np.arrange(n)

print(f(10000000))

will literally create a 100000000 dimensional array in memory without any consideration to how sum will use it. This is pretty expensive, but if you decorate with numba,

import numpy as np
from numba import njit

@njit
def f(n):
    return np.sum(np.arrange(n)

print(f(10000000))

f() gets JIT compiled and no array generation occurs.

The Numba JIT compilation is potentially* cheating in the sense that Numba is doing some extra manual translation work in the frontend. It's sort of comparable to clang (which is also just an LLVM frontend), but I think the translation from c/++ -> LLVM IR in clang is more deterministic / automatic. The Numba team literally re-implements builtins / Numpy functions in their own language and then maps the Python code "something they know how to compile" before translating the Numba IR to LLVM IR. So there is probably a side effect here where Numba implicitly re-writes poor Python implementations to be optimal before lowering to LLVM IR --in a way that is more rigid than clang. I could be wrong though.

vs.

Generally, the name of the game for compiling Python is to either translate Python -> C (which can then be compiled by clang/gcc) or translating Python -> LLVM IR (which can be compiled directly by LLVM backend). There's dozens of mature libraries / tools for JIT and AOT compilation, but the methodology is essentially bimodal. Usually the former for AOT and the latter for JIT.

While it is possible make Python as fast as C languages in 2021 using JIT compilers, I concede that the comparison is somewhat beyond the scope of Dan's video. The C languages are designed to be compiled, and Python is designed to be interpreted. It would be equivalent to benchmark native Python vs. C++ using a 3rd party C interpreter.

Drag race

rbergen · 2021-06-25T22:06:42Z

I'm wondering how this relates to #2?

rbergen · 2021-07-21T20:32:40Z

Closing due to lack of contributor response. Can be reopened on drag-race branch if desired.

rhbvkleef added 2 commits March 25, 2021 22:03

Made PrimePY a LOT faster

f1f7865

You probably dont want to merge this as it significantly changes the spirit of the piece of code, but I wanted to show that Python can be VERY in some cases.

Fixed a major error

f206f43

cristi-neagu mentioned this pull request Mar 25, 2021

Not an issue - Python with Numba, faster than C# #3

Closed

michhernand mentioned this pull request Mar 26, 2021

Optimized python 25 passes -> 4,848 passes #15

Closed

rhbvkleef added language:Python type:dependencies-faithful labels Apr 6, 2021

rhbvkleef pushed a commit that referenced this pull request Apr 29, 2021

Merge pull request #9 from davepl/drag-race

8260b61

Drag race

rbergen mentioned this pull request Jun 25, 2021

Huge Improvement on Python #35

Closed

rbergen closed this Jul 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Made PrimePY a LOT faster #9

Made PrimePY a LOT faster #9

rhbvkleef commented Mar 25, 2021

cristi-neagu commented Mar 25, 2021

cristi-neagu commented Mar 27, 2021

rhbvkleef commented Mar 28, 2021 •

edited

Loading

JL102 commented Mar 28, 2021

rhbvkleef commented Mar 28, 2021

JL102 commented Mar 28, 2021

cristi-neagu commented Mar 28, 2021

rhbvkleef commented Mar 29, 2021

knowlen commented Apr 12, 2021 •

edited

Loading

rbergen commented Jun 25, 2021

rbergen commented Jul 21, 2021

Made PrimePY a LOT faster #9

Made PrimePY a LOT faster #9

Conversation

rhbvkleef commented Mar 25, 2021

cristi-neagu commented Mar 25, 2021

cristi-neagu commented Mar 27, 2021

rhbvkleef commented Mar 28, 2021 • edited Loading

C++

C#

Python

JL102 commented Mar 28, 2021

rhbvkleef commented Mar 28, 2021

JL102 commented Mar 28, 2021

cristi-neagu commented Mar 28, 2021

rhbvkleef commented Mar 29, 2021

knowlen commented Apr 12, 2021 • edited Loading

rbergen commented Jun 25, 2021

rbergen commented Jul 21, 2021

rhbvkleef commented Mar 28, 2021 •

edited

Loading

knowlen commented Apr 12, 2021 •

edited

Loading