Benchmark results expand2b, SymEngine

Sumith edited this page Jul 9, 2015 · 23 revisions

More recent version can be found in SymEngine Wiki here.

To keep track of the benchmark results of expand2b benchmark of SymEngine.

Benchmark used

f = (x + y + z + w)**15
f * (f + w)

Result being a 6272 term expression

#####Results of expand2
expand2 uses regular SymEngine expressions

######On master

836ms
831ms
845ms
830ms
836ms
829ms
839ms
831ms
839ms
837ms

Maximum: 845ms
Minimum: 829ms
Average: 853.3ms

######On packint

1106ms
1110ms
1133ms
1105ms
1103ms
1109ms
1105ms
1101ms
1107ms
1119ms

Maximum: 1133ms
Minimum: 1101ms
Average: 1109.8ms

#####Results of expand2b
expand2b uses the structure and poly_mul that resides in the rings
######On master
Example run:

sumith@sumith-Lenovo-Z50-70:~/github/csympy/benchmarks$ sudo nice -n -19 ./expand2b
poly_mul start
poly_mul stop
95ms
number of terms: 6272

######On master

94ms
97ms
94ms
94ms
94ms
97ms
96ms
93ms
93ms
94ms

Maximum: 97ms
Minimum: 93ms
Average: 94.6ms

######On packint
Example run:

sumith@sumith-Lenovo-Z50-70:~/github/csympy/benchmarks$ sudo nice -n -19 ./expand2b
poly_mul start
poly_mul stop
114ms
number of terms: 6272

Result of 10 execution:

106ms
105ms
108ms
106ms
110ms
106ms
106ms
107ms
106ms
106ms

Maximum: 110ms
Minimum: 105ms
Average: 106.6ms

#####Why is there a slowdown in packint branch for expand and expand2b? #####Results of expand2c The most recent expand2c uses the structure that uses piranha::integer from Piranha. The new rings.cpp can be found here Example run:

sumith@sumith-Lenovo-Z50-70:~/github/csympy/benchmarks$ sudo nice -n -19 ./expand2c
poly_mul start
poly_mul stop
32ms
number of terms: 6272

Result of 10 execution:

27ms
27ms
26ms
26ms
27ms
26ms
26ms
26ms
27ms
26ms

Maximum: 27ms
Minimum: 26ms
Average: 26.4ms

#####Results of expand2d
Example run:

sumith@sumith-Lenovo-Z50-70:~/github/csympy/benchmarks$ sudo nice -n -19 ./expand2d
poly_mul start
poly_mul stop
23ms
number of terms: 6272

Here, the evaluate_sparsity() gave the following result for the hash_set

0,11488
1,3605
2,1206
3,85

Result of 10 execution:

14ms
14ms
14ms
15ms
14ms
15ms
14ms
14ms
15ms
14ms

Maximum: 15ms
Minimum: 14ms
Average: 14.3ms

#####Piranha results The fateman1_perf was re-written with the following benchmark.
Example run:

sumith@sumith-Lenovo-Z50-70:~/github/piranha/tests$ sudo nice -n -19 ./fateman1_perf 1
Running 1 test case...
 0.013577s wall, 0.010000s user + 0.000000s system = 0.010000s CPU (73.7%)

*** No errors detected
Freeing MPFR caches.
Setting shutdown flag.

Result of 10 execution:

0.013577s wall
0.013190s wall
0.013875s wall
0.012964s wall
0.013724s wall
0.013539s wall
0.013469s wall
0.013343s wall
0.013011s wall
0.013515s wall

Average: 13.421ms
Maximum: 13.875ms
Minimum: 12.964ms
The wall time is used for comparison and stats.

Note: All the above are first 10 results of execution.
Inputs received from Ondřej Čertík and Francesco Biscani

On a new branch:
Changes: Used arr_int4 instead of vec_int for monomial mul. std::array<int, 4>
Result: Nice percentage speedup

81ms
79ms
81ms
81ms
80ms
80ms
80ms
82ms
81ms
80ms

Max: 79ms
Min: 82ms
Average: 80.5ms

But using std::valarray resulted in slow down, averaged around 112ms. There were very few instances were the syntactic sugar came handy. We are assuming that bottleneck is in memory allocation time, valarray will probably not bring much over vector. Anyways there might be situations in which it's worth using it over vector, just something to keep in mind.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.