Speed improvements #18

oleksandr-pavlyk · 2018-04-23T18:28:40Z

I just stumbled into this project, and notices that functions in basics could be sped-up a bit with more efficient use of NumPy. For example mse on vectors is 9X faster, and 5X faster on matrices.

Also sped-up code in MDS.

Please let me know what you think

1. Achived 5X speed improvement in `mse` by using `np.square` rather than `np.power(x, 2)`, and by reusing intermediate buffers via use of universal function's out= keyword. ``` In [2]: # %load mse.py ...: import numpy as np ...: ...: def mse(x, xhat): ...: """ Calcualte mse between vector or matrix x and xhat """ ...: sum_ = np.sum(np.power(x - xhat, 2)) ...: return sum_ / x.size ...: ...: def mse2(x, xhat): ...: """ Calculate mse between vector or matrix x and xhar """ ...: sum_ = np.sum(np.square(x - xhat)) ...: return sum_ / x.size ...: ...: def mse3(x, xhat): ...: """ Calcualte mse between vector or matrix x and xhat """ ...: sum_ = x - xhat ...: np.square(sum_, out=sum_) ...: return np.sum(sum_) / x.size ...: ...: In [3]: v = np.random.uniform(0,1, size=10**6) In [4]: m = np.random.uniform(0,1, size=(2048, 2048)) In [5]: %timeit mse(v, 0.5) 14.7 ms ± 6.88 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [6]: %timeit mse2(v, 0.5) 4.4 ms ± 2.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [7]: %timeit mse3(v, 0.5) 1.58 ms ± 734 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) In [8]: %timeit mse(m, 0.5) 67 ms ± 44.6 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [9]: %timeit mse2(m, 0.5) 24.7 ms ± 37.5 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [10]: %timeit mse3(m, 0.5) 15.1 ms ± 113 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [11]: np.equal(mse(v, 0.5), mse2(v, 0.5)) Out[11]: True In [12]: np.equal(mse(v, 0.5), mse3(v, 0.5)) Out[12]: True In [13]: np.all(np.equal(mse(m, 0.5), mse3(m, 0.5))) Out[13]: True In [14]: np.all(np.equal(mse(m, 0.5), mse2(m, 0.5))) Out[14]: True In [15]: def mse4(x, xhat): ...: """ Calcualte mse between vector or matrix x and xhat """ ...: buf_ = x - xhat ...: np.square(buf_, out=buf_) ...: sum_ = np.sum(buf_) ...: sum_ /= x.size ...: return sum_ ...: ...: In [16]: %timeit mse4(m, 0.5) 14.4 ms ± 9.14 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [17]: %timeit mse4(v, 0.5) 1.6 ms ± 392 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) In [18]: %timeit mse4(v, 0.5) 1.59 ms ± 430 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` 2. Routine `get_rotation_matrix` was computing composition of 3 rotations in the YZ plane through dot product of 3x3 matrices. Since it is equivalent to rotation in YZ plane though the sum of 3 angles, construct the rotation matrix directly. If the routine was intended to construct rotation matrix by Euler angles, it did not do it right. I left the old code commented out. 3. Modified other routines to utilize intermediate buffers to speed them up.

Less copying, use of broadcast_to to avoid creation of intermediate arrays. Improved efficiency of `theta_from_eigendecomp` and `x_from_eigendecom` by means of efficient use of broadcasting in numpy

oleksandr-pavlyk · 2018-04-30T17:41:44Z

ping.

duembgen · 2018-05-02T04:09:58Z

Hi thank you very much for the contribution! I am reviewing it now, will get back asap

duembgen · 2018-05-02T04:40:24Z

Very valuable contribution, learned a few tricks while reading this! If you could just remove the few lines of code that you commented out to replace them with something more efficient, that'd be great. I added notes for the two places where the comments should be left.

Also sorry for the late answer, I was not notified about this pull request (thank you @fakufaku for the notice).

duembgen · 2018-05-02T04:33:24Z

pylocus/basics.py

+#    cz, sz = np.cos(theta_z), np.sin(theta_z)
+#    Rz = np.array([[1, 0, 0], [0, cz, sz], [0, -sz, cz]])
+#    return Rx.dot(Ry.dot(Rz))
+    # N.B.:


good point. All above "N.B.:" can be deleted

duembgen · 2018-05-02T04:37:18Z

pylocus/basics.py

@@ -62,20 +72,21 @@ def eigendecomp(G, d):
    N = G.shape[0]
    lamda, u = np.linalg.eig(G)
    # test decomposition of G.
-    G_hat = np.dot(np.dot(u, np.diag(lamda)), u.T)
+    #G_hat = np.dot(np.dot(u, np.diag(lamda)), u.T)


leave this comment (for later testing)

duembgen · 2018-05-02T04:37:55Z

pylocus/mds.py

+    if (theta):
+        return theta_from_eigendecomp(factor, u)
+    else:
+        return x_from_eigendecomp(factor, u, dim)


nice find, didn't notice this ugly duplicate code.

Avoided creation of unneeded copy of `x_est` in mds.py by performing subtraction of its minimum in place.

duembgen · 2018-05-07T06:45:46Z

Thank you for the contrubution, oleksandr-pavlyk. Changes are now merged

oleksandr-pavlyk added 4 commits April 23, 2018 12:30

ENH: Replaced np.power(x,2) with np.square

96c9ec2

ENH: More efficients MDS function

5b0079f

Less copying, use of broadcast_to to avoid creation of intermediate arrays. Improved efficiency of `theta_from_eigendecomp` and `x_from_eigendecom` by means of efficient use of broadcasting in numpy

removed needless memory copy

f6b16f6

duembgen requested changes May 2, 2018

View reviewed changes

Removed comment per PR feedback

77dcb45

Avoided creation of unneeded copy of `x_est` in mds.py by performing subtraction of its minimum in place.

duembgen approved these changes May 7, 2018

View reviewed changes

duembgen merged commit f08187a into LCAV:master May 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed improvements #18

Speed improvements #18

oleksandr-pavlyk commented Apr 23, 2018

oleksandr-pavlyk commented Apr 30, 2018

duembgen commented May 2, 2018

duembgen commented May 2, 2018

duembgen May 2, 2018

duembgen May 2, 2018

duembgen May 2, 2018

duembgen commented May 7, 2018 •

edited

Loading

Speed improvements #18

Speed improvements #18

Conversation

oleksandr-pavlyk commented Apr 23, 2018

oleksandr-pavlyk commented Apr 30, 2018

duembgen commented May 2, 2018

duembgen commented May 2, 2018

duembgen May 2, 2018

Choose a reason for hiding this comment

duembgen May 2, 2018

Choose a reason for hiding this comment

duembgen May 2, 2018

Choose a reason for hiding this comment

duembgen commented May 7, 2018 • edited Loading

duembgen commented May 7, 2018 •

edited

Loading