[MRG] Speedup #1

vene · 2016-07-18T17:32:12Z

Switch everything to typed memoryviews and inline direct solver update.

20 newsgroups
=============
X_train.shape = (11314, 130107)
X_train.format = csc
X_train.dtype = float64
X_train density = 0.001214353154362896
y_train (11314,)
X_test (7532, 130107)
X_test.format = csc
X_test.dtype = float64
y_test (7532,)

Classifier Training
===================
/home/vlad/code/polylearn/polylearn/factorization_machine.py:118: UserWarning: Objective did not converge. Increase max_iter.
  warnings.warn("Objective did not converge. Increase max_iter.")
Training fm-2 ... done
Training fm-3 ... done
/home/vlad/code/polylearn/polylearn/polynomial_network.py:99: UserWarning: Objective did not converge. Increase max_iter.
  warnings.warn("Objective did not converge. Increase max_iter.")
Training polynet-2 ... done
Training polynet-3 ... done
Classification performance:
===========================

Classifier            train       test         f1   accuracy
------------------------------------------------------------
fm-3               36.3608s    0.3642s     0.4009     0.9663
fm-2               31.7401s    0.1225s     0.6126     0.9740
polynet-3           6.6819s    0.0685s     0.6992     0.9796
polynet-2           5.9581s    0.0693s     0.7310     0.9807

>>> on branch speedup

Classifier            train       test         f1   accuracy
------------------------------------------------------------
fm-3               11.7805s    0.3608s     0.4009     0.9663
fm-2                7.1945s    0.1218s     0.6126     0.9740
polynet-3           6.5308s    0.0676s     0.6992     0.9796
polynet-2           5.9655s    0.0675s     0.7310     0.9807

coveralls · 2016-07-18T17:34:43Z

Coverage remained the same at 97.886% when pulling 025c323e6d0788e712276d952846e97bf3cffb87 on speedup into 191fb42 on master.

coveralls · 2016-07-18T18:43:48Z

Coverage remained the same at 97.886% when pulling 89bdd3e803cc9a17611a16df3028895439a3a159 on speedup into 191fb42 on master.

coveralls · 2016-07-18T19:13:44Z

Coverage remained the same at 97.886% when pulling 41fdc4ed485ee125b7fee52175d69008919496fb on speedup into 191fb42 on master.

vene · 2016-07-19T17:00:28Z

@fabianp could you give me a hand to figure out why I'm getting these win32/py2 failures? It looks like the cd_direct_fast.pyx solver has numerical issues in this configuration, but I can't pinpoint the cause.

fabianp · 2016-07-20T11:50:47Z

Hey. Went through the file cd_direct_fast.pyx but I don't know what can be causing such failures. Any chance you could make a minimal example for such failures?

mblondel · 2016-07-25T12:49:06Z

Since the win32 failures are independent of this PR, I think this PR should be merged, since it is a big improvement. I would also add a benchmark/ folder to help track the improvements over time.

Regarding the failures, could it be that the random kit behave differently on win32?

fabianp · 2016-07-25T12:52:50Z

If that is the case, adding l2 regularization and/or more iterations should
solve the problem

On Jul 25, 2016 2:49 PM, "Mathieu Blondel" notifications@github.com wrote:

Since the win32 failures are independent of this PR, I think this PR
should be merged, since it is a big improvement. I would also add a
benchmark/ folder to help track the improvements over time.

Regarding the failures, could it be that the random kit behave differently
on win32?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAQ8hwPuHbnZQYCovhlsgRQJNBkR_VrZks5qZLDDgaJpZM4JO8xX
.

mblondel · 2016-07-25T12:55:36Z

Remember that FMs and PNs are non-convex :)

mblondel · 2016-07-25T13:08:00Z

@ogrisel From past experience with scikit-learn, any idea what the win32 failures could come from?

By the way, the problem is only on Python 2.7. Python 3.5 is fine.

fabianp · 2016-07-25T13:26:56Z

D'oh!

On Jul 25, 2016 2:55 PM, "Mathieu Blondel" notifications@github.com wrote:

Remember that FMs and PNs are non-convex :)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAQ8h5Z2gdf70Izpa9Lk0nEZLJJODZdHks5qZLJJgaJpZM4JO8xX
.

ogrisel · 2016-07-25T14:10:34Z

polylearn/cd_direct_fast.pyx

+                      Py_ssize_t order,
                      double* out,
-                      int s,
-                      int degree):


It's weird to use Py_ssize_t for a mathematical variable. Py_ssize_t should only be used to index arrays.

ogrisel · 2016-07-25T14:23:48Z

Py_ssize_t has a different meaning signed 32 bit int on a 32 bit platform and signed 64 bit int on a 64 bit platform. It should only be used to index datastructures (e.g. arrays or Python lists). I do not understand why the problem does not exist with Python 3 though.

vene · 2016-07-25T14:31:54Z

That makes sense. Thanks, @ogrisel! However the problem still existed when I was using int everywhere. I only replaced it with py_ssize_t in an attempt to fix it, and it seems I got a bit overzealous and replaced too many vars.

I'll change it to use py_ssize_t just where it should.

On July 25, 2016 10:23:48 AM EDT, Olivier Grisel notifications@github.com wrote:

Py_ssize_t has a different meaning signed 32 bit int on a 32 bit
platform and signed 64 bit int on a 64 bit platform. It should only be
used to index datastructures (e.g. arrays or Python lists). I do not
understand why the problem does not exist with Python 3 though.

You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#1 (comment)

Sent from my Android device with K-9 Mail. Please excuse my brevity.

coveralls · 2016-07-25T16:13:06Z

Coverage increased (+0.003%) to 97.888% when pulling 64c8876 on speedup into e0d79c2 on master.

vene · 2016-07-25T16:21:28Z

@mblondel I think it's not a rng issue. The y generated in tests/test_factorization_machine.py:check_fit for both degree=2, 3 is the same in AppVeyor as on my 64bit ubuntu machine. I haven't directly checked the initial P_ but I doubt it would be different.

I also fixed the types as suggested by @ogrisel with no difference.

So it's probably something numerical. It's strange that no errors occur with the lifted solver though.

Maybe I should try a 32bit py2 env in linux and see if I can reproduce it.

coveralls · 2016-07-25T16:45:45Z

Coverage increased (+0.003%) to 97.888% when pulling 64c8876 on speedup into e0d79c2 on master.

coveralls · 2016-07-25T17:22:41Z

Coverage increased (+0.003%) to 97.888% when pulling 64c8876 on speedup into e0d79c2 on master.

coveralls · 2016-07-26T19:47:22Z

Coverage decreased (-0.005%) to 97.88% when pulling e78961e on speedup into e0d79c2 on master.

coveralls · 2016-07-26T19:51:01Z

Coverage decreased (-0.005%) to 97.88% when pulling 49a3d7e on speedup into e0d79c2 on master.

coveralls · 2016-07-26T21:22:41Z

Coverage decreased (-0.005%) to 97.88% when pulling 024e08d on speedup into e0d79c2 on master.

coveralls · 2016-07-26T21:29:06Z

Coverage increased (+0.2%) to 98.13% when pulling e8d175b on speedup into e0d79c2 on master.

mblondel · 2016-07-26T21:59:28Z

polylearn/cd_direct_fast.pyx

        if degree == 2:
            grad_y = d1[i] - p_js * data[ii]
        elif degree == 3:
-            grad_y = 0.5 * (d1[i] ** 2 - d2[i])


I am surprised that Cython doesn't optimize this directly. Did it make a significant difference?

I was looking at the generated C++ and found that a ** 2 is compiled into pow(a, 2). However, it makes no difference in terms of speed. I guess the compiler inlines pow calls efficiently.

If there's no measurable speed up (which confirms my experience), I would favor the readable version.

I agree, I'll revert.

Meanwhile a different win64+py2 failure appeared out of nowhere (?), I'll
try actually investigating on a windows machine.

On Tue, Jul 26, 2016 at 6:33 PM, Mathieu Blondel notifications@github.com
wrote:

In polylearn/cd_direct_fast.pyx
#1 (comment)
:

@@ -71,14 +75,14 @@ cdef inline double _update(int* indices,
if degree == 2:
grad_y = d1[i] - p_js * data[ii]
elif degree == 3:

grad_y = 0.5 \* (d1[i] *\* 2 - d2[i])

If there's no measurable speed up (which confirms my experience), I would
favor the readable version.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/scikit-learn-contrib/polylearn/pull/1/files/49a3d7e8d41eef91ca1126b8f03b35df61b9149f..95828c13a600f1dd8395fc82fec5ad6e03a019f1#r72350574,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAOwUR53YhHcCp9XgAq9xqlTiD7Pi3yjks5qZospgaJpZM4JO8xX
.

mblondel · 2016-07-26T22:01:54Z

Is there a way to reduce the verbosity of coveralls? I receive one email per commit, which makes it a bit difficult to follow real comments.

vene · 2016-07-26T22:07:33Z

Ok, I think I made coveralls be quiet.

vene · 2016-08-01T16:32:37Z

I fixed the windows bug! It was my fault, of course. I forgot to initialize a double scalar to 0, and it turns out this doesn't behave well on MSVC (oddly, only for py2, not for py3: must be because of the different compiler version?!)

Appveyor is green now for both win32 and win64. (I canceled a duplicate build, triggered by the fact that this branch is on scikit-learn-contrib rather than on my fork, feel free to ignore that.)

I'll remove the ugly _sq thing, update the benchmarks, then ping again for a final review before merging.

vene · 2016-08-01T21:53:35Z

This is now ready to merge.

mblondel · 2016-08-02T08:42:00Z

I fixed the windows bug!

Hurray! Squash and merge?

vene · 2016-08-02T14:04:00Z

Merged by rebase. 🍻

vene force-pushed the speedup branch from 41fdc4e to faca431 Compare July 19, 2016 15:57

ogrisel reviewed Jul 25, 2016
View reviewed changes

vene added 5 commits July 25, 2016 11:19

switch everything to typed memoryviews

129824f

try to pinpoint win32py2 instability

f4d58ac

another try at win32: indices?

8ca6f9a

replace sizes with Py_ssize_t

a5a2790

Be more rigurous with size types

48dc187

vene force-pushed the speedup branch from 5c8f05f to 48dc187 Compare July 25, 2016 15:20

vene force-pushed the speedup branch from 64c8876 to 48dc187 Compare July 26, 2016 19:33

vene added 2 commits July 26, 2016 15:39

fully remove vestiges of compute_loss

ee8b75a

cimport numpy no longer needed

e78961e

disable appveyor on win32 for now

49a3d7e

vene added 2 commits July 26, 2016 17:17

Remove square operations, add benchmarks

95828c1

Remove squares in cd_lifted as well

024e08d

beta=1 in FM check_same_as_slow might fix stability

e8d175b

mblondel reviewed Jul 26, 2016
View reviewed changes

vene added 5 commits July 26, 2016 18:16

fewer iter in same_as_slow test?

4a21fde

cdivision in loss

4b05f49

debug test failure

8f4577f

FIX windows failures: memory wasn't zeroed :-o

f004fed

recompile cython and reenable tests

cbc199f

vene added 3 commits August 1, 2016 17:18

revert: x ** 2 is more readable and no slower

b0130f8

remove test debugging code and comments

4357cd1

Fully remove dead code _sq

ecb6a80

vene changed the title ~~Speedup~~ [MRG] Speedup Aug 1, 2016

vene closed this Aug 2, 2016

[MRG] Speedup #1

[MRG] Speedup #1

Uh oh!

Conversation

vene commented Jul 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jul 18, 2016

Uh oh!

coveralls commented Jul 18, 2016

Uh oh!

coveralls commented Jul 18, 2016

Uh oh!

vene commented Jul 19, 2016

Uh oh!

fabianp commented Jul 20, 2016

Uh oh!

mblondel commented Jul 25, 2016

Uh oh!

fabianp commented Jul 25, 2016

Uh oh!

mblondel commented Jul 25, 2016

Uh oh!

mblondel commented Jul 25, 2016

Uh oh!

fabianp commented Jul 25, 2016

Uh oh!

ogrisel Jul 25, 2016

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Jul 25, 2016

Uh oh!

vene commented Jul 25, 2016

Uh oh!

coveralls commented Jul 25, 2016

Uh oh!

vene commented Jul 25, 2016

Uh oh!

coveralls commented Jul 25, 2016

Uh oh!

coveralls commented Jul 25, 2016

Uh oh!

coveralls commented Jul 26, 2016

Uh oh!

coveralls commented Jul 26, 2016

Uh oh!

coveralls commented Jul 26, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jul 26, 2016

Uh oh!

mblondel Jul 26, 2016

Choose a reason for hiding this comment

Uh oh!

vene Jul 26, 2016

Choose a reason for hiding this comment

Uh oh!

mblondel Jul 26, 2016

Choose a reason for hiding this comment

Uh oh!

vene Jul 27, 2016

Choose a reason for hiding this comment

Uh oh!

mblondel commented Jul 26, 2016

Uh oh!

vene commented Jul 26, 2016

Uh oh!

vene commented Aug 1, 2016

Uh oh!

vene commented Aug 1, 2016

Uh oh!

mblondel commented Aug 2, 2016

Uh oh!

vene commented Aug 2, 2016

Uh oh!

Uh oh!

vene commented Jul 18, 2016 •

edited

Loading

coveralls commented Jul 26, 2016 •

edited

Loading