FM's on simple Sklearn's boston data giving NaN's #3

silkspace · 2015-02-17T00:42:49Z

This is giving errors, am I missing something?

from scipy import sparse
from sklearn.datasets import load_boston
import pylibfm

instantiate FM instance with 7 latent factors

fm = pylibfm.FM(num_factors=7, num_iter=6, verbose=True)

load dataset

boston = load_boston()

fit FM, making sure to wrap the ndarray as a sparse csr

fm.fit(sparse.csr_matrix(boston.data), boston.target)

Creating validation dataset of 0.01 of training for adaptive regularization
-- Epoch 1
Training log loss: nan
-- Epoch 2
Training log loss: nan
-- Epoch 3
Training log loss: nan
-- Epoch 4
Training log loss: nan
-- Epoch 5
Training log loss: nan
-- Epoch 6
Training log loss: nan

fm.v is also all nan.

trunghlt · 2015-02-22T16:49:02Z

+1

ruifpmaia · 2015-02-25T17:44:51Z

Did anyone actually run the Movielens example? I'm getting errors with kernel crash, maybe access violation. And it's just about using the example code and data (from 100k movielens).

silkspace · 2015-02-25T17:54:49Z

I am forking it and working on a patch. I'll let you know if it works.

If you normalize the boston data set it seems to work...that is strange...

Other sets seem to get alright answers, but there doesn't seem to be much
rhyme or reason to when that happens.

I am working on a few other improvements (and changing some of the
namespace).

Best, Alex

On Wed, Feb 25, 2015 at 9:44 AM, Rui Maia notifications@github.com wrote:

Did anyone actually run the Movielens example? I'm getting errors with
kernel crash, maybe access violation. And it's just about using the example
code and data (from 100k movielens).

—
Reply to this email directly or view it on GitHub
#3 (comment).

ruifpmaia · 2015-03-02T21:20:15Z

Hello (silkspace)
Did you manage to identify the problem?
regards
maia

coreylynch · 2015-05-06T15:26:04Z

Hi @silkspace, yea feature scaling is a fairly typical preprocessing step for many machine learning problems. See this for more detail: https://class.coursera.org/ml-003/lecture/21. My guess is that the unnormalized feature space blew up the gradients, but I'm going to take a closer look at this.

The following code works:

from scipy import sparse
from sklearn.datasets import load_boston
from sklearn.preprocessing import normalize
import pylibfm
fm = pylibfm.FM(num_factors=7, num_iter=6, verbose=True)
boston = load_boston()
fm = pylibfm.FM(num_factors=7, num_iter=6, verbose=True, initial_learning_rate=0.0001)
fm.fit(normalize(sparse.csr_matrix(boston.data)), boston.target)

Creating validation dataset of 0.01 of training for adaptive regularization
-- Epoch 1
Training log loss: 0.96985
-- Epoch 2
Training log loss: 0.96064
-- Epoch 3
Training log loss: 0.95156
-- Epoch 4
Training log loss: 0.94262
-- Epoch 5
Training log loss: 0.93383
-- Epoch 6
Training log loss: 0.92521

@ruifpmaia I just reran the movielens example on my laptop and wasn't able to see a problem. Would you mind opening a new issue with some steps to reproduce your error? Thanks!

silkspace · 2015-05-06T15:33:11Z

Hi Corey, Thanks. We rewrote the whole shebang in the ALS formulation and
now everything is fine. Not looking further into this. Thanks!

On Wed, May 6, 2015 at 8:26 AM, Corey Lynch notifications@github.com
wrote:

Hi @silkspace https://github.com/silkspace, yea feature scaling is a
fairly typical preprocessing step for many machine learning problems. See
this for more detail: https://class.coursera.org/ml-003/lecture/21. My
guess is that the unnormalized feature space blew up the gradients, but I'm
going to take a closer look at this.

The following code works:

from scipy import sparse
from sklearn.datasets import load_boston
from sklearn.preprocessing import normalize
import pylibfm
fm = pylibfm.FM(num_factors=7, num_iter=6, verbose=True)
boston = load_boston()
fm = pylibfm.FM(num_factors=7, num_iter=6, verbose=True, initial_learning_rate=0.0001)
fm.fit(normalize(sparse.csr_matrix(boston.data)), boston.target)

Creating validation dataset of 0.01 of training for adaptive regularization
-- Epoch 1
Training log loss: 0.96985
-- Epoch 2
Training log loss: 0.96064
-- Epoch 3
Training log loss: 0.95156
-- Epoch 4
Training log loss: 0.94262
-- Epoch 5
Training log loss: 0.93383
-- Epoch 6
Training log loss: 0.92521

@ruifpmaia https://github.com/ruifpmaia I just reran the movielens
example on my laptop and wasn't able to see a problem. Would you mind
opening a new issue with some steps to reproduce your error? Thanks!

—
Reply to this email directly or view it on GitHub
#3 (comment).

coreylynch · 2015-05-06T15:40:43Z

@silkspace I forgot to mention also that when trying this out on different datasets, the default settings may not be the best. We typically use cross validation to find suitable values for things like learning rate.

silkspace · 2015-05-06T16:05:02Z

Thanks Corey,

We did the same (grid search + cross val) to find the hypers. What is
interesting is that the ALS version, which does not learn the hypers, seems
to consistently give the lowest log loss. I found this surprising, but it
turns out that Rendle et al have already discovered this
http://t.signauxhuit.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XX48p_7DxW7fsWKY7fZkV0W3LqWkM56dwDFf5JHDY002?t=http%3A%2F%2Fwww.ismll.uni-hildesheim.de%2Fpub%2Fpdfs%2FRendle_et_al2011-Context_Aware.pdf&si=4783836553543680&pi=c9ab5608-4780-4048-d74b-0fc0ef959794,
however, I am not sure if this is a general result for all data sets.

Best, Silk

On Wed, May 6, 2015 at 8:40 AM, Corey Lynch notifications@github.com
wrote:

@silkspace https://github.com/silkspace I forgot to mention also that
when trying this out on different datasets, the default settings may not be
the best. We typically use cross validation to find suitable values for
things like learning rate.

—
Reply to this email directly or view it on GitHub
#3 (comment).

MLnick · 2015-05-25T12:35:11Z

@silkspace are you going to release your fork? Or maybe PR against this repo for adding the ALS formulation?

silkspace · 2015-05-25T14:39:59Z

Hi All,

We started over from scratch (Matt and I). It's not the same code.

I just recently noticed this,

https://github.com/ibayer/fastFM

Best, Alex

On Mon, May 25, 2015 at 5:35 AM, MLnick notifications@github.com wrote:

@silkspace https://github.com/silkspace are you going to release your
fork? Or maybe PR against this repo for adding the ALS formulation?

—
Reply to this email directly or view it on GitHub
#3 (comment).

MLnick · 2015-05-25T14:55:05Z

Thanks - will take a look at that link.

Still, are you gonna open-source your new version?

—
Sent from Mailbox

On Mon, May 25, 2015 at 4:40 PM, silkspaceships notifications@github.com
wrote:

Hi All,
We started over from scratch (Matt and I). It's not the same code.
I just recently noticed this,
https://github.com/ibayer/fastFM
Best, Alex
On Mon, May 25, 2015 at 5:35 AM, MLnick notifications@github.com wrote:

@silkspace https://github.com/silkspace are you going to release your
fork? Or maybe PR against this repo for adding the ALS formulation?

—
Reply to this email directly or view it on GitHub
#3 (comment).

Reply to this email directly or view it on GitHub:
#3 (comment)

silkspace · 2015-05-25T18:15:52Z

Hi Corey,

Not sure as it was a work product that my company now technically owns. We
have been thinking about open sourcing our toolkit. I'll let you know.

Best, Alex

On Mon, May 25, 2015 at 7:55 AM, MLnick notifications@github.com wrote:

Thanks - will take a look at that link.

Still, are you gonna open-source your new version?

—
Sent from Mailbox

On Mon, May 25, 2015 at 4:40 PM, silkspaceships notifications@github.com
wrote:

Hi All,
We started over from scratch (Matt and I). It's not the same code.
I just recently noticed this,
https://github.com/ibayer/fastFM
Best, Alex
On Mon, May 25, 2015 at 5:35 AM, MLnick notifications@github.com
wrote:

@silkspace https://github.com/silkspace are you going to release your
fork? Or maybe PR against this repo for adding the ALS formulation?

—
Reply to this email directly or view it on GitHub
#3 (comment).

Reply to this email directly or view it on GitHub:
#3 (comment)

—
Reply to this email directly or view it on GitHub
#3 (comment).

MLnick · 2015-05-25T19:41:50Z

Cool (I'm Nick by the way - Corey is the library author :)

The lib you linked to is pretty comprehensive, looks really good. Will test it out.

Thanks!

—
Sent from Mailbox

On Mon, May 25, 2015 at 8:15 PM, silkspaceships notifications@github.com
wrote:

Hi Corey,
Not sure as it was a work product that my company now technically owns. We
have been thinking about open sourcing our toolkit. I'll let you know.
Best, Alex
On Mon, May 25, 2015 at 7:55 AM, MLnick notifications@github.com wrote:

Thanks - will take a look at that link.

Still, are you gonna open-source your new version?

—
Sent from Mailbox

On Mon, May 25, 2015 at 4:40 PM, silkspaceships notifications@github.com
wrote:

Hi All,
We started over from scratch (Matt and I). It's not the same code.
I just recently noticed this,
https://github.com/ibayer/fastFM
Best, Alex
On Mon, May 25, 2015 at 5:35 AM, MLnick notifications@github.com
wrote:

@silkspace https://github.com/silkspace are you going to release your
fork? Or maybe PR against this repo for adding the ALS formulation?

—
Reply to this email directly or view it on GitHub
#3 (comment).

Reply to this email directly or view it on GitHub:
#3 (comment)

—
Reply to this email directly or view it on GitHub
#3 (comment).

Reply to this email directly or view it on GitHub:
#3 (comment)

silkspace · 2015-05-25T19:48:50Z

Hi Nick, sorry for the mix up!

Yeah, I think the lib I linked to has more functionality than my current
implementation. Looking to build that out, client work get's in the way ;->
(a good problem to have I suppose).

Best, Alex

On Mon, May 25, 2015 at 12:41 PM, MLnick notifications@github.com wrote:

Cool (I'm Nick by the way - Corey is the library author :)

The lib you linked to is pretty comprehensive, looks really good. Will
test it out.

Thanks!

—
Sent from Mailbox

On Mon, May 25, 2015 at 8:15 PM, silkspaceships notifications@github.com
wrote:

Hi Corey,
Not sure as it was a work product that my company now technically owns.
We
have been thinking about open sourcing our toolkit. I'll let you know.
Best, Alex
On Mon, May 25, 2015 at 7:55 AM, MLnick notifications@github.com
wrote:

Thanks - will take a look at that link.

Still, are you gonna open-source your new version?

—
Sent from Mailbox

On Mon, May 25, 2015 at 4:40 PM, silkspaceships <
notifications@github.com>
wrote:

Hi All,
We started over from scratch (Matt and I). It's not the same code.
I just recently noticed this,
https://github.com/ibayer/fastFM
Best, Alex
On Mon, May 25, 2015 at 5:35 AM, MLnick notifications@github.com
wrote:

@silkspace https://github.com/silkspace are you going to release
your
fork? Or maybe PR against this repo for adding the ALS formulation?

—
Reply to this email directly or view it on GitHub
<#3 (comment)
.

Reply to this email directly or view it on GitHub:
#3 (comment)

—
Reply to this email directly or view it on GitHub
#3 (comment).

Reply to this email directly or view it on GitHub:
#3 (comment)

—
Reply to this email directly or view it on GitHub
#3 (comment).

tkuTM · 2015-07-29T10:42:14Z

He guys..
i also ran into the nan issues, but with my own dataset.
The cause was that my initial_learning_rate parameter was set too high and with increasing number of train samples the parameters exploded.

arogozhnikov · 2016-02-16T12:25:30Z

Just in case - I've run some simple benchmarks of pylibFM vs other LibFM implementations without tuning parameters and it gives bad results (much slower, fails on large datasets, etc.)

Post with comparison and results

Sadly, original LibFM easily won this competition.

If developers of pylibFM are interested, the code of benchmarks may be found here.

coreylynch closed this as completed May 6, 2015

chezou mentioned this issue Feb 20, 2016

Why do I get better results with libfm? ibayer/fastFM#28

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FM's on simple Sklearn's boston data giving NaN's #3

FM's on simple Sklearn's boston data giving NaN's #3

silkspace commented Feb 17, 2015

load dataset

fit FM, making sure to wrap the ndarray as a sparse csr

trunghlt commented Feb 22, 2015

ruifpmaia commented Feb 25, 2015

silkspace commented Feb 25, 2015

ruifpmaia commented Mar 2, 2015

coreylynch commented May 6, 2015

silkspace commented May 6, 2015

coreylynch commented May 6, 2015

silkspace commented May 6, 2015

MLnick commented May 25, 2015

silkspace commented May 25, 2015

MLnick commented May 25, 2015

silkspace commented May 25, 2015

MLnick commented May 25, 2015

silkspace commented May 25, 2015

tkuTM commented Jul 29, 2015

arogozhnikov commented Feb 16, 2016

FM's on simple Sklearn's boston data giving NaN's #3

FM's on simple Sklearn's boston data giving NaN's #3

Comments

silkspace commented Feb 17, 2015

instantiate FM instance with 7 latent factors

load dataset

fit FM, making sure to wrap the ndarray as a sparse csr

fm.v is also all nan.

trunghlt commented Feb 22, 2015

ruifpmaia commented Feb 25, 2015

silkspace commented Feb 25, 2015

ruifpmaia commented Mar 2, 2015

coreylynch commented May 6, 2015

silkspace commented May 6, 2015

coreylynch commented May 6, 2015

silkspace commented May 6, 2015

MLnick commented May 25, 2015

silkspace commented May 25, 2015

MLnick commented May 25, 2015

silkspace commented May 25, 2015

MLnick commented May 25, 2015

silkspace commented May 25, 2015

tkuTM commented Jul 29, 2015

arogozhnikov commented Feb 16, 2016