Readable model stores three values per feature even with --sgd #596

martinpopel · 2015-04-24T00:22:22Z

Let's say I want non-adaptive, non-normalized sgd. So I use either --sgd or --invariant (depending on whether I want invariant updates or not).

vw -d rcv1_small.dat --save_resume --sgd --readable_model readable.model

Surprisingly, the readable.model still contains three parameters per features, although I would expect just one parameter (the weight):

34876:-1.216224 0.097365 0.808201
34877:0.097365 0.808201 -0.190733
34878:0.808201 -0.190733 -1.410626
34879:-0.190733 -1.410626 0.495522

You can notice that the third parameter of the feature n is actually the second parameter of feature n+1 and the first parameter (weight) of feature n+2.

At https://github.com/JohnLangford/vowpal_wabbit/blob/master/vowpalwabbit/gd.cc#L815
you can see

if (stride == 2) { /*either adaptive or normalized*/}
else {/*adaptive and normalized*/}

This seems that the possibility stride=1 (stride_shift=0) is not expected in the code. (--sgd --save_resume is not a common use case, but someone could use it for the effect of aggregated average loss, so this still looks like a bug.)

I tried vw -d rcv1_small.dat --save_resume --adaptive --readable_model readable.model and here I would expect two parameters per feature (the weight and the AdaGrad sum of squared gradients), but I still see three parameters in readable.model.

Note that this probably affects also binary model and also reading the model (few lines above in the code), not only writing, which may have more dangerous consequences.

The text was updated successfully, but these errors were encountered:

JohnLangford · 2015-05-06T18:11:31Z

This was a nasty rare bug---thanks for finding it. It should be fixed in
the master now.

-John

On Thu, Apr 23, 2015 at 8:22 PM, Martin Popel notifications@github.com
wrote:

Let's say I want non-adaptive, non-normalized sgd. So I use either --sgd
or --invariant (depending on whether I want invariant updates or not).

vw -d rcv1_small.dat --save_resume --sgd --readable_model readable.model

Surprisingly, the readable.model still contains three parameters per
features, although I would expect just one parameter (the weight):

34876:-1.216224 0.097365 0.808201
34877:0.097365 0.808201 -0.190733
34878:0.808201 -0.190733 -1.410626
34879:-0.190733 -1.410626 0.495522

You can notice that the third parameter of the feature n is actually the
second parameter of feature n+1 and the first parameter (weight) of feature
n+2.

At
https://github.com/JohnLangford/vowpal_wabbit/blob/master/vowpalwabbit/gd.cc#L815
you can see

if (stride == 2) { /either adaptive or normalized/}else {/adaptive and normalized/}

This seems that the possibility stride=0 is not expected in the code. (--sgd
--save_resume is not a common use case, but someone could use it for the
effect of aggregated average loss, so this still looks like a bug.)

I tried vw -d rcv1_small.dat --save_resume --adaptive --readable_model
readable.model and here I would expect two parameters per feature (the
weight and the AdaGrad sum of squared gradients), but I still see three
parameters in readable.model.

Note that this probably affects also binary model and also reading the
model (few lines above in the code), not only writing, which may have more
dangerous consequences.

Reply to this email directly or view it on GitHub
#596.

JohnLangford closed this as completed May 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readable model stores three values per feature even with --sgd #596

Readable model stores three values per feature even with --sgd #596

martinpopel commented Apr 24, 2015

JohnLangford commented May 6, 2015

Readable model stores three values per feature even with --sgd #596

Readable model stores three values per feature even with --sgd #596

Comments

martinpopel commented Apr 24, 2015

JohnLangford commented May 6, 2015