Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readable model stores three values per feature even with --sgd #596

Closed
martinpopel opened this issue Apr 24, 2015 · 1 comment
Closed

Readable model stores three values per feature even with --sgd #596

martinpopel opened this issue Apr 24, 2015 · 1 comment

Comments

@martinpopel
Copy link
Contributor

Let's say I want non-adaptive, non-normalized sgd. So I use either --sgd or --invariant (depending on whether I want invariant updates or not).

vw -d rcv1_small.dat --save_resume --sgd --readable_model readable.model

Surprisingly, the readable.model still contains three parameters per features, although I would expect just one parameter (the weight):

34876:-1.216224 0.097365 0.808201
34877:0.097365 0.808201 -0.190733
34878:0.808201 -0.190733 -1.410626
34879:-0.190733 -1.410626 0.495522

You can notice that the third parameter of the feature n is actually the second parameter of feature n+1 and the first parameter (weight) of feature n+2.

At https://github.com/JohnLangford/vowpal_wabbit/blob/master/vowpalwabbit/gd.cc#L815
you can see

if (stride == 2) { /*either adaptive or normalized*/}
else {/*adaptive and normalized*/}

This seems that the possibility stride=1 (stride_shift=0) is not expected in the code. (--sgd --save_resume is not a common use case, but someone could use it for the effect of aggregated average loss, so this still looks like a bug.)

I tried vw -d rcv1_small.dat --save_resume --adaptive --readable_model readable.model and here I would expect two parameters per feature (the weight and the AdaGrad sum of squared gradients), but I still see three parameters in readable.model.

Note that this probably affects also binary model and also reading the model (few lines above in the code), not only writing, which may have more dangerous consequences.

@JohnLangford
Copy link
Member

This was a nasty rare bug---thanks for finding it. It should be fixed in
the master now.

-John

On Thu, Apr 23, 2015 at 8:22 PM, Martin Popel notifications@github.com
wrote:

Let's say I want non-adaptive, non-normalized sgd. So I use either --sgd
or --invariant (depending on whether I want invariant updates or not).

vw -d rcv1_small.dat --save_resume --sgd --readable_model readable.model

Surprisingly, the readable.model still contains three parameters per
features, although I would expect just one parameter (the weight):

34876:-1.216224 0.097365 0.808201
34877:0.097365 0.808201 -0.190733
34878:0.808201 -0.190733 -1.410626
34879:-0.190733 -1.410626 0.495522

You can notice that the third parameter of the feature n is actually the
second parameter of feature n+1 and the first parameter (weight) of feature
n+2.

At
https://github.com/JohnLangford/vowpal_wabbit/blob/master/vowpalwabbit/gd.cc#L815
you can see

if (stride == 2) { /either adaptive or normalized/}else {/adaptive and normalized/}

This seems that the possibility stride=0 is not expected in the code. (--sgd
--save_resume is not a common use case, but someone could use it for the
effect of aggregated average loss, so this still looks like a bug.)

I tried vw -d rcv1_small.dat --save_resume --adaptive --readable_model
readable.model and here I would expect two parameters per feature (the
weight and the AdaGrad sum of squared gradients), but I still see three
parameters in readable.model.

Note that this probably affects also binary model and also reading the
model (few lines above in the code), not only writing, which may have more
dangerous consequences.

Reply to this email directly or view it on GitHub
#596.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants