-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some feedback on vw and vw-hypersearch #406
Comments
@trufanov-nok thanks so much for all this feedback. It is very helpful. I wrote most of that problematic code in |
Nice to hear that! Looking forward for next VW update. That's a great project! |
This is helpful, thanks. The min_label and max_label for logistic loss are actually set to -50 The -b segfault is a real bug, now fixed. Can you update the documentation for the random seed? It's a wiki. The incoherence in dealing with arguments is an issue. Internally, In order to use logistic loss, you need to specify --loss_function logistic. The string in a namespace affects the hash value of all features in the -John On 09/24/2014 09:01 AM, trufanov-nok wrote:
|
Small update on this. I've fixed all the vw-hypersearch issues except Brent. Still doing testing. Will submit a pull request soon. |
Hi all! Very long explanations of how best constants are calculated is published and my understanding of how it was done before could be found here: Except for constants calculation the important piece is my investigation of usage --initial_t in example count calculation (end of p.3). I suspect it was done in a wrong way. Current implementation removes the influence of --initial_t and its value is not affecting best constant anymore. Some tests are published here: As fixed best constant functionality supports all loss functions - it's quite big piece of code. I would consider storing it in a separate best_constant.cc file to keep main.cc small, but I have no VS on my linux machine and won't risk to edit project/make files manually. If there won't be any problems with that I'll make a pull request for it as main.cc modification. |
This sounds very useful. Please place it in a separate .cc file and -John On 11/17/2014 04:29 AM, Alexander Trufanov wrote:
|
Alex, I'm very impressed by this thorough work on a more general and correct best constant. Thank you! |
Yes, thanks. |
It looks like IV is the only issue remaining in the core VW? |
Issue IV might be fixed now(?) Anyways, closing this because too much has changed since then. |
Hi,
I've used VW for a first time for one of kaggle's competition and would like to share some feedback. That's a big post as most of following are minor buglets or my wishes - no any serious problems found. Being a inexperienced user I could be wrong in some cases:
I. best constant's loss.
best constant's loss value won't be printed out by VW in case of '--loss_function logistic' is specified. Although the best constant itself is printed. The corresponding code is following:
if (all->sd->min_label == 0. && all->sd->max_label == 1. && best_constant < 1. && best_constant > 0.)
cerr << endl << "best constant's loss = " << constant_loss;
https://github.com/JohnLangford/vowpal_wabbit/blob/master/vowpalwabbit/main.cc#L97
LogLoss enforces usage of [-1,1] labels so I suspect that min_label == -1 in this case. Also would be great if the value will be postprocessed according to --link function specified in command line (I have used '--link:logistic').
II. Crash in case of -b mismatch
Although there is a -b values check in the code:
if (all.default_bits != true && all.num_bits != local_num_bits)
{
cout << "vw: -b bits mismatch: command-line " << all.num_bits << " != " << local_num_bits << " stored in model" << endl;
throw exception();
}
https://github.com/JohnLangford/vowpal_wabbit/blob/master/vowpalwabbit/parse_regressor.cc#L100
It's not working for some reason. Example:
$ cat dummy.vw
-1 | price:.23 sqft:.25 age:.05 2006
1 2 second_house| price:.18 sqft:.15 age:.35 1976
-1 1 0.5 third_house| price:.53 sqft:.32 age:.87 1924
$ vw -d dummy.vw -b 20 -f dummy.model
Num weight bits = 20
average since example example current current current
loss last counter weight label predict features
1.000000 1.000000 1 1.0 -1.0000 0.0000 5
3.000000 4.000000 2 3.0 1.0000 -1.0000 5
$ vw -t dummy.vw -b 21 -i dummy.model
only testing
Num weight bits = 21
average since example example current current current
loss last counter weight label predict features
0.595190 0.595190 1 1.0 -1.0000 -0.2285 5
0.667309 0.703368 2 3.0 1.0000 0.1613 5
average loss = 0.807461
$ vw -t dummy.vw -b 10 -i dummy.model
only testing
Num weight bits = 10
learning rate = 10
initial_t = 1
power_t = 0.5
Seg fault (mery dump was saved) <-- the app crashed, no any '-b bits mismatch' message in terminal.
III. random_weights' arg
Official tutorial isn't clear on --random_weights arg meaning.
https://github.com/JohnLangford/vowpal_wabbit/wiki/Command-line-arguments
And in zinkov's manual it's described as
Weights unless stated are all initialized at 0, they can start at another value with --initial_weight w or be randomized with the --random_weights seed option.
http://zinkov.com/posts/2013-08-13-vowpal-tutorial/
I would expect the same - that arg shall be seed. And it works with any seed value specified. But from the code it could be seen that its arg is a bool value.
https://github.com/JohnLangford/vowpal_wabbit/blob/master/vowpalwabbit/parse_args.cc#L907
And there is another option --random_seed to specify seed value. So if you are inattentive reader like me you can end up with passing seed with --random_weights instead of --random_seed and it's hard to realise that it's not working as expected bcs any value converted to bool correcly. Would be great if documentation could be a bit more clear on that.
IV. Inconsistency in param specification for vw [-d -f] /[-t -i] steps
Example:
$ vw -d dummy.vw -f dummy.model --ngram 2
Generating 2-grams for all namespaces.
....
$ vw -t dummy.vw -i dummy.model --ngram 2
Generating 2-grams for all namespaces.
Generating 2-grams for all namespaces.
only testing
...
Ok, I've got ngram generation message twice. So that info is stored in model file and I shouldn't specify ngram for -t step anymore. After that I'm getting only one line for -t. Ok but:
$ vw -d dummy.vw -f dummy.model -qab
creating quadratic features for pairs: ab
...
$ vw -t dummy.vw -i dummy.model
only testing
Num weight bits = 18
...
For -q vw doesn't print out any info message on -t -i step, but I'm sure tat q value was saved in model file too (according to printed feature numbers). That could confuse user and make him unnecessary specify -q or unnecessary remove --ngram for -t -i step.
3rd case: --loss:logistic
$ vw -d dummy.vw -f dummy.model --loss:logistic
...
$ vw -t dummy.vw -i dummy.model --loss:logistic
only testing
...
vw: unrecognised option '--loss:logistic'
You can't specify --loss param with -t at all! VW halts! And I know its value is stored in model file too.
I think all 3 cases shall be aligned.
V. Namespaces
According to https://github.com/JohnLangford/vowpal_wabbit/wiki/Input-format and http://zinkov.com/posts/2013-08-13-vowpal-tutorial/ namespace is a string. And following sample was given:
1 1.0 |MetricFeatures:3.28 height:1.5 length:2.0 |Says black
That's incorrect. Code investigation and practice demonstrates that namespace's string could contain only one literal. Like 'n' or 'Q'. Or perhaps only first latter of string is used.
It can't be even a number like '2' as in that case --ngram 22 become ambiguous. And thus please correct the documentation.
P.S. I was very upset by the fact I can't use one --ngram specification for several namespaces, like --ngram ab2. I have to use --ngram a2 --ngram b2. Same for --skips.
VI. brents search
I found brent's search implementation in vw-hypersearch script:
https://github.com/JohnLangford/vowpal_wabbit/blob/master/utl/vw-hypersearch#L263
It's commented with following:
As transcribed from wikipedia.
Doesn't give a better result than GoldenSection (yet)
Need to figure out why.
and hidden from the end user. Looks like mentioned wikipedia's code is:
http://en.wikipedia.org/wiki/Brent%27s_method
and wiki's line
if f(a) f(b) >= 0 then exit function because the root is not bracketed.
corresponds to script's:
if ($fa * $fb >= 0) {
if ($fa < $fb) {
return opt_value($a);
} else {
return opt_value($b);
}
}
on https://github.com/JohnLangford/vowpal_wabbit/blob/master/utl/vw-hypersearch#L277
Well, I suspect it just shouldn't work. Didn't go deep into that, but looks like wiki's code is for finding roots of f(x) = 0. And " As with the bisection method, we need to initialize Dekker's method with two points, say a0 and b0, such that f(a0) and f(b0) have opposite signs. If f is continuous on [a0, b0], the intermediate value theorem guarantees the existence of a solution between a0 and b0.".
In our case f(x) is an average loss function and it's positive for all x.
I would suggest to transcribe brents search from c++ code of boost library instead:
http://svn.boost.org/svn/boost/trunk/boost/math/tools/minima.hpp
VII. Hypersearch with LogLoss.
In case -t option is specified for vw-hypersearch:
vw-hypersearch -t test.vw 0..1 vw train.vw
the script takes "vw ..." part of the command and adds " -f temp.model && vw -t test.vw -i temp.model" to it. After its execution completion it searches for "Average loss: " line from bottom to top bcs both vw -d and vw -t will print out such line and we are optimising the -t one. Generally temp.model filename is generated at runtime but if it was specified in vw -d part - it'll be copied to vw -t part.
Unfortunately that's all that script could automatically copy from vw -d to vw -t. That means that vw -t step will always use default --loss_function value. Perhaps this value is stored in the model file (i doubt so) but still script works with vw -t output printed to stdout, and square loss influence clearly presence there.
To fix that in my project I've replaced this line with hardcoded value
push(@argv, '&&', $ARGV[0], '-t', '-i', $ModelFile, $opt_t);
new one:
push(@argv, '&&', $ARGV[0], '-t', '--loss_function logistic', '-i', $ModelFile, $opt_t);
https://github.com/JohnLangford/vowpal_wabbit/blob/master/utl/vw-hypersearch#L570
But would be great if this will be fixed for any --log_loss values specified for vw -d step.
VIII.Hypersearch without input file.
Once I launched vw-hyperseach without -d param or input file. Simple example:
$ ../vowpal_wabbit/utl/vw-hypersearch 0.1 1.0 vw -l %
trying 0.656230589874905 .........
My command was long, contained a lot of params including -t file for vw-hypersearch, so i didn't notice that there is no -d value. And I was sure that everything is ok next 10 mins as I saw initial progress printed out by script. Eventually I realised that dots progress got stuck for some reason and checked my task manager to find that vw doesn't consume CPU. It was waiting for stdin data. Would be great if the script at least print something like 'using stdin' in that case or won't print these dots.
IX. Hyper if hypersearch and --holdout_off
Hypersearch doesn't work if user has specified --passes without --holdout_off in vw -d command line part. It enforces --holdout_off automatically. The related code is there:
https://github.com/JohnLangford/vowpal_wabbit/blob/master/utl/vw-hypersearch#L546
There is a comment:
Not very nice, but unfortunately, average_loss when
hold-out is in effect (default), is printed as 0.
I couldn't reproduce the problem author refers to. Average result is not 0 and the script shall work for average_loss values between 0 and 1 (if author meant that). I have just commented that check. I suppose that there was a problem earlier bcs with holdout option the average result is printed with suffix 'h' at the end. But currently used regexp ("/^average loss\s_=\s_(\S+)/") in this script process it normally as suffix is space separated. So my assumption is that there was a problem with parsing average result before, but since some vw update or script's regexp tweak there is no such problem anymore. But the unnecessary holdout enforcement is still there. My suggestion is just to comment lines 550-551 - it should work.
X. Hypersearch and --quiet
Would be nice if vw-hypersearch will automatically remove --quiet param that could be accidentally passed in vw -d part of it's arguments. With --quiet it stucks without progress as it doesn't get any output from vw. Example:
../vowpal_wabbit/utl/vw-hypersearch 0.1 1.0 vw dummy.vw -l % --quiet
trying 0.656230589874905
The text was updated successfully, but these errors were encountered: