New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix to save the state of FTRL models #1912
Conversation
…n by average lenght of the feature vectors
…than default one in ooa and cbify
@@ -153,8 +153,9 @@ def do_test(filename, args, verbose=None, repeat_args=None, known_failure=False) | |||
errors += do_test(filename, '--loss_function logistic --link logistic') | |||
errors += do_test(filename, '--nn 2') | |||
errors += do_test(filename, '--binary') | |||
errors += do_test(filename, '--ftrl', known_failure=True) | |||
errors += do_test(filename, '--ftrl') | |||
errors += do_test(filename, '--pistol', known_failure=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are pistol and coin still in known failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. As I write above, the data structure is saved correctly, but the weights seem to be different if trained continuously or trained, saved, resumed. The difference does not happen immediately, but after some samples. This is why I suspect numerical issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This violates my understanding of computers so I suspect there is something that we're missing. I probably won't be able to debug before the next release, but it should get done...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to debug this issue a bit more
vowpalwabbit/global_data.h
Outdated
@@ -565,6 +565,7 @@ struct vw | |||
bool adaptive; // Should I use adaptive individual learning rates? | |||
bool normalized_updates; // Should every feature be normalized | |||
bool invariant_updates; // Should we use importance aware/safe updates | |||
uint32_t ftrl_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we pass this as an argument to save_load instead? That seems more elegant than sticking it in the global data structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I merged. Should I revert, or do you want to do a separate pull request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found the bug, not sure what best way to fix it, see mail
The test 67 failure is fixed in the current patch that I'm working on. |
This reverts commit 538e9bb.
* tag '8.7.0': (354 commits) Update version to 8.7.0 (VowpalWabbit#1926) Fix misconfiguration (VowpalWabbit#1925) Ataymano ataymano/warnings fixes (VowpalWabbit#1924) Update new version script (VowpalWabbit#1922) Run clang-format over codebase (VowpalWabbit#1921) change semantics of lambda (VowpalWabbit#1920) Bremen79 fix save ftrl (VowpalWabbit#1919) fix for daemon race condition (VowpalWabbit#1918) Revert "Fix to save the state of FTRL models (VowpalWabbit#1912)" (VowpalWabbit#1916) fix static library build (VowpalWabbit#1913) more warnings (VowpalWabbit#1915) Fix to save the state of FTRL models (VowpalWabbit#1912) remove warnings (VowpalWabbit#1911) fix closing invalid file descriptor with memory_io_buf (VowpalWabbit#1910) Optional exception (VowpalWabbit#1906) Contextual Memory Tree (VowpalWabbit#1799) Coin betting (VowpalWabbit#1903) Ataymano/c wrapper fix2 (VowpalWabbit#1859) Use Appveyor MSBuildLogger (VowpalWabbit#1904) fix for no label confidence (VowpalWabbit#1901) ...
* releases: (354 commits) Update version to 8.7.0 (VowpalWabbit#1926) Fix misconfiguration (VowpalWabbit#1925) Ataymano ataymano/warnings fixes (VowpalWabbit#1924) Update new version script (VowpalWabbit#1922) Run clang-format over codebase (VowpalWabbit#1921) change semantics of lambda (VowpalWabbit#1920) Bremen79 fix save ftrl (VowpalWabbit#1919) fix for daemon race condition (VowpalWabbit#1918) Revert "Fix to save the state of FTRL models (VowpalWabbit#1912)" (VowpalWabbit#1916) fix static library build (VowpalWabbit#1913) more warnings (VowpalWabbit#1915) Fix to save the state of FTRL models (VowpalWabbit#1912) remove warnings (VowpalWabbit#1911) fix closing invalid file descriptor with memory_io_buf (VowpalWabbit#1910) Optional exception (VowpalWabbit#1906) Contextual Memory Tree (VowpalWabbit#1799) Coin betting (VowpalWabbit#1903) Ataymano/c wrapper fix2 (VowpalWabbit#1859) Use Appveyor MSBuildLogger (VowpalWabbit#1904) fix for no label confidence (VowpalWabbit#1901) ...
* dfsg: (354 commits) Update version to 8.7.0 (VowpalWabbit#1926) Fix misconfiguration (VowpalWabbit#1925) Ataymano ataymano/warnings fixes (VowpalWabbit#1924) Update new version script (VowpalWabbit#1922) Run clang-format over codebase (VowpalWabbit#1921) change semantics of lambda (VowpalWabbit#1920) Bremen79 fix save ftrl (VowpalWabbit#1919) fix for daemon race condition (VowpalWabbit#1918) Revert "Fix to save the state of FTRL models (VowpalWabbit#1912)" (VowpalWabbit#1916) fix static library build (VowpalWabbit#1913) more warnings (VowpalWabbit#1915) Fix to save the state of FTRL models (VowpalWabbit#1912) remove warnings (VowpalWabbit#1911) fix closing invalid file descriptor with memory_io_buf (VowpalWabbit#1910) Optional exception (VowpalWabbit#1906) Contextual Memory Tree (VowpalWabbit#1799) Coin betting (VowpalWabbit#1903) Ataymano/c wrapper fix2 (VowpalWabbit#1859) Use Appveyor MSBuildLogger (VowpalWabbit#1904) fix for no label confidence (VowpalWabbit#1901) ...
I have made a fix to save the state of FTRL models. Currently the state is not saved at all because it calls the the gd save function that doesn't save the state unless it finds a gd structure, that is absent in ftrl models.
I have added a integer in the vw structure the indicates the size of the ftrl model to save and changed read/write of gd accordingly.
Note that the saving for Pistol and Coin betting still does not work: It correctly save the state, but the test fails. Not sure why, I think it is the loss in precision due to the saving and the exponential nature of these algorithms. On the other hand, the saving state for FTRL Proximal now works perfectly.