-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: store l1 and l2 as model state and not command line arguments #3654
refactor: store l1 and l2 as model state and not command line arguments #3654
Conversation
vowpalwabbit/vw_versions.h
Outdated
@@ -59,5 +59,9 @@ constexpr VW::version_struct VERSION_PASS_UINT64{8, 3, 3}; | |||
|
|||
/// Added serialized seen min and max labels in the --active reduction | |||
constexpr VW::version_struct VERSION_FILE_WITH_ACTIVE_SEEN_LABELS{9, 0, 0}; | |||
|
|||
/// Moved option values from command line to model data | |||
constexpr VW::version_struct VERSION_FILE_WITH_L1_AND_L2_STATE_IN_MODEL_DATA{8, 11, 0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decide if okay for 9 or later and update accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make sure we decide this before merging the PR, since it could be a little awkward to have to change this once it is in the repo, and it could cause odd situations where a source-build does the wrong thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source build should do the right thing according to version.txt. It is just whether or not this functionality is "turned on"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update this to 9.0.0 prior to merging
36e7563
to
880153a
Compare
Wondering about a RunTest (or unit test) to verify things since this is a gd change. Are there some tests the cover save/resume with non trivial l1, l2 values? |
@@ -1211,12 +1230,10 @@ base_learner* setup(VW::setup_base_i& stack_builder) | |||
.default_value(0.f) | |||
.help("Degree of l2 regularization applied to activated sparse parameters")) | |||
.add(make_option("l1_state", all.sd->gravity) | |||
.keep(all.save_resume) | |||
.default_value(0.) | |||
.default_value(L1_STATE_DEFAULT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the model file does not have l1,l2 (i.e. older model), and we don't pick up l1, l2 values from command line will this be breaking change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, if neither the model file or command line contains l1/l2 information then the default value will be used as before. There is no change in behavior
Test 1 in slow.vwtest.json handles saving and loading models with --l1 and --l2 which causes l1_state and l2_state to be modified. ( vowpal_wabbit/test/save_resume_test.py Line 190 in 2f189ca
I can add more tests though. I'll test the override behavior. |
This change is important as these options being serialized as a float or double matters as enough precision is lost to affect save_resume model in the float case. Moving this to the model resolves the todo, makes it consistent with all other state and removes the need for options to understand double precision.
This change automatically supports "upgrading" models. If an old keep style model is loaded it will update the state accordingly and on subsequent save/load will use the save load data section
When loading value is
Must merge after #3656