-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CLIPPER-105] Refactor selection policy to support single-model applications with default output #89
[CLIPPER-105] Refactor selection policy to support single-model applications with default output #89
Conversation
a269c57
to
614dafd
Compare
Test PASSed. |
Test FAILed. |
614dafd
to
33e8d84
Compare
Test PASSed. |
Regarding required changes to Reviewing this now... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having trouble running the end-to-end benchmark with these policy changes. Can you take a look at this?
In general, the changes make sense - comments are mostly nits and documentation.
@@ -25,7 +25,7 @@ size_t state_key_hash(const StateKey& key); | |||
// Threadsafe, non-copyable state storage | |||
class StateDB { | |||
public: | |||
StateDB(); | |||
explicit StateDB(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A zero-argument constructor doesn't benefit from the explicit
keyword
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I added a single-argument constructor then deleted it and forgot to remove explicit.
std::function<size_t(const VersionedModelId&)>>; | ||
|
||
class BanditPolicyState { | ||
class SelectionState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have some documentation here explaining the role of selection state in enacting selection policies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
SelectionState(SelectionState&&) = default; | ||
SelectionState& operator=(SelectionState&&) = default; | ||
virtual ~SelectionState() = default; | ||
virtual std::string get_debug_string() const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this be called? Is there an intended format for the debug string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is called from the admin frontend. There's no particular format any policy needs to conform to. Whatever the implementer thinks would be useful information to expose for debugging/runtime inspection.
DefaultOutputSelectionPolicy& operator=(DefaultOutputSelectionPolicy&&) = | ||
default; | ||
~DefaultOutputSelectionPolicy() = default; | ||
std::shared_ptr<SelectionState> init_state(Output default_output) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this method do? It seems like state initialization will be necessary for many selection policies; we should consider defining a virtual init_state
method in SelectionPolicy
and overriding it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with defining a virtual init_state method is that it's likely the arguments to it will vary based on the selection policy. So I'm not sure what the function signature would be. For the 0.1 release, we only have one selection policy (the DefaultOutputSelectionPolicy). init_state is called each time a new application is created, which is the correct behavior for this selection policy. When we expand our functionality in 0.2 to support more sophisticated types of selection policies, we will need to revisit this. But I already knew we would need to refine our selection policy API anyway. CLIPPER-99 will track this effort.
// Turn y_hat into either 0 or 1 | ||
if (y_hat < 0.5) { | ||
y_hat = 0.0; | ||
int num_candidate_models = query.candidate_models_.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this should be a size_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/libclipper/test/redis_test.cpp
Outdated
@@ -147,17 +147,19 @@ TEST_F(RedisTest, AddApplication) { | |||
std::make_pair("simple_svm", 2), std::make_pair("music_cnn", 4)}; | |||
InputType input_type = InputType::Doubles; | |||
std::string policy = "exp3_policy"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer an accurate name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
auto bytes = Exp3Policy::serialize_state(state); | ||
auto new_state = Exp3Policy::deserialize_state(bytes); | ||
ASSERT_EQ(state.weight_sum_, new_state.weight_sum_); | ||
TEST_F(DefaultOutputSelectionPolicyTest, InitState) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an empty test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Removed it. There's not much to test there.
ASSERT_EQ(state.weight_sum_, new_state.weight_sum_); | ||
TEST_F(DefaultOutputSelectionPolicyTest, InitState) {} | ||
|
||
TEST_F(DefaultOutputSelectionPolicyTest, TestSelectPredictTasks) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add some documentation or use a more descriptive test name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split the SelectPredictTask and CombinePredictions tests into individual test cases with more descriptive test names.
src/frontends/src/query_frontend.hpp
Outdated
// selection policies have a default output? | ||
// | ||
// Initialize selection state for this application | ||
if (policy == "DefaultOutputSelectionPolicy") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should define a static get_name()
function within each selection policy.
policy == DefaultOutputSelectionPolicy::get_name()
is cleaner and less error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
std::shared_ptr<SelectionPolicy> current_policy = current_policy_iter->second; | ||
|
||
auto state_opt = state_db_->get(StateKey{query.label_, query.user_id_, 0}); | ||
if (!state_opt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After changing the specified selection policy in the end-to-end benchmark from "EXP3" to "DefaultOutputSelectionPolicy", the benchmark is still logging an error here:
[04:22:23.560][error] [QUERYPR...] No selection state found for query with label: test
Test FAILed. |
e29ee8c
to
055b914
Compare
I fixed the |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. We shouldn't merge this until CLIPPER-111 is addressed and the tutorial is fixed.
055b914
to
8c66acf
Compare
Test PASSed. |
Once #115 is merged, this can be rebased on develop and will be ready for merge. |
Note: Tutorial calls to |
8c66acf
to
5beb668
Compare
@Corey-Zumar I rebased on develop to include the model versioning PR #115 and fixed the tutorial to specify a default output instead of a selection policy. I believe this is ready to go. Definitely do a last pass over the code, and run the tutorial to make sure it works (I ran it as well without any problems). When you run the tutorial, note that you'll need to build the Clipper docker containers locally to test. |
Test PASSed. |
Test PASSed. |
Confirmed that the |
This is a big PR, but it's not quite as big as it looks. I refactored the JSON utils to split it between a .hpp and a .cpp file and I fixed some formatting with the code formatter. The big changes are in the query processor and the selection policies.
I refactored the selection policy API to be an abstract base class that implementations inherit from. This makes adding a new policy much simpler than the templates-based API we had previously and cleans up the QueryProcessor implementation substantially. I also ripped out all the existing policies and added the DefaultOutputSelectionPolicy which returns the model prediction if it arrives in time, and otherwise returns a static default output that is configured on a per-application basis.
Currently this PR does not update
clipper_manager.py
or the tutorial to the new selection policy implementation, which means they are broken.Do you think we should update them here and fix this all in one PR so that we go from working codebase to working codebase? Or should we separate them into to PRs because they are somewhat distinct changes?