Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Reconfigure backend class at runtime #11709

Merged
merged 11 commits into from
Nov 9, 2020

Conversation

architkulkarni
Copy link
Contributor

@architkulkarni architkulkarni commented Oct 29, 2020

Allows a user to add a reconfigure method to their backend class, and then update (all current and future replicas of) their backend by setting the user_config field in the BackendConfig, which gets passed to their reconfigure method. The following example from the doc should make the usage clear:

class Threshold:
    def __init__(self):
        # self.model won't be changed by reconfigure.
        self.model = random.Random()  # Imagine this is some heavyweight model.

    def reconfigure(self, config):
        # This will be called when the class is created and when
        # the user_config field of BackendConfig is updated.
        self.threshold = config["threshold"]

    def __call__(self, request):
        return self.model.random() > self.threshold


backend_config = BackendConfig(user_config={"threshold": 0.01})
client.create_backend("threshold", Threshold, config=backend_config)
client.create_endpoint("threshold", backend="threshold", route="/threshold")
print(requests.get("http://127.0.0.1:8000/threshold").text)  # true, probably

backend_config = BackendConfig(user_config={"threshold": 0.99})
client.update_backend_config("threshold", backend_config)
print(requests.get("http://127.0.0.1:8000/threshold").text)  # false, probably

Why are these changes needed?

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • [x ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@architkulkarni architkulkarni changed the title [WIP] [Serve] Reconfigure backend [WIP] [Serve] Allow user to reconfigure backend Oct 29, 2020
@architkulkarni architkulkarni changed the title [WIP] [Serve] Allow user to reconfigure backend [WIP] [Serve] Allow user to reconfigure backend class at runtime Oct 29, 2020
@architkulkarni architkulkarni changed the title [WIP] [Serve] Allow user to reconfigure backend class at runtime [WIP] [Serve] Reconfigure backend class at runtime Oct 29, 2020
Copy link
Contributor

@edoakes edoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good at a high level to me. Can we mark this as experimental or something like that for now? Just have a note in the docs.

@@ -34,3 +34,6 @@
2000,
5000,
]

#: Name of backend reconfiguration method implemented by user.
BACKEND_RECONFIGURE_METHOD = "reconfigure"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be user_reconfigure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I forgot we had picked that name but that name makes sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I don't think we settled on a name, that's just the method I saw in other places in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Before I had user_reconfigure as our internal method and reconfigure as the name of the method the user provides in the class. I think I'll just make them both reconfigure for consistency.

@architkulkarni architkulkarni changed the title [WIP] [Serve] Reconfigure backend class at runtime [Serve] Reconfigure backend class at runtime Nov 3, 2020
@architkulkarni architkulkarni marked this pull request as ready for review November 3, 2020 00:26
@@ -249,6 +257,10 @@ def create_backend(
update={"internal_metadata": metadata})
else:
raise TypeError("config must be a BackendConfig or a dictionary.")
if backend_config.user_config and not inspect.isclass(func_or_class):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also make sure reconfigure is a method in the class so user get this error immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, we do that check in backend_worker currently. That way, it's checked upon calling create_backend, as well as upon calling update_backend_config. I will move the inspect.isclass check into backend_worker as well, since that should also happen for both create_backend and update_backend_config.

"""

internal_metadata: BackendMetadata = BackendMetadata()
num_replicas: PositiveInt = 1
max_batch_size: Optional[PositiveInt] = None
batch_wait_timeout: float = 0
max_concurrent_queries: Optional[int] = None
user_config: Dict[str, Any] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should just be Any instead of Dict? Because we are just passing this to user's method directly.

@@ -48,6 +48,42 @@ def function(flask_request):
assert resp == "POST"


def test_backend_user_config(serve_instance):
client = serve_instance

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use SignalActor to synchronize this instead of sending 100 queries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simon-mo Actually I'm not sure about this--I don't think we need any synchronization here, we just want a way of asking each replica if it got updated. Since replicas are called in round-robin, it's enough to call handle.remote() three (or 100) times, and these calls don't have to happen simultaneously. Does that make sense?

If we did want to use SignalActor, I'm not sure how that would work, since we don't have a "handle" for each replica, we just have one for the endpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

@architkulkarni architkulkarni added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Nov 3, 2020
@architkulkarni architkulkarni added tests-ok The tagger certifies test failures are unrelated and assumes personal liability. and removed @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. labels Nov 5, 2020
@edoakes
Copy link
Contributor

edoakes commented Nov 6, 2020

@architkulkarni ah, looks like there are some conflicts now. Mind resolving them so we can merge?

@architkulkarni architkulkarni added @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. tests-ok The tagger certifies test failures are unrelated and assumes personal liability. and removed tests-ok The tagger certifies test failures are unrelated and assumes personal liability. @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. labels Nov 6, 2020
@edoakes edoakes merged commit adcaabc into ray-project:master Nov 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants