Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable autoscaling tracking on agent for models that are fixed #4501

Merged
merged 13 commits into from
Dec 17, 2022
Merged

Disable autoscaling tracking on agent for models that are fixed #4501

merged 13 commits into from
Dec 17, 2022

Conversation

sakoush
Copy link
Member

@sakoush sakoush commented Dec 14, 2022

What this PR does / why we need it:
Previously the agent will track autoscaling metrics for all models regardless of whether the user would want to autoscale them. The scheduler will reject later any autoscaling events. This strategy suffers from the issue that there are unnecessary grpc messages moving from agent->scheduler.

This PR disables tracking autoscaling metrics for models that are fixed (i.e. autoscaling is not set on them). Therefore no events will be fired from agent to scheduler. This is done by the scheduler setting a enable autoscaling flag during model load. There is a caveat with this strategy though:

  • we cannot set autoscaling later after the model (replica) has been loaded to a particular server and the user would have to create a new version and do a rolling update

This PR also adds a flag disable-autoscaling to disable (model) autoscaling service on scheduler cmdline args. This is set by default in local (docker-compose) deployment as it is not possible to add model replicasgiven we have only one server replica (so far)

@sakoush sakoush added the v2 label Dec 14, 2022
@seldondev
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: cliveseldon
To complete the pull request process, please assign
You can assign the PR to them by writing /assign in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ukclivecox ukclivecox merged commit bf57bc5 into SeldonIO:v2 Dec 17, 2022
ukclivecox added a commit that referenced this pull request Jan 5, 2023
* initial commit for pipeline inputs from other pipelines

* Add extended example

* add trigger example

* fix kafka header test - headers now lowercase

* Fix cli x-seldon-route and also step tensor map and add step pipeline example

* Update notebook

* Ensure tensorMap works across pipelines by sending pipeline name in tensormap

* Update docs/source/contents/pipelines/index.md

Co-authored-by: Sherif Akoush <sherif.akoush@gmail.com>

* Update docs/source/contents/pipelines/index.md

Co-authored-by: Sherif Akoush <sherif.akoush@gmail.com>

* [SCv2] Improve inference docs re routing and headers (#4481)

* Capitalise Seldon when used as proper noun

* Formatting, capitalisation, etc.

* Add detail to inference docs page

* Formatting and typo fixes

* Use consistent capitalisation of model & pipeline through inference docs page

* Fix typo in inference docs for Kafka topics

* Specify use of inference v2 protocol high up in inference docs

* Add mention of headers to sync inference introduction

* Formatting + minor rewording for clarity

* Use tabs for Compose vs. k8s methods for finding the seldon-mesh endpoint

* Add note on port-forwarding seldon-mesh svc for inference requests

* Add note on service meshes for sending inference requests

* Add section on inference request routing with headers

* Add section on path-based routing for inference endpoints

* Add subsection header for Seldon routing (vs. ingress routing)

* Add section on routing from ingress -> seldon-mesh for inference calls

* Add links to RFCs for host & authority headers

* Update link to RFC for  HTTP/1 Host header

RFC-7230 obsoletes RFC-2616, the previous link.

* Add line describing virtual hosts vs. physical ones

* Use tabs for alternate ways of making inference requests

* Add inference request example with Seldon CLI

* Use consistent capitalisation of v2 for inference protocol

* Add note on Kafka headers for pipelines

* Use ordinal numbering for bullet points

* Update URI for consistency and to avoid confusion

* Move section on making requests above section on routing

* Use interpolation syntax to clarify usage of path-based routing in Seldon mesh

* Add second form of path-based routing for pipelines in Seldon mesh

* Clarify wording re virtual endpoints in SCv2

* Add section for header-based routing examples

This section builds on the examples from the prior section on making inference requests.

* Update basic examples to exclude routing headers

Routing headers are then given in the examples relevant to that section.

* Formatting

* Use group-tabs for example requests with different clients

* Add emphasis to header lines in examples for header-based routing

* Add notes on support for subdomain-based routing

* Add example snippets for subdomain routing

* Add Open Inference schema for iris model for examples

* Move pipeline inference tip lower for better flow

* Fix datatype for iris model inputs

* Bump MLServer version to 1.2.1 (#4503)

* add a notebook test for changing model replicas (#4504)

* Disable autoscaling tracking on agent for models that are fixed (#4501)

* add flag for autoscaling in grpc msg

* autogen files

* extract helper function

* adjust comment

* wire up autoscaling flag in server

* wire up autoscaling in agent client

* set thresholds for scaling in local deployment

* add autoscaling flag to scheduler

* add a toggle for autoscaling service

* revert autoscaling envs set in local deployment

* disable scaling for local deployment

* use a disable toggle instead

* do not disable by default scaling service

* Upgrading docker compose CLI command (#4498)

Not sure if this is necessary but it actually took me some time to figure it out as I was sure that I have `docker compose` already installed. According to the [Docker documentation](https://docs.docker.com/compose/reference/) the spaced version looks like the newer one and maybe the makefile should be updated for that.

* Ensure x-request-id header matches kafka key (#4511)

* Fix possible SIGSEV after producer close in modelgateway (#4515)

* Fix possible SIGSEV after producer close in modelgateway

* Set running after setup

* review comments

* Link how to install docker compose v2 from github releases (#4516)

* link compose github for easier installation

* Update docs/source/contents/getting-started/docker-installation/index.md

Co-authored-by: Alex Rakowski <20504869+agrski@users.noreply.github.com>

Co-authored-by: Alex Rakowski <20504869+agrski@users.noreply.github.com>

* review comments

Co-authored-by: Sherif Akoush <sherif.akoush@gmail.com>
Co-authored-by: Alex Rakowski <20504869+agrski@users.noreply.github.com>
Co-authored-by: Adrian Gonzalez-Martin <agm@seldon.io>
Co-authored-by: Sherif Akoush <sa@seldon.io>
Co-authored-by: Saeid <s.ghafouri@qmul.ac.uk>
Co-authored-by: RafalSkolasinski <r.j.skolasinski@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants