Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve] Make HTTP proxy (almost) a regular Serve deployment #15820

Closed
wants to merge 24 commits into from

Conversation

edoakes
Copy link
Contributor

@edoakes edoakes commented May 14, 2021

Why are these changes needed?

Entirely removes http_state.py and instead expresses the HTTP proxy as a regular deployment. This reduces the number of core concepts we need to maintain. It also offers a clear avenue to make the HTTP settings declarative and updateable in the future .

We do need some special casing here to handle the "EveryNode" deployment strategy. We should be able to remove this in the future if we can lean on placement group policies.

Related issue number

Closes #15561

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@edoakes
Copy link
Contributor Author

edoakes commented May 19, 2021

Blocking this on #15909 so we can wait for config changes to propagate without needing a hacky solution

@edoakes edoakes changed the title [serve] Make HTTP proxy (almost) a regular Serve deployment [WIP][serve] Make HTTP proxy (almost) a regular Serve deployment May 19, 2021
@edoakes edoakes changed the title [WIP][serve] Make HTTP proxy (almost) a regular Serve deployment [serve] Make HTTP proxy (almost) a regular Serve deployment Jul 1, 2021
@@ -31,7 +31,7 @@ class BackendConfig(BaseModel):
for shutdown. Defaults to 20s.
"""

num_replicas: PositiveInt = 1
num_replicas: Any = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Union[str, PositiveInt]

raise TimeoutError(
"HTTP proxies not available after {HTTP_PROXY_TIMEOUT}s.")
client = Client(controller, controller_name, detached=detached)
_set_global_client(client)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we calling these lines twice ? Line 730

@@ -81,6 +88,8 @@ def actor_handle(self) -> ActorHandle:

def start_or_update(self, backend_info: BackendInfo):
self._actor_resources = backend_info.replica_config.resource_dict
if self._node_resource is not None:
self._actor_resources[self._node_resource] = 0.001
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does 0.001 mean here ?

@@ -837,8 +864,14 @@ def _scale_backend_replicas(
"""
assert (backend_tag in self._backend_metadata
), "Backend {} is not registered.".format(backend_tag)
assert target_replicas >= 0, ("Number of replicas must be"
" greater than or equal to 0.")
if isinstance(target_replicas, str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes target_replicas' type from int (which also implies it from its name) defined in line 853, maybe we can assign a special value for it like -1, or introduce a new flag to control its behavior ?

assert target_replicas >= 0, ("Number of replicas must be"
" greater than or equal to 0.")
if isinstance(target_replicas, str):
assert target_replicas == "EveryNode"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry i don't have full context on this concept yet .. does "EveryNode" means the target replica num == all nodes we have in the ray cluster ? If so it might be simpler to handle it outside of this function call and simply pass in a number.

@jiaodong
Copy link
Member

is this WIP or blocked on review ?

@edoakes
Copy link
Contributor Author

edoakes commented Aug 10, 2021

@jiaodong feel free to review, the only thing it's really blocked on is better handling of constructor failure so we can raise an error if the HTTP proxy fails to start up (e.g., bad HTTP host or port)

@jiaodong
Copy link
Member

i see .. let me ensure to actually put sometime on the constructor failure issue to unblock this.

@jiaodong
Copy link
Member

Is this still WIP ?

@edoakes
Copy link
Contributor Author

edoakes commented Sep 20, 2021

@jiaodong this is unblocked now that you made the constructor failure PR, but there are a ton of merge conflicts from the refactors. I will try to clean it up in the background otherwise we can prioritize sometime in the coming weeks.

@bveeramani
Copy link
Member

‼️ ACTION REQUIRED ‼️

We've switched our code formatter from YAPF to Black (see #21311).

To prevent issues with merging your code, here's what you'll need to do:

  1. Install Black
pip install -I black==21.12b0
  1. Format changed files with Black
curl -o format-changed.sh https://gist.githubusercontent.com/bveeramani/42ef0e9e387b755a8a735b084af976f2/raw/7631276790765d555c423b8db2b679fd957b984a/format-changed.sh
chmod +x ./format-changed.sh
./format-changed.sh
rm format-changed.sh
  1. Commit your changes.
git add --all
git commit -m "Format Python code with Black"
  1. Merge master into your branch.
git pull upstream master
  1. Resolve merge conflicts (if necessary).

After running these steps, you'll have the updated format.sh.

@simon-mo
Copy link
Contributor

simon-mo commented Feb 1, 2022

@edoakes should we close this as a prototype and I can open a new PR to tackle this?

@edoakes
Copy link
Contributor Author

edoakes commented Feb 1, 2022

@simon-mo sounds good to me

@edoakes edoakes closed this Feb 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Serve] Make HTTPProxy a regular Serve Deployment
4 participants