Skip to content

Conversation

CompuIves
Copy link
Contributor

@CompuIves CompuIves commented Feb 12, 2025

Have you read the Contributing Guidelines?

Describe your changes

Adds the ability to create, stop, delete start, list and update dedicated endpoints from the CLI:

Usage: together endpoints [OPTIONS] COMMAND [ARGS]...

  Endpoints API commands

Options:
  --help  Show this message and exit.

Commands:
  create    Create a new dedicated inference endpoint.
  delete    Delete a dedicated inference endpoint.
  get       Get a dedicated inference endpoint.
  hardware  List all available hardware options, optionally filtered by...
  list      List all inference endpoints (includes both dedicated and...
  start     Start a dedicated inference endpoint.
  stop      Stop a dedicated inference endpoint.
  update    Update a dedicated inference endpoint's configuration.

This approach is a bit different than the other resources. In this case, I have generated an OpenAPI client from our public OpenAPI spec, and used that within the CLI (and resource) to call everything inside dedicated. Because of this, I have added a make job to also generate the client, and I have updated the spec (here: togethercomputer/openapi#64). We'll need to merge both PRs for dedicated endpoint support.

@CompuIves CompuIves requested a review from orangetin February 12, 2025 19:58
@CompuIves CompuIves requested a review from mojojoji February 12, 2025 20:08
Copy link

@mojojoji mojojoji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments but we can merge this and tackle those in upcoming PRs

def stop(client: Together, endpoint_id: str) -> None:
"""Stop a dedicated inference endpoint."""
client.endpoints.update(endpoint_id, state="STOPPED")
click.echo("Successfully stopped endpoint", err=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not instantly stop the endpoint. It moves to STOPPING state and then after some time STOPPED. We can change the message here to indicate that STOPPING is initiated. Maybe we can also add a --wait option to wait for the STOPPED state.

def start(client: Together, endpoint_id: str) -> None:
"""Start a dedicated inference endpoint."""
client.endpoints.update(endpoint_id, state="STARTED")
click.echo("Successfully started endpoint", err=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as stoppped. In start it moves to PENDING and the STARTING and then STARTED. So we can change the message and maybe add a --wait option

if min_replicas is not None or max_replicas is not None:
current_min = min_replicas
current_max = max_replicas
if current_min is None or current_max is None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not needed as the api supports updating just one of min or max replicas. There is no harm in doing a check here but then we can avoid the get call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to update the OpenAPI spec in that case, because the autoscaling object currently says min_replicas and max_replicas are required. So that's not the case?

@CompuIves
Copy link
Contributor Author

YAY! The tests pass! Ultimately I had to include the generated OpenAPI client files in the repo, because whatever I tried, Poetry (or pip) would not include them. I tried MANIFEST.in, pyproject.yaml changes, changes to generator. Nothing worked. Adding to the source control finally worked.

Copy link
Member

@orangetin orangetin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@orangetin orangetin merged commit 5a20155 into main Feb 13, 2025
10 of 11 checks passed
@orangetin orangetin deleted the feat/dedicated branch February 13, 2025 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants