Skip to content

[shim] Implement Future API#2141

Merged
un-def merged 2 commits intomasterfrom
issue_1780_shim_future_api
Dec 29, 2024
Merged

[shim] Implement Future API#2141
un-def merged 2 commits intomasterfrom
issue_1780_shim_future_api

Conversation

@un-def
Copy link
Copy Markdown
Collaborator

@un-def un-def commented Dec 25, 2024

  • Added Future API endpoints /api/tasks{/...}
  • Added a mechanism to restore shim state on restarts
  • Reworked error reporting — methods now return error instead bool or (bool, error) with error category (ErrNotFound, ErrInternal, etc.) wrapped.
  • Simplifed API routing — a new Router struct with a simpler interface added
  • CPU and memory resource limits added

Part-of: #1780

@un-def un-def requested a review from r4victor December 25, 2024 11:28
Comment on lines +51 to +56
// Future API
r.AddHandler("GET", "/api/tasks", s.TaskListHandler)
r.AddHandler("GET", "/api/tasks/{id}", s.TaskInfoHandler)
r.AddHandler("PUT", "/api/tasks", s.TaskSubmitHandler)
r.AddHandler("POST", "/api/tasks/{id}/terminate", s.TaskTerminateHandler)
r.AddHandler("DELETE", "/api/tasks/{id}", s.TaskRemoveHandler)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: switch from REST-like API to RPC (replace with POSTs) since it's more flexible for internal APIs and doesn't carry extra semantics.

// | | |
// v v v
// terminated terminated terminated
// pending -> preparing -> pulling -> creating -> running -> terminated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@un-def, could you elaborate why preparing status is needed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to split Run() method into two — submission and run operations, Submit(), which should work as fast as possible, only puts a task into storage with pending status and returns an error on conflict, which is returned to the server via HTTP API. Run(), in turn, performs some preparation steps before it can start a container (puts authorized_keys, prepares volumes, etc.). Previously, these steps were executed with pending status (and with legacy API — pulling state), which is not quite right semantically.

In short, penging has not Run() yet, preparing — just first status in the Run() sequence (prepare, pull image, create container, run, mark terminated)

It's not a REST API anyway
@un-def un-def merged commit 4ece690 into master Dec 29, 2024
@un-def un-def deleted the issue_1780_shim_future_api branch December 29, 2024 12:09
pranitnaik43 pushed a commit to bahaal-tech/dstack that referenced this pull request Feb 9, 2025
pranitnaik43 pushed a commit to bahaal-tech/dstack that referenced this pull request Mar 4, 2025
pranitnaik43 pushed a commit to bahaal-tech/dstack that referenced this pull request Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants