Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(model)!: adopt containerized model serving #542

Merged
merged 14 commits into from
Apr 8, 2024
Merged

Conversation

heiruwu
Copy link
Member

@heiruwu heiruwu commented Apr 1, 2024

Because

  • we are completely moving to serving models with container image format

This commit

  • retire controller-model
  • retire model-repository
  • retire caching mechanism
  • retire github huggingface and artivc model definitions
  • retire github PAT
  • refactor create and deploy/undeploy methods to be sync call
  • add model version instance under namespace
  • refactor create/deploy/undeploy/trigger methods to have version instance concept
  • move deploy/undeploy endpoints to private
  • support accelerator type
  • add detail message for model instance status
  • support async model trigger with temporal

resolves INS-3724
resolves INS-3715
resolves INS-3714
resolves INS-3713
resolves INS-4050

Copy link

linear bot commented Apr 1, 2024

Copy link

linear bot commented Apr 1, 2024

Copy link

linear bot commented Apr 1, 2024

Copy link

codecov bot commented Apr 1, 2024

Codecov Report

Attention: Patch coverage is 0% with 434 lines in your changes are missing coverage. Please review.

Project coverage is 1.20%. Comparing base (d4ed219) to head (7427f34).

❗ Current head 7427f34 differs from pull request most recent head 585418d. Consider uploading reports for the commit 585418d to get more accurate results

Files Patch % Lines
pkg/service/service.go 0.00% 210 Missing ⚠️
pkg/handler/public.go 0.00% 137 Missing ⚠️
pkg/handler/private.go 0.00% 38 Missing ⚠️
pkg/service/worker.go 0.00% 22 Missing ⚠️
pkg/handler/stream.go 0.00% 15 Missing ⚠️
pkg/handler/create.go 0.00% 11 Missing ⚠️
pkg/service/collect.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##            main    #542      +/-   ##
========================================
+ Coverage   0.99%   1.20%   +0.21%     
========================================
  Files         15      14       -1     
  Lines       5839    3653    -2186     
========================================
- Hits          58      44      -14     
+ Misses      5773    3605    -2168     
+ Partials       8       4       -4     
Flag Coverage Δ
unittests 1.20% <0.00%> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

linear bot commented Apr 1, 2024

Copy link

linear bot commented Apr 2, 2024

Because

- we are going to support asynchronous model trigger

This commit

- support async model trigger with temporal workflow
@heiruwu heiruwu marked this pull request as ready for review April 8, 2024 07:30
@heiruwu heiruwu merged commit 3c80f39 into main Apr 8, 2024
12 checks passed
@heiruwu heiruwu deleted the containerized-support branch April 8, 2024 07:42
heiruwu pushed a commit that referenced this pull request Jun 6, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.24.0-alpha](v0.23.0-alpha...v0.24.0-alpha)
(2024-06-06)


### ⚠ BREAKING CHANGES

* **model:** adopt containerized model serving
([#542](#542))

### Features

* **handler:** implement get latest operation
([#589](#589))
([33d2395](33d2395))
* **handler:** support listing available regions for model deployment
([#561](#561))
([52c2172](52c2172))
* **handler:** support model profile image
([#566](#566))
([0c8dbba](0c8dbba))
* **model:** add permission field in model object
([#576](#576))
([2d36a58](2d36a58))
* **model:** add task schema in model struct
([#578](#578))
([647069d](647069d))
* **model:** adopt containerized model serving
([#542](#542))
([3c80f39](3c80f39))
* **model:** embed sample input/output in model proto message
([#558](#558))
([5fba538](5fba538))
* **model:** support latest model version trigger
([#580](#580))
([47cb36c](47cb36c))
* **model:** support resource spec in model definition
([#557](#557))
([fee6e4b](fee6e4b))
* **model:** support search/filter with list endpoints
([#559](#559))
([7b17393](7b17393))
* **model:** support watch latest model and `order_by` for list
endpoints
([#586](#586))
([1a5e48c](1a5e48c))
* **prediction:** implement sync/async prediction records
([#555](#555))
([8d58eda](8d58eda))
* **ray:** support containerized model deployment
([#529](#529))
([4dcab05](4dcab05))
* **ray:** support custom accelerator type
([#547](#547))
([f0cc0d7](f0cc0d7))


### Bug Fixes

* **acl:** fix wrong type name
([#560](#560))
([89d09a5](89d09a5))
* **dockerfile:** update deploy config yaml path
([#590](#590))
([ee369e0](ee369e0))
* **model:** fix missing package in test models
([#552](#552))
([a28a21b](a28a21b))
* **ray:** check CDI availability for model container
([#538](#538))
([28bad42](28bad42))
* **server:** add missing message size option
([#597](#597))
([d0a0aac](d0a0aac))
* **service:** fix list model version pagination
([#569](#569))
([d8fb04a](d8fb04a))
* **service:** fix list model version return list size
([#556](#556))
([9b69f9c](9b69f9c))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
2 participants