Skip to content

Conversation

@heiruwu
Copy link
Contributor

@heiruwu heiruwu commented Mar 13, 2024

Because

  • We are going to support containerized model serving with Instill Model

This commit

  • add deployment handle return
  • add modules for building and pushing model image
  • expose cpu/gpu/memory resource allocation configs
  • update instill_deployable decorator

heiruwu and others added 8 commits March 14, 2024 01:00
Because

- `model-backend` needs `Ray` CLI to deploy dockerized application

This commit

- return deployment handle for CLI to reference
Because

- we need to provide easy-to-use script for user to build and push
containerized model to desired registry

This commit

- add `docker` dependency
- add `build` module script for easy image building and pushing
Because

- we need to copy model weight files along with config and model.py

This commit

- update `dockerfile` to copy all files in the same directory
Because

- It is not practical to determine vram usage solely from model file
size

This commit

- expose cpu/gpu/ram resource allocation config to user
Because

- It is hard to know what went wrong without build logs
- pip install tends to timeout for large packages installation

This commit

- print build logs
- add default timeout for pip package installation
Because

- we remove the restriction of the model folder structure after
deprecating triton model support

This commit

- update custom model guide
Because

- user may want to push to multiple registries, it is undesirable to
define in `instill.yaml`

This commit

- separate `build` and `push` script
- remove `registry` from model config
@heiruwu heiruwu merged commit 05863a0 into main Mar 13, 2024
@heiruwu heiruwu deleted the dockerized branch March 13, 2024 17:16
heiruwu added a commit that referenced this pull request Mar 20, 2024
Because

- We are going to support containerized model serving with `Instill
Model`

This commit

- add deployment handle return
- add modules for building and pushing model image
- expose cpu/gpu/memory resource allocation configs
- update `instill_deployable` decorator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

No open projects
Status: 👋 Done

Development

Successfully merging this pull request may close these issues.

3 participants