Documentation to support the MultiModel deployment feature [DO NOT MERGE] #554

dipatidar · 2025-03-18T19:55:05Z

No description provided.

ai-quick-actions/multimodel-model-deployment-tips.md

VipulMascarenhas

added some minor comments

ai-quick-actions/multimodel-deployment-tips.md

VipulMascarenhas · 2025-03-25T16:40:21Z

ai-quick-actions/multimodel-deployment-tips.md

+
+MultiModel inference and serving refers to efficiently hosting and managing multiple large language models simultaneously to serve inference requests using shared resources. The Data Science server has prebuilt **vLLM service container** that make deploying and serving multiple large language model on **single GPU Compute shape** very easy, simplifying the deployment process and reducing operational complexity. This container comes with preinstalled [**LiteLLM proxy server**]https://docs.litellm.ai/docs/simple_proxy) which routes requests to the appropriate model, ensuring seamless prediction.
+
+**Multi-Model Deployment is currently in beta and is only available through the CLI. At this time, only base service LLM models are supported, and fine-tuned/registered models cannot be deployed.**


nit: use either "Multi-Model" or "MultiModel" throughout to be consistent in terminology.

ai-quick-actions/multimodel-deployment-tips.md

mrDzurb · 2025-03-20T17:11:53Z

ai-quick-actions/multimodel-deployment-tips.md

+```bash
+ads aqua deployment list_shapes
+```
+### Example


I think the Example section in this case would be redundant. Same command we show in Usage.

ai-quick-actions/multimodel-deployment-tips.md

mrDzurb · 2025-03-20T17:16:08Z

ai-quick-actions/multimodel-deployment-tips.md

+```
+##### CLI Output
+
+```json


To be short, let's reduce the result list to a couple of items.

ai-quick-actions/multimodel-deployment-tips.md

mrDzurb · 2025-03-20T17:20:44Z

ai-quick-actions/multimodel-deployment-tips.md

+If no primary model is provided, the gpu allocation for A, B, C could be [2, 4, 2], [2, 2, 4] or [4, 2, 2]
+If B is the primary model, the gpu allocation is [2, 4, 2] as B always gets the maximum gpu count.
+
+`**kwargs`


I think we don't need to mention **kwargs here. We could rather show:

-- compartment_id: [str] The compartment OCID to retrieve the models and available model deployment shapes.

mrDzurb · 2025-03-20T17:24:48Z

ai-quick-actions/multimodel-deployment-tips.md

+```
+
+
+## List MultiModel Deployments


I think there is no difference currently between listing single and multi-model deployments. We can give a reference to the CLI tips like we did for model list. We can still mention that MultiModel deployments will have tag "aqua_multimodel": "true", associated with them.

mrDzurb · 2025-03-25T17:21:40Z

ai-quick-actions/multimodel-deployment-tips.md

+  - [Create Model Evaluation](#create-model-evaluations)
+
+
+# Introduction to MultiModel Deployment and Serving


Would it be helpful to add a Prerequisites, Info or Limitations section where we can outline all the current limitations?

mrDzurb · 2025-03-25T17:28:45Z

ai-quick-actions/multimodel-deployment-tips.md

+### Example
+
+```bash
+ads aqua evaluation create  --evaluation_source_id "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>" --evaluation_name "test_evaluation" --dataset_path "oci://<bucket>@<namespace>/path/to/the/dataset.jsonl" --report_path "oci://<bucket>@<namespace>/report/path/" --model_parameters '{"model":"<model_name>","max_tokens": 500, "temperature": 0.7, "top_p": 1.0, "top_k": 50}' --shape_name "VM.Standard.E4.Flex" --block_storage_size 50 --metrics '[{"name": "bertscore", "args": {}}, {"name": "rouge", "args": {}}]


Nit: For longer examples, could we split them across multiple lines for better readability?

Documentation to support the MultiModel depolyment feature

0a94060

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Mar 18, 2025

mrDzurb reviewed Mar 18, 2025

View reviewed changes

Added addition CLI commands

b73eabd

dipatidar changed the title ~~Documentation to support the MultiModel deployment feature~~ Documentation to support the MultiModel deployment feature [DO NOT MERGE] Mar 25, 2025

VipulMascarenhas reviewed Mar 25, 2025

View reviewed changes

mrDzurb reviewed Mar 25, 2025

View reviewed changes

dipatidar added 2 commits March 26, 2025 16:15

Review changes

8c46910

adding custom model section

1eb5724

mrDzurb requested a review from elizjo March 29, 2025 00:01

mrDzurb approved these changes Mar 29, 2025

View reviewed changes

VipulMascarenhas approved these changes Mar 29, 2025

View reviewed changes

mrDzurb merged commit 50d8d2f into oracle-samples:main Apr 23, 2025
1 check passed


		MultiModel inference and serving refers to efficiently hosting and managing multiple large language models simultaneously to serve inference requests using shared resources. The Data Science server has prebuilt vLLM service container that make deploying and serving multiple large language model on single GPU Compute shape very easy, simplifying the deployment process and reducing operational complexity. This container comes with preinstalled [LiteLLM proxy server]https://docs.litellm.ai/docs/simple_proxy) which routes requests to the appropriate model, ensuring seamless prediction.

		Multi-Model Deployment is currently in beta and is only available through the CLI. At this time, only base service LLM models are supported, and fine-tuned/registered models cannot be deployed.

		- [Create Model Evaluation](#create-model-evaluations)


		# Introduction to MultiModel Deployment and Serving

Documentation to support the MultiModel deployment feature [DO NOT MERGE] #554

Documentation to support the MultiModel deployment feature [DO NOT MERGE] #554

Uh oh!

Conversation

dipatidar commented Mar 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

VipulMascarenhas left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

VipulMascarenhas Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mrDzurb Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mrDzurb Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mrDzurb Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

mrDzurb Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

mrDzurb Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

mrDzurb Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants