Skip to content

Conversation

@dipatidar
Copy link
Contributor

No description provided.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Mar 18, 2025
@dipatidar dipatidar changed the title Documentation to support the MultiModel deployment feature Documentation to support the MultiModel deployment feature [DO NOT MERGE] Mar 25, 2025
Copy link
Contributor

@VipulMascarenhas VipulMascarenhas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some minor comments


MultiModel inference and serving refers to efficiently hosting and managing multiple large language models simultaneously to serve inference requests using shared resources. The Data Science server has prebuilt **vLLM service container** that make deploying and serving multiple large language model on **single GPU Compute shape** very easy, simplifying the deployment process and reducing operational complexity. This container comes with preinstalled [**LiteLLM proxy server**]https://docs.litellm.ai/docs/simple_proxy) which routes requests to the appropriate model, ensuring seamless prediction.

**Multi-Model Deployment is currently in beta and is only available through the CLI. At this time, only base service LLM models are supported, and fine-tuned/registered models cannot be deployed.**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use either "Multi-Model" or "MultiModel" throughout to be consistent in terminology.

```bash
ads aqua deployment list_shapes
```
### Example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Example section in this case would be redundant. Same command we show in Usage.

```
##### CLI Output

```json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be short, let's reduce the result list to a couple of items.

If no primary model is provided, the gpu allocation for A, B, C could be [2, 4, 2], [2, 2, 4] or [4, 2, 2]
If B is the primary model, the gpu allocation is [2, 4, 2] as B always gets the maximum gpu count.

`**kwargs`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to mention **kwargs here. We could rather show:

-- compartment_id: [str] 

The compartment OCID to retrieve the models and available model deployment shapes.

```


## List MultiModel Deployments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no difference currently between listing single and multi-model deployments. We can give a reference to the CLI tips like we did for model list. We can still mention that MultiModel deployments will have tag "aqua_multimodel": "true", associated with them.

- [Create Model Evaluation](#create-model-evaluations)


# Introduction to MultiModel Deployment and Serving
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to add a Prerequisites, Info or Limitations section where we can outline all the current limitations?

### Example

```bash
ads aqua evaluation create --evaluation_source_id "ocid1.datasciencemodeldeployment.oc1.iad.<ocid>" --evaluation_name "test_evaluation" --dataset_path "oci://<bucket>@<namespace>/path/to/the/dataset.jsonl" --report_path "oci://<bucket>@<namespace>/report/path/" --model_parameters '{"model":"<model_name>","max_tokens": 500, "temperature": 0.7, "top_p": 1.0, "top_k": 50}' --shape_name "VM.Standard.E4.Flex" --block_storage_size 50 --metrics '[{"name": "bertscore", "args": {}}, {"name": "rouge", "args": {}}]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: For longer examples, could we split them across multiple lines for better readability?

@mrDzurb mrDzurb requested a review from elizjo March 29, 2025 00:01
@mrDzurb mrDzurb merged commit 50d8d2f into oracle-samples:main Apr 23, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants