Skip to content

Conversation

@elizjo
Copy link
Member

@elizjo elizjo commented Feb 5, 2025

  • Updated the GET aqua/deployments/<model_ocid>/params endpoint to return vLLM Parameters specifically for a set number of GPUs
  • accepts gpu_count as an optional parameter
  • used the new AQUA Service Managed Model Configs with 'multi_model_deployment' key

Request
GET aqua/deployments/<model_ocid>/params?instance_shape=<shape_name>&gpu_count=<gpu_count_value>

Output

{
    "data": [
        "--max-model-len 4096"
    ]
} 

Modified the test_get_deployment_default_params() unit test
we test cases such as:

  • gpu_count = 1, we return the vLLM params as specified by the gpu_count : 1 field in the AQUA Service Managed Model Configs with 'multi_model_deployment' key
  • gpu_count = 2, we return the vLLM params as specified by the gpu_count : 2 field in the AQUA Service Managed Model Configs with 'multi_model_deployment' key
  • gpu_count is None, we always return the single model vLLM params
  • TGI is attached as the inference container for the model and gpu_count is specified (always returns data: [] or single model TGI params since TGI is not supported by multi model)

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Feb 5, 2025
@elizjo elizjo requested a review from lu-ohai February 5, 2025 20:01
Copy link
Contributor

@darenr darenr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also really nice code.

Copy link
Member

@VipulMascarenhas VipulMascarenhas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

@VipulMascarenhas VipulMascarenhas merged commit bc2e0b7 into feature/multi_model_deployment Feb 6, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants