GitHub - cloudera/CML_Deploy-Llama2-CML-Native-Model: CML_Deploy-Llama2-CML-Native-Model

Deploy LLM as a model within CML

This project walks through a deployment and hosting of a Large Languge Model (LLM) within CML. The project can be cloned into CML directly, It can be launched as an Applied Machine Learning Prototype (AMP)

Site settings prerequisites

Go to Site Administration > Settings > Ephemeral Storage Limit (in GB) and set to 20GB

Deploy the model as an AMP

Add catalog entry to Site administration and the navigate to AMPs --> "Shared LLM Model for Hands on Lab"

Deploy the model manually

Deploy the model by:

Navigate to Model Deployments
Click New Model
Give it a Name and Description
Disable Authentication (for convenience)
Select File launch_model_*.py
Set Function Name api_wrapper
- This is the function implemented in the python script which wraps text inference with the llama2-chat model
Set sample json payload
```
 {
 "prompt": "test prompt hello"
 }
```
Pick Runtime
- PBJ Workbench -- Python 3.9 -- Nvidia GPU -- 2023.08
Set Resource Profile
- At least 4CPU / 16MEM
- 1 GPU
Click Deploy Model
Wait until it is Deployed

Test the Model

Note on compute instances:

g4dn.4xlarge is the recommended GPU type on AWS. It has 8 vCPUs and accounts for any overhead on top of 4 vCPUs that the model deployment needs.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
images		images
.project-metadata.yaml		.project-metadata.yaml
Launch_model.py		Launch_model.py
NOTICE		NOTICE
README.md		README.md
catalog-entry.yaml		catalog-entry.yaml
cdsw-build.sh		cdsw-build.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

.project-metadata.yaml

.project-metadata.yaml

Launch_model.py

Launch_model.py

NOTICE

NOTICE

README.md

README.md

catalog-entry.yaml

catalog-entry.yaml

cdsw-build.sh

cdsw-build.sh

requirements.txt

requirements.txt

Repository files navigation

Deploy LLM as a model within CML

Site settings prerequisites

Deploy the model as an AMP

Deploy the model manually

About

Releases

Packages

Contributors 2

Languages

cloudera/CML_Deploy-Llama2-CML-Native-Model

Folders and files

Latest commit

History

Repository files navigation

Deploy LLM as a model within CML

Site settings prerequisites

Deploy the model as an AMP

Deploy the model manually

About

Resources

Stars

Watchers

Forks

Languages