Skip to content

Latest commit

 

History

History
 
 

alt text

Example notebooks for the Llama 2 7B model on Databricks

This folder contains the following examples for Llama 2 models: `

File Description Model Used GPU Minimum Requirement
01_load_inference Environment setup and suggested configurations when inferencing Llama 2 models on Databricks. Llama-2-7b-chat-hf 1xA10-24GB
02_mlflow_logging_inference Save, register, and load Llama 2 models with MLflow, and create a Databricks model serving endpoint. Llama-2-7b-chat-hf 1xA10-24GB
02_[chat]_mlflow_logging_inference Save, register, and load Llama 2 models with MLflow, and create a Databricks model serving endpoint for chat completion. Llama-2-7b-chat-hf 1xA10-24GB
03_serve_driver_proxy Serve Llama 2 models on the cluster driver node using Flask. Llama-2-7b-chat-hf 1xA10-24GB
03_[chat]_serve_driver_proxy Serve Llama 2 models as chat completion on the cluster driver node using Flask. Llama-2-7b-chat-hf 1xA10-24GB
04_langchain Integrate a serving endpoint or cluster driver proxy app with LangChain and query. N/A N/A
04_[chat]_langchain Integrate a serving endpoint and setup langchain chat model. N/A N/A
05_fine_tune_deepspeed Fine-tune Llama 2 base models leveraging DeepSpeed. Llama-2-7b-hf 4xA10 or 2xA100-80GB
06_fine_tune_qlora Fine-tune Llama 2 base models with QLORA. Llama-2-7b-hf 1xA10
07_ai_gateway Manage a MLflow AI Gateway Route that accesses a Databricks model serving endpoint. N/A N/A