Skip to content

Commit

Permalink
[Doc][Example] Fine-tune vicuna-13b-v1.3 with LightningTrainer + De…
Browse files Browse the repository at this point in the history
…epSpeed (ray-project#37016)

Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
Signed-off-by: Yunxuan Xiao <xiaoyunxuan1998@gmail.com>
Signed-off-by: matthewdeng <matthew.j.deng@gmail.com>
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
  • Loading branch information
2 people authored and arvind-chandra committed Aug 31, 2023
1 parent b20e982 commit 7a9f3e0
Show file tree
Hide file tree
Showing 10 changed files with 1,515 additions and 1 deletion.
2 changes: 2 additions & 0 deletions doc/source/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,8 @@ parts:
title: "Torch Data Prefetching Benchmark"
- file: train/examples/pytorch/pytorch_resnet_finetune
title: "PyTorch Finetuning ResNet Example"
- file: train/examples/lightning/vicuna_13b_lightning_deepspeed_finetune
title: "Fine-tune Vicuna-13B with DeepSpeed and PyTorch Lightning"
- file: train/faq
- file: train/api/api

Expand Down
7 changes: 7 additions & 0 deletions doc/source/ray-overview/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1618,3 +1618,10 @@ Ray Examples
.. button-ref:: /serve/tutorials/streaming

Using Ray Serve to deploy a chatbot

.. grid-item-card:: :bdg-secondary:`Code example`
:class-item: gallery-item training llm

.. button-ref:: /train/examples/lightning/vicuna_13b_lightning_deepspeed_finetune

Fine-tune vicuna-13b-v1.3 with DeepSpeed and LightningTrainer
8 changes: 8 additions & 0 deletions doc/source/train/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,14 @@ Distributed Training Examples using Ray Train
.. button-ref:: dolly_lightning_fsdp_finetuning

Fine-tune LLM with AIR LightningTrainer and FSDP

.. grid-item-card::
:img-top: /images/pytorch_lightning_small.png
:class-img-top: pt-2 w-75 d-block mx-auto fixed-height-img

.. button-ref:: vicuna_lightning_deepspeed_finetuning

Fine-tune vicuna-13b-v1.3 with Deepspeed and LightningTrainer


Ray Train Examples Using Loggers & Callbacks
Expand Down
5 changes: 4 additions & 1 deletion doc/source/train/examples/lightning/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,10 @@ filegroup(
py_test_run_all_notebooks(
size="large",
include=["*.ipynb"],
exclude=["lightning_exp_tracking.ipynb"],
exclude=[
"lightning_exp_tracking.ipynb", # CPU test
"vicuna_13b_lightning_deepspeed_finetune.ipynb", # Release Test
],
data=["//doc/source/train/examples/lightning:lightning_examples"],
tags=["exclusive", "team:ml", "gpu", "ray_air"],
)
Expand Down

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
cloud_id: {{env["ANYSCALE_CLOUD_ID"]}}
region: us-west-2

head_node_type:
name: head_node
instance_type: g5.16xlarge

worker_node_types:
- name: worker_node
instance_type: g5.4xlarge
min_workers: 15
max_workers: 15
use_spot: false

aws:
TagSpecifications:
- ResourceType: "instance"
Tags:
- Key: ttl-hours
Value: '24'
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
base_image: {{ env["RAY_IMAGE_ML_NIGHTLY_GPU"] | default("anyscale/ray:nightly-py38-cu118") }}
env_vars: {}
debian_packages:
- curl

python:
pip_packages:
- datasets==2.13.1
- evaluate==0.4.0
- scikit-learn==1.3.0
- boto3==1.28.5
- myst-parser==0.15.2
- myst-nb==0.13.1
- jupytext==1.13.6
- typing-extensions<4.6.0
conda_packages: []

post_build_cmds:
- pip uninstall -y ray || true && pip3 install -U {{ env["RAY_WHEELS"] | default("ray") }}
- {{ env["RAY_WHEELS_SANITY_CHECK"] | default("echo No Ray wheels sanity check") }}
- echo "sudo lsblk -f" >> ~/.bashrc
- echo "yes N | sudo mkfs -t ext4 /dev/nvme1n1 || true" >> ~/.bashrc
- echo "mkdir -p /mnt/local_storage" >> ~/.bashrc
- echo "sudo chmod 0777 /mnt/local_storage" >> ~/.bashrc
- echo "sudo mount /dev/nvme1n1 /mnt/local_storage || true" >> ~/.bashrc
- pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
- pip3 install "pytorch_lightning==2.0.3" "transformers==4.30.2" "accelerate==0.20.3" "deepspeed==0.9.4"
21 changes: 21 additions & 0 deletions release/release_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -967,6 +967,27 @@

# variations: TODO(jungong): add GCP variation.

- name: air_example_vicuna_13b_lightning_deepspeed_finetuning
group: AIR examples
working_dir: air_examples/vicuna_13b_lightning_deepspeed_finetuning

python: "3.8"

frequency: weekly
team: ml
cluster:
byod:
type: cu118
pip:
- myst-parser==0.15.2
- myst-nb==0.13.1
- jupytext==1.13.6
cluster_env: vicuna_13b_deepspeed_env.yaml
cluster_compute: vicuna_13b_deepspeed_compute_aws.yaml

run:
timeout: 4700
script: python test_myst_doc.py --path vicuna_13b_lightning_deepspeed_finetune.ipynb

#####################################
# Workspace templates release tests #
Expand Down

0 comments on commit 7a9f3e0

Please sign in to comment.