Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
init_scripts		init_scripts
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
data_preparation.ipynb		data_preparation.ipynb
model_inference_hvd.ipynb		model_inference_hvd.ipynb
model_inference_hvd_deepspeed.ipynb		model_inference_hvd_deepspeed.ipynb
model_inference_pudf.ipynb		model_inference_pudf.ipynb
model_inference_pudf_deepspeed.ipynb		model_inference_pudf_deepspeed.ipynb
model_training_hvd.ipynb		model_training_hvd.ipynb
model_training_hvd_deepspeed.ipynb		model_training_hvd_deepspeed.ipynb

Repository files navigation

Training and Inference of Hugging Face models on Azure Databricks

This repository contains the code for the blog post series Optimized Training and Inference of Hugging Face Models on Azure Databricks.

If you want to reproduce the Databricks Notebooks, you should first follow the steps below to set up your environment:

Create a Azure Databricks Workspace: you can create one by following these instructions and you can select the Standard pricing tier.
Create a Cluster: you can follow these instructions to create your cluster. Your cluster configuration should be based on nodes of the type Standard_NC4as_T4_v3. Please make sure you have enough CPU cores of that type, otherwise work with your Azure subscription administrator to request a quota increase. Use the information below when creating your cluster:

Databricks runtime version should be at least 11.2 ML (GPU, Scala 2.12, Spark 3.3.0)
worker type should be Standard_NC4as_T4_v3 and number of workers should be at least 2 (the notebooks here were run with 8 worker nodes)
Driver type should be the same as worker type
Disable autoscaling

Install Python libraries in your cluster: you can follow these instructions to install the libraries. Please install the following PyPI libraries in your cluster:

transformers==4.20.1
sentencepiece
datasets
deepspeed
mpi4py
ninja

Install a cluster-scoped init script in your cluster. This is needed for installing the ninja Linux library. The script to be installed is the ninja_install.sh. You can follow these instructions to learn how to install it.

The notebooks should be run in the following order:

data_preparation.ipynb: it downloads and prepares the datasets needed for model training and inference.
model_training_hvd.ipynb: it performs distributed fine tuning using PyTorch and Horovod on the pre-trained Hugging Face model.
model_training_hvd_deepspeed.ipynb: it performs distributed fine tuning using PyTorch and Horovod, optimized with DeepSpeed, on the pre-trained Hugging Face model.
model_inference_hvd.ipynb: it performs distributed inference using PyTorch and Horovod on the fine-tuned model.
model_inference_hvd_deepspeed.ipynb: it performs distributed inference using PyTorch and Horovod, optimized with DeepSpeed, on the fine-tuned model.
model_inference_pudf.ipynb: it performs distributed inference using Transformer's Pipeline and Pandas UDF on the fine-tuned model.
model_inference_pudf_deepspeed.ipynb it performs distributed inference using Transformer's Pipeline and Pandas UDF, optimized with DeepSpeed, on the fine-tuned model.

About

Sample notebooks for optimized training and inference of Hugging Face models on Azure Databricks

Code of conduct

Security policy

Custom properties

Report repository

Releases

No releases published

Packages

No packages published

Contributors 2

Languages