Optimum for Intel Gaudi is the interface between the Transformers and Diffusers libraries and Intel® Gaudi® AI Accelerators (HPUs). It provides a set of tools that enable easy model loading, training and inference on single- and multi-HPU settings for various downstream tasks as shown in the table below.
HPUs offer fast model training and inference as well as a great price-performance ratio. Check out this blog post about BERT pre-training and this post benchmarking Intel Gaudi 2 with NVIDIA A100 GPUs for concrete examples. If you are not familiar with HPUs, we recommend you take a look at our conceptual guide.
The following model architectures, tasks and device distributions have been validated for Optimum for Intel Gaudi:
In the tables below, ✅ means single-card, multi-card and DeepSpeed have all been validated.
- Transformers:
- Diffusers:
Architecture | Training | Inference | Tasks |
---|---|---|---|
Stable Diffusion | |||
Stable Diffusion XL | |||
LDM3D |
- PyTorch Image Models/TIMM:
Architecture | Training | Inference | Tasks |
---|---|---|---|
FastViT |
- TRL:
Architecture | Training | Inference | Tasks |
---|---|---|---|
Llama 2 | ✅ | ||
Llama 2 | ✅ | ||
Stable Diffusion | ✅ |
Other models and tasks supported by the 🤗 Transformers and 🤗 Diffusers library may also work. You can refer to this section for using them with 🤗 Optimum Habana. Besides, this page explains how to modify any example from the 🤗 Transformers library to make it work with 🤗 Optimum Habana.
Learn the basics and become familiar with training transformers on HPUs with 🤗 Optimum. Start here if you are using 🤗 Optimum Habana for the first time!
Practical guides to help you achieve a specific goal. Take a look at these guides to learn how to use 🤗 Optimum Habana to solve real-world problems.
High-level explanations for building a better understanding of important topics such as HPUs.
Technical descriptions of how the Habana classes and methods of 🤗 Optimum Habana work.