Skip to content

Latest commit

 

History

History
125 lines (101 loc) · 15.6 KB

index.mdx

File metadata and controls

125 lines (101 loc) · 15.6 KB

Optimum for Intel Gaudi

Optimum for Intel Gaudi is the interface between the Transformers and Diffusers libraries and Intel® Gaudi® AI Accelerators (HPUs). It provides a set of tools that enable easy model loading, training and inference on single- and multi-HPU settings for various downstream tasks as shown in the table below.

HPUs offer fast model training and inference as well as a great price-performance ratio. Check out this blog post about BERT pre-training and this post benchmarking Intel Gaudi 2 with NVIDIA A100 GPUs for concrete examples. If you are not familiar with HPUs, we recommend you take a look at our conceptual guide.

The following model architectures, tasks and device distributions have been validated for Optimum for Intel Gaudi:

In the tables below, ✅ means single-card, multi-card and DeepSpeed have all been validated.

  • Transformers:
Architecture Training Inference Tasks
BERT
  • text classification
  • question answering
  • language modeling
  • text feature extraction
  • RoBERTa
  • question answering
  • language modeling
  • ALBERT
  • question answering
  • language modeling
  • DistilBERT
  • question answering
  • language modeling
  • GPT2
  • language modeling
  • text generation
  • BLOOM(Z)
  • DeepSpeed
  • text generation
  • StarCoder / StarCoder2
  • Single card
  • language modeling
  • text generation
  • GPT-J
  • DeepSpeed
  • Single card
  • DeepSpeed
  • language modeling
  • text generation
  • GPT-NeoX
  • DeepSpeed
  • DeepSpeed
  • language modeling
  • text generation
  • OPT
  • DeepSpeed
  • text generation
  • Llama 2 / CodeLlama / Llama 3 / Llama Guard / Granite
  • language modeling
  • text generation
  • question answering
  • text classification (Llama Guard)
  • StableLM
  • Single card
  • text generation
  • Falcon
  • LoRA
  • text generation
  • CodeGen
  • Single card
  • text generation
  • MPT
  • Single card
  • text generation
  • Mistral
  • Single card
  • text generation
  • Phi
  • Single card
  • language modeling
  • text generation
  • Mixtral
  • Single card
  • text generation
  • Gemma
  • Single card
  • language modeling
  • text generation
  • Qwen2
  • Single card
  • Single card
  • language modeling
  • text generation
  • Persimmon
  • Single card
  • text generation
  • T5 / Flan T5
  • summarization
  • translation
  • question answering
  • BART
  • Single card
  • summarization
  • translation
  • question answering
  • ViT
  • image classification
  • Swin
  • image classification
  • Wav2Vec2
  • audio classification
  • speech recognition
  • Whisper
  • speech recognition
  • SpeechT5
  • Single card
  • text to speech
  • CLIP
  • contrastive image-text training
  • BridgeTower
  • contrastive image-text training
  • ESMFold
  • Single card
  • protein folding
  • Blip
  • Single card
  • visual question answering
  • image to text
  • OWLViT
  • Single card
  • zero shot object detection
  • ClipSeg
  • Single card
  • object segmentation
  • Llava / Llava-next
  • Single card
  • image to text
  • SAM
  • Single card
  • object segmentation
  • VideoMAE
  • Single card
  • Video classification
    • Diffusers:
    Architecture Training Inference Tasks
    Stable Diffusion
  • textual inversion
  • ControlNet
  • Single card
  • text-to-image generation
  • Stable Diffusion XL
  • fine-tuning
  • Single card
  • text-to-image generation
  • LDM3D
  • Single card
  • text-to-image generation
    • PyTorch Image Models/TIMM:
    Architecture Training Inference Tasks
    FastViT
  • Single card
  • image classification
    • TRL:
    Architecture Training Inference Tasks
    Llama 2
  • DPO Pipeline
  • Llama 2
  • PPO Pipeline
  • Stable Diffusion
  • DDPO Pipeline
  • Other models and tasks supported by the 🤗 Transformers and 🤗 Diffusers library may also work. You can refer to this section for using them with 🤗 Optimum Habana. Besides, this page explains how to modify any example from the 🤗 Transformers library to make it work with 🤗 Optimum Habana.