This repository contains samples for AWS Neuron, the software development kit (SDK) that enables machine learning (ML) inference and training workloads on the AWS ML accelerator chips Inferentia and Trainium.
The samples in this repository provide an indication of the types of deep learning models that can be used with Trainium and Inferentia, but do not represent an exhaustive list of supported models. If you have additional model samples that you would like to contribute to this repository, please submit a pull request following the repository's contribution guidelines.
Samples are organized by use case (training, inference) and deep learning framework (PyTorch, TensorFlow) below:
Framework | Description | Instance Type |
---|---|---|
PyTorch Neuron (torch-neuronx) | Sample training scripts for training various PyTorch models on AWS Trainium | Trn1 |
Framework | Description | Instance Type |
---|---|---|
PyTorch Neuron (torch-neuron) | Sample Jupyter notebooks demonstrating model compilation and inference for various PyTorch models on AWS Inferentia | Inf1 |
PyTorch Neuron (torch-neuronx) | Sample Jupyter notebooks demonstrating model compilation and inference for various PyTorch models on AWS Trainium | Trn1 |
PyTorch Neuron (transformers-neuronx) | Sample Jupyter Notebooks demonstrating tensor parallel inference for various PyTorch large language models (LLMs) on AWS Inferentia and Trainium | Inf2 & Trn1 |
TensorFlow Neuron (tensorflow-neuron) | Sample Jupyter notebooks demonstrating model compilation and inference for various TensorFlow models on AWS Inferentia | Inf1 |
If you encounter issues with any of the samples in this repository, please open an issue via the GitHub Issues feature.
Please refer to the CONTRIBUTING document for details on contributing additional samples to this repository.
Please refer to the Change Log.
Model | Framework | Training/Inference | Instance Type | Status |
---|---|---|---|---|
Fairseq | PyTorch | Inference | Inf1 | RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! |
Yolof | PyTorch | Inference | Inf1 | RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! |