QuickStart

A typical Neuron developer flow includes compilation phase and then deployment (inference) on inf1 instance/s.

To quickly start developing with Neuron:

Setup your environment to compile and deploy on Inf1 instance/s:
- ec2-then-ec2-setenv
Run a tutorial from one of the leading machine learning frameworks supported by Neuron:
- pytorch-tutorials
- tensorflow-tutorials
- mxnet-tutorials
Explore more flows to develop with Neuron:
- neuron-devflows

Customers can train their models anywhere and easily migrate their ML applications to Neuron and run their high-performance production predictions with Inferentia. Once a model is trained to the required accuracy, model is compiled to an optimized binary form, referred to as a Neuron Executable File Format (NEFF), and loaded by the Neuron runtime driver to execute inference input requests on the Inferentia chips. Developers have the option to train their models in fp16 or keep training in 32-bit floating point for best accuracy and Neuron will auto-cast the 32-bit trained model to run at speed of 16-bit using bfloat16.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get-started.rst

get-started.rst

QuickStart

Files

get-started.rst

Latest commit

History

get-started.rst

File metadata and controls

QuickStart