diff --git a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/_index.md b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/_index.md index da8b5b5949..fe5d7bd7c6 100644 --- a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/_index.md +++ b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/_index.md @@ -1,23 +1,19 @@ --- title: Deploy SqueezeNet 1.0 INT8 model with ONNX Runtime on Azure Cobalt 100 -draft: true -cascade: - draft: true - + minutes_to_complete: 60 -who_is_this_for: This Learning Path introduces ONNX deployment on Microsoft Azure Cobalt 100 (Arm-based) virtual machines. It is designed for developers deploying ONNX-based applications on Arm-based machines. +who_is_this_for: This Learning Path is for developers deploying ONNX-based applications on Arm-based machines. learning_objectives: - - Provision an Azure Arm64 virtual machine using Azure console, with Ubuntu Pro 24.04 LTS as the base image. - - Deploy ONNX on the Ubuntu Pro virtual machine. - - Perform ONNX baseline testing and benchmarking on Arm64 virtual machines. + - Provision an Azure Arm64 virtual machine using Azure console, with Ubuntu Pro 24.04 LTS as the base image + - Perform ONNX baseline testing and benchmarking on Arm64 virtual machines prerequisites: - - A [Microsoft Azure](https://azure.microsoft.com/) account with access to Cobalt 100 based instances (Dpsv6). - - Basic understanding of Python and machine learning concepts. - - Familiarity with [ONNX Runtime](https://onnxruntime.ai/docs/) and Azure cloud services. + - A [Microsoft Azure](https://azure.microsoft.com/) account with access to Cobalt 100 based instances (Dpsv6) + - Basic understanding of Python and machine learning concepts + - Familiarity with [ONNX Runtime](https://onnxruntime.ai/docs/) and Azure cloud services author: Pareena Verma diff --git a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/background.md b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/background.md index 1aef38fe2f..9f1bb80efc 100644 --- a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/background.md +++ b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/background.md @@ -8,14 +8,33 @@ layout: "learningpathall" ## Azure Cobalt 100 Arm-based processor -Azure’s Cobalt 100 is built on Microsoft's first-generation, in-house Arm-based processor: the Cobalt 100. Designed entirely by Microsoft and based on Arm’s Neoverse N2 architecture, this 64-bit CPU delivers improved performance and energy efficiency across a broad spectrum of cloud-native, scale-out Linux workloads. These include web and application servers, data analytics, open-source databases, caching systems, and more. Running at 3.4 GHz, the Cobalt 100 processor allocates a dedicated physical core for each vCPU, ensuring consistent and predictable performance. -To learn more about Cobalt 100, refer to the blog [Announcing the preview of new Azure virtual machine based on the Azure Cobalt 100 processor](https://techcommunity.microsoft.com/blog/azurecompute/announcing-the-preview-of-new-azure-vms-based-on-the-azure-cobalt-100-processor/4146353). +Azure’s Cobalt 100 is built on Microsoft's first-generation, in-house Arm-based processor, the Cobalt 100. Designed entirely by Microsoft and based on Arm’s Neoverse N2 architecture, it is a 64-bit CPU that delivers improved performance and energy efficiency across a broad spectrum of cloud-native, scale-out Linux workloads. + +You can use Cobalt 100 for: + +- Web and application servers +- Data analytics +- Open-source databases +- Caching systems +- Many other scale-out workloads + +Running at 3.4 GHz, the Cobalt 100 processor allocates a dedicated physical core for each vCPU, ensuring consistent and predictable performance. You can learn more about Cobalt 100 in the Microsoft blog [Announcing the preview of new Azure virtual machine based on the Azure Cobalt 100 processor](https://techcommunity.microsoft.com/blog/azurecompute/announcing-the-preview-of-new-azure-vms-based-on-the-azure-cobalt-100-processor/4146353). ## ONNX -ONNX (Open Neural Network Exchange) is an open-source format designed for representing machine learning models. -It provides interoperability between different deep learning frameworks, enabling models trained in one framework (such as PyTorch or TensorFlow) to be deployed and run in another. -ONNX models are serialized into a standardized format that can be executed by the ONNX Runtime, a high-performance inference engine optimized for CPU, GPU, and specialized hardware accelerators. This separation of model training and inference allows developers to build flexible, portable, and production-ready AI workflows. +ONNX (Open Neural Network Exchange) is an open-source format designed for representing machine learning models. + +You can use ONNX to: + +- Move models between different deep learning frameworks, such as PyTorch and TensorFlow +- Deploy models trained in one framework to run in another +- Build flexible, portable, and production-ready AI workflows + +ONNX models are serialized into a standardized format that you can execute with ONNX Runtime - a high-performance inference engine optimized for CPU, GPU, and specialized hardware accelerators. This separation of model training and inference lets you deploy models efficiently across cloud, edge, and mobile environments. + +To learn more, see the [ONNX official website](https://onnx.ai/) and the [ONNX Runtime documentation](https://onnxruntime.ai/docs/). + +## Next steps for ONNX on Azure Cobalt 100 -ONNX is widely used in cloud, edge, and mobile environments to deliver efficient and scalable inference for deep learning models. Learn more from the [ONNX official website](https://onnx.ai/) and the [ONNX Runtime documentation](https://onnxruntime.ai/docs/). +Now that you understand the basics of Azure Cobalt 100 and ONNX Runtime, you are ready to deploy and benchmark ONNX models on Arm-based Azure virtual machines. This Learning Path will guide you step by step through setting up an Azure Cobalt 100 VM, installing ONNX Runtime, and running machine learning inference on Arm64 infrastructure. diff --git a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/baseline.md b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/baseline.md index 08f727beb4..2c7a50e4da 100644 --- a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/baseline.md +++ b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/baseline.md @@ -36,19 +36,24 @@ python3 baseline.py You should see output similar to: ```output Inference time: 0.0026061534881591797 -``` -{{% notice Note %}}Inference time is the amount of time it takes for a trained machine learning model to make a prediction (i.e., produce output) after receiving input data. -input tensor of shape (1, 3, 224, 224): -- 1: batch size -- 3: color channels (RGB) -- 224 x 224: image resolution (common for models like SqueezeNet) +{{% notice Note %}} +Inference time is how long it takes for a trained machine learning model to make a prediction after it receives input data. + +The input tensor shape `(1, 3, 224, 224)` means: +- `1`: One image is processed at a time (batch size) +- `3`: Three color channels (red, green, blue) +- `224 x 224`: Each image is 224 pixels wide and 224 pixels tall (standard for SqueezeNet) {{% /notice %}} This indicates the model successfully executed a single forward pass through the SqueezeNet INT8 ONNX model and returned results. -#### Output summary: +## Output summary: Single inference latency(0.00260 sec): This is the time required for the model to process one input image and produce an output. The first run includes graph loading, memory allocation, and model initialization overhead. Subsequent inferences are usually faster due to caching and optimized execution. This demonstrates that the setup is fully working, and ONNX Runtime efficiently executes quantized models on Arm64. + +Great job! You've completed your first ONNX Runtime inference on Arm-based Azure infrastructure. This baseline test confirms your environment is set up correctly and ready for more advanced benchmarking. + +Next, you'll use a dedicated benchmarking tool to capture more detailed performance statistics and further optimize your deployment. diff --git a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/benchmarking.md b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/benchmarking.md index d3a18d7050..5a429269e8 100644 --- a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/benchmarking.md +++ b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/benchmarking.md @@ -1,19 +1,25 @@ --- -title: Benchmarking via onnxruntime_perf_test +title: Benchmark ONNX runtime performance with onnxruntime_perf_test weight: 6 ### FIXED, DO NOT MODIFY layout: learningpathall --- -Now that you have validated ONNX Runtime with Python-based timing (e.g., SqueezeNet baseline test), you can move to using a dedicated benchmarking utility called `onnxruntime_perf_test`. This tool is designed for systematic performance evaluation of ONNX models, allowing you to capture more detailed statistics than simple Python timing. -This helps evaluate the ONNX Runtime efficiency on Azure Arm64-based Cobalt 100 instances and other x86_64 instances. architectures. +## Benchmark ONNX model inference on Azure Cobalt 100 +Now that you have validated ONNX Runtime with Python-based timing (for example, the SqueezeNet baseline test), you can move to using a dedicated benchmarking utility called `onnxruntime_perf_test`. This tool is designed for systematic performance evaluation of ONNX models, allowing you to capture more detailed statistics than simple Python timing. + +This approach helps you evaluate ONNX Runtime efficiency on Azure Arm64-based Cobalt 100 instances and compare results with other architectures if needed. + +You are ready to run benchmarks, which is a key skill for optimizing real-world deployments. + ## Run the performance tests using onnxruntime_perf_test -The `onnxruntime_perf_test` is a performance benchmarking tool included in the ONNX Runtime source code. It is used to measure the inference performance of ONNX models and supports multiple execution providers (like CPU, GPU, or other execution providers). on Arm64 VMs, CPU execution is the focus. +The `onnxruntime_perf_test` tool is included in the ONNX Runtime source code. You can use it to measure the inference performance of ONNX models and compare different execution providers (such as CPU or GPU). On Arm64 VMs, CPU execution is the focus. -### Install Required Build Tools -Before building or running `onnxruntime_perf_test`, you will need to install a set of development tools and libraries. These packages are required for compiling ONNX Runtime and handling model serialization via Protocol Buffers. + +## Install required build tools +Before building or running `onnxruntime_perf_test`, you need to install a set of development tools and libraries. These packages are required for compiling ONNX Runtime and handling model serialization via Protocol Buffers. ```console sudo apt update @@ -29,35 +35,48 @@ You should see output similar to: ```output libprotoc 3.21.12 ``` -### Build ONNX Runtime from Source: +## Build ONNX Runtime from source -The benchmarking tool `onnxruntime_perf_test`, isn’t available as a pre-built binary for any platform. So, you will have to build it from the source, which is expected to take around 40 minutes. +The benchmarking tool `onnxruntime_perf_test` isn’t available as a pre-built binary for any platform, so you will need to build it from source. This process can take up to 40 minutes. -Clone onnxruntime repo: +Clone the ONNX Runtime repository: ```console git clone --recursive https://github.com/microsoft/onnxruntime cd onnxruntime ``` + Now, build the benchmark tool: ```console ./build.sh --config Release --build_dir build/Linux --build_shared_lib --parallel --build --update --skip_tests ``` -You should see the executable at: +If the build completes successfully, you should see the executable at: ```output ./build/Linux/Release/onnxruntime_perf_test ``` -### Run the benchmark + +## Run the benchmark Now that you have built the benchmarking tool, you can run inference benchmarks on the SqueezeNet INT8 model: ```console ./build/Linux/Release/onnxruntime_perf_test -e cpu -r 100 -m times -s -Z -I ../squeezenet-int8.onnx ``` + Breakdown of the flags: - -e cpu → Use the CPU execution provider. - -r 100 → Run 100 inference passes for statistical reliability. - -m times → Run in “repeat N times” mode. Useful for latency-focused measurement. + +- `-e cpu`: use the CPU execution provider. +- `-r 100`: run 100 inference passes for statistical reliability. +- `-m times`: run in “repeat N times” mode for latency-focused measurement. +- `-s`: print summary statistics after the run. +- `-Z`: disable memory arena for more consistent timing. +- `-I ../squeezenet-int8.onnx`: path to your ONNX model file. + +You should see output with latency and throughput statistics. If you encounter build errors, check that you have enough memory (at least 8 GB recommended) and all dependencies are installed. For missing dependencies, review the installation steps above. + +If the benchmark runs successfully, you are ready to analyze and optimize your ONNX model performance on Arm-based Azure infrastructure. + +Well done! You have completed a full benchmarking workflow. Continue to the next section to explore further optimizations or advanced deployment scenarios. -s → Show detailed per-run statistics (latency distribution). -Z → Disable intra-op thread spinning. Reduces CPU waste when idle between runs, especially on high-core systems like Cobalt 100. -I → Input the ONNX model path directly, skipping pre-generated test data. @@ -86,17 +105,17 @@ P95 Latency: 0.00187393 s P99 Latency: 0.00190312 s P999 Latency: 0.00190312 s ``` -### Benchmark Metrics Explained +## Benchmark Metrics Explained - * Average Inference Time: The mean time taken to process a single inference request across all runs. Lower values indicate faster model execution. - * Throughput: The number of inference requests processed per second. Higher throughput reflects the model’s ability to handle larger workloads efficiently. - * CPU Utilization: The percentage of CPU resources used during inference. A value close to 100% indicates full CPU usage, which is expected during performance benchmarking. - * Peak Memory Usage: The maximum amount of system memory (RAM) consumed during inference. Lower memory usage is beneficial for resource-constrained environments. - * P50 Latency (Median Latency): The time below which 50% of inference requests complete. Represents typical latency under normal load. - * Latency Consistency: Describes the stability of latency values across all runs. "Consistent" indicates predictable inference performance with minimal jitter. + * Average inference time: the mean time taken to process a single inference request across all runs. Lower values indicate faster model execution. + * Throughput: the number of inference requests processed per second. Higher throughput reflects the model’s ability to handle larger workloads efficiently. + * CPU utilization: the percentage of CPU resources used during inference. A value close to 100% indicates full CPU usage, which is expected during performance benchmarking. + * Peak Memory Usage: the maximum amount of system memory (RAM) consumed during inference. Lower memory usage is beneficial for resource-constrained environments. + * P50 Latency (Median Latency): the time below which 50% of inference requests complete. Represents typical latency under normal load. + * Latency Consistency: describes the stability of latency values across all runs. "Consistent" indicates predictable inference performance with minimal jitter. -### Benchmark summary on Arm64: -Here is a summary of benchmark results collected on an Arm64 **D4ps_v6 Ubuntu Pro 24.04 LTS virtual machine**. +## Benchmark summary on Arm64: +Here is a summary of benchmark results collected on an Arm64 D4ps_v6 Ubuntu Pro 24.04 LTS virtual machine. | **Metric** | **Value** | |----------------------------|-------------------------------| @@ -113,12 +132,9 @@ Here is a summary of benchmark results collected on an Arm64 **D4ps_v6 Ubuntu Pr | **Latency Consistency** | Consistent | -### Highlights from Benchmarking on Azure Cobalt 100 Arm64 VMs +## Highlights from Benchmarking on Azure Cobalt 100 Arm64 VMs + -The results on Arm64 virtual machines demonstrate: -- Low-Latency Inference: Achieved consistent average inference times of ~1.86 ms on Arm64. -- Strong and Stable Throughput: Sustained throughput of over 538 inferences/sec using the `squeezenet-int8.onnx` model on D4ps_v6 instances. -- Lightweight Resource Footprint: Peak memory usage stayed below 37 MB, with CPU utilization around 96%, ideal for efficient edge or cloud inference. -- Consistent Performance: P50, P95, and Max latency remained tightly bound, showcasing reliable performance on Azure Cobalt 100 Arm-based infrastructure. +These results on Arm64 virtual machines demonstrate low-latency inference, with consistent average inference times of approximately 1.86 ms. Throughput remains strong and stable, sustaining over 538 inferences per second using the `squeezenet-int8.onnx` model on D4ps_v6 instances. The resource footprint is lightweight, as peak memory usage stays below 37 MB and CPU utilization is around 96%, making this setup ideal for efficient edge or cloud inference. Performance is also consistent, with P50, P95, and maximum latency values tightly grouped, showcasing reliable results on Azure Cobalt 100 Arm-based infrastructure. You have now successfully benchmarked inference time of ONNX models on an Azure Cobalt 100 Arm64 virtual machine. diff --git a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/create-instance.md b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/create-instance.md index 420b6ea4b8..31f2fb1e30 100644 --- a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/create-instance.md +++ b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/create-instance.md @@ -1,12 +1,12 @@ --- -title: Create an Arm-based Azure VM with Cobalt 100 +title: Create an Arm-based Azure Cobalt 100 virtual machine weight: 3 ### FIXED, DO NOT MODIFY layout: learningpathall --- -## Set up your development environment +## Set up your Arm-based Azure Cobalt 100 virtual machine There is more than one way to create an Arm-based Cobalt 100 virtual machine: @@ -20,37 +20,46 @@ You will focus on the general-purpose virtual machines in the D-series. For furt While the steps to create this instance are included here for convenience, for further information on setting up Cobalt on Azure, see [Deploy a Cobalt 100 virtual machine on Azure Learning Path](/learning-paths/servers-and-cloud-computing/cobalt/). -#### Create an Arm-based Azure Virtual Machine +## Create an Arm-based Azure Virtual Machine -Creating a virtual machine based on Azure Cobalt 100 is no different from creating any other virtual machine in Azure. To create an Azure virtual machine, launch the Azure portal and navigate to "Virtual Machines". -1. Select "Create", and click on "Virtual Machine" from the drop-down list. -2. Inside the "Basic" tab, fill in the Instance details such as "Virtual machine name" and "Region". -3. Choose the image for your virtual machine (for example, Ubuntu Pro 24.04 LTS) and select “Arm64” as the VM architecture. -4. In the “Size” field, click on “See all sizes” and select the D-Series v6 family of virtual machines. Select “D4ps_v6” from the list. -![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/instance.png "Figure 1: Select the D-Series v6 family of virtual machines") -5. Select "SSH public key" as an Authentication type. Azure will automatically generate an SSH key pair for you and allow you to store it for future use. It is a fast, simple, and secure way to connect to your virtual machine. -6. Fill in the Administrator username for your VM. -7. Select "Generate new key pair", and select "RSA SSH Format" as the SSH Key Type. RSA could offer better security with keys longer than 3072 bits. Give a Key pair name to your SSH key. -8. In the "Inbound port rules", select HTTP (80) and SSH (22) as the inbound ports. +To launch an Arm-based virtual machine on Azure, you will use the Azure portal to create a Linux VM powered by the Cobalt 100 processor. This process is similar to creating any other Azure VM, but you will specifically select the Arm64 architecture and the D-Series v6 (D4ps_v6) size for optimal performance on Arm. -![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/instance1.png "Figure 2: Allow inbound port rules") +Follow these steps to deploy a Linux-based Azure Cobalt 100 VM: -9. Click on the "Review + Create" tab and review the configuration for your virtual machine. It should look like the following: +- Select **Create**, and click on **Virtual Machine** from the drop-down list. +- Inside the **Basic** tab, fill in the instance details such as **Virtual machine name** and **Region**. +- Choose the image for your virtual machine (for example, Ubuntu Pro 24.04 LTS) and select **Arm64** as the VM architecture. +- In the **Size** field, click on **See all sizes** and select the D-Series v6 family of virtual machines. Select **D4ps_v6** from the list. -![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/ubuntu-pro.png "Figure 3: Review and Create an Azure Cobalt 100 Arm64 VM") +![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/instance.png "Select the D-Series v6 family of virtual machines") -10. Finally, when you are confident about your selection, click on the "Create" button, and click on the "Download Private key and Create Resources" button. +- Select **SSH public key** as an Authentication type. Azure will automatically generate an SSH key pair for you and allow you to store it for future use. It is a fast, simple, and secure way to connect to your virtual machine. +- Fill in the **Administrator username** for your VM. +- Select **Generate new key pair**, and select **RSA SSH Format** as the SSH Key Type. RSA could offer better security with keys longer than 3072 bits. Give a **Key pair name** to your SSH key. +- In the **Inbound port rules**, select **HTTP (80)** and **SSH (22)** as the inbound ports. -![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/instance4.png "Figure 4: Download Private key and Create Resources") +![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/instance1.png "Allow inbound port rules") -11. Your virtual machine should be ready and running within no time. You can SSH into the virtual machine using the private key, along with the Public IP details. +Click on the **Review + Create** tab and review the configuration for your virtual machine. It should look like the following: -![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/final-vm.png "Figure 5: VM deployment confirmation in Azure portal") +![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/ubuntu-pro.png "Review and Create an Azure Cobalt 100 Arm64 VM") + +When you are confident about your selection, click on the **Create** button, and click on the **Download Private key and Create Resources** button. + +![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/instance4.png "Download Private key and Create Resources") + +Your virtual machine should be ready and running within a few minutes. You can SSH into the virtual machine using the private key, along with the Public IP details. -{{% notice Note %}} -To learn more about Arm-based virtual machine in Azure, refer to “Getting Started with Microsoft Azure” in [Get started with Arm-based cloud instances](/learning-paths/servers-and-cloud-computing/csp/azure). +You should see your Arm-based Azure Cobalt 100 VM listed as **Running** in the Azure portal. If you have trouble connecting, double-check your SSH key and ensure the correct ports are open. If the VM creation fails, check your Azure quota, region availability, or try a different VM size. For more troubleshooting tips, see the [Deploy a Cobalt 100 virtual machine on Azure Learning Path](/learning-paths/servers-and-cloud-computing/cobalt/). + +Nice work! You have successfully provisioned an Arm-based Azure Cobalt 100 virtual machine. This setup is ideal for deploying Linux workloads, running ONNX Runtime, and benchmarking machine learning models on Arm64 infrastructure. You are now ready to continue with ONNX Runtime installation and performance testing in the next steps. + +![Azure portal VM creation - Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/final-vm.png "VM deployment confirmation in Azure portal") + +{{% notice Note %}} +For further information or alternative setup options, see “Getting Started with Microsoft Azure” in [Get started with Arm-based cloud instances](/learning-paths/servers-and-cloud-computing/csp/azure). {{% /notice %}} diff --git a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/deploy.md b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/deploy.md index ed9ff8e35e..a9550eefd7 100644 --- a/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/deploy.md +++ b/content/learning-paths/servers-and-cloud-computing/onnx-on-azure/deploy.md @@ -8,14 +8,18 @@ layout: learningpathall ## ONNX Installation on Azure Ubuntu Pro 24.04 LTS -To work with ONNX models on Azure, you will need a clean Python environment with the required packages. The following steps install Python, set up a virtual environment, and prepare for ONNX model execution using ONNX Runtime. +To work with ONNX models on Azure, you will need a clean Python environment with the required packages. The following steps show you how to install Python, set up a virtual environment, and prepare for ONNX model execution using ONNX Runtime. -### Install Python and Virtual Environment: + +## Install Python and virtual environment + +To get started, update your package list and install Python 3 along with the tools needed to create a virtual environment: ```console sudo apt update -sudo apt install -y python3 python3-pip python3-virtualenv python3-venv +sudo apt install -y python3 python3-pip python3-venv ``` + Create and activate a virtual environment: ```console @@ -24,28 +28,35 @@ source onnx-env/bin/activate ``` {{% notice Note %}}Using a virtual environment isolates ONNX and its dependencies to avoid system conflicts.{{% /notice %}} -### Install ONNX and Required Libraries: +Once your environment is active, you're ready to install the required libraries. + + +## Install ONNX and required libraries Upgrade pip and install ONNX with its runtime and supporting libraries: ```console pip install --upgrade pip pip install onnx onnxruntime fastapi uvicorn numpy ``` -This installs ONNX libraries along with FastAPI (web serving) and NumPy (for input tensor generation). +This installs ONNX libraries, FastAPI (for web serving, if you want to deploy models as an API), Uvicorn (ASGI server for FastAPI), and NumPy (for input tensor generation). + +If you encounter errors during installation, check your internet connection and ensure you are using the activated virtual environment. For missing dependencies, try updating pip or installing system packages as needed. + +After installation, you're ready to validate your setup. -### Validate ONNX and ONNX Runtime: -Once the libraries are installed, you should verify that both ONNX and ONNX Runtime are correctly set up on your VM. + +## Validate ONNX and ONNX Runtime +Once the libraries are installed, verify that both ONNX and ONNX Runtime are correctly set up on your VM. Create a file named `version.py` with the following code: ```python import onnx import onnxruntime -print("ONNX version:", onnx.__version__) -print("ONNX Runtime version:", onnxruntime.__version__) +print("ONNX version:", onnx.__version__) +print("ONNX Runtime version:", onnxruntime.__version__) ``` -Run the script: - +Run the script: ```console python3 version.py ``` @@ -54,15 +65,20 @@ You should see output similar to: ONNX version: 1.19.0 ONNX Runtime version: 1.23.0 ``` -With this validation, you have confirmed that ONNX and ONNX Runtime are installed and ready on your Azure Cobalt 100 VM. This is the foundation for running inference workloads and serving ONNX models. +If you see version numbers for both ONNX and ONNX Runtime, your environment is ready. If you get an ImportError, double-check that your virtual environment is activated and the libraries are installed. + +Great job! You have confirmed that ONNX and ONNX Runtime are installed and ready on your Azure Cobalt 100 VM. This is the foundation for running inference workloads and serving ONNX models. + + +## Download and validate ONNX model: SqueezeNet +SqueezeNet is a lightweight convolutional neural network (CNN) architecture designed to provide accuracy close to AlexNet while using 50x fewer parameters and a much smaller model size. This makes it well-suited for benchmarking ONNX Runtime. -### Download and Validate ONNX Model - SqueezeNet: -SqueezeNet is a lightweight convolutional neural network (CNN) architecture designed to provide accuracy close to AlexNet while using 50x fewer parameters and a much smaller model size. This makes it well-suited for benchmarking ONNX Runtime. +Now that your environment is set up and validated, you're ready to download and test the SqueezeNet model in the next step. Download the quantized model: ```console wget https://github.com/onnx/models/raw/main/validated/vision/classification/squeezenet/model/squeezenet1.0-12-int8.onnx -O squeezenet-int8.onnx ``` -#### Validate the model: +## Validate the model: After downloading the SqueezeNet ONNX model, the next step is to confirm that it is structurally valid and compliant with the ONNX specification. ONNX provides a built-in checker utility that verifies the graph, operators, and metadata. Create a file named `validation.py` with the following code: