diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_index.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_index.md index f76e3568d7..50f1b1fe9d 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_index.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_index.md @@ -1,23 +1,23 @@ --- -title: Create and train a PyTorch model for digit classification +title: Create and train a PyTorch model for digit classification using the MNIST dataset minutes_to_complete: 160 -who_is_this_for: This is an advanced topic for software developers interested in learning how to use PyTorch to create and train a feedforward neural network for digit classification. You will also learn how to use the trained model in an Android application. Finally, you will apply model optimizations. +who_is_this_for: This is an advanced topic for software developers interested in learning how to use PyTorch to create and train a feedforward neural network for digit classification, and also software developers interested in learning how to use and apply optimizations to the trained model in an Android application. learning_objectives: - Prepare a PyTorch development environment. - Download and prepare the MNIST dataset. - - Create a neural network architecture using PyTorch. - - Train a neural network using PyTorch. - - Create an Android app and loading the pre-trained model. + - Create and train a neural network architecture using PyTorch. + - Create an Android app and load the pre-trained model. - Prepare an input dataset. - Measure the inference time. - Optimize a neural network architecture using quantization and fusing. - - Use an optimized model in the Android application. + - Deploy an optimized model in an Android application. prerequisites: - - A computer that can run Python3, Visual Studio Code, and Android Studio. The OS can be Windows, Linux, or macOS. + - A machine that can run Python3, Visual Studio Code, and Android Studio. + - For the OS, you can use Windows, Linux, or macOS. author_primary: Dawid Borycki diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_next-steps.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_next-steps.md index 82cf1f985b..4c12b745e1 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_next-steps.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_next-steps.md @@ -4,7 +4,7 @@ # ================================================================================ next_step_guidance: > - Proceed to Use Keras Core with TensorFlow, PyTorch, and JAX backends to continue exploring Machine Learning. + To continue exploring Maching Learning, you can now learn about using Keras Core with TensorFlow, PyTorch, and JAX backends. # 1-3 sentence recommendation outlining how the reader can generally keep learning about these topics, and a specific explanation of why the next step is being recommended. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_review.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_review.md index 8347d010f0..c25b83c564 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_review.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_review.md @@ -15,31 +15,31 @@ review: question: > Does the input layer of the model flatten the 28x28 pixel image into a 1D array of 784 elements? answers: - - "Yes" - - "No" + - "Yes." + - "No." correct_answer: 1 explanation: > Yes, the model uses nn.Flatten() to reshape the 28x28 pixel image into a 1D array of 784 elements for processing by the fully connected layers. - questions: question: > - Will the model make random predictions if it’s run before training? + Will the model make random predictions if it is run before training? answers: - - "Yes" - - "No" + - "Yes." + - "No." correct_answer: 1 explanation: > - Yes, however in such the case the model will produce random outputs, as the network has not been trained to recognize any patterns from the data. + Yes, however in this scenario the model will produce random outputs, as the network has not been trained to recognize any patterns from the data. - questions: question: > - Which loss function was used to train the PyTorch model on the MNIST dataset? + Which loss function did you use to train the PyTorch model on the MNIST dataset in this Learning Path? answers: - - Mean Squared Error Loss - - Cross Entropy Loss - - Hinge Loss + - Mean Squared Error Loss. + - Cross-Entropy Loss. + - Hinge Loss. - Binary Cross-Entropy Loss correct_answer: 2 explanation: > - Cross Entropy Loss was used to train the model because it is suitable for multi-class classification tasks like digit classification. It measures the difference between the predicted probabilities and the true class labels, helping the model learn to make accurate predictions. + Cross-Entropy Loss was used to train the model as it is suitable for multi-class classification such as digit classification. It measures the difference between the predicted probabilities and the true class labels, helping the model to learn to make accurate predictions. # ================================================================================ # FIXED, DO NOT MODIFY diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/app.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/app.md index d591afe578..5848fe0386 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/app.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/app.md @@ -7,29 +7,31 @@ weight: 10 layout: "learningpathall" --- -You are now ready to run the Android application. You can use an emulator or a physical device. - -The screenshots below show an emulator. +You are now ready to run the Android application. The screenshots below show an emulator, but you can also use a physical device. To run the app in Android Studio using an emulator, follow these steps: 1. Configure the Emulator: -* Go to Tools > Device Manager (or click the Device Manager icon on the toolbar). -* Click Create Device to set up a new virtual device (if you haven’t done so already). -* Choose a device model, such as Pixel 4, and click Next. -* Select a system image, such as Android 11, API level 30, and click Next. -* Review the settings and click Finish to create the emulator. + +* Go to **Tools** > **Device Manager**, or click the Device Manager icon on the toolbar. +* Click **Create Device** to set up a new virtual device, if you haven’t done so already. +* Choose a device model, such as the Pixel 4, and click **Next**. +* Select a system image, such as Android 11, API level 30, and click **Next**. +* Review the settings, and click **Finish** to create the emulator. 2. Run the App: -* Make sure the emulator is selected in the device dropdown menu in the toolbar (next to the “Run” button). -* Click the Run button (a green triangle). Android Studio will build the app, install it on the emulator, and launch it. -3. View the App on the Emulator: Once the app is installed, it will automatically open on the emulator screen, allowing you to interact with it as if it were running on a real device. +* Make sure the emulator is selected in the device drop-down menu in the toolbar, next to the **Run** button. +* Click the **Run** button, which is a green triangle. Android Studio builds the app, installs it on the emulator, and then launches it. + +3. View the App on the Emulator: + +* Once the app is installed, it automatically opens on the emulator screen, allowing you to interact with it as if it were running on a real device. -Once the application is started, click the Load Image button. It will load a randomly selected image. Then, click Run Inference to recognize the digit. The application will display the predicted label and the inference time as shown below: +Once the application starts, click the **Load Image** button. It loads a randomly-selected image. Then, click **Run Inference** to recognize the digit. The application displays the predicted label and the inference time as shown below: -![img](Figures/05.png) +![img alt-text#center](Figures/05.png "Figure 7. Digit Recognition 1") -![img](Figures/06.png) +![img alt-text#center](Figures/06.png "Figure 8. Digit Recognition 2") -In the next step you will learn how to further optimize the model. +In the next step of this Learning Path, you will learn how to further optimize the model. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/datasets-and-training.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/datasets-and-training.md index d1e4991135..ce888b7f2a 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/datasets-and-training.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/datasets-and-training.md @@ -1,6 +1,6 @@ --- # User change -title: "Perform training and save the model" +title: "Perform Training and Save the Model" weight: 5 @@ -9,9 +9,9 @@ layout: "learningpathall" ## Prepare the MNIST data -Start by downloading the MNIST dataset. Proceed as follows: +Start by downloading the MNIST dataset. -1. Open the pytorch-digits.ipynb you created earlier. +1. Open the `pytorch-digits.ipynb` you created earlier. 2. Add the following statements: @@ -42,9 +42,15 @@ train_dataloader = DataLoader(training_data, batch_size=batch_size) test_dataloader = DataLoader(test_data, batch_size=batch_size) ``` -The above code snippet downloads the MNIST dataset, transforms the images into tensors, and sets up data loaders for training and testing. Specifically, the `datasets.MNIST` function is used to download the MNIST dataset, with `train=True` indicating training data and `train=False` indicating test data. The `transform=transforms.ToTensor()` argument converts each image in the dataset into a PyTorch tensor, which is necessary for model training and evaluation. +Using this code enables you to: -The DataLoader wraps the datasets and allows efficient loading of data in batches. It handles data shuffling, batching, and parallel loading. Here, the train_dataloader and test_dataloader are created with a batch_size of 32, meaning they will load 32 images per batch during training and testing. +* Download the MNIST dataset. +* Transform the images into tensors. +* Set up data loaders for training and testing. + +Specifically, the `datasets.MNIST` function downloads the MNIST dataset, with `train=True` indicating training data and `train=False` indicating test data. The `transform=transforms.ToTensor()` argument converts each image in the dataset into a PyTorch tensor, which is necessary for model training and evaluation. + +The DataLoader wraps the datasets and enables efficient loading of data in batches. It handles data shuffling, batching, and parallel loading. Here, the train_dataloader and test_dataloader are created with a batch_size of 32, meaning they will load 32 images per batch during training and testing. This setup prepares the training and test datasets for use in a machine learning model, enabling efficient data handling and model training in PyTorch. @@ -54,19 +60,21 @@ To run the above code, you will need to install certifi package: pip install certifi ``` -The certifi Python package provides the Mozilla root certificates, which are essential for ensuring the SSL connections are secure. If you’re using macOS, you may also need to install the certificates by running: +The certifi Python package provides the Mozilla root certificates, which are essential for ensuring the SSL connections are secure. If you’re using macOS, you might also need to install the certificates by running: ```console /Applications/Python\ 3.x/Install\ Certificates.command ``` -Make sure to replace `x` with the number of Python version you have installed. +{{% notice Note %}} +Make sure to replace 'x' with the version number of Python that you have installed. +{{% /notice %}} -After running the code you see output similar to the screenshot below: +After running the code, you will see output similar to Figure 5: -![image](Figures/01.png) +![image alt-text#center](Figures/01.png "Figure 5. Output".) -# Train the model +## Train the Model To train the model, specify the loss function and the optimizer: @@ -77,7 +85,7 @@ loss_fn = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) ``` -Use CrossEntropyLoss as the loss function and the Adam optimizer for training. The learning rate is set to 1e-3. +Use `CrossEntropyLoss` as the loss function and the Adam optimizer for training. The learning rate is set to 1e-3. Next, define the methods for training and evaluating the feedforward neural network: @@ -111,7 +119,7 @@ def test_loop(dataloader, model, loss_fn): print(f"Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n") ``` -The first method, `train_loop`, uses the backpropagation algorithm to optimize the trainable parameters and minimize the prediction error of the neural network. The second method, `test_loop`, calculates the neural network error using the test images and displays the accuracy and loss values. +The first method, `train_loop`, uses the backpropagation algorithm to optimize the trainable parameters and minimize the prediction error rate of the neural network. The second method, `test_loop`, calculates the neural network error rate using the test images, and displays the accuracy and loss values. You can now invoke these methods to train and evaluate the model using 10 epochs. @@ -124,9 +132,9 @@ for t in range(epochs): test_loop(test_dataloader, model, loss_fn) ``` -After running the code, you see the following output showing the training progress. +After running the code, you see the following output showing the training progress, as displayed in Figure 2. -![image](Figures/02.png) +![image alt-text#center](Figures/02.png "Figure 2. Output 2") Once the training is complete, you see output similar to: @@ -139,13 +147,13 @@ The output shows the model achieved around 95% accuracy. # Save the model -Once the model is trained, you can save it. There are various approaches for this. In PyTorch, you can save both the model’s structure and its weights to the same file using the `torch.save()` function. Alternatively, you can save only the weights (parameters) of the model, not the model architecture itself. This requires you to have the model’s architecture defined separately when loading. To save the model weights, you can use the following command: +Once the model is trained, you can save it. There are various approaches for this. In PyTorch, you can save both the model’s structure and its weights to the same file using the `torch.save()` function. Alternatively, you can save only the weights of the model, not the model architecture itself. This requires you to have the model’s architecture defined separately when loading. To save the model weights, you can use the following command: ```Python torch.save(model.state_dict(), "model_weights.pth"). ``` -However, PyTorch does not save the definition of the class itself. When you load the model using `torch.load()`, PyTorch needs to know the class definition to recreate the model object. +However, PyTorch does not save the definition of the class itself. When you load the model using `torch.load()`, PyTorch requires the class definition to recreate the model object. Therefore, when you later want to use the saved model for inference, you will need to provide the definition of the model class. @@ -164,16 +172,22 @@ traced_model = torch.jit.trace(model, torch.rand(1, 1, 28, 28)) traced_model.save("model.pth") ``` -The above commands set the model to evaluation mode, trace the model, and save it. Tracing is useful for converting models with static computation graphs to TorchScript, making them portable and independent of the original class definition. +The above commands perform the following tasks: + +* They set the model to evaluation mode. +* They trace the model. +* They save it. + +Tracing is useful for converting models with static computation graphs to TorchScript, making them flexible and independent of the original class definition. Setting the model to evaluation mode before tracing is important for several reasons: -1. Behavior of Layers like Dropout and BatchNorm: - * Dropout. During training, dropout randomly zeroes out some of the activations to prevent overfitting. During evaluation dropout is turned off, and all activations are used. +1. Behavior of Layers like Dropout and BatchNorm: + * Dropout. During training, dropout randomly zeroes out some of the activations to prevent overfitting. During evaluation, dropout is turned off, and all activations are used. * BatchNorm. During training, Batch Normalization layers use batch statistics to normalize the input. During evaluation, they use running averages calculated during training. -2. Consistent Inference Behavior. By setting the model to eval mode, you ensure that the traced model will behave consistently during inference, as it will not use dropout or batch statistics that are inappropriate for inference. +2. Consistent Inference Behavior. By setting the model to eval mode, you ensure that the traced model behaves consistently during inference, as it does not use dropout or batch statistics that are inappropriate for inference. -3. Correct Tracing. Tracing captures the operations performed by the model using a given input. If the model is in training mode, the traced graph may include operations related to dropout and batch normalization updates. These operations can affect the correctness and performance of the model during inference. +3. Correct Tracing. Tracing captures the operations performed by the model using a given input. If the model is in training mode, the traced graph might include operations related to dropout and batch normalization updates. These operations can affect the correctness and performance of the model during inference. In the next step, you will use the saved model for ML inference. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/inference.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/inference.md index 9aed5754ee..343f3a822c 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/inference.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/inference.md @@ -1,32 +1,35 @@ --- # User change -title: "Use the model for inference" +title: "Deploy the Model for Inference" weight: 6 layout: "learningpathall" --- -The inference process involves using a trained model to make predictions on new, unseen data. It typically follows these steps: +You can use a trained model to make predictions on new, unseen data. The model uses a process called inference, and it typically follows these steps: -1. **Load the Trained Model**: the model, along with its learned parameters - weights and biases - is loaded from a saved file. -2. **Prepare the Input Data**: the input data is pre-processed in the same way as during training, for example, normalization and tensor conversion, to ensure compatibility with the model. -3. **Make Predictions**: the pre-processed data is fed into the model, which computes the output based on its trained parameters. The output is often a probability distribution over possible classes. -4. **Interpret the Results**: the predicted class is usually the one with the highest probability. The results can then be used for further analysis or decision-making. +1. **Load the Trained Model**: load the trained model with its parameters that consist of learned weights and biases, from a saved file. + +2. **Prepare the Input Data**: prepare the input data with pre-processing in the same way as during training. For example, undergoing normalization and tensor conversion, to ensure compatibility with the model. + +3. **Feed Pre-Processed Data into the Model to compute predictions**: feed the pre-processed data into the model, which then computes the output based on its trained parameters. The output is often a probability distribution over possible classes. + +4. **Interpret the Results**: finally, you can interpret the results. The predicted class is usually the one with the highest probability. You can also use the results for further analysis or decision-making. This process allows the model to generalize its learned knowledge to make accurate predictions on new data. # Running inference in PyTorch -You can inference in PyTorch using the previously saved model. To display results, you can use matplotlib. +You can run inference in PyTorch using the previously-saved model. You can then use `matplotlib` to display the results. -Start by installing matplotlib package: +Start by installing the `matplotlib` package: ```console pip install matplotlib ``` -Use Visual Studio Code to create a new file named `pytorch-digits-inference.ipynb` and modify the file to include the code below: +Use Visual Studio Code to create a new file named `pytorch-digits-inference.ipynb`, and modify the file to include the code: ```python import torch @@ -83,32 +86,38 @@ plt.tight_layout() plt.show() ``` -The above code performs inference on the saved PyTorch model using 16 randomly-selected images from the MNIST test dataset and displays them along with their actual and predicted labels. +This code performs inference on the saved PyTorch model using 16 randomly-selected images from the MNIST test dataset, and then displays them alongside their predicted and actual labels. + +As before, start by importing the necessary Python libraries: -As before, start by importing the necessary Python libraries: torch, datasets, transforms, matplotlib.pyplot, and random. Torch is used for loading the model and performing tensor operations. Datasets and transforms from torchvision are used for loading and transforming the MNIST dataset. Use matplotlib.pyplot for plotting and displaying images, and random is used for selecting random images from the dataset. +* `Torch` - for loading the model and performing tensor operations. +* `Datasets` - for loading the MNIST dataset. +* `Transforms` - for transforming the MNIST dataset. +* `Matplotlib.pyplot` - for plotting and displaying images. +* `Random` - for selecting random images from the dataset. -Next, load the MNIST test dataset using datasets.MNIST() with train=False to specify that it’s the test data. The dataset is automatically downloaded if it’s not available locally. +Next, load the MNIST test dataset using `datasets.MNIST()` with `train=False` to specify that it is the test data. The dataset is automatically downloaded if it is not available locally. -Load the saved model using torch.jit.load("model.pth") and set the model to evaluation mode using model.eval(). This ensures that layers like dropout and batch normalization behave appropriately during inference. +Load the saved model using `torch.jit.load("model.pth")` and set the model to evaluation mode using `model.eval()`. This ensures that layers like dropout and batch normalization behave appropriately during inference. -Subsequently, select 16 random images and create a 4x4 grid of subplots using plt.subplots(4, 4, figsize=(12, 12)) for displaying the images. +Then select 16 random images and create a 4x4 grid of subplots using `plt.subplots(4, 4, figsize=(12, 12))` for displaying the images. -Afterwards, perform inference and display the images in a loop. Specifically, for each of the 16 selected images, the image and its label are retrieved from the dataset using the random index. The image tensor is expanded to include a batch dimension (image.unsqueeze(0)) because the model expects a batch of images. Inference is performed with model(image_batch) to get the prediction. The predicted label is determined using torch.argmax() to find the index of the maximum probability in the output. Each image is displayed in its respective subplot with the actual and predicted labels. We use plt.tight_layout() to ensure that the layout is adjusted nicely, and plt.show() to display the 16 images with their actual and predicted labels. +Afterwards, perform inference and display the images in a loop. Specifically, for each of the 16 selected images, the image and its label are retrieved from the dataset using the random index. The image tensor is expanded to include a batch dimension `(image.unsqueeze(0))` because the model expects a batch of images. Inference is performed with `model(image_batch)` to get the prediction. The predicted label is determined using torch.argmax() to find the index of the maximum probability in the output. Each image is displayed in its respective subplot with the actual and predicted labels. You can use plt.tight_layout() to ensure that the layout is well-adjusted, and plt.show() to display the 16 images with their predicted and actual labels. This code demonstrates how to use a saved PyTorch model for inference and visualization of predictions on a subset of the MNIST test dataset. After running the code, you should see results similar to the following figure: -![image](Figures/03.png) +![image](Figures/03.png "Figure 6. Results Displayed") -# What have you learned? +### What have you learned? -You have completed the process of training and using a PyTorch model for digit classification on the MNIST dataset. Using the training dataset, you optimized the model’s weights and biases over multiple epochs. You employed the CrossEntropyLoss function and the Adam optimizer to minimize prediction errors and improve accuracy. You periodically evaluated the model on the test dataset to monitor its performance, ensuring it was learning effectively without overfitting. +You have completed the process of training and using a PyTorch model for digit classification on the MNIST dataset. Using the training dataset, you optimized the model’s weights and biases over multiple epochs. You employed the `CrossEntropyLoss` function and the `Adam optimizer` to minimize prediction errors and improve accuracy. You periodically evaluated the model on the test dataset to monitor its performance, ensuring it was learning effectively without overfitting. -After training, you saved the model using TorchScript, which captures both the model’s architecture and its learned parameters. This made the model portable and independent of the original class definition, simplifying deployment. +After training, you saved the model using `TorchScript`, which captures both the model’s architecture and its learned parameters. This improved the flexibility of the model; making it portable and able to function independently of the original class definition, which simplifies deployment. Next, you performed inference. You loaded the saved model and set it to evaluation mode to ensure that layers like dropout and batch normalization behaved correctly during inference. You randomly selected 16 images from the MNIST test dataset to evaluate the model’s performance on unseen data. For each selected image, you used the model to predict the digit, comparing the predicted labels with the actual ones. You displayed the images alongside their actual and predicted labels in a 4x4 grid, visually assessing the model’s accuracy and performance. This comprehensive process, from model training and saving to inference and visualization, illustrates the end-to-end workflow for building and deploying a machine learning model in PyTorch. It demonstrates how to train a model, save it in a portable format, and then use it to make predictions on new data. -In the next step, you will learn how to use the model in an Android application. \ No newline at end of file +In the next step, you will learn how to use the model in an Android application. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro-android.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro-android.md index 0e29ad2515..5af70ca8af 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro-android.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro-android.md @@ -1,6 +1,6 @@ --- # User change -title: "Understand inference on Android" +title: "Learn about Inference on Android" weight: 7 @@ -15,11 +15,13 @@ Arm provides a wide range of hardware and software accelerators designed to opti Running a machine learning model on Android involves a few key steps. -First, you train and save the model in a mobile-friendly format, such as TensorFlow Lite, ONNX, or TorchScript, depending on the framework you are using. +* You train and save the model in a mobile-friendly format, such as TensorFlow Lite, ONNX, or TorchScript, depending on the framework you are using. -Next, you add the model file to your Android project's assets directory. In your application's code, use the corresponding framework's Android library, such as TensorFlow Lite or PyTorch Mobile, to load the model. +* You add the model file to your Android project's assets directory. In your application's code, use the corresponding framework's Android library, such as TensorFlow Lite or PyTorch Mobile, to load the model. -You then prepare the input data, ensuring it is formatted and preprocessed in the same way as during model training. The input data is passed through the model, and the output predictions are retrieved and interpreted accordingly. For improved performance, you can leverage hardware acceleration using Android’s Neural Networks API (NNAPI) or use GPU support if available. This process enables the Android app to make real-time predictions and execute complex machine learning tasks directly on the device. +* You prepare the input data, ensuring it is formatted and preprocessed in the same way as during model training. The input data is passed through the model, and the output predictions are retrieved and interpreted accordingly. + +For improved performance, you can leverage hardware acceleration using Android’s Neural Networks API (NNAPI) or use GPU support if available. This process enables the Android app to make real-time predictions and execute complex machine learning tasks directly on the device. In this Learning Path, you will learn how to perform inference in an Android application using the pre-trained digit classifier from the previous sections. @@ -27,7 +29,7 @@ In this Learning Path, you will learn how to perform inference in an Android app Before you begin make [Android Studio](https://developer.android.com/studio/install) is installed on your system. -## Project source code +## Project Source Code The following steps explain how to build an Android application for MNIST inference. The application can be constructed from scratch, but there are two GitHub repositories available if you need to copy any files from them as you learn how to create the Android application. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro-opt.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro-opt.md index 870aa445da..4a429b88c5 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro-opt.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro-opt.md @@ -1,36 +1,36 @@ --- # User change -title: "Optimizing neural network models in PyTorch" +title: "Optimizing Neural Network Models in PyTorch" weight: 11 layout: "learningpathall" --- -## Optimizing models +## Optimizing Models Optimizing models is crucial to achieving efficient performance while minimizing resource consumption. -Because mobile and edge devices can have limited computational power, memory, and energy availability, various strategies are used to ensure that ML models can run effectively in these constrained environments. +As mobile and edge devices can have limited computational power, memory, and energy availability, various strategies can be deployed to ensure that ML models can run effectively in these constrained environments. ### Quantization -Quantization is one of the most widely used techniques, which reduces the precision of the model's weights and activations from floating-point to lower-bit representations, such as int8 or float16. This not only reduces the model size but also accelerates inference speed on hardware that supports lower precision arithmetic. +Quantization is one of the most widely used techniques, which reduces the precision of the model's weights and activations from floating-point to lower-bit representations, such as int8 or float16. This not only reduces the model size but also accelerates inference speed on hardware that supports low-precision arithmetic. -### Layer fusion +### Layer Fusion -Another key optimization strategy is layer fusion, where multiple operations, such as combining linear layers with their subsequent activation functions (like ReLU), into a single layer. This reduces the number of operations that need to be executed during inference, minimizing latency and improving throughput. +Another key optimization strategy is layer fusion. Layer fusion involves combining linear layers with their subsequent activation functions, such as ReLU, into a single layer. This reduces the number of operations that need to be executed during inference, minimizing latency and improving throughput. ### Pruning -In addition to these techniques, pruning, which involves removing less important weights or neurons from the model, can help in creating a leaner model that requires fewer resources without significantly affecting accuracy. +In addition to these techniques, pruning, which involves removing less significant weights or neurons from the model, can help in creating a leaner model that requires fewer resources without markedly affecting accuracy. ### Android NNAPI Leveraging hardware-specific optimizations, such as the Android Neural Networks API (NNAPI) allows you to take full advantage of the underlying hardware acceleration available on edge devices. -### More on optimization +### More on Optimization By employing these strategies, you can significantly enhance the efficiency of ML models for deployment on mobile and edge platforms, ensuring a balance between performance and resource utilization. @@ -46,7 +46,7 @@ PyTorch’s integration with hardware acceleration libraries, such as NNAPI for Overall, PyTorch provides a comprehensive ecosystem that empowers developers to implement effective optimizations for mobile and edge deployment, enhancing both speed and efficiency. -### Optimization Next steps +### Optimization Next Steps In the following sections, you will delve into the techniques of quantization and fusion using the previously created neural network model and Android application. @@ -62,4 +62,4 @@ After adjusting the training pipeline to produce an optimized version of the mod Once these changes are made, you will modify the Android application to load either the original or the optimized model based on user input, allowing you to switch between them dynamically. -This setup enables you to compare the inference speed of both models on the device, providing valuable insights into the performance benefits of model optimization techniques in real-world scenarios. \ No newline at end of file +This setup enables you to compare the inference speed of both models on the device, providing valuable insights into the performance benefits of model optimization techniques in real-world scenarios. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro.md index 536228cccf..9d61aacf4b 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro.md @@ -1,6 +1,6 @@ --- # User change -title: "Prepare a PyTorch development environment" +title: "Prepare a PyTorch Development Environment" weight: 2 @@ -9,16 +9,19 @@ layout: "learningpathall" ## Introduction to PyTorch -PyTorch is an open-source deep learning framework that is developed by Meta AI and is now part of the Linux Foundation. +Meta AI have designed an Open Source deep learning framework called PyTorch, that is now part of the Linux Foundation. -PyTorch is designed to provide a flexible and efficient platform for building and training neural networks. It is widely used due to its dynamic computational graph, which allows users to modify the architecture during runtime, making debugging and experimentation easier. +PyTorch provides a flexible and efficient platform for building and training neural networks. It has a dynamic computational graph that allows users to modify the architecture during runtime, making debugging and experimentation easier, and therefore makes it popular amongst developers. -PyTorch's objective is to provide a more flexible, user-friendly deep learning framework that addresses the limitations of static computational graphs found in earlier tools like TensorFlow. +PyTorch provides a more flexible, user-friendly deep learning framework that reduces the limitations of static computational graphs found in earlier tools, such as TensorFlow. -Prior to PyTorch, many frameworks used static computation graphs that require the entire model structure to be defined before training, making experimentation and debugging cumbersome. PyTorch introduced dynamic computational graphs, also known as “define-by-run”, that allow the graph to be constructed dynamically as operations are executed. This flexibility significantly improves ease of use for researchers and developers, enabling faster prototyping, easier debugging, and more intuitive code. +Prior to PyTorch, many frameworks used static computational graphs that require the entire model structure to be defined before training, which makes experimentation and debugging cumbersome. PyTorch introduced dynamic computational graphs, also known as “define-by-run”, that allow the graph to be constructed dynamically as operations are executed. This flexibility significantly improves ease of use for researchers and developers, enabling: +* Faster prototyping. +* Easier debugging. +* More intuitive code. -Additionally, PyTorch seamlessly integrates with Python, encouraging a native coding experience. Its deep integration with GPU acceleration also makes it a powerful tool for both research and production environments. This combination of flexibility, usability, and performance has contributed to PyTorch’s rapid adoption, especially in academic research, where experimentation and iteration are crucial. +PyTorch also seamlessly integrates with Python, which creates a native coding experience. Its deep integration with GPU acceleration also makes it a powerful tool for both research and production environments. This combination of flexibility, usability, and performance has ensured PyTorch’s rapid adoption, particularly in academic research, where experimentation and iteration are crucial activities. A typical process for creating a feedforward neural network in PyTorch involves defining a sequential stack of fully-connected layers, which are also known as linear layers. Each layer transforms the input by applying a set of weights and biases, followed by an activation function like ReLU. PyTorch supports this process using the torch.nn module, where layers are easily defined and composed. @@ -28,31 +31,31 @@ In this Learning Path, you will explore how to use PyTorch to create and train a ## Before you begin -Before you begin make sure Python3 is installed on your system. You can check this by running: +Before you begin, make sure Python3 is installed on your system. You can check this by running: ```console python3 --version ``` -The expected output is the Python version, for example: +You should then see the Python version printed in the output, for example: ```output Python 3.11.2 ``` -If Python3 is not installed, download and install it from [python.org](https://www.python.org/downloads/). +If Python3 is not installed, you can download and install it from [python.org](https://www.python.org/downloads/). Alternatively, you can also install Python3 using package managers such as Homebrew or APT. -If you are using Windows on Arm you can refer to the [Python install guide](https://learn.arm.com/install-guides/py-woa/). +If you are using Windows on Arm, see the [Python install guide](https://learn.arm.com/install-guides/py-woa/). -Next, download and install [Visual Studio Code](https://code.visualstudio.com/download). +Next, if you do not already have it, download and install [Visual Studio Code](https://code.visualstudio.com/download). ## Install PyTorch and additional Python packages -To prepare a virtual Python environment, install PyTorch, and the additional tools you will need for this Learning Path: +To prepare a virtual Python environment, first you need to install PyTorch, and then move on to installing the additional tools that you will need for this Learning Path. -1. Open a terminal or command prompt and navigate to your project directory. +1. Open a terminal or command prompt, and navigate to your project directory. 2. Create a virtual environment by running: @@ -60,29 +63,29 @@ To prepare a virtual Python environment, install PyTorch, and the additional too python -m venv pytorch-env ``` -This will create a virtual environment named pytorch-env. +This creates a virtual environment called `pytorch-env`. 3. Activate the virtual environment: -* On Windows: +* On Windows, run the following: ```console pytorch-env\Scripts\activate ``` -* On macOS or Linux: +* On macOS or Linux, run this code: ```console source pytorch-env/bin/activate ``` -Once activated, you see the virtual environment name `(pytorch-env)` before your terminal prompt. +Once activated, you can see the virtual environment name `(pytorch-env)` before your terminal prompt. -3. Install PyTorch using Pip: +4. Install PyTorch using Pip: ```console pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu ``` -4. Install torchsummary, Jupyter and IPython Kernel: +5. Install torchsummary, Jupyter and IPython Kernel: ```console pip install torchsummary @@ -90,28 +93,28 @@ pip install jupyter pip install ipykernel ``` -5. Register your virtual environment as a new kernel: +6. Register your virtual environment as a new kernel: ```console python3 -m ipykernel install --user --name=pytorch-env ``` -6. Install the Jupyter Extension in VS Code: +7. Install the Jupyter Extension in VS Code: -* Open VS Code and go to the Extensions view (click on the Extensions icon or press Ctrl+Shift+X). +* Open VS Code and go to the **Extensions** view, by clicking on the **Extensions** icon or pressing Ctrl+Shift+X. * Search for “Jupyter” and install the official Jupyter extension. -* Optionally, also install the Python extension if you haven’t already, as it improves Python language support in VS Code. +* Optionally, also install the Python extension if you have not already, as it improves Python language support in VS Code. -To ensure everything is set up correctly: +To ensure everything is set up correctly, follow these next steps: 1. Open Visual Studio Code. -2. Click New file, and select `Jupyter Notebook .ipynb Support`. +2. Click **New file**, and select `Jupyter Notebook .ipynb Support`. 3. Save the file as `pytorch-digits.ipynb`. -4. Select the Python kernel you created earlier (pytorch-env). To do so, click Kernels in the top right corner. Then, click Jupyter Kernel..., and you will see the Python kernel as shown below: +4. Select the Python kernel you created earlier, `pytorch-env`. To do so, click **Kernels** in the top right-hand corner. Then, click **Jupyter Kernel...**, and you will see the Python kernel as shown below: -![img1](Figures/1.png) +![img1 alt-text#center](Figures/1.png "Figure 1: Python kernel.") 5. In your Jupyter notebook, run the following code to verify PyTorch is working correctly: @@ -121,6 +124,6 @@ print(torch.__version__) ``` It will look as follows: -![img2](Figures/2.png) +![img2 alt-text#center](Figures/2.png "Figure 2: Jupyter Notebook.") -With your development environment created, you can proceed to creating a PyTorch model. +Now you have set up your development environment, you can move on to creating a PyTorch model. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro2.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro2.md index c1e3a10219..dbbdd5b621 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro2.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro2.md @@ -1,27 +1,35 @@ --- # User change -title: "About PyTorch model training" +title: "About PyTorch Model Training" weight: 4 layout: "learningpathall" --- -## PyTorch model training +## Training -In the previous section, you created a feedforward neural network for digit classification using the MNIST dataset. The network was left untrained and lacks the ability to make accurate predictions. +Now you have created a feedforward neural network for digit classification using the MNIST dataset, to enable the network to recognize handwritten digits effectively and make accurate predictions, training is needed. -To enable the network to recognize handwritten digits effectively, training is needed. Training in PyTorch involves configuring the network's parameters, such as weights and biases, by exposing the model to labeled data and iteratively adjusting these parameters to minimize prediction errors. This process allows the model to learn the patterns in the data, enabling it to make accurate classifications on new, unseen inputs. +Training in PyTorch involves exposing the model to labeled data and iteratively configuring the network's parameters. These parameters, such as the weights and biases, can be adjusted to reduce the number of prediction errors. This process allows the model to learn the patterns in the data, enabling it to make accurate classifications on new, unseen inputs. -The typical approach to training a neural network in PyTorch involves several key steps. +The typical approach to training a neural network in PyTorch involves several key steps: -First, obtain and preprocess the dataset, which usually includes normalizing the data and converting it into a format suitable for the model. +* Preprocess the dataset, for example normalize the data and convert it into a suitable format. -Next, the dataset is split into training and testing subsets. Training data is used to update the model's parameters, while testing data evaluates its performance. During training, feed batches of input data through the network, calculate the prediction error or loss using a loss function (such as cross-entropy for classification tasks), and optimize the model's weights and biases using backpropagation. Backpropagation involves computing the gradient of the loss with respect to each parameter and then updating the parameters using an optimizer, like Stochastic Gradient Descent (SGD) or Adam. This process is repeated for multiple epochs until the model achieves satisfactory performance, balancing accuracy and generalization. +* Divide the dataset into training and testing subsets. You can use training data to update the model's parameters, and testing data to evaluate its performance. + +* Feed batches of input data through the network. + +* Calculate the prediction error or loss using a loss function, such as Cross-Entropy for classification tasks. + +* Optimize the model's weights and biases using backpropagation. Backpropagation involves computing the gradient of the loss with respect to each parameter and then updating the parameters using an optimizer, like Stochastic Gradient Descent (SGD) or Adam. + +* Repeat the process for multiple epochs until the model achieves satisfactory performance, balancing accuracy and generalization. ### Loss, gradients, epoch and backpropagation -Loss is a measure of how well a model's predictions match the true labels of the data. It quantifies the difference between the predicted output and the actual output. The lower the loss, the better the model's performance. In classification tasks, a common loss function is Cross-Entropy Loss, while Mean Squared Error (MSE) is often used for regression tasks. The goal of training is to minimize the loss, which indicates that the model's predictions are getting closer to the actual labels. +Loss is a measure of how well a model's predictions match the true labels of the data. It quantifies the difference between the predicted output and the actual output. The lower the loss, the better the model's performance. In classification tasks, a common loss function is Cross-Entropy Loss, while Mean Squared Error (MSE) is often used for regression tasks. The goal of training is to minimize the loss, and get the model's predictions closer to the actual labels. Gradients represent the rate of change of the loss with respect to each of the model's parameters (weights and biases). They are used to update the model's parameters in the direction that reduces the loss. Gradients are calculated during the backpropagation step, where the loss is propagated backward through the network to compute how each parameter contributes to the overall loss. Optimizers like SGD or Adam use these gradients to adjust the parameters, effectively “teaching” the model to improve its predictions. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/mobile-app.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/mobile-app.md index fe897f8171..cfbc922d4d 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/mobile-app.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/mobile-app.md @@ -21,9 +21,9 @@ Start by modifying the `activity_main.xml` by adding a `CheckBox` to use the opt android:textSize="16sp"/> ``` -Copy the optimized model to the `assets` folder of the Android project. +Copy the optimized model to the `assets` folder in the Android project. -Replace the `MainActivity.kt` by the following code: +Replace the code in `MainActivity.kt` Kotlin file with the following code: ```Kotlin package com.arm.armpytorchmnistinference @@ -214,7 +214,7 @@ class MainActivity : AppCompatActivity() { } ``` -The updated version of the Android application includes modifications to the Android Activity to dynamically load the model based on the state of the `CheckBox`. +The updated version of the Android application includes modifications to the Android Activity source code to dynamically load the model based on the state of the `CheckBox`. When the `CheckBox` is selected, the app loads the optimized model, which is quantized and fused for improved performance. @@ -222,22 +222,22 @@ If the `CheckBox` is not selected, the app loads the original model. After the model is loaded, the inference is run. To better estimate the execution time, the `runInference()` method executes the inference 100 times in a loop. This provides a more reliable measure of the average inference time by smoothing out any inconsistencies from single executions. -The results for a run on a physical device are shown below. These results indicate that, on average, the optimized model reduced the inference time to about 65% of the original model's execution time, showing a significant improvement in performance. +The results for a run on a physical device are shown below. These results indicate that, on average, the optimized model reduced the inference time to about 65% of the original model's execution time, which demonstrates a significant improvement in performance. This optimization showcases the benefits of quantization and layer fusion for mobile inference, and there is further potential for enhancement by enabling hardware acceleration on supported devices. -This would allow the model to take full advantage of the device's computational capabilities, potentially reducing the inference time even more. +This would allow the model to take full advantage of the device's computational capabilities, potentially further reducing the inference time. -![fig](Figures/07.jpg) +![fig alt-text#center](Figures/07.jpg "Figure 9.") -![fig](Figures/08.jpg) +![fig alt-text#center](Figures/08.jpg "Figure 10.") -# What have you learned? +### What have you learned? You have successfully optimized a neural network model for mobile inference using quantization and layer fusion. -Quantization and layer fusion removed unnecessary elements such as dropout layers during inference. +Quantization and layer fusion removes unnecessary elements such as dropout layers during inference. By running multiple iterations of the inference process, you learned that the optimized model significantly reduced the average inference time to around 65% of the original time. -You also learned that there is potential for further performance improvements by leveraging hardware acceleration. \ No newline at end of file +You also learned that there is potential for further performance improvements by leveraging hardware acceleration. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model-opt.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model-opt.md index dc5b8556eb..76703398e3 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model-opt.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model-opt.md @@ -7,15 +7,19 @@ weight: 12 layout: "learningpathall" --- -You can create and train an optimized feedforward neural network to classify handwritten digits from the MNIST dataset. As a reminder, the dataset contains 70,000 images, comprising 60,000 training and 10,000 testing images, of handwritten numerals (0-9), each with dimensions of 28x28 pixels. - -This time you will introduce several changes to enable model quantization and fusing. +You can create and train an optimized feedforward neural network to classify handwritten digits from the MNIST dataset. This time you will introduce several changes to enable model quantization and fusing. # Model architecture -Start by creating a new notebook named `pytorch-digits-model-optimisations.ipynb`. +Start by creating a new notebook named `pytorch-digits-model-optimizations.ipynb`. + +Then define the model architecture using the code below. + +{% notice Note%}} +You can also find the source code on [GitHub](https://github.com/dawidborycki/Arm.PyTorch.MNIST.Inference.Python). +{{% /notice %}} + -Then define the model architecture using the code below. You can also find the source code on [GitHub](https://github.com/dawidborycki/Arm.PyTorch.MNIST.Inference.Python) ```python import torch @@ -53,13 +57,13 @@ class NeuralNetwork(nn.Module): return x # Outputs raw logits ``` -This code defines a neural network in PyTorch for digit classification, consisting of three linear layers with ReLU activations and optional dropout layers for regularization. The network first flattens the input (a 28x28 image) and passes it through two linear layers, each followed by a ReLU activation and a dropout layer (if enabled). The final layer produces raw logits as the output. Notably, the softmax layer has been removed to enable quantization and layer fusion during model optimization, allowing better performance when deploying the model on mobile or edge devices. +This code defines a neural network in PyTorch for digit classification, consisting of three linear layers with ReLU activations and optional dropout layers for regularization. The network first flattens the input, that is a 28x28 image, and passes it through two linear layers, each followed by a ReLU activation and if enbaled, a dropout layer. The final layer produces raw logits as the output. Notably, the softmax layer has been removed to enable quantization and layer fusion during model optimization, allowing better performance when deploying the model on mobile or edge devices. The output is left as logits, and the softmax function can be applied during post-processing, particularly during inference. This model includes dropout layers, which are used during training to randomly set a portion of the neurons to zero in order to prevent overfitting and improve generalization. -The `use_dropout` parameter allows you to enable or disable dropout, with the option to bypass dropout by replacing it with an `nn.Identity` layer when set to `False`, which is typically done during inference or quantization for more consistent behavior. +The `use_dropout` parameter allows you to enable or disable dropout, with the option to bypass dropout by replacing it with a `nn.Identity` layer when set to `False`, which is typically done during inference or quantization for more consistent behavior. Add the following lines to display the model architecture: @@ -69,7 +73,7 @@ model = NeuralNetwork() summary(model, (1, 28, 28)) ``` -After running the code, you see the following output: +After running the code, you will see the following output: ```output ---------------------------------------------------------------- @@ -98,14 +102,14 @@ Estimated Total Size (MB): 0.41 The output shows the structure of the neural network, including the layers, their output shapes, and the number of parameters. * The network starts with a Flatten layer, which reshapes the input from [1, 28, 28] to [1, 784] without adding any parameters. -* This is followed by two Linear (fully connected) layers with ReLU activations and optional Dropout layers in between, contributing to the parameter count. -* The first linear layer (from 784 to 96 units) has 75,360 parameters, while the second (from 96 to 256 units) has 24,832 parameters. +* This is followed by two linear, fully-connected, layers with ReLU activations and optional Dropout layers in between that contribute to the parameter count. +* The first linear layer, from 784 to 96 units, has 75,360 parameters, while the second, from 96 to 256 units, has 24,832 parameters. * The final linear layer, which outputs raw logits for the 10 classes, has 2,570 parameters. -* The total number of trainable parameters in the model is 102,762, with no non-trainable parameters. +* The total number of trainable parameters in the model is 102,762, without any non-trainable parameters. # Training the model -Now add the data loading, train, and test loops to actually train the model. This proceeds exactly the same as in the original model: +Now add the load-the-data, train, and test loops to train the model. This process is the same as with the original model: ``` from torchvision import transforms, datasets @@ -175,21 +179,21 @@ for t in range(epochs): test_loop(test_dataloader, model, loss_fn) ``` -You begin by preparing the MNIST dataset for training and testing our neural network model. +Begin by preparing the MNIST dataset for training and testing the neural network model. -Using the torchvision library, you download the MNIST dataset and apply a transformation to convert the images into tensors, making them suitable for input into the model. +Using the torchvision library, download the MNIST dataset and apply a transformation to convert the images into tensors, making them suitable for input into the model. Next, create two data loaders: one for the training set and one for the test set, each configured with a batch size of 32. These data loaders allow you to easily feed batches of images into the model during training and testing. -Next, define a training loop, which is the core of the model’s learning process. For each batch of images and labels, the model generates predictions, and you calculate the cross-entropy loss to measure how far off the predictions are from the true labels. +Next, define a training loop, which is at the core of the model’s learning process. For each batch of images and labels, the model generates predictions, and you calculate the cross-entropy loss to measure how far off the predictions are from the true labels. The Adam optimizer is used to perform backpropagation, updating the model's weights to reduce this error. The process repeats for every batch in the training dataset, gradually improving model accuracy over time. -To ensure the model is learning effectively, you also define a testing loop. +To ensure the model is learning effectively, you need to define a testing loop. -Here, the model is evaluated on a separate set of test images that it hasn't seen during training. You calculate both the average loss and the accuracy of the predictions, giving a clear sense of how well the model is performing. Importantly, this evaluation is done without updating the model's weights, as the goal is simply to measure its performance. +Here, the model is evaluated on a separate set of test images that it has not seen during training. You can calculate both the average loss and the accuracy of the predictions, and it will give you a clear sense of how well the model is performing. This evaluation must be done without updating the model's weights, as the goal is simply to measure its performance. -Finally, run the training and testing loops over the course of 10 epochs. With each epoch, the model trains on the full training dataset, and afterward, you test it to monitor its progress. By the end of the process, the model has learned to classify the MNIST digits with a high degree of accuracy, as reflected in the final test results. +Finally, run the training and testing loops over the course of 10 epochs. With each epoch, the model trains on the full training dataset, and afterwards, you can test it to monitor its progress. By the end of the process, the model has learned to classify the MNIST digits with a high degree of accuracy, as reflected in the final test results. This setup efficiently trains and evaluates the model for digit classification, providing feedback after each epoch on accuracy and loss. @@ -227,8 +231,8 @@ Epoch 10: Accuracy: 96.5%, Avg loss: 0.137004 ``` -The above shows a similar accuracy as the original model. +These results show a similar rate of accuracy as the original model. You now have the trained model with the modified architecture. -In the next step you will optimize it for mobile inference. \ No newline at end of file +In the next step you will optimize it for mobile inference. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model.md index 1c03a3e1f5..1db5d1e793 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model.md @@ -7,23 +7,23 @@ weight: 3 layout: "learningpathall" --- -You can create and train a feedforward neural network to classify handwritten digits from the MNIST dataset. This dataset contains 70,000 images, comprised of 60,000 training images and 10,000 testing images, of handwritten numerals (0-9), each with dimensions of 28x28 pixels. Some representative MNIST digits with their corresponding labels are shown below. +You can create and train a feedforward neural network to classify handwritten digits from the MNIST dataset. This dataset contains 70,000 images, comprising 60,000 training images and 10,000 testing images of handwritten numerals (0-9), each with dimensions of 28x28 pixels. Some representative MNIST digits with their corresponding labels are shown in Figure 3: -![img3](Figures/3.png) +![img3 alt-text#center](Figures/3.png "Figure 3: MNIST Digits and Labels.") -The neural network begins with an input layer containing 28x28 = 784 input nodes, with each node accepting a single pixel from an MNIST image. +The neural network begins with an input layer containing 28x28 = 784 input nodes, with each node accepting a single pixel from a MNIST image. -You will add a linear hidden layer with 96 nodes, using the hyperbolic tangent (tanh) activation function. To prevent overfitting, a dropout layer is applied, randomly setting 20% of the nodes to zero. +You will add a linear hidden layer with 96 nodes, using the hyperbolic tangent (tanh) activation function. To prevent overfitting, you will apply a dropout layer, randomly setting 20% of the nodes to zero. -You will then include another hidden layer with 256 nodes, followed by a second dropout layer that again removes 20% of the nodes. Finally, the output layer consists of ten nodes, each representing the probability of recognizing one of the digits (0-9). +You will then include another hidden layer with 256 nodes, followed by a second dropout layer that again removes 20% of the nodes. Finally, you will reach a situation where the output layer consists of ten nodes, each representing the probability of recognizing one of the digits (0-9). The total number of trainable parameters for this network is calculated as follows: -* First hidden layer: 784 x 96 + 96 = 75,360 parameters (weights + biases). +* First hidden layer: 784 x 96 + 96 = 75,360 parameters (weights and biases). * Second hidden layer: 96 x 256 + 256 = 24,832 parameters. * Output layer: 256 x 10 + 10 = 2,570 parameters. -In total, the network will have 102,762 trainable parameters. +In total, the network has 102,762 trainable parameters. # Implementation @@ -58,7 +58,9 @@ class NeuralNetwork(nn.Module): return logits ``` -To build the neural network in PyTorch, define a class that inherits from PyTorch’s nn.Module. This approach is similar to TensorFlow’s subclassing API. In this case, define a class named NeuralNetwork, which consists of two main components: +To build the neural network in PyTorch, define a class that inherits from PyTorch’s nn.Module. This approach is similar to TensorFlow’s subclassing API. + +Define a class named NeuralNetwork, which consists of two main components: 1. __init__ method @@ -69,13 +71,14 @@ First initialize the nn.Module with super(NeuralNetwork, self).__init__(). Insid Next, create a sequential stack of layers using nn.Sequential. The network consists of: + * A fully-connected (Linear) layer with 96 nodes, followed by the Tanh activation function. * A Dropout layer with a 20% dropout rate to prevent overfitting. * A second Linear layer, with 256 nodes, followed by the Sigmoid activation function. * Another Dropout layer, that removes 20% of the nodes. * A final Linear layer, with 10 nodes (matching the number of classes in the dataset), followed by a Softmax activation function that outputs class probabilities. -2. forward method +2. Forward method This method defines the forward pass of the network. It takes an input tensor x, flattens it using self.flatten, and then passes it through the defined sequential stack of layers (self.linear_stack). @@ -89,9 +92,9 @@ model = NeuralNetwork() summary(model, (1, 28, 28)) ``` -After running the notebook, you will see the following output: +After running the notebook, you will see the output as shown in Figure 4: -![img4](Figures/4.png) +![img4 alt-text#center](Figures/4.png "Figure 4: Notebook Output.") You will see a detailed summary of the NeuralNetwork model’s architecture, including the following information: diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/optimisation.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/optimisation.md index 06778bf470..67569d0789 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/optimisation.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/optimisation.md @@ -7,7 +7,7 @@ weight: 13 layout: "learningpathall" --- -To optimize the model use the `pytorch-digits-model-optimisations.ipynb` to add the following lines: +To optimize the model, use the `pytorch-digits-model-optimizations.ipynb` to add the following lines: ```python from torch.utils.mobile_optimizer import optimize_for_mobile @@ -62,4 +62,4 @@ Finally, the traced model is optimized for mobile using `optimize_for_mobile()`, The optimized model is saved in a format suitable for the PyTorch Lite Interpreter for efficient deployment on mobile platforms. -The result is an optimized and quantized model stored as `"optimized_model.ptl"`, ready for deployment. \ No newline at end of file +The result is an optimized and quantized model stored as `"optimized_model.ptl"`, ready for deployment. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/prepare-data.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/prepare-data.md index c16cb4c208..4affa57970 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/prepare-data.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/prepare-data.md @@ -1,21 +1,21 @@ --- # User change -title: "Prepare Test Data" +title: "Prepare the Test Data" weight: 9 layout: "learningpathall" --- -In this section you will add the pre-trained model and copy the bitmap image data to the Android project. +In this section, you will add the pre-trained model and copy the bitmap image data to the Android project. ## Model To add the model, create a folder named `assets` in the `app/src/main` folder. -Copy the pre-trained model you created in the previous steps, `model.pth` to the `assets` folder. +Copy the pre-trained model, named `model.pth`, to the `assets` folder. -The model is also available in the [GitHub repository](https://github.com/dawidborycki/Arm.PyTorch.MNIST.Inference.git) if you need to copy it. +The model is also available in the [GitHub repository](https://github.com/dawidborycki/Arm.PyTorch.MNIST.Inference.git) if you require it. ## Image data @@ -66,16 +66,18 @@ for i, (image, label) in enumerate(test_data): break ``` -The above code processes the MNIST test dataset to generate and save bitmap images for digit classification. +This code processes the MNIST test dataset to generate and save bitmap images for digit classification. It defines constants for the number of unique digits (0-9) and the number of examples to collect per digit. The dataset is loaded using `torchvision.datasets` with a transformation to convert images to tensors. A directory named `mnist_bitmaps` is created to store the images. A dictionary tracks the number of collected examples for each digit. The code iterates through the dataset, converting each image tensor back to a PIL image, and saves two examples of each digit in the format `digit_index_example_index.png`. -The loop breaks once the specified number of examples per digit is saved, ensuring that exactly 20 images (2 per digit) are generated and stored in the specified directory. +The loop breaks once the specified number of examples per digit is saved, ensuring that exactly 20 images, two per digit, are generated and stored in the specified directory. -For your convenience the data is included in the [GitHub repository](https://github.com/dawidborycki/Arm.PyTorch.MNIST.Inference.git) +{% notice Note %}} +This data is included in the [GitHub repository](https://github.com/dawidborycki/Arm.PyTorch.MNIST.Inference.git) +{{% /notice %}} Copy the `mnist_bitmaps` folder to the `assets` folder. -Once you have the `model.pth` and the `mnist_bitmaps` folder in the `assets` folder continue to the next step to run the Android application. \ No newline at end of file +Once you have the `model.pth` and the `mnist_bitmaps` folder in the `assets` folder, continue to the next step to run the Android application. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/user-interface.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/user-interface.md index bcf84520bb..05156e79d8 100644 --- a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/user-interface.md +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/user-interface.md @@ -1,6 +1,6 @@ --- # User change -title: "Create an Android application" +title: "Create an Android Application" weight: 8 @@ -17,25 +17,28 @@ The application runs an inference on the image and predicts the digit value. Start by creating a project: -1. Open Android Studio and create a new project with an “Empty Views Activity.” +1. Open Android Studio and create a new project with an **Empty Views Activity**. -2. Set the project name to **ArmPyTorchMNISTInference**, set the package name to: **com.arm.armpytorchmnistinference**, select **Kotlin** as the language, and set the minimum SDK to **API 27 ("Oreo" Android 8.1)**. +2. Configure as follows: + * Set the project name to **ArmPyTorchMNISTInference**. + * Set the package name to: **com.arm.armpytorchmnistinference**. + * Select **Kotlin** as the language. + * Set the minimum SDK to **API 27 ("Oreo" Android 8.1)**. + * Set the API to Android 8.1 (API level 27). This version introduced NNAPI, providing a standard interface for running computationally-intensive machine learning models on Android devices. -Set the API to Android 8.1 (API level 27) because this version introduced NNAPI, providing a standard interface for running computationally intensive machine learning models on Android devices. - -Devices with hardware accelerators can leverage NNAPI to offload ML tasks to specialized hardware, such as NPUs (Neural Processing Units), DSPs (Digital Signal Processors), or GPUs (Graphics Processing Units). +Devices with hardware accelerators can leverage NNAPI to offload ML tasks to specialized hardware, such as Neural Processing Units (NPUs), Digital Signal Processors (DSPs), or Graphics Processing Units (GPUs). ## User interface design -The user interface design contains the following: +The user interface design contains different components: - A header. - `ImageView` and `TextView` sections to display the image and its true label. - A button to load the image. - A button to run inference. -- Two `TextView` controls to display the predicted label and inference time. +- Two `TextView` controls to display the predicted label and the inference time. -Use the Android Studio editor to replace the contents of `activity_main.xml`, located in `src/main/res/layout` with the following code: +Use the editor in Android Studio to replace the contents of `activity_main.xml`, located in `src/main/res/layout` with the following code: ```XML @@ -109,9 +112,9 @@ Use the Android Studio editor to replace the contents of `activity_main.xml`, lo ``` -The above XML code defines a user interface layout for an Android activity using a vertical `LinearLayout`. It includes several UI components arranged vertically with padding and centered alignment. +The XML code above defines a user interface layout for an Android activity using a vertical `LinearLayout`. It includes several UI components arranged vertically with padding and centered alignment. -At the top, there is a `TextView` acting as a header, displaying the text `Digit Recognition` in bold and with a large font size. +At the top, there is a `TextView` acting as a header, displaying the text **Digit Recognition** in bold and with a large font size. Below the header, an `ImageView` displays an image, with a default source set to `sample_image`. @@ -313,20 +316,51 @@ class MainActivity : AppCompatActivity() { } ``` -The above Kotlin code defines an Android app activity called `MainActivity` that performs inference on the MNIST dataset using a pre-trained PyTorch model. The app allows the user to load a random MNIST image from the `assets` folder and runs the model to classify the image. +This Kotlin code defines an Android app activity called `MainActivity` that performs inference on the MNIST dataset using a pre-trained PyTorch model. The app allows the user to load a random MNIST image from the `assets` folder and run the model to classify the image. + +The `MainActivity` class contains several methods: + +* The `onCreate()` method is called when the activity is first created. It sets up the user interface by inflating the layout defined in `activity_main.xml` and initializes several UI components, including an `ImageView` to display the image, `TextView` controls to show the true label and predicted label, and two buttons, `selectImageButton` and `runInferenceButton`, to select an image and run inference. This method then loads the PyTorch model from the `assets` folder using the `assetFilePath()` function, and sets up click listeners for the buttons. The `selectImageButton` is configured to select a random image from the `mnist_bitmaps` folder, while the `runInferenceButton` runs the inference on the selected image. + +* The `selectRandomImageFromAssets()` method is responsible for selecting a random image from the `mnist_bitmaps` folder in `assets`. It lists all the files in the folder, picks one at random, and loads it as a bitmap. This method then does the following: + + * It extracts the true label from the filename. For example, 07_00.png implies a true label of 7. + * It displays the selected image in the `ImageView`. + * It updates the `trueLabel TextView` with the correct label. + +If there is an error loading the image or the folder is empty, an appropriate error message is displayed in the `trueLabel TextView`. + +* The `createTensorFromBitmap()` method converts a grayscale bitmap of size 28x28 (an image from the MNIST dataset) into a PyTorch Tensor, through the following steps: + + * The method begins by verifying that the bitmap has the correct dimensions. + * Then it extracts pixel data from the bitmap. + * It normalizes each pixel value to a float in the range [0, 1], and stores the values in a float array. + * Then it constructs and returns a tensor with the shape [1, 1, 28, 28], where 1 is the batch size, 1 is the number of channels (for grayscale), and 28 represents the width and height of the image. This is required to match the input expected by the model. + +* The `runInference()` method accepts a bitmap as input and performs inference using the pre-trained PyTorch model, through the following steps: -The MainActivity class contains several methods. The first one, `onCreate()` is called when the activity is first created. It sets up the user interface by inflating the layout defined in `activity_main.xml` and initializes several UI components, including an `ImageView` to display the image, `TextView` controls to show the true label and predicted label, and two buttons (`selectImageButton` and `runInferenceButton`) to select an image and run inference. The method then loads the PyTorch model from the assets folder using the `assetFilePath()` function and sets up click listeners for the buttons. The `selectImageButton` is configured to select a random image from the `mnist_bitmaps` folder, while the `runInferenceButton` runs the inference on the selected image. + * First, it converts the bitmap to a tensor using the `createTensorFromBitmap()` method. + * Then, it measures the time taken to run the forward pass of the model using the `measureTimeMicros()` method. + * The output tensor from the model, which contains the scores for each digit class, is then processed to determine the predicted label. + * The predicted label is displayed in the `predictedLabel TextView`. + * The method also updates the `inferenceTime TextView` with the time taken for the inference in microseconds. -Next, the `selectRandomImageFromAssets()` method is responsible for selecting a random image from the `mnist_bitmaps` folder in the assets. It lists all the files in the folder, picks one at random, and loads it as a bitmap. The method then extracts the true label from the filename (e.g., 07_00.png implies a true label of 7), displays the selected image in the `ImageView`, and updates the `trueLabel TextView` with the correct label. If there is an error loading the image or the folder is empty, an appropriate error message is displayed in the `trueLabel TextView`. +* The inline function `measureTimeMicros()` is a utility method that measures the execution time of the given code block in microseconds: + + * It uses the `measureNanoTime()` function to get the execution time in nanoseconds. + * It converts the resultant execution time to microseconds by dividing the result by 1000. + * This method is used to measure the time taken for model inference in the `runInference()` method. -Afterward, the `createTensorFromBitmap()` converts a grayscale bitmap of size 28x28 (an image from the MNIST dataset) into a PyTorch Tensor. First, the method verifies that the bitmap has the correct dimensions. Then, it extracts pixel data from the bitmap, normalizes each pixel value to a float in the range [0, 1], and stores the values in a float array. The method finally constructs and returns a tensor with the shape [1, 1, 28, 28], where 1 is the batch size, 1 is the number of channels (for grayscale), and 28 represents the width and height of the image. This is required to match the input expected by the model. +* The `assetFilePath()` method is a helper function that copies a file from the assets folder to the application's internal storage and returns the absolute path of the copied file. This is necessary because PyTorch’s `Module.load()` method requires a file path, not an InputStream. The `assetFilePath()` method does the following: -Subsequently, we have the `runInference()` method. It accepts a bitmap as input and performs inference using the pre-trained PyTorch model. It first converts the bitmap to a tensor using the `createTensorFromBitmap()` method. Then, it measures the time taken to run the forward pass of the model using the `measureTimeMicros()` method. The output tensor from the model, which contains the scores for each digit class, is processed to determine the predicted label. This predicted label is displayed in the `predictedLabel TextView`. The method also updates the `inferenceTime TextView` with the time taken for the inference in microseconds. + * The function reads the specified asset file. + * It writes its contents to a file in the internal storage. + * It returns the path to this file. -Also, we have an inline function `measureTimeMicros()`. It is a utility method that measures the execution time of the provided code block in microseconds. It uses the `measureNanoTime()` function to get the execution time in nanoseconds and then converts it to microseconds by dividing the result by 1000. This method is used to measure the time taken for model inference in the `runInference()` method. +This method is used in `onCreate()` to load the PyTorch model file, `model.pth`, from the `assets` folder. -The `assetFilePath()` method is a helper function that copies a file from the assets folder to the application's internal storage and returns the absolute path of the copied file. This is necessary because PyTorch’s `Module.load()` method requires a file path, not an InputStream. The function reads the specified asset file, writes its contents to a file in the internal storage, and returns the path to this file. This method is used in `onCreate()` to load the PyTorch model file, `model.pth`, from the `assets` folder. +* The `MainActivity` class initializes the UI components, loads a pre-trained PyTorch model, and allows the user to select random MNIST images and run inference on them. -The `MainActivity` class initializes the UI components, loads a pre-trained PyTorch model, and allows the user to select random MNIST images and run inference on them. Each method is designed to handle a specific aspect of the functionality, such as loading images, converting them to tensors, running inference, and measuring execution time. The code is modular and organized, making it easy to understand and maintain. +Each method is designed to handle a specific aspect of the functionality, such as loading images, converting them to tensors, running inference, and measuring execution time. The code is modular and organized, making it easy to understand and maintain. -To be able to successfully run the application you need to add the model and prepare the bitmaps. Continue to see how to prepare the data. \ No newline at end of file +To be able to successfully run the application, you need to add the model and prepare the bitmaps. Continue with this Learning Path to learn how to prepare the data.