diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/01.png b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/01.png new file mode 100644 index 0000000000..bc0c2cbffe Binary files /dev/null and b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/01.png differ diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/02.png b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/02.png new file mode 100644 index 0000000000..54eacae4e1 Binary files /dev/null and b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/02.png differ diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/03.png b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/03.png new file mode 100644 index 0000000000..2dea1eff2c Binary files /dev/null and b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/03.png differ diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/1.png b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/1.png new file mode 100644 index 0000000000..9fd24961b7 Binary files /dev/null and b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/1.png differ diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/2.png b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/2.png new file mode 100644 index 0000000000..881080cdf0 Binary files /dev/null and b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/2.png differ diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/3.png b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/3.png new file mode 100644 index 0000000000..8faba5d3dc Binary files /dev/null and b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/3.png differ diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/4.png b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/4.png new file mode 100644 index 0000000000..14c3ede70e Binary files /dev/null and b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/Figures/4.png differ diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_index.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_index.md new file mode 100644 index 0000000000..23ac166cbc --- /dev/null +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_index.md @@ -0,0 +1,45 @@ +--- +title: Create and train a PyTorch model for digit classification + +minutes_to_complete: 80 + +who_is_this_for: This is an introductory topic for software developers interested in learning how to use PyTorch to create and train a feedforward neural network for digit classification. + +learning_objectives: + - Prepare a PyTorch development environment. + - Download and prepare the MNIST dataset. + - Create a neural network architecture using PyTorch. + - Train a neural network using PyTorch. + +prerequisites: + - A computer that can run Python3 and Visual Studio Code. The OS can be Windows, Linux, or macOS. + + +author_primary: Dawid Borycki + +### Tags +skilllevels: Introductory +subjects: ML +armips: + - Cortex-A + - Cortex-X + - Neoverse +operatingsystems: + - Windows + - Linux + - macOS +tools_software_languages: + - Android Studio + - Coding +shared_path: true +shared_between: + - servers-and-cloud-computing + - laptops-and-desktops + - smartphones-and-mobile + +### FIXED, DO NOT MODIFY +# ================================================================================ +weight: 1 # _index.md always has weight of 1 to order correctly +layout: "learningpathall" # All files under learning paths have this same wrapper +learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content. +--- diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_next-steps.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_next-steps.md new file mode 100644 index 0000000000..82cf1f985b --- /dev/null +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_next-steps.md @@ -0,0 +1,42 @@ +--- +# ================================================================================ +# Edit +# ================================================================================ + +next_step_guidance: > + Proceed to Use Keras Core with TensorFlow, PyTorch, and JAX backends to continue exploring Machine Learning. + +# 1-3 sentence recommendation outlining how the reader can generally keep learning about these topics, and a specific explanation of why the next step is being recommended. + +recommended_path: "/learning-paths/servers-and-cloud-computing/keras-core/" + +# Link to the next learning path being recommended(For example this could be /learning-paths/servers-and-cloud-computing/mongodb). + + +# further_reading links to references related to this path. Can be: + # Manuals for a tool / software mentioned (type: documentation) + # Blog about related topics (type: blog) + # General online references (type: website) + +further_reading: + - resource: + title: PyTorch + link: https://pytorch.org + type: documentation + - resource: + title: MNIST + link: https://en.wikipedia.org/wiki/MNIST_database + type: website + - resource: + title: Visual Studio Code + link: https://code.visualstudio.com + type: website + + +# ================================================================================ +# FIXED, DO NOT MODIFY +# ================================================================================ +weight: 21 # set to always be larger than the content in this path, and one more than 'review' +title: "Next Steps" # Always the same +layout: "learningpathall" # All files under learning paths have this same wrapper +--- diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_review.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_review.md new file mode 100644 index 0000000000..fb1980742f --- /dev/null +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/_review.md @@ -0,0 +1,50 @@ +--- +# ================================================================================ +# Edit +# ================================================================================ + +# Always 3 questions. Should try to test the reader's knowledge, and reinforce the key points you want them to remember. + # question: A one sentence question + # answers: The correct answers (from 2-4 answer options only). Should be surrounded by quotes. + # correct_answer: An integer indicating what answer is correct (index starts from 0) + # explanation: A short (1-3 sentence) explanation of why the correct answer is correct. Can add additional context if desired + + +review: + - questions: + question: > + Does the input layer of the model flatten the 28x28 pixel image into a 1D array of 784 elements? + answers: + - "Yes" + - "No" + correct_answer: 1 + explanation: > + Yes, the model uses nn.Flatten() to reshape the 28x28 pixel image into a 1D array of 784 elements for processing by the fully connected layers. + - questions: + question: > + Will the model make random predictions if it’s run before training? + answers: + - "Yes" + - "No" + correct_answer: 1 + explanation: > + Yes, however in such the case the model will produce random outputs, as the network has not been trained to recognize any patterns from the data. + - questions: + question: > + Which loss function was used to train the PyTorch model on the MNIST dataset? + answers: + - Mean Squared Error Loss + - CrossEntropyLoss + - Hinge Loss + - Binary Cross-Entropy Loss + correct_answer: 2 + explanation: > + The CrossEntropyLoss function was used to train the model because it is suitable for multi-class classification tasks like digit classification. It measures the difference between the predicted probabilities and the true class labels, helping the model learn to make accurate predictions. + +# ================================================================================ +# FIXED, DO NOT MODIFY +# ================================================================================ +title: "Review" # Always the same title +weight: 20 # Set to always be larger than the content in this path +layout: "learningpathall" # All files under learning paths have this same wrapper +--- diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/datasets-and-training.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/datasets-and-training.md new file mode 100644 index 0000000000..d50b6d3c42 --- /dev/null +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/datasets-and-training.md @@ -0,0 +1,177 @@ +--- +# User change +title: "Datasets and training" + +weight: 5 + +layout: "learningpathall" +--- + +Start by downloading the MNIST dataset. Proceed as follows: + +1. Open the pytorch-digits.ipynb you created earlier. + +2. Add the following statements: + +```python +from torchvision import transforms, datasets +from torch.utils.data import DataLoader + +# Training data +training_data = datasets.MNIST( + root="data", + train=True, + download=True, + transform=transforms.ToTensor() +) + +# Test data +test_data = datasets.MNIST( + root="data", + train=False, + download=True, + transform=transforms.ToTensor() +) + +# Dataloaders +batch_size = 32 + +train_dataloader = DataLoader(training_data, batch_size=batch_size) +test_dataloader = DataLoader(test_data, batch_size=batch_size) +``` + +The above code snippet downloads the MNIST dataset, transforms the images into tensors, and sets up data loaders for training and testing. Specifically, the `datasets.MNIST` function is used to download the MNIST dataset, with `train=True` indicating training data and `train=False` indicating test data. The `transform=transforms.ToTensor()` argument converts each image in the dataset into a PyTorch tensor, which is necessary for model training and evaluation. + +The DataLoader wraps the datasets and allows efficient loading of data in batches. It handles data shuffling, batching, and parallel loading. Here, the train_dataloader and test_dataloader are created with a batch_size of 32, meaning they will load 32 images per batch during training and testing. + +This setup prepares the training and test datasets for use in a machine learning model, enabling efficient data handling and model training in PyTorch. + +To run the above code, you will need to install certifi package: + +```console +pip install certifi +``` + +The certifi Python package provides the Mozilla root certificates, which are essential for ensuring the SSL connections are secure. If you’re using macOS, you may also need to install the certificates by running: + +```console +/Applications/Python\ 3.x/Install\ Certificates.command +``` + +Make sure to replace `x` with the number of Python version you have installed. + +After running the code you will see the output that might look like shown below: + +![image](Figures/01.png) + +# Train the model + +To train the model, specify the loss function and the optimizer: + +```Python +learning_rate = 1e-3 + +loss_fn = nn.CrossEntropyLoss() +optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) +``` + +Use CrossEntropyLoss as the loss function and the Adam optimizer for training. The learning rate is set to 1e-3. + +Next, define the methods for training and evaluating the feedforward neural network: + +```Python +def train_loop(dataloader, model, loss_fn, optimizer): + size = len(dataloader.dataset) + for batch, (x, y) in enumerate(dataloader): + # Compute prediction and loss + pred = model(x) + loss = loss_fn(pred, y) + + # Backpropagation + optimizer.zero_grad() + loss.backward() + optimizer.step() + +def test_loop(dataloader, model, loss_fn): + size = len(dataloader.dataset) + num_batches = len(dataloader) + test_loss, correct = 0, 0 + + with torch.no_grad(): + for x, y in dataloader: + pred = model(x) + test_loss += loss_fn(pred, y).item() + correct += (pred.argmax(1) == y).type(torch.float).sum().item() + + test_loss /= num_batches + correct /= size + + print(f"Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n") +``` + +The first method, `train_loop`, uses the backpropagation algorithm to optimize the trainable parameters and minimize the prediction error of the neural network. The second method, `test_loop`, calculates the neural network error using the test images and displays the accuracy and loss values. + +You can now invoke these methods to train and evaluate the model using 10 epochs. + +```Python +epochs = 10 + +for t in range(epochs): + print(f"Epoch {t+1}:") + train_loop(train_dataloader, model, loss_fn, optimizer) + test_loop(test_dataloader, model, loss_fn) +``` + +After running this code, you will see the following output that shows the training progress. + +![image](Figures/02.png) + +Once the training is complete, you will see something like the following: + +```output +Epoch 10: +Accuracy: 95.4%, Avg loss: 1.507491 +``` + +which shows the model achieved around 95% of accuracy. + +# Save the model + +Once the model is trained, you can save it. There are various approaches for this. In PyTorch, you can save both the model’s structure and its weights to the same file using the `torch.save()` function. Alternatively, you can save only the weights (parameters) of the model, not the model architecture itself. This requires you to have the model’s architecture defined separately when loading. To save the model weights, you can use the following command: + +```Python +torch.save(model.state_dict(), "model_weights.pth"). +``` + +However, PyTorch does not save the definition of the class itself. When you load the model using `torch.load()`, PyTorch needs to know the class definition to recreate the model object. + +Therefore, when you later want to use the saved model for inference, you will need to provide the definition of the model class. + +Alternatively, you can use TorchScript, which serializes both the architecture and weights into a single file that can be loaded without needing the original class definition. This is particularly useful for deploying models to production or sharing models without code dependencies. + +Use TorchScript to save the model using the following commands: + +```Python +# Set model to evaluation mode +model.eval() + +# Trace the model with an example input +traced_model = torch.jit.trace(model, torch.rand(1, 1, 28, 28)) + +# Save the traced model +traced_model.save("model.pth") +``` + +The above commands set the model to evaluation mode, trace the model, and save it. Tracing is useful for converting models with static computation graphs to TorchScript, making them portable and independent of the original class definition. + +Setting the model to evaluation mode before tracing is important for several reasons: + +1. Behavior of Layers like Dropout and BatchNorm: + * Dropout. During training, dropout randomly zeroes out some of the activations to prevent overfitting. During evaluation dropout is turned off, and all activations are used. + * BatchNorm. During training, Batch Normalization layers use batch statistics to normalize the input. During evaluation, they use running averages calculated during training. + +2. Consistent Inference Behavior. By setting the model to eval mode, you ensure that the traced model will behave consistently during inference, as it will not use dropout or batch statistics that are inappropriate for inference. + +3. Correct Tracing. Tracing captures the operations performed by the model using a given input. If the model is in training mode, the traced graph may include operations related to dropout and batch normalization updates. These operations can affect the correctness and performance of the model during inference. + +In the next step, you will use the saved model for inference. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/inference.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/inference.md new file mode 100644 index 0000000000..c421f037b1 --- /dev/null +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/inference.md @@ -0,0 +1,112 @@ +--- +# User change +title: "Inference" + +weight: 6 + +layout: "learningpathall" +--- + +The inference process involves using a trained model to make predictions on new, unseen data. It typically follows these steps: + +1. **Load the Trained Model**: the model, along with its learned parameters - weights and biases - is loaded from a saved file. +2. **Prepare the Input Data**: the input data is pre-processed in the same way as during training, for example, normalization and tensor conversion, to ensure compatibility with the model. +3. **Make Predictions**: the pre-processed data is fed into the model, which computes the output based on its trained parameters. The output is often a probability distribution over possible classes. +4. **Interpret the Results**: the predicted class is usually the one with the highest probability. The results can then be used for further analysis or decision-making. + +This process allows the model to generalize its learned knowledge to make accurate predictions on new data. + +# Runing inference in PyTorch + +You can inference in PyTorch using the previously saved model. To display results, you can use matplotlib. + +Start by installing matplotlib package: + +```console +pip install matplotlib +``` + +Then, in Visual Studio Code create a new file named `pytorch-digits-inference.ipynb` and modify the file to include the code below: + +```python +import torch +from torchvision import datasets, transforms +import matplotlib.pyplot as plt +import random + +# Define a transformation to convert the image to a tensor +transform = transforms.Compose([ + transforms.ToTensor() +]) + +# Load the test set with transformation +test_data = datasets.MNIST( + root="data", + train=False, + download=True, + transform=transform +) + +# Load the entire model +model = torch.jit.load("model.pth") + +# Set the model to evaluation mode +model.eval() + +# Select 16 random indices from the test dataset +random_indices = random.sample(range(len(test_data)), 16) + +# Plot the 16 randomly selected images +fig, axes = plt.subplots(4, 4, figsize=(12, 12)) # Create a 4x4 grid of subplots + +for i, ax in enumerate(axes.flat): + # Get a random image and its label + index = random_indices[i] + image, label = test_data[index] + + # Add a batch dimension (model expects a batch of images) + image_batch = image.unsqueeze(0) + + # Run inference + with torch.no_grad(): + prediction = model(image_batch) + + # Get the predicted class + predicted_label = torch.argmax(prediction, dim=1).item() + + # Display the image with actual and predicted labels + ax.imshow(image.squeeze(), cmap="gray") + ax.set_title(f"Actual: {label}\nPredicted: {predicted_label}") + ax.axis("off") # Remove axes for clarity + +plt.tight_layout() +plt.show() +``` + +The above code performs inference on the saved PyTorch model using 16 randomly-selected images from the MNIST test dataset and displays them along with their actual and predicted labels. + +As before, start by importing the necessary Python libraries: torch, datasets, transforms, matplotlib.pyplot, and random. Torch is used for loading the model and performing tensor operations. Datasets and transforms from torchvision are used for loading and transforming the MNIST dataset. Use matplotlib.pyplot for plotting and displaying images, and random is used for selecting random images from the dataset. + +Next, load the MNIST test dataset using datasets.MNIST() with train=False to specify that it’s the test data. The dataset is automatically downloaded if it’s not available locally. + +Load the saved model using torch.jit.load("model.pth") and set the model to evaluation mode using model.eval(). This ensures that layers like dropout and batch normalization behave appropriately during inference. + +Subsequently, select 16 random images and create a 4x4 grid of subplots using plt.subplots(4, 4, figsize=(12, 12)) for displaying the images. + +Afterwards, perform inference and display the images in a loop. Specifically, for each of the 16 selected images, the image and its label are retrieved from the dataset using the random index. The image tensor is expanded to include a batch dimension (image.unsqueeze(0)) because the model expects a batch of images. Inference is performed with model(image_batch) to get the prediction. The predicted label is determined using torch.argmax() to find the index of the maximum probability in the output. Each image is displayed in its respective subplot with the actual and predicted labels. We use plt.tight_layout() to ensure that the layout is adjusted nicely, and plt.show() to display the 16 images with their actual and predicted labels. + +This code demonstrates how to use a saved PyTorch model for inference and visualization of predictions on a subset of the MNIST test dataset. + +After running the code, you should see results similar to the following figure: + +![image](Figures/03.png) + +# What you have learned + +In this exercise, you went through the complete process of training and using a PyTorch model for digit classification on the MNIST dataset. Using the training dataset, you optimized the model’s weights and biases over multiple epochs. You employed the CrossEntropyLoss function and the Adam optimizer to minimize prediction errors and improve accuracy. You periodically evaluated the model on the test dataset to monitor its performance, ensuring it was learning effectively without overfitting. + +After training, you saved the model using TorchScript, which captures both the model’s architecture and its learned parameters. This made the model portable and independent of the original class definition, simplifying deployment. + +Next, you performed inference. You loaded the saved model and set it to evaluation mode to ensure that layers like dropout and batch normalization behaved correctly during inference. You randomly selected 16 images from the MNIST test dataset to evaluate the model’s performance on unseen data. For each selected image, you used the model to predict the digit, comparing the predicted labels with the actual ones. You displayed the images alongside their actual and predicted labels in a 4x4 grid, visually assessing the model’s accuracy and performance. + +This comprehensive process, from model training and saving to inference and visualization, illustrates the end-to-end workflow for building and deploying a machine learning model in PyTorch. It demonstrates how to train a model, save it in a portable format, and then use it to make predictions on new data. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro.md new file mode 100644 index 0000000000..af7cffde58 --- /dev/null +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro.md @@ -0,0 +1,124 @@ +--- +# User change +title: "Prepare a PyTorch development environment" + +weight: 2 + +layout: "learningpathall" +--- + +PyTorch is an open-source deep learning framework that is developed by Meta AI and is now part of the Linux Foundation. + +PyTorch is designed to provide a flexible and efficient platform for building and training neural networks. It is widely used due to its dynamic computational graph, which allows users to modify the architecture during runtime, making debugging and experimentation easier. + +PyTorch's objective is to provide a more flexible, user-friendly deep learning framework that addresses the limitations of static computational graphs found in earlier tools like TensorFlow. + +Prior to PyTorch, many frameworks used static computation graphs that require the entire model structure to be defined before training, making experimentation and debugging cumbersome. PyTorch introduced dynamic computational graphs, also known as “define-by-run”, that allow the graph to be constructed dynamically as operations are executed. This flexibility significantly improves ease of use for researchers and developers, enabling faster prototyping, easier debugging, and more intuitive code. + + +Additionally, PyTorch seamlessly integrates with Python, encouraging a native coding experience. Its deep integration with GPU acceleration also makes it a powerful tool for both research and production environments. This combination of flexibility, usability, and performance has contributed to PyTorch’s rapid adoption, especially in academic research, where experimentation and iteration are crucial. + +A typical process for creating a feedforward neural network in PyTorch involves defining a sequential stack of fully-connected layers, which are also known as *linear layers*. Each layer transforms the input by applying a set of weights and biases, followed by an activation function like ReLU. PyTorch supports this process using the torch.nn module, where layers are easily defined and composed. + +To create a model, users subclass the torch.nn.Module class, defining the network architecture in the __init__ method, and implement the forward pass in the forward method. PyTorch’s intuitive API and support for GPU acceleration make it ideal for building efficient feedforward networks, particularly in tasks such as image classification and digit recognition. + +In this Learning Path, you will explore how to use PyTorch for creating a model for digit recognition, before then proceeding to train it. + +## Before you begin + +Before you begin make sure Python3 is installed on your system. You can check this by running: + +```console +python3 --version +``` + +The expected output is the Python version, for example: + +```output +Python 3.11.2 +``` + +If Python3 is not installed, download and install it from [python.org](https://www.python.org/downloads/). + +Alternatively, you can also install Python3 using package managers such as Brew or APT. + +If you are using Windows on Arm you can refer to the [Python install guide](https://learn.arm.com/install-guides/py-woa/). + +Next, download and install [Visual Studio Code](https://code.visualstudio.com/download). + +## Install PyTorch and additional Python packages + +To prepare a virtual Python environment, install PyTorch, and the additional tools you will need for this Learning Path: + +1. Open a terminal or command prompt and navigate to your project directory. + +2. Create a virtual environment by running: + +```console +python -m venv pytorch-env +``` + +This will create a virtual environment named pytorch-env. + +3. Activate the virtual environment: + +* On Windows: +```console +pytorch-env\Scripts\activate +``` + +* On macOS or Linux: +```console +source pytorch-env/bin/activate +``` + +Once activated, you should see the virtual environment name in your terminal prompt. + +3. Install PyTorch using `pip`: + +```console +pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu +``` + +4. Install torchsummary, Jupyter and IPython Kernel: + +```console +pip install torchsummary +pip install jupyter +pip install ipykernel +``` + +5. Register your virtual environment as a new kernel: + +```console +python3 -m ipykernel install --user --name=pytorch-env +``` + +6. Install the Jupyter Extension in VS Code: + +* Open VS Code and go to the Extensions view (click on the Extensions icon or press Ctrl+Shift+X). + +* Search for “Jupyter” and install the official Jupyter extension. + +* Optionally, also install the Python extension if you haven’t already, as it improves Python language support in VS Code. + +To ensure everything is set up correctly: + +1. Open Visual Studio Code. +2. Click New file, and select `Jupyter Notebook .ipynb Support`. +3. Save the file as `pytorch-digits.ipynb`. +4. Select the Python kernel you created earlier (pytorch-env). To do so, click Kernels in the top right corner. Then, click Jupyter Kernel..., and you will see the Python kernel as shown below: + +![img1](Figures/1.png) + +5. In your Jupyter notebook, run the following code to verify PyTorch is working correctly: + +```console +import torch +print(torch.__version__) +``` + +It will look as follows: +![img2](Figures/2.png) + +With your development environment created, you can proceed to creating a PyTorch model. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro2.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro2.md new file mode 100644 index 0000000000..ae6126132d --- /dev/null +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/intro2.md @@ -0,0 +1,53 @@ +--- +# User change +title: "PyTorch model training" + +weight: 4 + +layout: "learningpathall" +--- + +In the previous section, you created a feedforward neural network for digit classification using the MNIST dataset. The network was left untrained and lacks the ability to make accurate predictions. + +To enable the network to recognize handwritten digits effectively, training is needed. Training in PyTorch involves configuring the network's parameters, such as weights and biases, by exposing the model to labeled data and iteratively adjusting these parameters to minimize prediction errors. This process allows the model to learn the patterns in the data, enabling it to make accurate classifications on new, unseen inputs. + +The typical approach to training a neural network in PyTorch involves several key steps. + +First, obtain and preprocess the dataset, which usually includes normalizing the data and converting it into a format suitable for the model. + +Next, the dataset is split into training and testing subsets. Training data is used to update the model’s parameters, while testing data evaluates its performance. During training, feed batches of input data through the network, calculate the prediction error or loss using a loss function (such as cross-entropy for classification tasks), and optimize the model’s weights and biases using backpropagation. Backpropagation involves computing the gradient of the loss with respect to each parameter and then updating the parameters using an optimizer, like Stochastic Gradient Descent (SGD) or Adam. This process is repeated for multiple epochs until the model achieves satisfactory performance, balancing accuracy and generalization. + +### Loss, gradients, epoch and backpropagation + +Loss is a measure of how well a model’s predictions match the true labels of the data. It quantifies the difference between the predicted output and the actual output. The lower the loss, the better the model’s performance. In classification tasks, a common loss function is Cross-Entropy Loss, while Mean Squared Error (MSE) is often used for regression tasks. The goal of training is to minimize the loss, which indicates that the model’s predictions are getting closer to the actual labels. + +Gradients represent the rate of change of the loss with respect to each of the model’s parameters (weights and biases). They are used to update the model’s parameters in the direction that reduces the loss. Gradients are calculated during the backpropagation step, where the loss is propagated backward through the network to compute how each parameter contributes to the overall loss. Optimizers like SGD or Adam use these gradients to adjust the parameters, effectively “teaching” the model to improve its predictions. + +An epoch refers to one complete pass through the entire training dataset. During each epoch, the model sees every data point once and updates its parameters accordingly. Multiple epochs are typically required to train a model effectively because, during each epoch, the model learns and fine-tunes its parameters based on the data it processes. The number of epochs is a hyperparameter that you set before training, and increasing it can improve the model’s performance, but too many epochs may lead to overfitting, where the model performs well on training data but poorly on new, unseen data. + +Backpropagation is a fundamental algorithm used in training neural networks to optimize their parameters—weights and biases—by minimizing the loss function. It works by propagating the error backward through the network, calculating the gradients of the loss function with respect to each parameter, and updating these parameters accordingly. + +### Training a model in PyTorch + +To train a model in PyTorch, several essential components are required: + +1. **Dataset**: the source of data that the model will learn from. It typically consists of input samples and their corresponding labels. PyTorch provides the `torchvision.datasets` module for easy access to popular datasets like MNIST, CIFAR-10, and ImageNet. You can also create custom datasets using the `torch.utils.data.Dataset` class. + +2. **DataLoader**: used to efficiently load and batch the data during training. It handles data shuffling, batching, and parallel loading, making it easier to feed the data into the model in a structured manner. This is crucial for performance, especially when working with large datasets. + +3. **Model**: the Neural Network Architecture defines the structure of the neural network. You learned that in PyTorch, models are typically created by subclassing `torch.nn.Module` and defining the network layers and forward pass. This includes specifying the input and output dimensions and the sequence of layers, such as linear layers, activation functions, and dropout. + +4. **Loss Function**: measures how far the model’s predictions are from the actual targets. It guides the optimization process by providing a signal that tells the model how to adjust its parameters. Common loss functions include Cross-Entropy Loss for classification tasks and Mean Squared Error (MSE) Loss for regression tasks. You can select a predefined loss function from torch.nn or define your own. + +5. **Optimizer**: updates the model’s parameters based on the gradients computed during backpropagation. It determines how the model learns from the data. Popular optimizers include Stochastic Gradient Descent (SGD) and Adam, which are available in the torch.optim module. You need to specify the learning rate (a hyperparameter that controls how much to change the parameters in response to the gradient) and other hyperparameters when creating the optimizer. + +6. **Training Loop**: where the actual learning happens. For each iteration of the loop: + * A batch of data is fetched from the DataLoader. + * The model performs a forward pass to generate predictions. + * The loss is calculated using the predictions and the true labels. + * The gradients are computed via backpropagation. + * The optimizer updates the model’s parameters based on the gradients. + +This process is repeated for a specified number of epochs to gradually reduce the loss and improve the model’s performance. + +In the next step you will see how to perform model training. diff --git a/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model.md b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model.md new file mode 100644 index 0000000000..abfc9f117f --- /dev/null +++ b/content/learning-paths/cross-platform/pytorch-digit-classification-arch-training/model.md @@ -0,0 +1,142 @@ +--- +# User change +title: "Create a PyTorch model for MNIST" + +weight: 3 + +layout: "learningpathall" +--- + +You can create and train a feedforward neural network to classify handwritten digits from the MNIST dataset. This dataset contains 70,000 images, comprising 60,000 training and 10,000 testing images, of handwritten numerals (0-9), each with dimensions of 28x28 pixels. Some representative MNIST digits with their corresponding labels are shown below. + +![img3](Figures/3.png) + +The neural network begins with an input layer containing 28x28 = 784 input nodes, with each node accepting a single pixel from an MNIST image. + +You will add a linear hidden layer with 96 nodes, using the hyperbolic tangent (tanh) activation function. To prevent overfitting, a dropout layer is applied, randomly setting 20% of the nodes to zero. + +You will then include another hidden layer with 256 nodes, followed by a second dropout layer that again removes 20% of the nodes. Finally, the output layer consists of ten nodes, each representing the probability of recognizing one of the digits (0-9). + +The total number of trainable parameters for this network is calculated as follows: + +* First hidden layer: 784 x 96 + 96 = 75,360 parameters (weights + biases). +* Second hidden layer: 96 x 256 + 256 = 24,832 parameters. +* Output layer: 256 x 10 + 10 = 2,570 parameters. + +In total, the network will have 102,762 trainable parameters. + +# Implementation + +To implement the model, supplement the `pytorch-digits.ipynb` notebook with the following statements: + +```Python +from torch import nn +from torchsummary import summary + +class_names = range(10) + +class NeuralNetwork(nn.Module): + def __init__(self): + super(NeuralNetwork, self).__init__() + self.flatten = nn.Flatten() + self.linear_stack = nn.Sequential( + nn.Linear(28*28, 96), + nn.Tanh(), + nn.Dropout(.2), + + nn.Linear(96, 256), + nn.Sigmoid(), + nn.Dropout(.2), + + nn.Linear(256, len(class_names)), + nn.Softmax(dim=1) + ) + + def forward(self, x): + x = self.flatten(x) + logits = self.linear_stack(x) + return logits +``` + +To build the neural network in PyTorch, define a class that inherits from PyTorch’s nn.Module. This approach is similar to TensorFlow’s subclassing API. In this case, define a class named NeuralNetwork, which consists of two main components: + +1. **__init__** method + +This method serves as the constructor for the class. + +First initialize the nn.Module with super(NeuralNetwork, self).__init__(). Inside this method, define the architecture of the feedforward neural network. The input is first flattened from its original 28x28 pixel format into a 1D array of 784 elements using nn.Flatten(). + +Next, create a sequential stack of layers using nn.Sequential. + +The network consists of: +* A fully-connected (Linear) layer with 96 nodes, followed by the Tanh activation function. +* A Dropout layer with a 20% dropout rate to prevent overfitting. +* A second Linear layer, with 256 nodes, followed by the Sigmoid activation function. +* Another Dropout layer, that removes 20% of the nodes. +* A final Linear layer, with 10 nodes (matching the number of classes in the dataset), followed by a Softmax activation function that outputs class probabilities. + +2. **forward** method + +This method defines the forward pass of the network. It takes an input tensor x, flattens it using self.flatten, and then passes it through the defined sequential stack of layers (self.linear_stack). + +The output, called logits, represents the class probabilities for the digit prediction. + +The next step initializes the model and displays the summary using the torchsummary package: + +```Python +model = NeuralNetwork() + +summary(model, (1, 28, 28)) +``` + +After running the notebook, you will see the following output: + +![img4](Figures/4.png) + +You will see a detailed summary of the NeuralNetwork model’s architecture, including the following information: + +1. Layer Details + +The summary lists each layer of the network sequentially, including: + +* The Flatten layer, which reshapes the 28x28 input images into a 784-element vector. +* The Linear layers with 96 and 256 nodes, respectively, along with the activation functions (Tanh and Sigmoid) applied after each linear transformation. +* The Dropout layers that randomly-deactivate 20% of the neurons in the respective layers. +* The final Linear layer with 10 nodes, corresponding to the output probabilities for the 10 digit classes, followed by the Softmax function. + +2. Input and Output Shapes + +For each layer, the summary shows the shape of the input and output tensors, helping to trace how the data flows through the network. For example, the input shape starts as (1, 28, 28) for the image, which gets flattened to (1, 784) after the Flatten layer. + +3. The summary + +The summary provides the total number of trainable parameters in each layer, including both weights and biases. + +This includes: + +* 75,360 parameters for the first Linear layer (784 inputs × 96 nodes + 96 biases). +* 24,832 parameters for the second Linear layer (96 nodes × 256 nodes + 256 biases). +* 2,570 parameters for the output Linear layer (256 nodes × 10 output nodes + 10 biases). +* At the end, you will see the total number of parameters in the model, which is 102,762 trainable parameters. + +This summary provides a clear overview of the model architecture, the dimensional transformations happening at each layer, and the number of parameters that will be optimized during training. + +Running the model now will produce random outputs, as the network has not been trained to recognize any patterns from the data. The next step is to train the model using a dataset and an optimization process, such as gradient descent, so that it can learn to make accurate predictions. + +At this point, the model makes predictions, but since it hasn’t been trained, the predictions are random and unreliable. The network’s weights are initialized randomly, or use the default initialization methods, so the output probabilities from the softmax layer are essentially random. + +The output is still a probability distribution over the 10 digit classes (0-9), but the values do not correspond to the images, because the model has not learned the patterns from the MNIST dataset. + +Technically, the code will run without errors as long as you provide it with an input image of the correct dimensions, which is 28x28 pixels. The model can accept input, pass it through the layers, and return a prediction - a vector of 10 probabilities. However, the results are not useful until the model is trained. + +# What you have learned so far + +You have successfully defined and initialized a feedforward neural network using PyTorch. + +The model was designed to classify handwritten digits from the MNIST dataset, and details of the architecture were printed using the **summary()** function. + +The network consists of input flattening, two hidden layers with activation functions and dropout for regularization, and an output layer with a softmax function to predict the digit class probabilities. + +You also confirmed that the model has a total of 102,762 trainable parameters. + +The next step is to train the model using the MNIST dataset, which involves feeding the data through the network, calculating the loss, and optimizing the weights based on backpropagation to improve the model's accuracy in digit classification.