
# Basic Neural Network with PyTorch

This notebook provides a step-by-step guide to building a basic neural network using PyTorch. We'll use the Breast Cancer dataset from Scikit-Learn to demonstrate a binary classification task.

By the end of this notebook, you'll understand how to:

1. Load and preprocess data for a neural network
2. Define and train a neural network using PyTorch
3. Evaluate and make predictions with the trained model


## Step 1: Import Libraries

In [1]:

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np


## Understanding the Breast Cancer Dataset

In this example, we're using the **Breast Cancer dataset** from Scikit-Learn, a well-known dataset for binary classification. This dataset contains information derived from digitized images of breast cancer tissue samples. Specifically, it focuses on **30 features** that describe the characteristics of cell nuclei within these samples. The goal is to predict whether a tumor is **malignant** (cancerous) or **benign** (non-cancerous).

### Breakdown of the Dataset

1. **Features (Inputs)**:
   - Each sample (or row) represents a breast cancer tissue sample.
   - Each sample includes 30 features, which describe cell nuclei measurements, such as:
     - **Radius**: Mean radius of the cell nuclei.
     - **Texture**: Variation in cell texture.
     - **Perimeter** and **Area**: Measurements that give insights into cell size.
     - **Smoothness**: Describes how smooth the cell boundaries are.
     - **Compactness**, **Concavity**, and **Concave points**: Shape-related metrics.
     - **Symmetry** and **Fractal dimension**: Provide structural information.

   These features help the model learn patterns that distinguish between malignant and benign tumors.

2. **Target (Output)**:
   - The target variable (label) is binary, with two possible values:
     - **0** for benign (non-cancerous).
     - **1** for malignant (cancerous).
   - Our model learns to predict this label based on the input features, identifying patterns that distinguish benign from malignant tumors.

### Example Data Sample

Each data sample might look something like this (simplified for clarity):

| Radius | Texture | Perimeter | Area | Smoothness | Compactness | Concavity | ... | Target |
|--------|---------|-----------|------|------------|-------------|-----------|-----|--------|
| 17.99  | 10.38   | 122.8     | 1001 | 0.1184     | 0.2776      | 0.3001    | ... | 1 (Malignant) |
| 13.54  | 14.36   | 87.46     | 566  | 0.09779    | 0.08129     | 0.06664   | ... | 0 (Benign)    |

Each value in the columns represents a measurement from a cell image. The model uses these values to classify new samples as either malignant or benign.

### Why This Data?

The Breast Cancer dataset is widely used because:
- It has real-world medical relevance, making it useful for applications in healthcare.
- It’s small enough for efficient training, ideal for demonstrating machine learning concepts.
- It provides a clear binary classification task, making it easy to interpret the network's output and evaluate model performance.

### Summary

This dataset allows us to create a model that learns patterns in cell measurements to predict if a tumor is cancerous. By training on these features and labels, our model can generalize well enough to classify new, unseen samples. This is a valuable capability in real-world medical applications, where accurate predictions can assist in diagnosis and treatment.


In [2]:
# Import necessary libraries
from sklearn.datasets import load_breast_cancer
import pandas as pd

# Load the Breast Cancer dataset
data = load_breast_cancer()

# Print dataset description
print("Dataset Description:\n")
print(data.DESCR)

# Convert the dataset to a DataFrame for easier viewing
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target  # Add the target column

# Display the first few rows of the DataFrame
print("\nFirst Few Rows of the Dataset:\n")
print(df.head())

# Display summary statistics for the DataFrame
print("\nSummary Statistics:\n")
print(df.describe())

# Display target distribution
print("\nTarget Distribution:\n")
print(df['target'].value_counts())

# Print feature and target names for reference
print("\nFeature Names:\n", data.feature_names)
print("\nTarget Names:\n", data.target_names)



Dataset Description:

.. _breast_cancer_dataset:

Breast cancer wisconsin (diagnostic) dataset
--------------------------------------------

**Data Set Characteristics:**

:Number of Instances: 569

:Number of Attributes: 30 numeric, predictive attributes and the class

:Attribute Information:
    - radius (mean of distances from center to points on the perimeter)
    - texture (standard deviation of gray-scale values)
    - perimeter
    - area
    - smoothness (local variation in radius lengths)
    - compactness (perimeter^2 / area - 1.0)
    - concavity (severity of concave portions of the contour)
    - concave points (number of concave portions of the contour)
    - symmetry
    - fractal dimension ("coastline approximation" - 1)

    The mean, standard error, and "worst" or largest (mean of the three
    worst/largest values) of these features were computed for each image,
    resulting in 30 features.  For instance, field 0 is Mean Radius, field
    10 is Radius SE, field 20 is


## Step 2: Load and Prepare the Data

In this step, we load the breast cancer dataset, split it into training and testing sets, and normalize it for faster convergence.


In [3]:

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Normalize the data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Convert data to PyTorch tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test, dtype=torch.float32)


### Code Explanation: Preparing the Breast Cancer Dataset for Neural Network Training

The following code prepares the **Breast Cancer dataset** for training a neural network by following essential steps: data loading, splitting, normalization, and conversion to PyTorch tensors.

1. **Load Dataset**:
   - The function `load_breast_cancer()` loads the Breast Cancer dataset from Scikit-Learn, returning a special data structure called a **Bunch**.
   - `data.data`: This contains the feature values for each sample in the dataset. It’s a 2D array with shape `(samples, features)`, where each row is a sample, and each column is a feature (like mean radius or smoothness).
   - `data.target`: This contains the labels or targets (0 for benign and 1 for malignant) associated with each sample.
   
   **Variables Created**:
   - `X`: Stores the features (predictors) from the dataset.
   - `y`: Stores the target labels.

---

2. **Split the Data**:
   - `train_test_split(X, y, test_size=0.2, random_state=42)`: This function from Scikit-Learn splits the data into **training** and **testing** sets.
     - `test_size=0.2`: Reserves 20% of the data for testing and 80% for training.
     - `random_state=42`: Ensures reproducibility of the split (the same data split each time).
   - **Why Split?** The model is trained on the training set and evaluated on the test set to check its generalization to unseen data.

   **Variables Created**:
   - `X_train` and `y_train`: The features and labels for the training set.
   - `X_test` and `y_test`: The features and labels for the test set.

---

3. **Normalize the Data**:
   - Neural networks perform better when the input data is **normalized** (scaled to a standard range).
   - `StandardScaler()`: Scales each feature to have a mean of 0 and a standard deviation of 1.
     - `scaler.fit_transform(X_train)`: Fits the scaler to `X_train` and transforms the data.
     - `scaler.transform(X_test)`: Transforms `X_test` using the same scaling parameters from `X_train` to maintain consistency.

   **Variables Updated**:
   - `X_train` and `X_test` are now scaled, improving the model’s performance and helping the neural network converge faster during training.

---

4. **Convert Data to PyTorch Tensors**:
   - PyTorch models require data to be in **tensor** format.
   - `torch.tensor(data, dtype=torch.float32)`: Converts `X_train`, `y_train`, `X_test`, and `y_test` to PyTorch tensors with a `float32` data type, the default format for neural network inputs.
   - The `float32` type is essential for efficient model training, while `float64` (double precision) is not necessary and can slow down computation.

   **Variables Created**:
   - `X_train_tensor`, `y_train_tensor`, `X_test_tensor`, and `y_test_tensor`: These PyTorch tensors are the input data prepared and ready for training a neural network model.

---

### Summary

These steps ensure that:
1. The data is split into training and test sets to allow model evaluation.
2. Features are scaled to a standard range, making training more efficient.
3. Data is converted to tensors for compatibility with PyTorch models, setting the stage for model training.

This preparation is critical for creating a robust and efficient neural network training pipeline.



## Step 3: Create DataLoader for Batch Processing

Using DataLoader allows us to load data in batches, which helps in efficient training.


In [4]:

# Create TensorDataset and DataLoader
train_data = TensorDataset(X_train_tensor, y_train_tensor)
train_loader = DataLoader(train_data, batch_size=32, shuffle=True)

test_data = TensorDataset(X_test_tensor, y_test_tensor)
test_loader = DataLoader(test_data, batch_size=32, shuffle=False)


This code snippet prepares the data for efficient training and evaluation of the neural network by creating **TensorDataset** and **DataLoader** objects. These tools are useful for managing and feeding data into the neural network during training and testing in PyTorch.

### Explanation of Each Step

1. **TensorDataset**:
   - `TensorDataset` is a PyTorch utility that groups input data (`X_train_tensor` and `X_test_tensor`) and their corresponding target labels (`y_train_tensor` and `y_test_tensor`) together.
   - By creating `TensorDataset` objects (`train_data` and `test_data`), we combine each feature set (input) with its label (target), making it easier to manage and feed to the model during training and testing.
   - `train_data` contains all training samples and their labels, while `test_data` contains all test samples and labels.

   **Example**:
   - `train_data[0]` would return a tuple containing the first training sample and its corresponding label.

2. **DataLoader**:
   - `DataLoader` takes a dataset (like `train_data` or `test_data`) and provides an efficient way to iterate over the data in **batches**.
   - It loads data in **batches** of a specified size (`batch_size=32` in this example), meaning each iteration of the DataLoader will yield 32 samples at a time. This batch processing is more memory efficient and speeds up training.
   - **`shuffle=True`** in `train_loader`: Randomly shuffles the data before each epoch, which helps prevent the model from learning any order-based patterns and generally leads to better model generalization.
   - **`shuffle=False`** in `test_loader`: The test set doesn’t need shuffling because it is only used for evaluation, not learning. Keeping it in the same order allows for consistent evaluation.

3. **Batch Size**:
   - **Batch Size** (`batch_size=32`): Refers to the number of samples the model processes before updating the weights. A batch size of 32 is common, balancing training speed and memory efficiency.
   - When `DataLoader` iterates, it will yield 32 samples at a time, allowing the model to calculate the gradients and update weights based on each batch rather than the entire dataset.

### Summary
- `train_loader` and `test_loader` now allow the model to iterate over the training and test datasets in efficient batches, which helps with:
  - **Faster training**: Processing small batches of data instead of the entire dataset at once.
  - **Memory efficiency**: By not loading the entire dataset at once.
  - **Shuffling**: Improving model generalization for training batches (only applied to the training set).



## Step 4: Define the Neural Network Model

We'll define a simple neural network with a single hidden layer. We'll use ReLU activation in the hidden layer and Sigmoid activation in the output layer for binary classification.


In [5]:

class SimpleNN(nn.Module):
    def __init__(self, input_size):
        super(SimpleNN, self).__init__()
        self.hidden = nn.Linear(input_size, 16)  # Hidden layer with 16 neurons
        self.output = nn.Linear(16, 1)           # Output layer for binary classification

    def forward(self, x):
        x = torch.relu(self.hidden(x))           # ReLU activation for hidden layer
        x = torch.sigmoid(self.output(x))        # Sigmoid activation for output layer
        return x


This code defines a simple neural network in PyTorch. Let's break down each part to understand how it works.

### Class Definition: `SimpleNN`

The `SimpleNN` class inherits from `nn.Module`, which is the base class for all neural network modules in PyTorch. This inheritance allows us to define custom architectures and layers, making it easy to structure and extend neural networks.

1. **`__init__` Method (Initialization)**:
   - **Purpose**: Defines the layers of the neural network. In this case, the network has one hidden layer and one output layer.
   - **Parameters**:
     - `input_size`: The number of input features in each sample. This allows the network to accept input data with a specific number of features.
   - **Layers**:
     - `self.hidden`: Creates a **hidden layer** with `input_size` inputs and 16 neurons (outputs). The `nn.Linear` layer performs a linear transformation \( xW + b \), where `x` is the input, `W` is the weight matrix, and `b` is the bias.
     - `self.output`: Creates an **output layer** with 16 inputs (matching the hidden layer’s outputs) and 1 neuron. The single neuron is suitable for **binary classification**, outputting a single probability.

### `forward` Method (Forward Pass)

The `forward` method defines how data flows through the network. Each layer applies transformations to the data as it passes through.

2. **Activation Functions**:
   - **ReLU Activation (`torch.relu`)**:
     - Applied to the hidden layer. ReLU (Rectified Linear Unit) sets negative values to zero and keeps positive values the same, introducing non-linearity and helping the network learn complex patterns.
     - This operation takes the output of `self.hidden(x)` and applies ReLU, which is crucial for deep networks to prevent them from behaving like linear models.
   - **Sigmoid Activation (`torch.sigmoid`)**:
     - Applied to the output layer. Sigmoid squashes the output to a range between 0 and 1, making it interpretable as a probability in binary classification tasks.
     - This means that for each input sample, the output is a probability between 0 (class 0) and 1 (class 1).

3. **Return Statement**:
   - The function returns the output from the sigmoid layer, which represents the probability that the input belongs to the positive class (e.g., class 1).

### Summary

This network structure is straightforward:
- **Input Layer** → **Hidden Layer** with 16 neurons → **Output Layer** with 1 neuron.
- It uses **ReLU** for the hidden layer and **Sigmoid** for the output, making it ideal for binary classification.
- By defining layers and activation functions in the `__init__` and `forward` methods, PyTorch handles the backward propagation automatically, allowing easy training and optimization.



## Step 5: Instantiate Model, Loss Function, and Optimizer

We initialize the model, specify binary cross-entropy as the loss function, and use the Adam optimizer.


In [6]:

input_size = X_train.shape[1]  # Number of features in the dataset
model = SimpleNN(input_size)

criterion = nn.BCELoss()  # Binary cross-entropy loss for binary classification
optimizer = optim.Adam(model.parameters(), lr=0.001)  # Adam optimizer


### Explanation

#### Determining `input_size`:

- `input_size = X_train.shape[1]`: Sets `input_size` to the number of features in the training dataset.
- `X_train.shape[1]` gives the number of columns (features) in `X_train`, which represents the number of input features the model will need.
- This value is passed to the `SimpleNN` model so the input layer matches the number of features in each sample.

#### Initializing the Model:

- `model = SimpleNN(input_size)`: Instantiates the neural network model using the `SimpleNN` class, with `input_size` as an argument.
- This line creates the model object (`model`) and defines its architecture based on `SimpleNN`, with layers and activations that we defined previously.

#### Setting the Loss Function (`criterion`):

- `criterion = nn.BCELoss()`: Defines the Binary Cross-Entropy Loss as the loss function.
- **Binary Cross-Entropy (BCE) Loss**: A common loss function for binary classification tasks, calculating the difference between the predicted probabilities and actual labels.
- BCE loss penalizes incorrect predictions more heavily, helping the model to learn better classifications between two classes (e.g., benign vs. malignant).

#### Choosing the Optimizer (`optimizer`):

- `optimizer = optim.Adam(model.parameters(), lr=0.001)`: Initializes the Adam optimizer with a learning rate of `0.001`.
- **Adam (Adaptive Moment Estimation)**: A popular optimizer that adjusts the learning rate during training, making it highly efficient and well-suited for deep learning.
- `model.parameters()`: Passes all parameters (weights and biases) of `model` to the optimizer, so Adam can adjust them to minimize the loss.
- **Learning Rate (`lr=0.001`)**: Specifies how fast or slow the optimizer updates model parameters. A smaller learning rate allows for gradual adjustments, which can improve accuracy.

### Summary

- **Model Initialization**: `SimpleNN` is instantiated with the correct input size based on the number of features.
- **Loss Function (`criterion`)**: Binary Cross-Entropy Loss is chosen for binary classification.
- **Optimizer (`optimizer`)**: Adam optimizer is initialized with a learning rate of `0.001`, making it adaptive for efficient training.

This setup ensures that the model, loss function, and optimizer are configured and ready for training.



## Step 6: Train the Model

Here we train the model for 10 epochs. During each epoch, we calculate the forward pass, compute the loss, perform the backward pass, and update the weights.


In [7]:

num_epochs = 10

for epoch in range(num_epochs):
    model.train()  # Set the model to training mode
    running_loss = 0.0
    
    for X_batch, y_batch in train_loader:
        # Forward pass
        outputs = model(X_batch).squeeze()  # Remove extra dimension
        loss = criterion(outputs, y_batch)
        
        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
    
    avg_loss = running_loss / len(train_loader)
    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {avg_loss:.4f}')


Epoch [1/10], Loss: 0.6733
Epoch [2/10], Loss: 0.5622
Epoch [3/10], Loss: 0.4635
Epoch [4/10], Loss: 0.3928
Epoch [5/10], Loss: 0.3342
Epoch [6/10], Loss: 0.2954
Epoch [7/10], Loss: 0.2461
Epoch [8/10], Loss: 0.2248
Epoch [9/10], Loss: 0.1947
Epoch [10/10], Loss: 0.1822


Here's an explanation of each part of this training loop. This code iteratively trains the neural network over a set number of epochs by calculating the loss, performing backpropagation, and updating the model's parameters.

### Explanation

#### Setting the Number of Epochs:

- `num_epochs = 10`: Specifies the number of times the model will iterate over the entire training dataset. In this case, the model will train for 10 epochs.

#### Outer Loop (Epoch Loop):

- `for epoch in range(num_epochs)`: Loops through each epoch. During each epoch, the model will see every sample in the training dataset once.
- `model.train()`: Puts the model into "training mode." In PyTorch, this enables certain behaviors specific to training, such as dropout (if used).

#### Tracking Loss:

- `running_loss = 0.0`: Initializes a variable to accumulate the total loss for each epoch, which is later used to calculate the average loss for that epoch.

#### Inner Loop (Batch Loop):

- `for X_batch, y_batch in train_loader`: Loops over each batch of data in the `train_loader`.
- Each iteration loads a small subset (batch) of data (`X_batch` and `y_batch`) from the training dataset. This batch size is specified when we defined `train_loader` earlier.

#### Forward Pass:

- `outputs = model(X_batch).squeeze()`: Passes the batch of inputs (`X_batch`) through the model to obtain predictions (`outputs`).
- `.squeeze()`: Removes extra dimensions from `outputs` for compatibility with the loss function, especially if the output has a single dimension (as is common in binary classification).

#### Calculate Loss:

- `loss = criterion(outputs, y_batch)`: Calculates the difference between the predicted outputs and the actual targets (`y_batch`) using the loss function defined earlier (Binary Cross-Entropy Loss).

#### Backward Pass and Optimization:

- `optimizer.zero_grad()`: Resets the gradients of all model parameters to zero before backpropagation. This is necessary because, by default, PyTorch accumulates gradients.
- `loss.backward()`: Performs backpropagation, calculating the gradients of the loss with respect to each model parameter.
- `optimizer.step()`: Updates the model parameters based on the gradients, using the optimization algorithm defined (Adam).

#### Accumulate Loss:

- `running_loss += loss.item()`: Accumulates the batch loss for calculating the average loss for the epoch. `loss.item()` returns the scalar value of the loss.

#### Calculate and Print Average Loss:

- `avg_loss = running_loss / len(train_loader)`: Calculates the average loss for the entire epoch by dividing the accumulated loss by the number of batches.
- `print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {avg_loss:.4f}')`: Prints the current epoch and the average loss for that epoch, which is a useful metric to monitor training progress.

### Summary

The code trains the model for 10 epochs.  
In each epoch, the model:
- Passes batches of data through the network to compute predictions.
- Calculates the loss and updates model parameters to minimize the loss.
- Tracks and displays the average loss for each epoch, which is a useful indicator of training progress and model performance.



## Step 7: Evaluate the Model

After training, we evaluate the model's accuracy on the test set.


In [8]:

model.eval()  # Set the model to evaluation mode
with torch.no_grad():  # No need to track gradients
    correct = 0
    total = 0
    for X_batch, y_batch in test_loader:
        outputs = model(X_batch).squeeze()
        predicted = (outputs > 0.5).float()  # Convert probabilities to binary labels
        total += y_batch.size(0)
        correct += (predicted == y_batch).sum().item()
        
    accuracy = correct / total
    print(f'Test Accuracy: {accuracy * 100:.2f}%')


Test Accuracy: 96.49%


### Explanation

#### Setting the Model to Evaluation Mode:

- `model.eval()`: Switches the model to evaluation mode.
- In PyTorch, this mode disables behaviors specific to training, such as dropout and batch normalization, which helps the model provide stable and reliable predictions on test data.

#### Disabling Gradient Calculation:

- `with torch.no_grad()`: Disables gradient tracking within this code block.
- During evaluation, we don’t need to calculate gradients or update weights, so this saves memory and computation. It also ensures the model only makes predictions.

#### Initializing Counters:

- `correct = 0` and `total = 0`: These variables count the number of correct predictions and the total number of samples, respectively, to calculate the accuracy.

#### Loop Over Test Data:

- `for X_batch, y_batch in test_loader`: Iterates over each batch in the `test_loader`.
- Each batch contains a subset of test data (`X_batch` with inputs and `y_batch` with labels).

#### Forward Pass:

- `outputs = model(X_batch).squeeze()`: Passes the batch of inputs through the model to get predictions. The `.squeeze()` removes any extra dimensions in `outputs` to match the labels' dimensions.

#### Convert Probabilities to Binary Labels:

- `predicted = (outputs > 0.5).float()`: Converts the output probabilities to binary labels.
  - Since this is a binary classification task, a threshold of 0.5 is used. Probabilities above 0.5 are classified as 1 (positive class), and those 0.5 or below are classified as 0 (negative class).
  - `.float()` converts the binary output to floating-point numbers to match the label format.

#### Calculate Accuracy:

- `total += y_batch.size(0)`: Increments the total count by the number of samples in the batch.
- `correct += (predicted == y_batch).sum().item()`: Compares `predicted` to `y_batch` and sums the number of correct predictions.
  - `(predicted == y_batch)` creates a tensor of Boolean values (True for correct predictions and False for incorrect).
  - `.sum()` counts the number of `True` values, and `.item()` converts this count to a Python number.

#### Compute and Print Accuracy:

- `accuracy = correct / total`: Calculates the accuracy by dividing the number of correct predictions by the total number of samples.
- `print(f'Test Accuracy: {accuracy * 100:.2f}%')`: Prints the accuracy as a percentage to two decimal places, providing a clear metric of the model's performance on the test set.

### Summary

This evaluation code:
- Switches the model to evaluation mode and turns off gradient tracking to optimize performance.
- Iterates over the test dataset in batches, making predictions and calculating accuracy by comparing predicted labels with actual labels.
- Prints the final accuracy on the test set, providing a clear performance metric for the model.



## Step 8: Make Predictions

Finally, we make a prediction on a sample from the test set to see the model in action.


In [9]:

with torch.no_grad():
    # Use a sample from the test set for prediction
    sample = X_test_tensor[0].unsqueeze(0)  # Reshape to match model input
    prediction = model(sample).item()  # Get probability
    label = 1 if prediction > 0.5 else 0  # Convert to binary label
    print(f'Predicted Label: {label}, Probability: {prediction:.4f}')


Predicted Label: 1, Probability: 0.7318


### Explanation

#### Disabling Gradient Calculation:

- `with torch.no_grad()`: Disables gradient calculation within this block.
- Since we’re only making predictions (not training), we don’t need gradients, which reduces memory usage and speeds up computations.

#### Selecting and Reshaping a Sample:

- `sample = X_test_tensor[0].unsqueeze(0)`: Selects the first sample in the test set (`X_test_tensor[0]`) and reshapes it to add a batch dimension.
- `.unsqueeze(0)`: Adds an extra dimension to `sample`, making it compatible with the model’s expected input shape.
- This reshaping ensures that the sample is treated as a single batch, matching the model’s input format even when using only one sample.

#### Making a Prediction:

- `prediction = model(sample).item()`: Passes the sample through the model to generate a prediction.
- `model(sample)` outputs a probability value since our model’s output layer uses a sigmoid function.
- `.item()`: Extracts this probability as a standard Python number, which is easier to work with.

#### Converting the Probability to a Binary Label:

- `label = 1 if prediction > 0.5 else 0`: Converts the probability to a binary label based on a threshold of 0.5.
- If `prediction` is greater than 0.5, the sample is classified as `1` (e.g., positive class); otherwise, it is classified as `0` (negative class).

#### Printing the Result:

- `print(f'Predicted Label: {label}, Probability: {prediction:.4f}')`: Displays the predicted label (binary) and the prediction probability (formatted to four decimal places).
- This output gives a clear idea of the model’s confidence in its prediction.

### Summary

- This code selects a single sample from the test set, reshapes it, and uses the model to make a prediction.
- The output of the model is interpreted as a probability for binary classification, with probabilities above 0.5 classified as `1` (positive) and others as `0` (negative).
- Disabling gradient calculation and reshaping the sample make the prediction efficient and compatible with the model’s input requirements.
