# Building the Network

MindSpore encapsulates APIs for building network layers in the nn module. 
Different types of neural network layers are built by calling these APIs. 

## Step 1 : Build a fully-connected layer. 
In this step, we build a fully-connected layer using the `mindspore.nn.Dense` class.

- **Input Tensor:** A 2x3 matrix (`input_a`), representing two samples with three features each.
- **Fully-Connected Layer:** 
  - **in_channels:** 3 (matching the number of input features)
  - **out_channels:** 3 (the number of neurons in the layer)
  - **weight_init:** Initialized with the value `1` for simplicity.

The fully-connected layer (`Dense`) performs a linear transformation on the input tensor, producing an output tensor. The shape of the output tensor is determined by the number of samples (rows) and the `out_channels` specified in the layer.

This layer is a fundamental building block in neural networks, used to learn linear relationships between input features and produce the desired output dimensions.

In [1]:
import mindspore as ms 
import mindspore.nn as nn 
from mindspore import Tensor 
import numpy as np 

# Construct the input tensor. 
input_a = Tensor(np.array([[1, 1, 1], [2, 2, 2]]), ms.float32) 
print(input_a) 

# Construct a fully-connected network. Set both in_channels and out_channels to 3. 
net = nn.Dense(in_channels=3, out_channels=3, weight_init=1) 
output = net(input_a) 
print(output)

[[1. 1. 1.]
 [2. 2. 2.]]


## Step 2 : Build a convolutional layer

### Convolutional Layer Example

In this example, a 2D convolutional layer is created using MindSpore's `nn.Conv2d` class. The parameters for the convolutional layer are as follows:

- **Input Channels:** 1 (e.g., a grayscale image)
- **Output Channels:** 6 (number of filters)
- **Kernel Size:** 5x5
- **Bias:** Not used (`has_bias=False`)
- **Weight Initialization:** Normal distribution (`weight_init='normal'`)
- **Padding Mode:** 'Valid', meaning no padding is added (`pad_mode='valid'`)

An input tensor `input_x` is created with the shape `[1, 1, 32, 32]`, representing a batch size of 1, with a single channel, and a 32x32 spatial dimension (e.g., a 32x32 grayscale image).

The shape of the output from the convolutional layer is then printed.

The output shape will be `[1, 6, 28, 28]`, where:
- 1 is the batch size,
- 6 is the number of output channels (filters),
- 28x28 is the spatial dimension of the output, reduced due to the convolution with the 5x5 kern and no padding.



In [3]:
conv2d = nn.Conv2d(1, 6, 5, has_bias=False, weight_init='normal', pad_mode='valid')
input_x = Tensor(np.ones([1, 1, 32, 32]), ms.float32)

print(conv2d(input_x).shape)

## Step 3 : Build a ReLU layer

In this step, we apply the Rectified Linear Unit (ReLU) activation function using the `mindspore.nn.ReLU` class.

- **ReLU Function:** This activation function outputs the input directly if it is positive; otherwise, it outputs zero. It is commonly used in neural networks to introduce non-linearity.

- **Input Tensor:** A 1D tensor (`input_x`) with values `[-1, 2, -3, -1]`, represented as `float16`.

- **Output:** After applying the ReLU function, all negative values in the input tensor are replaced with `0`, resulting in `[0, 2, 0, 0]`.

The ReLU function helps in addressing the vanishing gradient problem and is widely used in deep learning models due to its simplicity and effectiveness.

In [5]:
relu = nn.ReLU()
input_x = Tensor(np.array([-1, 2, -3, -1]), ms.float16)

output = relu(input_x)

print(output)

## Step 4 : Build a Pooling Layer
In this step, we use the `mindspore.nn.MaxPool2d` class to apply the max pooling operation on a 2D input tensor.

- **Max Pooling:** This operation reduces the spatial dimensions of the input by selecting the maximum value from each region defined by the `kernel_size`. This helps in downsampling the input, reducing the number of parameters, and controlling overfitting.

- **Kernel Size:** 2x2, meaning that the pooling operation will consider non-overlapping 2x2 regions from the input tensor.
- **Stride:** 2, which dictates the step size for the pooling window, effectively halving the spatial dimensions.

- **Input Tensor:** A 4D tensor (`input_x`) with a shape of `[1, 6, 28, 28]`, representing:
  - 1 batch size,
  - 6 channels (e.g., feature maps),
  - 28x28 spatial dimensions.

- **Output Shape:** After applying the max pooling operation, the spatial dimensions are reduced by half, resulting in an output shape of `[1, 6, 14, 14]`. The number of channels remains unchanged.

Max pooling is commonly used in convolutional neural networks (CNNs) to reduce the spatial dimensions of feature maps while retaining the most important information.

In [7]:
max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
input_x = Tensor(np.ones([1, 6, 28, 28]), ms.float32)

print(max_pool2d(input_x).shape)

## Step 5 : Build a Flatten Layer
In this step, we use the `mindspore.nn.Flatten` class to flatten a multi-dimensional input tensor into a 2D tensor.

- **Flatten Operation:** The flattening operation reshapes the input tensor by collapsing all dimensions except the batch size into a single dimension. This is typically used before feeding the data into fully connected layers.

- **Input Tensor:** A 4D tensor (`input_x`) with a shape of `[1, 16, 5, 5]`, representing:
  - 1 batch size,
  - 16 channels,
  - 5x5 spatial dimensions.

- **Output Shape:** After flattening, the output tensor has a shape of `[1, 400]`, where 400 is the product of the remaining dimensions (`16 * 5 * 5`).

Flattening is essential in transitioning from convolutional layers to fully connected layers in a neural network, as it converts the spatial data into a format suitable for further processing.

In [9]:
flatten = nn.Flatten()
input_x = Tensor(np.ones([1, 16, 5, 5]), ms.float32)
output = flatten(input_x)

print(output.shape)

## Define a model class and view parameters
The Cell class of MindSpore is the base class for building all network and the basic unit of a network.
When a neural network is required, you need to inherit the Cell class and overwrite the __init__ and construct methods.

In this step, we define a neural network model using MindSpore by creating a custom class that inherits from the `nn.Cell` base class. The `Cell` class is the foundation for building all neural networks in MindSpore and serves as the basic unit of a network.

### Key Components:
- **Inheritance from `nn.Cell`:** The `LeNet5` class inherits from `nn.Cell`, which allows us to leverage MindSpore's framework for building and managing neural networks.
- **`__init__` Method:** The constructor method is overridden to define the network's layers and operations, including convolutional layers, fully connected layers, ReLU activation, max pooling, and flattening.
- **`construct` Method:** This method is overridden to specify how the input data flows through the network, defining the forward pass of the model.

### Network Structure:
- **Convolutional Layers (`conv1`, `conv2`):** These layers extract features from the input using filters.
- **Max Pooling (`max_pool2d`):** Reduces the spatial dimensions while retaining the most important features.
- **Fully Connected Layers (`fc1`, `fc2`, `fc3`):** These layers combine the features and map them to the final output.
- **ReLU Activation (`relu`):** Introduces non-linearity to the model.
- **Flattening (`flatten`):** Converts the 2D feature maps into a 1D vector for the fully connected layers.

### Model Instantiation and Viewing Parameters:
After defining the `LeNet5` model, we instantiate it and use the `parameters_and_names` method to view the model's parameters. This method provides insight into the model's internal structure, including the weights and biases of each layer.

The `LeNet5` network structure follows the classic design used in image classification tasks, particularly inspired by the LeNet architecture, and is often used as a baseline model for simple datasets like MNIST.

### Description:
- **Input Image:** A grayscale image with dimensions 1x28x28.
- **Conv1:** First convolutional layer with 6 filters of size 5x5, outputting feature maps of size 6x24x24.
- **ReLU Activation:** Applies the ReLU function to introduce non-linearity.
- **Max Pooling:** Reduces spatial dimensions by applying 2x2 pooling, resulting in feature maps of size 6x12x12.
- **Conv2:** Second convolutional layer with 16 filters of size 5x5, outputting feature maps of size 16x8x8.
- **ReLU Activation:** Applies the ReLU function.
- **Max Pooling:** Further reduces dimensions to 16x4x4.
- **Flatten:** Converts the 3D feature maps into a 1D vector of size 256.
- **FC1:** Fully connected layer with 120 neurons.
- **ReLU Activation:** Applies the ReLU function.
- **FC2:** Fully connected layer with 84 neurons.
- **ReLU Activation:** Applies the ReLU function.
- **FC3:** Final fully connected layer with a number of neurons equal to the number of output classes (`num_class`).

Below is a graphical representation of the LeNet5 architecture defined in the code:

```plaintext
Input (1x28x28)       # Input image with 1 channel (e.g., grayscale) and 28x28 pixels

       |
       v

+-----------------+
| Conv2d (6x5x5)  |  # 6 filters of size 5x5, valid padding
| Output: 6x24x24 |
+-----------------+

       |
       v

+-----------------+
| ReLU            |  # Activation function
+-----------------+

       |
       v

+-----------------+
| MaxPool2d (2x2) |  # Pooling with 2x2 kernel, stride 2
| Output: 6x12x12 |
+-----------------+

       |
       v

+-----------------+
| Conv2d (16x5x5) |  # 16 filters of size 5x5, valid padding
| Output: 16x8x8  |
+-----------------+

       |
       v

+-----------------+
| ReLU            |  # Activation function
+-----------------+

       |
       v

+-----------------+
| MaxPool2d (2x2) |  # Pooling with 2x2 kernel, stride 2
| Output: 16x4x4  |
+-----------------+

       |
       v

+-----------------+
| Flatten         |  # Flatten the output to 1D
| Output: 16*4*4  |
+-----------------+

       |
       v

+-----------------+
| Dense (120)     |  # Fully connected layer with 120 neurons
| Output: 120     |
+-----------------+

       |
       v

+-----------------+
| ReLU            |  # Activation function
+-----------------+

       |
       v

+-----------------+
| Dense (84)      |  # Fully connected layer with 84 neurons
| Output: 84      |
+-----------------+

       |
       v

+-----------------+
| ReLU            |  # Activation function
+-----------------+

       |
       v

+-----------------+
| Dense (num_class)| # Fully connected layer with `num_class` neurons
| Output: num_class|
+-----------------+

       |
       v

Output (num_class)  # Final output layer providing class probabilitiesclass (10 for MNIST)
        |
        V
  Output
s) -> Output: num_class dimensions



In [11]:
class LeNet5(nn.Cell): 
    """ Lenet network structure """ 
    def __init__(self, num_class=10, num_channel=1): 
        super(LeNet5, self).__init__() 
        
        # Define the required operations. 
        self.conv1 = nn.Conv2d(num_channel, 6, 5, pad_mode='valid') 
        self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid') 
        self.fc1 = nn.Dense(16 * 4 * 4, 120)
        self.fc2 = nn.Dense(120, 84) 
        self.fc3 = nn.Dense(84, num_class) 
        self.relu = nn.ReLU() 
        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2) 
        self.flatten = nn.Flatten() 
    def construct(self, x): 
        # Use the defined operations to build a feedforward network. 
        x = self.conv1(x) 
        x = self.relu(x) 
        x = self.max_pool2d(x) 
        x = self.conv2(x) 
        x = self.relu(x) 
        x = self.max_pool2d(x) 
        x = self.flatten(x) 
        x = self.fc1(x) 
        x = self.relu(x) 
        x = self.fc2(x) 
        x = self.relu(x) 
        x = self.fc3(x) 
        return x 
# Instantiate the model and use the parameters_and_names method to view the model parameters. 
modelle = LeNet5() 
for m in modelle.parameters_and_names(): 
    print(m)