<a href="https://colab.research.google.com/github/codebizpro/deep_learning_using_neural_networks/blob/main/Deep_Learning_using_Neural_Networks_A.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# @title
# This code will start tracking time, when notebook starts to execute.
# In the end we want to evaluate exection time.
import time
start_time = time.time()

**Deep Learning:**

Deep learning represents the forefront of artificial intelligence, leveraging sophisticated `neural network` models inspired by the human brain to autonomously learn from vast amounts of data, extract intricate patterns, and perform complex tasks with unprecedented accuracy. Its transformative impact spans diverse domains, from image and speech recognition to natural language processing and autonomous driving, fueling innovation and revolutionizing industries worldwide. Enabled by exponential data growth, advancements in computational power, and interdisciplinary collaboration, deep learning continues to evolve rapidly, promising to unlock new frontiers in technology and shape the future of human endeavor.

**Tensor**:

The advent of deep learning has necessitated the development of a new data structure known as `tensors`, tailored specifically to accommodate the unique demands of this revolutionary field. Unlike traditional data types, tensors excel at handling multi-dimensional data, such as `images`, `videos`, and `sequential data`, which are prevalent in deep learning tasks. Their versatility and efficiency in storing and manipulating vast quantities of numerical data make tensors indispensable for representing `inputs`, `weights`, and `activations` within `neural networks`. Moreover, tensors facilitate seamless integration with specialized hardware accelerators like GPUs through frameworks such as PyTorch and TensorFlow, enabling lightning-fast computations crucial for training complex models on massive datasets. Thus, the emergence of tensors as a fundamental data structure underscores their indispensable role in powering the transformative capabilities of deep learning, facilitating advancements across diverse domains and driving innovation on a global scale.

**PyTorch:**

PyTorch stands as a pinnacle in the landscape of deep learning frameworks, renowned for its intuitive design, flexibility, and robust performance. Developed by Facebook's AI Research lab (FAIR), PyTorch offers a seamless and dynamic approach to building and deploying neural network models, empowering researchers, engineers, and practitioners to translate groundbreaking ideas into tangible solutions with unparalleled ease. Its dynamic computation graph mechanism facilitates agile experimentation and iterative model development, allowing for real-time adjustments and debugging. Additionally, PyTorch's comprehensive suite of libraries and tools, coupled with its vibrant community support, enables practitioners to tackle a diverse range of tasks, from image and speech recognition to natural language processing and reinforcement learning, with unparalleled efficiency and scalability. As a result, PyTorch has emerged as the framework of choice for cutting-edge research, industrial applications, and educational endeavors, driving innovation and shaping the future of artificial intelligence.

## Set up Environment

In [2]:
# Import PyTorch Package
import torch

# Import other packages
import numpy as np


## Create your tensors

We can create tensors either from a list or numpy.

In [3]:
# Create tensor from a list

data_01 = [ [1,2,3,4],
              [5,6,7,8]]
tensor_01 = torch.tensor(data_01)

print (tensor_01)

# Create tensor from numpy array

data_02 = [ [8,2,1,7],
              [0,4,6,1]]
data_numpy = np.array(data_02)
tensor_02 = torch.from_numpy(data_numpy)

print (tensor_02)

tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])
tensor([[8, 2, 1, 7],
        [0, 4, 6, 1]])


## Tensor Attributes

Just like other data structures, Tensors have several attributes like shape and data_type (dtype)


In [4]:
print(tensor_01.shape)

torch.Size([2, 4])


In [5]:
print (tensor_02.dtype)

torch.int64


As tensors are optimized for high performance computing. They can either run on 'cpu' or 'gpu'

In [6]:
print (tensor_01.device)

cpu


In [7]:
print (tensor_02.device)

cpu


By default, Tensors are created for 'cpu'. However, we can always create or copy tensors to run on 'gpu'.

For google colabs, we have to manually change the settings of the environment.

Step 1: Select Runtime

Step 2: Select Change Runtime Type

Step 3: Select available GPU under Hardware Accelerator

**Warning** Changing the hardware accelerator will delete existing runtime and start a new runtime.

If you are using a Nvidia GPU based environment, you can check status of your GPU.

In [8]:
# Check GPU information for the runtime

!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


CUDA (Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) model created by NVIDIA.

If you just want to know about current GPU device and available cuda cores, you can check this information.

In [9]:
# Available Cuda Cores

if torch.cuda.is_available():
    device = torch.cuda.current_device()
    properties = torch.cuda.get_device_properties(device)

    print("GPU Name:", properties.name)
    print("Number of CUDA Cores:", properties.multi_processor_count)
else:
    print("CUDA is not available.")

CUDA is not available.


As default Tensor device is 'cpu'. You can change this to 'cuda'. You can also copy existing Tensors to 'cuda' based Tesnors.

In [10]:
# Check if GPU is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

print("Using device:", device)


Using device: cpu


**Warning** Changing device to 'cuda' does not transfer existing tensors running on 'cpu.

In [11]:
print (tensor_01.device)

cpu


We can copy 'cpu' based Tensor to 'cuda' based tensor, and vice versa.

In [12]:
tensor_01 = tensor_01.to(device)
print (tensor_01.device)

cpu


## Tensor Operations

We can run arithmatic operations between tensors, if they have compatable dimensions.

**Warning** Make sure that all the tensors in an operation are using same device.

In [13]:
# Copy tensor_02 to 'cuda'

tensor_02 = tensor_02.to(device)

print (tensor_01 + tensor_02)
print (tensor_01 - tensor_02)
print (tensor_01 * tensor_02)
print (tensor_01 / tensor_02)


tensor([[ 9,  4,  4, 11],
        [ 5, 10, 13,  9]])
tensor([[-7,  0,  2, -3],
        [ 5,  2,  1,  7]])
tensor([[ 8,  4,  3, 28],
        [ 0, 24, 42,  8]])
tensor([[0.1250, 1.0000, 3.0000, 0.5714],
        [   inf, 1.5000, 1.1667, 8.0000]])


## Create Neural Network

In [14]:
# @title
from IPython.display import Image

# Display the image of Neural Network
Image(url="https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Colored_neural_network.svg/250px-Colored_neural_network.svg.png")


This is a simple neural network, with an input layer, a hidden layer and an output layer.

Let's build a similar netowrk.

In [15]:
# Import neural network from torch

import torch.nn as nn

In [16]:
# Create input tenser with three features and random values

input_tensor = torch.tensor(
    [[0.5, 0.4, -0.2]]
)

print (input_tensor)

tensor([[ 0.5000,  0.4000, -0.2000]])


In [17]:
# Create a hidden layer with 3 input features and 2 output features

hidden_layer = nn.Linear(3, 2)


In [18]:
# Check the output layer

output_layer = hidden_layer(input_tensor)

print (output_layer)

tensor([[0.1113, 0.4929]], grad_fn=<AddmmBackward0>)


We have used Linear Layer as Hidden Layer. We can check weights and bias of the layer.

In [19]:
print (hidden_layer.weight)

Parameter containing:
tensor([[ 0.0873, -0.5654, -0.1424],
        [ 0.3423,  0.5612, -0.1398]], requires_grad=True)


In [20]:
print (hidden_layer.bias)

Parameter containing:
tensor([0.2654, 0.0694], requires_grad=True)


In a neural network layer, the concepts of 'weights' and 'biases' play pivotal roles in shaping the model's ability to learn and make predictions. The **weights represent the strength of connections between neurons** in adjacent layers, essentially determining the impact of input features on the output. Each connection is associated with a weight, which is adjusted during the training process to minimize the difference between predicted and actual outputs. **Biases, on the other hand, represent the offset or baseline value added to the weighted sum of inputs before passing through the activation function**. They allow the model to capture patterns that may not be directly influenced by the input data. Both weights and biases are learnable parameters that are iteratively updated through optimization algorithms like gradient descent, enabling the neural network to gradually improve its performance over time by effectively capturing the underlying patterns in the data.

In PyTorch:

output = W0 @ input + b0

In general use caess, we use neural networks with multiple layers. We can add as many layers as our use case requires and our hardware supports.

These layers are stacked using Sequential class in torch.nn package.

Let's create a neural network with 8 input features, 2 output features, and 3 hidden layers.

In [21]:
# Create input tensor

input_tensor = torch.tensor([[0.5, 0.4, -0.2, 0.33, -0.67, -0.16, 0.55, 0.9]])

# Create first hidden layer with 8 input features and 6 output features
layer_a = nn.Linear (8, 6)

# Create second hidden layer wih 6 input features and 4 output features
layer_b = nn.Linear (6,4)

# Create third hidden layer with 4 input features and 2 output features
layer_c = nn.Linear (4,2)

# Stack sequence of layers to create a model

model = nn.Sequential (layer_a, layer_b, layer_c)

# Check output of the model
output_tensor = model(input_tensor)

print (output_tensor)



tensor([[-0.3206,  0.2988]], grad_fn=<AddmmBackward0>)


## Activation Functions

So far, we have only used linear layer networks.

Activation functions in PyTorch are essential components in neural networks that introduce non-linearity, allowing the network to learn complex patterns in data. These functions operate on tensors, which are multi-dimensional arrays representing data in PyTorch. Tensors can be scalars, vectors, matrices, or higher-dimensional arrays, and they form the backbone of computation in PyTorch. Activation functions like `ReLU` (Rectified Linear Unit), `sigmoid`, and `tanh` are applied element-wise to these tensors, transforming their values to introduce non-linearities in the network, facilitating better learning and representation of complex relationships within the data. Overall, activation functions and tensors are fundamental elements in PyTorch for building and training neural networks.

Activation functions are used at the output layer or last layer of the network.

**Interested to learn more?** Use this Wikipedia link. https://en.wikipedia.org/wiki/Activation_function





**Binary Classification using sigmoid**

Choice of Sigmoid function depends upon use case, as it transforms final output of the neural network. One simple example is `binary classification`, and most commonly used activation function is `sigmoid`.

In [22]:
# Define input tensor
input_tensor = torch.tensor([[0.3, 0.4, -0.8, 0.6]])

# Define 3 linear hidden layers
layer_01 = nn.Linear(4,4)
layer_02 = nn.Linear(4,2)
layer_03 = nn.Linear(2,1)

# Define final activation output layer
layer_activation = nn.Sigmoid()

# Use Sequential to create layerd network
model = nn.Sequential (layer_01, layer_02, layer_03, layer_activation)

# Use the model for input_tensor transformation
output = model(input_tensor)

print (output)

tensor([[0.3985]], grad_fn=<SigmoidBackward0>)


The output is similar to logistic regression. By defining the threshold value, we can classify the output between class A or class B. Recall famous example of `cat` or `dog` classification on images to use it as analogy for better understanding.

Rerun the above code multiple times. You will observe that output is not same. It changes everytime. That is due to the fact that weights and biases are assigned randomly.

## Multi Class Classification using `softmax`

Similar to sigmoid, we can use softmax for multi-class classification.

Let us do a classification for 4 classes (4 element vector of output layer).

In [23]:
# Define input tensor
input_tensor = torch.tensor([[0.3, 0.4, -0.8, 0.6]])

# Define 3 linear hidden layers
layer_01 = nn.Linear(4,6)
layer_02 = nn.Linear(6,6)
layer_03 = nn.Linear(6,4)

# Define final activation output layer
layer_activation = nn.Softmax(dim=-1) #dim = -1 indicates that activation function is applied to the last layer

# Use Sequential to create layerd network
model = nn.Sequential (layer_01, layer_02, layer_03, layer_activation)

# Use the model for input_tensor transformation
output = model(input_tensor)

print (output)

tensor([[0.2519, 0.2039, 0.2457, 0.2985]], grad_fn=<SoftmaxBackward0>)


Instead of `cat` and `dog` only. This time, we have 2 more classes like `rabbit` and `deer`. Use this analogy to understand above code.

**Fun Fact** The outputs are probabilities of the classes. Sum of the output vector is always 1. So, we have a freedom to choose number of output classes.

In [26]:
# @title
# Check endtime to check execution time of the notebook.
end_time = time.time()

execution_time = end_time - start_time
print("Total execution time: {:.2f} seconds".format(execution_time))


Total execution time: 190.07 seconds
