# What is Deep learning

To understand what deep learning is, we first need to understand the relationship deep learning has with machine learning, neural networks, and artificial intelligence.

**The problem**

In traditional programming, the programmer formulates rules and conditions in their code that their program can then use to predict or classify in the correct manner. 

For example, normally to build a classifier, a programmer comes up with a set of rules, and program those rules. Then when you want to classify things, you give it a piece of data, and rules are used to select the category. This approach might be successful for a variety of problems. But remember NOT ALL! How? Let’s see.

![aa.png](attachment:aa.png)

Image classification is one such problem that cannot be resolved using the traditional programming method. Imagine how would you write the code for image classification? How could a programmer write rules and regulations to predict images they have never seen before?

**Solution Found! What is it?**

Deep learning is the solution to the above problem. It is a very efficient technique in pattern recognition by trial and error. With the help of a deep neural network, we can train the network by providing a huge amount of data and providing feedback on its performance, the network can identify, through a huge amount of iteration, its own set of conditions by which it can act in the correct way.

Deep learning is a branch of Machine learning where higher levels of features from the data can be extracted using an Artificial neural network inspired by the working of a neural system in the human body.

Broadly speaking, deep learning is a more approachable name for an artificial neural network. The “deep” in deep learning refers to the depth of the network. An artificial neural network can be very shallow.

Neural networks are inspired by the structure of the cerebral cortex. At the basic level is the perceptron, the mathematical representation of a biological neuron. Like in the cerebral cortex, there can be several layers of interconnected perceptrons.

The first layer is the input layer. Each node in this layer takes an input, and then passes its output as the input to each node in the next layer. There are generally no connections between nodes in the same layer and the last layer produces the outputs.

We call the middle part the hidden layer. These neurons have no connection to the outside (e.g. input or output) and are only activated by nodes in the previous layer.

Image link:



**Important Defintion**

What is Deep Learning?

- Deep learning is a subset of machine learning, which is essentially a neural network with three or more layers. DL is mainly used to deal unstructured data.

- These neural networks attempt to simulate the behavior of the human brain—albeit far from matching its ability allowing it to “learn” from large amounts of data.

Note:- ANN is the core concept of anything in DL, CV or Neural Network.


## Why is Deep Learning Important?

![1.webp](attachment:1.webp)

Computers have long had techniques for recognizing features inside of images. The results weren’t always great. Computer vision has been a main beneficiary of deep learning. Computer vision using deep learning now rivals humans on many image recognition tasks.


Facebook has had great success with identifying faces in photographs by using deep learning. It’s not just a marginal improvement, but a game changer: “Asked whether two unfamiliar photos of faces show the same person, a human being will get it right 97.53 percent of the time. New software developed by researchers at Facebook can score 97.25 percent on the same challenge, regardless of variations in lighting or whether the person in the picture is directly facing the camera.”


Speech recognition is a another area that’s felt deep learning’s impact. Spoken languages are so vast and ambiguous. Baidu – one of the leading search engines of China – has developed a voice recognition system that is faster and more accurate than humans at producing text on a mobile phone. In both English and Mandarin.


What is particularly fascinating, is that generalizing the two languages didn’t require much additional design effort: “Historically, people viewed Chinese and English as two vastly different languages, and so there was a need to design very different features,” Andrew Ng says, chief scientist at Baidu. “The learning algorithms are now so general that you can just learn.”

Google is now using deep learning to manage the energy at the company’s data centers. They’ve cut their energy needs for cooling by 40%. That translates to about a 15% improvement in power usage efficiency for the company and hundreds of millions of dollars in savings.

## Neural Network

- Neuron- Just like a neuron forms the basic element of our brain, a neuron forms the basic structure of a neural network. Just think of what we do when we get new information. When we get the information, we process it and then we generate an output. Similarly, in the case of a neural network, a neuron receives an input, processes it and generates an output that is either sent to other neurons for further processing or is the final output.


- Weights – When input enters the neuron, it is multiplied by a weight. For example, if a neuron has two inputs, then each input will have has an associated weight assigned to it. We initialize the weights randomly and these weights are updated during the model training process. The neural network after training assigns a higher weight to the input it considers more important as compared to the ones which are considered less important. A weight of zero denotes that the particular feature is insignificant.


- Let’s assume the input to be a, and the weight associated to be W1. Then after passing through the node the input becomes a*W1



![bp](https://cdn.analyticsvidhya.com/wp-content/uploads/2017/05/21145421/Perceptron.bmp)

![0.gif](attachment:0.gif)

A neural network consists of interconnected neurons transferring information to each other, much like the human brain. Each neuron is assigned a value. 

A neural network has many layers. Each layer performs a specific function, and the complex the network is, the more the layers are. That’s why a neural network is also called a multi-layer perceptron.

The purest form of a neural network has three layers:

1. The input layer
2. The hidden layer
3. The output layer

As the names suggest, each of these layers has a specific purpose. These layers are made up of nodes. There can be multiple hidden layers in a neural network according to the requirements. The input layer picks up the input signals and transfers them to the next layer. It gathers the data from the outside world.

The hidden layer performs all the back-end tasks of calculation. A network can even have zero hidden layers. However, a neural network has at least one hidden layer. The output layer transmits the final result of the hidden layer’s calculation.

Like other machine learning applications, you will have to train a neural network with some training data as well, before you provide it with a particular problem. But before we go more in-depth of how a neural network solves a problem, you should know about the working of perceptron layers first:

#### Note

- Input Layer
This is the initial layer of the network which takes in an input that will be used to produce an output.


- Hidden Layer(s)
The network needs to have at least one hidden layer. The hidden layer(s) perform computations and operations on the input data to produce something meaningful.


- Output Layer
The neurons in this layer display a meaningful output.

## What are Neural Networks Used For?
Neural networks are employed across various domains for:

- Identifying objects, faces, and understanding spoken language in applications like self-driving cars and voice assistants.


- Analyzing and understanding human language, enabling sentiment analysis, chatbots, language translation, and text generation.


- Diagnosing diseases from medical images, predicting patient outcomes, and drug discovery.


- Predicting stock prices, credit risk assessment, fraud detection, and algorithmic trading.


- Personalizing content and recommendations in e-commerce, streaming platforms, and social media.


- Powering robotics and autonomous vehicles by processing sensor data and making real-time decisions.


- Enhancing game AI, generating realistic graphics, and creating immersive virtual environments.


- Monitoring and optimizing manufacturing processes, predictive maintenance, and quality control.


- Analyzing complex datasets, simulating scientific phenomena, and aiding in research across disciplines.


- Generating music, art, and other creative content.

## Types of Neural Network in Machine Learning
Explore different kinds of neural networks in machine learning in this section:

##### (i) ANN
ANN is also known as an artificial neural network. It is a feed-forward neural network because the inputs are sent in the forward direction. It can also contain hidden layers which can make the model even denser. They have a fixed length as specified by the programmer. It is used for Textual Data or Tabular Data. A widely used real-life application is Facial Recognition. It is comparatively less powerful than CNN and RNN.

##### (ii) CNN
Convolutional Neural Networks is mainly used for Image Data. It is used for Computer Vision. Some of the real-life applications are object detection in autonomous vehicles. It contains a combination of convolutional layers and neurons. It is more powerful than both ANN and RNN.

##### (iii) RNN
It is also known as Recurrent Neural Networks. It is used to process and interpret time series data. In this type of model, the output from a processing node is fed back into nodes in the same or previous layers. The most known types of RNN are LSTM (Long Short Term Memory) Networks.

## Open Source Deep Learning Frameworks

- 1. Tensorflow

- 2. Pytorch

- 3. Theano

- 4. Caff2

- 5. Keras

## What is Perceptron?
A Perceptron is an algorithm for supervised learning of binary classifiers. This algorithm enables neurons to learn and processes elements in the training set one at a time. Perceptron is also understood as an Artificial Neuron or neural network unit that helps to detect certain input data computations in business intelligence.

![2.webp](attachment:2.webp)

### The perceptron consists of 4 parts

**Input value or One input layer**: The input layer of the perceptron is made of artificial input neurons and takes the initial data into the system for further processing.

**Weights and Bias:**

**Weight:** It represents the dimension or strength of the connection between units. If the weight from node 1 to node 2 has a higher quantity, then neuron 1 has a more considerable influence on the neuron.

**Bias:** It is the same as the intercept added in a linear equation. It is an additional parameter in which are the task is to modify the output along with the weighted sum of the input to the other neuron.

**Net sum:** It calculates the total sum.

**Activation Function:** A neuron can be activated or not, is determined by an activation function. The activation function calculates a weighted sum and further adds bias with it to give the result.

**Question:** it is mandatory to use activation functions? 

The answer is YES! activation functions are the most vital components of an ANN. Without them, we will just be calculating linear data which is of no use.

## Multi-Layer Perceptron ( MLP )

**Multilayer Perceptron** also known as **Artificial Neural Networks** consists of more than one perception which is grouped together to form a multiple layer neural network.

![2.png](attachment:2.png)

###### In the above image, The Artificial Neural Network consists of four layers interconnected with each other:

- An input layer, with 6 input nodes
- Hidden Layer 1, with 4 hidden nodes/4 perceptrons
- Hidden layer 2, with 4 hidden nodes
- Output layer with 1 output node

![3.png](attachment:3.png)

## Step by Step Working of the Artificial Neural Network 

![4.jpg](attachment:4.jpg)

1. In the first step, Input units are passed i.e data is passed with some weights attached to it to the hidden layer. We can have any number of hidden layers. In the above image inputs x1,x2,x3,….xn is passed.


2. Each hidden layer consists of neurons. All the inputs are connected to each neuron.


3. After passing on the inputs, all the computation is performed in the hidden layer (Blue oval in the picture)

Computation performed in hidden layers are done in two steps which are as follows :

- First of all, all the inputs are multiplied by their weights. Weight is the gradient or coefficient of each variable. It shows the strength of the particular input. After assigning the weights, a bias variable is added. Bias is a constant that helps the model to fit in the best way possible.

**Z1 = W1*In1 + W2*In2 + W3*In3 + W4*In4 + W5*In5 + b**

W1, W2, W3, W4, W5 are the weights assigned to the inputs In1, In2, In3, In4, In5, and b is the bias.

- Then in the second step, the activation function is applied to the linear equation Z1. The activation function is a nonlinear transformation that is applied to the input before sending it to the next layer of neurons. The importance of the activation function is to inculcate nonlinearity in the model.

4. The whole process described in point 3 is performed in each hidden layer. After passing through every hidden layer, we move to the last layer i.e our output layer which gives us the final output.

The process explained above is known as forwarding Propagation.

5. After getting the predictions from the output layer, the error is calculated i.e the difference between the actual and the predicted output.

If the error is large, then the steps are taken to minimize the error and for the same purpose, Back Propagation is performed.


#### how it works

1. Information is fed into the input layer which transfers it to the hidden layer


2. The interconnections between the two layers assign weights to each input randomly


3. A bias added to every input after weights are multiplied with them individually


4. The weighted sum is transferred to the activation function


5. The activation function determines which nodes it should fire for feature extraction


6. The model applies an application function to the output layer to deliver the output


7. Weights are adjusted, and the output is back-propagated to minimize error

The model uses a cost function to reduce the error rate. You will have to change the weights with different training models.

1. The model compares the output with the original result


2. It repeats the process to improve accuracy


The model adjusts the weights in every iteration to enhance the accuracy of the output.

## What is Back Propagation and How it works?
**Back Propagation is the process of updating and finding the optimal values of weights or coefficients which helps the model to minimize the error i.e difference between the actual and predicted values.**

But here are the question is: How the weights are updated and new weights are calculated?

**The weights are updated with the help of optimizers**. Optimizers are the methods/ mathematical formulations to change the attributes of neural networks i.e weights to minimizer the error.

##### Back Propagation with Gradient Descent 
Gradient Descent is one of the optimizers which helps in calculating the new weights. Let’s understand step by step how Gradient Descent optimizes the cost function.

In the image below, the curve is our cost function curve and our aim is the minimize the error such that Jmin i.e global minima is achieved.

![5.png](attachment:5.png)

#### Steps to achieve the global minima:

1. First, **the weights are initialized randomly** i.e random value of the weight, and intercepts are assigned to the model while forward propagation and the errors are calculated after all the computation. (As discussed above)


2. Then the **gradient is calculated i.e derivative of error w.r.t current weights**


3. Then new weights are calculated using the below formula, where **a is the learning rate** which is the parameter also known as step size to control the speed or steps of the backpropagation. It gives additional control on how fast we want to move on the curve to reach global minima.


![6.png](attachment:6.png)


4.This process of calculating the new weights, then errors from the new weights, and then updation of weights **continues till we reach global minima and loss is minimized.** 

A point to note here is that the learning rate i.e a in our weight updation equation should be chosen wisely. Learning rate is the amount of change or step size taken towards reaching global minima. It should not be very small as it will take time to converge as well as it should not be very large that it doesn’t reach global minima at all. Therefore, the learning rate is the hyperparameter that we have to choose based on the model.

![7.png](attachment:7.png)

# Happy Learning!