# **Gradient Descent**

##**What are Optimizers?**

Optimizers are the ones that are used to reduce the loss in the model or to reduce the error rate made by deep learning models. The less the error rate better will be the performance of the model. There are several different types of optimizers that are used while compiling the models. Some of them include gradient descent, stochastic gradient descent, adam, etc. All these are used to optimize the performance of the model. They are commonly defined after defining the model structure. Refer to the below code to understand more about defining these. 

##**What is Gradient Descent? How does it work?**


It is the most preferred optimizer that is used to optimize a deep learning model. It uses optimization algorithms to reduce the error and find the minimum values for a function. Gradient descent makes use of derivatives to reach the minima of a function. Also, there are steps that are taken to reach the minimum point which is set by defining the learning rate. It decides how many steps to take to reach the minima. If we define a big value to the learning rate we may exceed the minima of the function whereas if we define it to be very small then it would consume much time to reach the target. There can be chances that gradient descent will miss out on the target if the learning rate is very high. 

The role of derivatives in optimization algorithms is to decide whether to increase or decrease the weights resulting in increasing or decreasing the loss function or cost function. We cannot train a neural network without defining the optimizer and loss functions. They are the mandatory parameters that need to be set while compiling a deep learning model. 

## **How to implement Gradient Descent in python?**

Now we will see how gradient descent can be implemented in python. We will start by defining the required library first that would be used for numerical calculation and for plotting the graphs. Refer to the below code for the same.

In [None]:
!python -m pip install pip --upgrade --user -q
!python -m pip install numpy pandas seaborn matplotlib scipy statsmodels sklearn --user -q

In [None]:
import IPython
IPython.Application.instance().kernel.do_shutdown(True)

In [1]:
import numpy as np
import matplotlib.pyplot as plt
#Now we will define a function f as a quadratic function and function to compute its gradient. Refer to the below code for the same.
def function(x,a): 
    f = a[2]*x*x + a[1]*x + a[0] 
    return f
def grad(x,a): 
    g = 2*a[2]*x + a[1]
    return g

Now we will plot this function before we compute its minima. Use the below code to do the same. 

In [None]:
x = np.array([-3,-2,-1,0,1,2,3,4,5,6])
a = np.array([-3, -2, 3]) 
f = function(x,a)
plt.scatter(x,f)
plt.plot(x,f)
plt.xlabel('X')
plt.ylabel('f(X)')

We have values on the X-axis and f(x) on the y-axis. Now let’s define how to use gradient descent to find the minimum. Use the below code for the same. We will first define the starting point, learning rate, and the parameter to stop it like iterations or if the value does not change then it should stop. 

In [None]:
x = 8 

lr = 0.001

change = 1e-5

max_iteration = 500

We have defined X_series the variable to check how the value of x is getting changed. Then in the loop, we have defined the function f at any point(x, a) followed by computing its gradient and then getting the changed values of x which gets computed by subtracting the original value of x from the product of the learning rate and gradient. Then we will define the condition to stop the loop by making use of maximum iteration and change that was previously defined. At last, we are plotting the values. Refer to the below code for the same. 

In [None]:
series = [x]
iterations = 1
while True:
    f = function(x,a)
    g = grad(x,a)
    new_x = x - lr * g
    if np.sum(abs(new_x - x)) < change:
            break
            
            if iterations > max_iteration:
              break
    if iterations % (max_iteration/10) == 0:
            plt.scatter(x, f, marker='*')
            plt.plot(x, f)
            plt.xlabel('X')
            plt.ylabel('f(X)')
    iterations += 1
    x = new_x
    series = np.concatenate((series,[x]))

Now let us see the minimum value of X after iterations. We will check this by printing the min value of the series we defined before. 

In [None]:
print(series.min())

Please check the article [here](https://analyticsindiamag.com/gradient-descent-everything-you-need-to-know-with-implementation-in-python/).