## __Gradient Descent with Momentum__

Problem with the gradient descent algorithm:

- The progression of the search can fluctuate within the search space based on the gradient.
- This behavior can impede the progress of the search, particularly in optimization problems where the overall trend or shape of the search space is more valuable than specific gradients encountered along the way.

Momentum
- It serves as an extension to the gradient descent optimization algorithm, aiming to expedite the optimization process by incorporating historical information into the parameter update equation.
- This is achieved by considering the gradients encountered in previous updates.
- In this approach, an additional hyperparameter is introduced to govern the degree of historical momentum included in the update equation.


## Steps to Be Followed:
1.  Importing the required libraries
2.  Defining the objective function
3.  Defining the gradient descent algorithm


### Step 1: Importing the Required Libraries

- Import **numpy.asarray** to convert input data into an array
- Import **numpy.random.rand** to generate random numbers from a uniform distribution
- Import **numpy.random.seed** to set the seed for reproducible random number generation
- Import **numpy.arange** to create an array of values within a specified range
- Import **matplotlib.pyplot** that provides functions for creating plots and visualizations


In [None]:
from numpy import asarray
from numpy.random import rand
from numpy.random import seed
from numpy import arange
from matplotlib import pyplot

### Step 2: Defining the Objective Function
- The function **objective** takes a single input parameter **x**.
- It returns the square of the input value as the output, representing the objective function.

In [None]:
def objective(x):
    return x**2.0


### Step 3: Defining the Gradient Descent Algorithm

- You must calculate the square of **x**, representing the objective function.
- The derivative(x) function computes the derivative of x with respect to the objective function.
- The **gradient_descent(objective, derivative, bounds, n_iter, step_size, momentum)** function implements the gradient descent algorithm. It initializes a solution within the specified bounds and iteratively updates it based on the objective and derivative functions. The function also tracks and stores the solutions and their corresponding scores.
- The random seed is set to 4 using seed(4) to ensure reproducibility.
- The bounds variable defines the lower and upper bounds for the solution space.
- Parameters such as the number of iterations (n_iter), step size (step_size), and momentum (momentum) are specified.
- The **gradient_descent** function is called with the provided arguments, and the resulting solutions and scores are stored.
- An array of input values (inputs) is generated using **arange** within the defined bounds.
- The objective function values (results) are computed for the input values.
- The objective function curve is plotted using **pyplot.plot** with inputs on the x-axis and results on the y-axis.
- The optimization path is visualized by plotting the solutions and scores as red dots connected by lines using pyplot.plot.
- Finally, **pyplot.show()** is called to display the plot.

In [None]:

def derivative(x):
    return x * 2.0

def gradient_descent(objective, derivative, bounds, n_iter, step_size, momentum):
    solutions, scores = list(), list()
    solution = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])

    change = 0.0

    for i in range(n_iter):

        gradient = derivative(solution)

        new_change = step_size * gradient + momentum * change

        solution = solution - new_change

        change = new_change

        solution_eval = objective(solution)

        solutions.append(solution)
        scores.append(solution_eval)

        print('>%d f(%s) = %.5f' % (i, solution, solution_eval))
    return [solutions, scores]


seed(4)

bounds = asarray([[-1.0, 1.0]])

n_iter = 30

step_size = 0.1
momentum = 0.3

inputs = arange(bounds[0,0], bounds[0,1] + 0.1, 0.1)
results = objective(inputs)
pyplot.plot(inputs, results)
pyplot.show()

solutions, scores = gradient_descent(objective, derivative, bounds, n_iter, step_size, momentum)
pyplot.plot(inputs, results)
pyplot.plot(solutions, scores, '.-', color='red')
pyplot.show()

**Observation**
- The code snippet visualizes the convergence of the gradient descent algorithm by plotting the objective function and the solutions found at each iteration, providing a graphical representation of the optimization process.