Optimization and Learning

The first four problems

The fifth problem

The final problem

Problem 1 demonstrates the use of second order methods to calculate an optimal learning rate for gradient descent. We see how convergence is affected with change in the learning rate.
In Problems 2 and 3, various kinds of algorithms for optimizing functions have been tried out. A contrast has been shown among normal gradient descent, gradient descent with Polyak's learning rate, Nesterov's accelerated gradient descent, and the Adam optimizer. All of these have been implemented from scratch and applied for regression on two bivariate functions with MSE loss.
Problem 4 shows how data normalization can lead to faster training. A further analysis on the structure of datasets and 'good' learning rates has been provided.
Problem 5 explores the gradient ascent technique to calculate the local maxima of functions.
Problem 6 shows the use of Rprop and Quickprop for a regression task. We compare a one hidden-layer neural network architecture for the task, with different number of hidden neurons, different activation functions, with normal batch backpropagation, Rprop and Quickprop. The dataset used for this problem is the Concrete Compressive Strength dataset which can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Concrete_Compressive_Strength_Regression.ipynb		Concrete_Compressive_Strength_Regression.ipynb
Concrete_Data.xls		Concrete_Data.xls
Optimization_Learning.ipynb		Optimization_Learning.ipynb
README.md		README.md
dataset_explanation.txt		dataset_explanation.txt
gradient_ascent.ipynb		gradient_ascent.ipynb
results.csv		results.csv

Provide feedback