-
Problem 1 demonstrates the use of second order methods to calculate an optimal learning rate for gradient descent. We see how convergence is affected with change in the learning rate.
-
In Problems 2 and 3, various kinds of algorithms for optimizing functions have been tried out. A contrast has been shown among normal gradient descent, gradient descent with Polyak's learning rate, Nesterov's accelerated gradient descent, and the Adam optimizer. All of these have been implemented from scratch and applied for regression on two bivariate functions with MSE loss.
-
Problem 4 shows how data normalization can lead to faster training. A further analysis on the structure of datasets and 'good' learning rates has been provided.
-
Problem 5 explores the gradient ascent technique to calculate the local maxima of functions.
-
Problem 6 shows the use of Rprop and Quickprop for a regression task. We compare a one hidden-layer neural network architecture for the task, with different number of hidden neurons, different activation functions, with normal batch backpropagation, Rprop and Quickprop. The dataset used for this problem is the Concrete Compressive Strength dataset which can be found here.
-
Notifications
You must be signed in to change notification settings - Fork 1
sayarghoshroy/Optimization_and_Learning
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Concepts and algorithms in core learning theory
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published