GradientDescentImplementation

This project is an implementation of basic machine learning algorithms.
The problem statement included a diabetes dataset.
The dataset included the following columns:
- Pregnancies
- Glucose
- BloodPressure
- SkinThickness
- Insulin
- BMI
- DiabetesPedigreeFunction
- Age
- Outcome(Whether the patient has diabetes or not)

keywords: Gradient Descent, Polynomial Regression, Classification
Libraries Used:

numpy was used for faster computation and matrix operations
sklearn was used for splitting data into training and testing sets
matplotlib was used for data visualization

Gradient Descent

Use: This algorithm very cleverly uses the defenition of the gradient to update weights of a particular regression problem correctly. The gradient is defined as the direction of steepest increase. The algorithm takes advantage of this and goes in the direction opposite to that given by the gradient. Thus the weights move closer to a minima.
The Sum of Squared Errors was used
We performed linear regression to predict the outcome column values. Then we used a discriminant function to label all samples giving value of outcome greater than or equal to 0.5 to class 1 and others to class 0.

Batch-Gradient:

The weights are updated after each epoch
Computation was done in the form of matrices
The gradient is calculated as J(w) = X^TXw - X^Ty where X is the input matrix, w is the weight vector and y is the output vector
After calculating J(w), w = w - a*J(w) where a is the learning rate

Stochastic-Gradient:

The weights are updated for each sample in the dataset in all the epochs
weights are modified using the equation w_i = w_i - a*(y'-y)*x_i where y' represents the predicted value for the i^th data sample

Polynomial Regression

Use: This algorithm is used in case of a polynomial relation between the input variable and the output variable. It uses the same intuition as normal linear regression however the features are powers of the sample variable
Regularization: If we talk about polynomial regression, as the degree of the polynomial increases the flexibility of the hypotheses function increases and the model starts learning the randomness of the data points. This leads to over-fitting which leads to poor performance on unseen data. To overcome this, an additional penalty term is added to the usual error function to keep overfitting in check. A procedure similar to gradient descent was carried out and the achieved accuracy was 75% with a quadratic polynomial.

L1 regularization: The penalty term added is proportional to the sum of the magnitudes of the magnitudes of the weights. The constant of proportionality is called as the regularization rate denoted by the greek alphabet lambda
In such a case the new error function becomes E(w) + (regularization rate)*||w||₁
It is also called Lasso Regularization
L2 regularization: The penalty term is proportional to the sum of the squares of magnitudes of the weights In such a case the new error function becomes E(w) + (regularization rate)*||w||₂
It is also called Ridge Regularization

Logistic Regression

Use: This algorithm is used when we want to classify our samples into different classes. The Algorithm uses the sigmoid function to calculate the probability of a sample to belong to class 1. The probability of sample belonging to class 0 is accordingly calculated as 1 - sigmoid(x). Based on the prediction of the sigmoid we predict to which class the sample belongs to(sigmoid less than 0.5 implies class 0 else class 1). We achieved an accuracy of 84% using this algorithm.

Least Squares Classification

Use: For this algorithm we use a one hot encoding to specify which a particular sample belonged to. Thus we get a $n*k$ matrix where n is the number of samples in the dataset and k is the number of classes we want. We achieved an accuracy of 82% using this algorithm.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.gitattributes		.gitattributes
Batch Gradient Descent.ipynb		Batch Gradient Descent.ipynb
BatchGradientWithSyntheticDataset.ipynb		BatchGradientWithSyntheticDataset.ipynb
README.md		README.md
Stochastic Gradient Descent.ipynb		Stochastic Gradient Descent.ipynb
StochasticGradientWithSyntheticDataset.ipynb		StochasticGradientWithSyntheticDataset.ipynb
Team08_Assignment1_final.ipynb		Team08_Assignment1_final.ipynb
image.png		image.png
polynomial_regression.ipynb		polynomial_regression.ipynb
synthetic_dataset.csv		synthetic_dataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GradientDescentImplementation