Skip to content

A collection of some intriguing algorithms for building a foundation in Data Science.

Notifications You must be signed in to change notification settings

yashashwita20/Algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Overview

Python Notebooks
├── Power Method
├── Inverse Power Method
├── Least Squares Regression
├── Multiclass Classification on MNIST
├── Pagerank Algorithm
└── Evaluating Robustnest of Neural Networks

R Notebooks
├── Confidence Band for Linear Regression
├── Linear Regression Assumptions
├── Polynomial Regression and Piecewise Constant Fit
├── Anova, F-Test, Hypothesis Test and Polynomial Regression
├── Studentized Bootstrap and Student Confidence Intervals
├── Logistic Regression, Forward Selection and Bootstrapping
├── Cross-validation and Sequential Model Selection
└── Analysis of Chemical Plant Data

Description

Python Notebooks

1. Power Method

In this notebook, you'll find an implementation of the Power Method, a numerical algorithm used to find the dominant eigenvalue and its corresponding eigenvector of a square matrix. The Power Method is particularly useful for large matrices and plays a significant role in various applications, such as page ranking and principal component analysis.

2. Inverse Power Method

The Inverse Power Method, presented in this notebook, is an extension of the Power Method used to find the smallest (in magnitude) eigenvalue and corresponding eigenvector of a matrix. It is often employed in solving systems of linear equations and eigenvalue problems.

3. Least Squares Regression

This notebook delves into the concept of Least Squares Regression, a popular linear regression technique used to model the relationship between variables by minimizing the sum of the squares of the differences between observed and predicted values. It is implemented from scratch for both linear and non-linear estimates.

4. Multiclass Classification on MNIST

Here, we explore Multiclass Classification using the MNIST dataset, a classic dataset containing images of handwritten digits. The notebook demonstrates Multiclass Logistic Regression from scratch. Working with both Mean Square Error (L2) loss and Cross Entropy (CE) loss with gradient descent (GD) as well as stochastic/mini-batch gradient descent (SGD). Finally, training a 2-hidden-layer Neural Network model on the image dataset.

5. Pagerank Algorithm

The Pagerank Algorithm is famously known for powering Google's search engine. This notebook provides insights into how the algorithm works to determine the most important articles.

6. Evaluating Robustnest of Neural Networks

Neural Networks are powerful tools in machine learning, but they are also susceptible to adversarial attacks and may not always generalize well. In this notebook, I have implemented FGSM (Fast Gradient Sign Method) and PGD (Projected Gradient Descent) adversarial attacks on an MNIST neural network model as well as visualize those attacks. I have also performed verification of the model using Interval-Bound -Propagation (IBP).

R Notebooks

1. Confidence Band for Linear Regression

This notebook focuses on creating confidence bands for linear regression models, which help visualize the uncertainty around the regression line and predictions. Understanding confidence bands is crucial in drawing accurate conclusions from regression analyses.

2. Linear Regression Assumptions

Linear Regression comes with several assumptions that need to be validated before trusting the model's results. This notebook covers how to check these assumptions and interpret the results.

3. Polynomial Regression and Piecewise Constant Fit

In this notebook, we explore Polynomial Regression, a method that extends linear regression to capture nonlinear relationships between variables. Additionally, we discuss Piecewise Constant Fit, an alternative approach for modeling data with abrupt changes.

4. Anova, F-Test, Hypothesis Test and Polynomial Regression

The Analysis of Variance (ANOVA) is a statistical method used to compare means between two or more groups. In this notebook, we'll cover the ANOVA, F-Test, and Hypothesis Testing, along with incorporating polynomial regression in the ANOVA framework.

5. Studentized Bootstrap and Student Confidence Intervals

Bootstrap methods are powerful tools for estimating uncertainties and constructing confidence intervals. This notebook explores Studentized Bootstrap, a variant of the bootstrap that provides more accurate confidence intervals for certain statistics.

6. Logistic Regression, Forward Selection and Bootstrapping

Logistic Regression is widely used for binary classification problems. In this notebook, we'll learn how to build and interpret logistic regression models. Additionally, we'll cover forward selection, a feature selection technique, and explore how bootstrapping can improve model evaluation.

7. Cross-validation and Sequential Model Selection

Cross-validation is essential for estimating the generalization performance of a model. This notebook explains various cross-validation techniques and demonstrates how to perform sequential model selection to identify the best model among alternatives.

8. Analysis of Chemical Plant Data

In this notebook, we'll analyze data from a chemical plant. The dataset may include various variables related to the plant's operation, and we'll apply the statistical techniques learned in the previous notebooks to draw meaningful insights from the data.

I hope you find these notebooks insightful and beneficial for your learning and data analysis journey. Feel free to explore, experiment, and adapt the code to suit your specific needs. Happy coding!

Releases

No releases published

Packages

No packages published