Skip to content

chansmi/machine-learning

Repository files navigation

machine-learning

Various ML coding projects

HW - 1 LR

Creating correlated data and running various algorithms to understand their success in various situations.

HW - 2 EDA

The purpose of this assignment is to conduct exploratory data analysis on a merged dataset. EDA should allow us to uncover information about the data such as trends, relationships, and patterns. It also can be used to identify parts of data that needs to be cleaned. By the end, we gain the necessary experience to handle, analyze, and visualize real world 'style' data which can be messy.

HW - 3 MLR

The goal of this project was to explore MLR and logistic regression in depth. This involved setting up the full ML pipeline, manipulating data, and playing around with the models for optimization. This provided invaluable experience regarding iterating over a model, just like we would if provided a real-life problem.

HW - 4 SGD and KNN

The purpose of this task is to gain experience and familiarity with using KNN and SGD regression and classification models. From there, we can put theory into practice by fully analyzing the results and data.\

HW - 5 Tree Based Classifiers

The purpose of this task is to gain experience and familiarity with using Tree based and KNN classification models. From there, we can put theory into practice by fully analyzing the results and data from the EPA.

HW - 6 Comprehensive Supervised Learning

The overall purpose of this assignment is to tie together a variety of supervised learning techniques in order to appropriately analyze several questions related to the Behavioral Risk Factor Surveillance System. More specifically, the goal is to use supervised ML to identify patterns of comorbidity among the survey respondents. This is a comparative exercise focusing on pre (2019) and post (2021) covid health.

HW - 7 Unsupervides Learning (clustering)

This assignment involves using various clustering algorithms on the MNIST data to group similar data observations together based on their characteristics. The aim is to identify patterns and outliers in the data. The clustering algorithms used are k-means, mini-batch k-means, DBSCAN, and HDBSCAN, which will be applied to the original and noisy data. The essential steps for the assignment include importing and exploring the data, standardizing it, identifying and filtering out outliers, performing clustering using the chosen techniques, selecting 10 cluster solutions for all techniques, and tuning the algorithms to compare the clusters with the actual response. The best outcomes and associated algorithms will be commented on in the final output, which should be a fully executed Jupyter notebook.

About

Various ML coding projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors