Projects by Beck Edwards
This repository includes projects conducted for INDE 577: Data Science & Machine Learning @ Rice University. Each concept contains a jupyter notebook walkthrough outlining the machine learning technique. Feel free to modify and experiment with any and all notebooks — this repository is intended for public use.
The concepts covered in this repo include:
Supervised learning involves learning a function that maps input data to output labels or values, using labeled datasets.
- The Perceptron: A fundamental linear classifier for binary classification problems.
- Linear Regression: Predicts continuous values by modeling relationships between features and a target variable.
- Logistic Regression: Used for binary classification by modeling the probability of a class using a logistic function.
- Neural Networks: A powerful algorithm for learning complex, non-linear relationships between inputs and outputs.
- K Nearest Neighbors (KNN): A simple algorithm that classifies a point based on the majority class of its nearest neighbors.
- Decision Trees / Regression Trees: Tree-based models for classification and regression tasks that split the data based on feature thresholds.
- Random Forests: An ensemble of decision trees that reduces overfitting and improves accuracy.
- Other Ensemble Methods, including Boosting: Techniques like AdaBoost and Gradient Boosting that combine weak learners into a strong predictive model.
Unsupervised learning identifies patterns and structures in unlabeled datasets.
- K-Means Clustering: Groups data points into clusters based on feature similarity using a distance metric.
- DBSCAN: A density-based clustering algorithm that identifies clusters of arbitrary shapes and marks outliers as noise.
- Principal Component Analysis (PCA): Reduces dimensionality by finding directions of maximum variance in the data.
- Image Compression with the Singular Value Decomposition (SVD): Compresses images by decomposing them into singular values and selecting the most significant components.
All notebooks are made to run on Python 3.11.5