Skip to content

Collection of ML use-cases, implementations, and cleaned datasets for reference.

Notifications You must be signed in to change notification settings

ccmilne/ml-approaches

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ML References

Collection of ML use-cases, implementations, and cleaned datasets for reference. Scripts are sourced from personal projects and coursework.

Folder Structure

Directories are grouped by ML Libraries + Coursework examples

|--PyTorch
|   |-- Classifying the Political Framing of Campaign Emails (Logistic Regression)
|   |-- Train a Word2Vec model on Wikipedia Biographies with debiasing (Tensorboard)
|
|--Keras
|   |-- Computer Vision with CNN
|
|--HuggingFace (Transformers)
|   |-- Predicting Helpful Stack Overflow Answers and Data Annotation/Measuring Annotation Quality
|   |-- Pattern-Based Learning (Exploitation Training) for Toxic Language
|
|--Scikit-Learn
|
|--Coursework Examples
|   |--SI630 - Natural Language Processing
|   |   |-- Classifying the Political Framing of Campaign Emails (Logistic Regression)
|   |   |-- Train a Word2Vec model on Wikipedia Biographies with debiasing (Tensorboard)
|   |   |-- Predicting Helpful Stack Overflow Answers and Data Annotation/Measuring Annotation Quality (HuggingFace)
|   |   |-- Pattern-Based Learning (Exploitation Training) for Toxic Language
|   |
|   |--SI670 - Applied Machine Learning
|   |   |--TBD
|   |
|   |--SI671 - Data Mining
|   |   |-- Mining and Evaluating Frequent Itemsets on Twitter Emojis
|   |   |-- Time Series analysis of COVID-19 trends for G7 Nations
|   |   |-- Social Network Analysis for Amazon Product Reviews

Use-Cases (excluding Neural Net Approaches)

Project focus (dataset: methods and topics)

  • Supervised Learning

    • Regression Models (Linear, Logistic, Ridge, Lasso)

    • Tree-Based Models (Decision Tree, Random Forests, Gradient Boosting Regression, XGBoost, LightGBM Regressor)

      • Credit Risk (Statlog German Credit Data: Decision Tree, Random Forest)
  • Unsupervised Learning

    • Clustering (K-Means, Hierarchical Clustering, Gaussian Mixture Models)

    • Association (Apriori algorithm)

  • Other

    • Time Series

    • NLP

About

Collection of ML use-cases, implementations, and cleaned datasets for reference.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published