Skip to content

yaswanthteja/30DaysMachine-Learning-Challange-roadmap

Repository files navigation

30DaysMachine-Learning-Challange-roadmap

View Repositories View My Profile

This is a tentative roadmap for our 30 days machine learning challange. I will add more information along the way.

  • Core concepts of Machine Learning

  • Machine Learning Process

image

  • Basic Data Exploration

  • Working with Missing Data

a. Examining Missing Data

b. Dropping Missing Data

c. Imputing Data

d. Adding Indicator Columns

  • Your first machine-learning model

  • Working with Cleaning Data

a. Column Names

b. Replacing Missing Values

  • Model Validation

  • Data Exploration

a. Data Size

b. Summary Stats

c. Histogram

d. Scatter Plot

e. Joint Plot

f. Pair Grid

g. Box and Violin Plots

h. Comparing Two Ordinal Values

i. Correlation

j. RadViz

k. Parallel Coordinates

a. Standardize

b. Scale to Range

c. Dummy Variables

d. Label Encoder

e. Frequency Encoding

f. Pulling Categories from Strings

g. Other Categorical Encoding

h. Date Feature Engineering

i. Add col _na Feature

j. Manual Feature Engineering

  • Feature Selection

a. Collinear Columns

b. Lasso Regression

c. Recursive Feature Elimination

e. Mutual Information

f. Principal Component Analysis

g. Feature Importance

  • Dealing with Imbalance Classes

a. Use a Different Metric

b. Tree-based Algorithms and Ensembles

c. Penalize Models

d. Upsampling Minority

e. Generate Minority Data

f. Downsampling Majority

g. Upsampling Then Downsampling

  • Classification Algorithms
  • Model Selection
  • Metrics and Classification Evaluation

a. Confusion Matrix

b. Metrics

c. Accuracy

d. Recall

e. Precision

f. F1

g. Classification Report

h. RoC

i. Precision-Recall Curve

j. Cumulative Gains Plot

k. Lift Curve

l. Class Balance

m. Class Prediction Error

n. Discrimination Threshold

Day Thirteen:

  • Explaining Classification Model

Day Fourteen:

  • Regression Algorithms

Day Fifteen:

  • Metrics and Regression Evaluation

Day Sixteen:

  • Explaining Regression Model

Day Seventeen:

  • Dimensionality Reduction

Day Eighteen:

  • Clustering

Day Nineteen:

  • Implementing Pipeline

Day Twenty:

  • Neural networks

  • Artificial neural networks (ANN)

Day Twenty-one:

  • Project:

  • ANN walkthrough: Predicting Stock Prices

Day Twenty-two:

  • Natural Language Processing (NLP)

Day Twenty-three:

  • Project:

  • NLP walkthrough: Mining Newsgroups Dataset

Day Twenty-four:

  • Deep Learning Basics

Day Twenty-five:

  • Problems and Solutions

Day Twenty-six:

  • Machine Learning best practices

Day Twenty-seven:

  • Project:

  • Building a Movie Recommendation Engine

Day Twenty-eight:

  • Project:

  • Recognizing Faces

Day Twenty-nine:

  • Project: Predicting Online Ad Click-Through: Tree-based Algorithm

Day Thirty:

  • Project: NewsGroups Dataset with Clustering and Topic Modeling

Credit