Skip to content

kirankotari/mlcourse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mlcourse.ai – Open Machine Learning Course

ODS stickers

<<<<<<< HEAD :ru: Russian version 🇷🇺

<<<<<<< HEAD <<<<<<< HEAD Current session: February 11th - April 26th, 2019. You can join at any point, fill in this form to participate, plese explore the main page mlcourse.ai as well.

=======

e376203a66858fdb965a2b99830c0b08e686b88b The next (and final) session launches on September 2, 2019. Fill in this form to participate, please explore the main page mlcourse.ai as well. ec8ebec14b4b3ceea0972a86b7885ff3b15d207d ======= The next (and final) session launches on September 2, 2019. Fill in this form to participate, please explore the main page mlcourse.ai as well. 6a6e3a7ca926a4fe7c956f1186ce57a8697a13b3

Mirrors (:uk:-only): mlcourse.ai (main site), Kaggle Dataset (same notebooks as Kernels)

Outline

This is the list of published articles on medium.com 🇬🇧, habr.com 🇷🇺, and jqr.com 🇨🇳. Icons are clickable. Also, links to Kaggle Kernels (in English) are given. This way one can reproduce everything without installing a single package.

  1. Exploratory Data Analysis with Pandas 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
  2. Visual Data Analysis with Python 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2
  3. Classification, Decision Trees and k Nearest Neighbors 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
  4. Linear Classification and Regression 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2, part3, part4, part5
  5. Bagging and Random Forest 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2, part3
  6. Feature Engineering and Feature Selection 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
  7. Unsupervised Learning: Principal Component Analysis and Clustering 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
  8. Vowpal Wabbit: Learning with Gigabytes of Data 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
  9. Time Series Analysis with Python, part 1 🇬🇧 🇷🇺 🇨🇳. Predicting future with Facebook Prophet, part 2 🇬🇧, 🇨🇳 Kaggle Kernels: part1, part2
  10. Gradient Boosting 🇬🇧 🇷🇺, 🇨🇳, Kaggle Kernel

Lectures

Videolectures are uploaded to this YouTube playlist. Introduction, video, slides

  1. Exploratory data analysis with Pandas, video
  2. Visualization, main plots for EDA, video
  3. Decision trees: theory and practical part
  4. Logistic regression: theoretical foundations, practical part (baselines in the "Alice" competition)
  5. Ensembles and Random Forest – part 1. Classification metrics – part 2. Example of a business task, predicting a customer payment – part 3
  6. Linear regression and regularization - theory, LASSO & Ridge, LTV prediction - practice
  7. Unsupervised learning - Principal Component Analysis and Clustering
  8. Stochastic Gradient Descent for classification and regression - part 1, part 2 TBA
  9. Time series analysis with Python (ARIMA, Prophet) - video
  10. Gradient boosting: basic ideas - part 1, key ideas behind Xgboost, LightGBM, and CatBoost + practice - part 2 <<<<<<< HEAD

<<<<<<< HEAD

Spring 2019 assignments

  1. Exploratory Data Analysis (EDA) of US flights, nbviewer. Deadline: February 24, 20:59 GMT
  2. In Assignment 2, you'll be beating baselines in first two competitions:
    • Part 1. User Identification with Logistic Regression (beating baselines in the "Alice" competition), nbviewer. Deadline: March 10, 20:59 GMT
    • Part 2. Predicting Medium articles popularity with Ridge Regression (beating baselines in the "Medium" competition), nbviewer. Deadline: March 10, 20:59 GMT
  3. Decision trees, Random Forest, and gradient boosting. Deadline: March 31, 20:59 GMT
    • Part 1. "Decision trees for classification and regression", nbviewer
    • Part 2. "Random Forest and Logistic Regression in credit scoring and movie reviews classification", nbviewer
    • Part 3. "Flight delays" competition, Kernel starter <<<<<<< HEAD =======
  4. Time series analysis, nbviewer. Deadline: April 7, 20:59 GMT

ec8ebec14b4b3ceea0972a86b7885ff3b15d207d =======

6a6e3a7ca926a4fe7c956f1186ce57a8697a13b3 ======= e376203a66858fdb965a2b99830c0b08e686b88b

Demo assignments, just for practice

The following are demo versions. Full versions are announced during course sessions.

  1. Exploratory data analysis with Pandas, nbviewer, Kaggle Kernel, solution
  2. Analyzing cardiovascular disease data, nbviewer, Kaggle Kernel, solution
  3. Decision trees with a toy task and the UCI Adult dataset, nbviewer, Kaggle Kernel, solution
  4. Sarcasm detection, Kaggle Kernel, solution. Linear Regression as an optimization problem, nbviewer, Kaggle Kernel
  5. Logistic Regression and Random Forest in the credit scoring problem, nbviewer, Kaggle Kernel, solution
  6. Exploring OLS, Lasso and Random Forest in a regression task, nbviewer, Kaggle Kernel, solution
  7. Unsupervised learning, nbviewer, Kaggle Kernel, solution
  8. Implementing online regressor, nbviewer, Kaggle Kernel, solution
  9. Time series analysis, nbviewer, Kaggle Kernel, solution
  10. Beating baseline in a competition, Kaggle kernel

Kaggle competitions

  1. Catch Me If You Can: Intruder Detection through Webpage Session Tracking. Kaggle Inclass
  2. How good is your Medium article? Kaggle Inclass <<<<<<< HEAD =======
  3. DotA 2 winner prediction Kaggle Inclass

ec8ebec14b4b3ceea0972a86b7885ff3b15d207d

Rating

Throughout the course we are maintaining a student rating. It takes into account credits scored in assignments and Kaggle competitions. They say, rating highly motivates to finish the course. Top students (according to the final rating) are listed on a special page.

Community

<<<<<<< HEAD <<<<<<< HEAD Discussions between students are held in the #mlcourse_ai channel of the OpenDataScience Slack team. Fill in this form to get an invitation (you can join at any point before the course ends ~ in the end of April 26, 2019). The form will also ask you some personal questions, don't hesitate 👋

Discussions between students are held in the #mlcourse_ai channel of the OpenDataScience Slack team. Fill in this form to get an invitation closer to the start of a new session.

ec8ebec14b4b3ceea0972a86b7885ff3b15d207d ======= Discussions between students are held in the #mlcourse_ai channel of the OpenDataScience Slack team. Fill in this form to get an invitation closer to the start of a new session. 6a6e3a7ca926a4fe7c956f1186ce57a8697a13b3

The course is free but you can support organizers by making a pledge on Patreon (monthly support) or a one-time payment on Ko-fi. Thus you'll foster the spread of Machine Learning in the world!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published