mlcourse.ai – Open Machine Learning Course
🇷🇺 Russian version 🇷🇺
❗ The next session is planned to launch in February 2019. Please wait for an announcement in January ❗
Mirrors (:uk:-only): mlcourse.ai (main site), Kaggle Dataset (same notebooks as Kernels)
This is the list of published articles on medium.com 🇬🇧, habr.com 🇷🇺, and jqr.com 🇨🇳. Icons are clickable. Also, links to Kaggle Kernels (in English) are given. This way one can reproduce everything without installing a single package.
- Exploratory Data Analysis with Pandas 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Visual Data Analysis with Python 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2
- Classification, Decision Trees and k Nearest Neighbors 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Linear Classification and Regression 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2, part3, part4, part5
- Bagging and Random Forest 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2, part3
- Feature Engineering and Feature Selection 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Unsupervised Learning: Principal Component Analysis and Clustering 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Vowpal Wabbit: Learning with Gigabytes of Data 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Time Series Analysis with Python, part 1 🇬🇧 🇷🇺 🇨🇳. Predicting future with Facebook Prophet, part 2 🇬🇧, 🇨🇳 Kaggle Kernels: part1, part2
- Gradient Boosting 🇬🇧 🇷🇺, 🇨🇳, Kaggle Kernel
Videolectures are uploaded to this YouTube playlist. Introduction, video, slides
- Exploratory data analysis with Pandas, video
- Visualization, main plots for EDA, video
- Decision trees: theory and practical part
- Logistic regression: theoretical foundations, practical part (baselines in the "Alice" competition)
- Emsembles and Random Forest – part 1. Classification metrics – part 2. Example of a business task, predicting a customer payment – part 3
- Linear regression and regularization - theory, LASSO & Ridge, LTV prediction - practice
- Unsupervised learning - Principal Component Analysis and Clustering
- Stochastic Gradient Descent for classification and regression - part 1, part 2 TBA
- Time series analysis with Python (ARIMA, Prophet) - video
- Gradient boosting: basic ideas - part 1, key ideas behind Xgboost, LightGBM, and CatBoost + practice - part 2
The following are demo versions. Full versions are announced during course sessions.
- Exploratory data analysis with Pandas, nbviewer, Kaggle Kernel
- Analyzing cardiovascular disease data, nbviewer, Kaggle Kernel
- Decision trees with a toy task and the UCI Adult dataset, nbviewer, Kaggle Kernel
- Linear Regression as an optimization problem, nbviewer, Kaggle Kernel
- Logistic Regression and Random Forest in the credit scoring problem, nbviewer, Kaggle Kernel
- Exploring OLS, Lasso and Random Forest in a regression task, nbviewer, Kaggle Kernel
- Unsupervised learning, nbviewer, Kaggle Kernel
- Implementing online regressor, nbviewer, Kaggle Kernel
- Time series analysis, nbviewer, Kaggle Kernel
- Beating baseline in a competition, Kaggle kernel
- Catch Me If You Can: Intruder Detection through Webpage Session Tracking. Kaggle Inclass
- How good is your Medium article? Kaggle Inclass
Throughout the course we are maintaining a student rating. It takes into account credits scored in assignments and Kaggle competitions. Top students (according to the final rating) are listed on a special Wiki page.
Discussions between students are held in the #mlcourse_ai channel of the OpenDataScience Slack team. Fill in this form to get an invitation. The form will also ask you some personal questions, don't hesitate 👋
Go to mlcourse.ai
The course is free but you can support organizers by making a pledge on Patreon (monthly support) or a one-time payment on Ko-fi