"It's tough to make predictions, especially about the future."
– Yogi Berra
In this course, you'll create end-to-end solutions to machine learning problems. This course will cover popular applied techniques in both supervised and unsupervised machine learning, such as regression, classification, and clustering. You'll learn how to properly engineer features, apply algorithms, and evaluate model performance. The focus of the course will be Python's scikit-learn library.
Instructor: Brian Spiering
- Working knowledge of probability and statistics.
- Introductory knowledge of linear algebra (e.g., determinants and singular value decomposition).
- Intermediate level of Python (e.g., ability to create to classes).
- No previous knowledge of machine learning required.
By the end of the course, you should be able to:
- Build end-to-end machine learning systems to answer meaningful Data Science questions.
- Write idiomatic code in Python's scikit-learn package to model data.
- Recognize when to and when not to apply machine learning techniques.
- Complete data science take-home challenges that you might encounter during job interviews.
- Welcome
- Machine learning workflow
- Scikit-learn API Overview (Estimators, Transformers, Pipelines)
- Build your first ML model
- Preprocessing
- Feature extraction
- Feature selection
- Principal Component Analysis (PCA)
- Model Selection
- Classifiers (binary classification and mutliclass classification)
- Handling class imbalance with with SMOTE resampling
- Classification Metrics
- Ensembling
- Feature Importance
- Creating custom classes in scikit-learn
- Clustering
There are five hand-ons assignments to practice applying course concepts to real-world data.
There is a final project where you choose a dataset and complete a end-to-end machine learning project.