Skip to content

cm-int/classification_models

Repository files navigation

Machine Learning Fundamentals Part I - Classification Models

This repository contains the lab instructions, demonstration scripts, Python notebooks, and data for the course Machine Learning Fundamentals Part I - Classification Models.

The Python notebooks are designed to be run using the Google Colab environment.

About the Course

This five-day instructor-led course is intended for IT professionals who wish to learn the fundamental principles of machine learning classification models. Students are typically developers or data analysts that have some programming skills, with a mathematical background. In this course, students will learn about classification machine learning models together with the algorithms that underpin these models. Students will also learn about the need for data quality, and how the correct selection of data can help to build more accurate models. Students will also learn how to test and validate models.

This course is generic, and doesn’t depend on any particular platform (Azure, AWS, Google Cloud etc), but does require some basic familiarity with Python. Students should also be familiar with the use of matrices and vectors in algebra, and have some basic understanding of differential calculus, probability and statistics.

Note: This course focusses on classification models. Two further courses cover cluster models and regression models. This course does not cover deep learning or neural networks, although the elements of this course can provide a foundation for further study in those areas.

Audience

This course is intended for developers and analysts who are new to machine learning, want to understand how machine learning models work, and need to understand how to build high-quality models. Attendees must have some familiarity with Python, an understanding of matrix and vector arithmetic and some basic familiarity with probability, statistics, and differential calculus.

At Course Completion

After completing this course, students will be able to:

  • Explain the uses of different types of machine learning models (classification, clustering, and regression).
  • Understand common machine learning classification algorithms.
  • Create and test a classification machine learning model.
  • Create non-binary classification models.
  • Summarize the statistical concepts utilized by many machine learning models.
  • Select features for a classification model
  • Measure the performance of a classification model
  • Build classification models that can handle imbalanced datasets

Prerequisites

Before attending this course, students must have:

  • Familiarity with the Python programming language.
  • An understanding of basic probability and statistics.
  • Ideally, some familiarity with the basics of differential calculus.

Additional Reading

To help you prepare for this class, review the following resources:

  1. Introduction to Machine Learning Models

    This module introduces machine learning together with the classification, clustering, and regression machine learning models. Students will learn the purpose of these models, and the types of problems to which students can apply them.

    Click here for the demonstration notebooks and data files.

  2. Understanding Classification Algorithms

    This module describes different algorithms that are commonly used to create a classification machine learning model. It provides a tour through the algorithms, summarizing their strengths and weaknesses, and when each is most appropriate.

  3. Creating a Classification Model

    This module provides an overview of the essential steps in building a machine learning model: data preparation, model construction and tuning, and testing and validation.

    Click here for the demonstration notebooks and data files.

    Click here for the lab instructions, notebooks, and data files.

  4. Understanding Binary and Non-Binary Classification

    This module describes the differences between binary and multi-valued classification and shows how to create a multi-class classification model.

    Click here for the demonstration notebooks and data files.

    Click here for the lab instructions, notebooks, and data files.

  5. Reviewing Statistics Concepts

    This module summarizes key statistics terminology, and some common techniques used to analyze the distribution, scale, and relationships between items in a dataset. This information is essential to understanding the validity of a machine learning model.

    Click here for the demonstration notebooks and data files.

  6. Exploring Data and Selecting Features and Algorithms

    This module explains how to refine a machine learning model, by selecting the most relevant features from the dataset, examining the distribution of values, investigating correlation between features, normalizing data, and removing bias. This is useful in refining the features of the dataset used to create a machine learning model.

    Click here for the demonstration notebooks and data files.

  7. Measuring the Performance of a Classification Model

    This module describes how to assess the accuracy and performance for a classification model, and how to balance precision and recall where appropriate.

    Click here for the lab instructions, notebooks, and data files.

  8. Understanding Imbalanced Classification

    This module discusses the problems that can arise when using an imbalanced dataset to create a classification model, how to recognize potential problems, and how to address them.

    Click here for the lab instructions, notebooks, and data files.

About

Repository for the course Machine Learning Fundamentals Part I - Classification Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published