- Presentation
- Course overview
- AI/ML/DL/DS/RL...
- History of ML
- **ML paradigm: X+rule = y vs X+y = rule**
- Supervised/unsupervised
- Current status and limits of AI
- How a ML project looks?

# Machine Learning Introduction

Vincent Vandenbussche

- PhD in Physics defended in 2014
- Experience in large companies: GE Healthcare, Renault, L'Oréal
- Experience in research: CNRS, CEA
- Experience in startups: Easyrecrue, Suricog, Sweesp
- Founded startup: Vivadata, a data science bootcamp

## Course overview

Introduction to Machine Learning:
- Big picture introduction to Machine Learning
- Reminders of Linear Regression and Logistic Regression
- Regularization and bias/variance tradeoff
- Classification with SVMs, k-NNs, Naive Bayes
- Data Preparation
- Model Evaluation
- Model Optimization
- Imbalanced Datasets
- Dimensionality Reduction
- Decisions Trees and Random Forest
- Boosting and Gradient Boosting
- Anomaly Detection

# What is ML?

Let's have a quizz: ML or not ML?
- Google Translate
- Netflix movie recommendation
- Snapchat/TikTok filters
- Uber driver selection
- Face recognition
- Siri / Google Home / Amazon Alexa...
- 




> 🏆 **Quizz**: Which **type of ML problem** do those examples correspond to? Supervised (Classification / Regression) or Unsupervised?
>
> - Photo tagging on Facebook 📸
> - Loan approvals 🏦
> - Targeted online ads 👗
> - Paypal fraud detection 💰
> - Speech recognition (ex: Siri) 🎙
> - Market segmentation 👪
> - Preventive maintenance 🔧

What are the differences between:
- Artificial intelligence
- Machine Learning
- Deep Learning
- Data Science

# 

## Introduction to Machine Learning 🤖

___

Today, we will **unveil** the hype behind **Machine Learning**. Especially, we will define most of the terms surrounding the field (and sometimes misused), and grasp the intuition behind ML algorithms.

Here we go!

![](https://drive.google.com/uc?export=view&id=1WXxaPikcf5u76j5cg1dLDvZK3fQPI9qr)

___

# I. What is Machine Learning?

## I.1. A.I vs Machine Learning vs Deep Learning

### Welcome to A.I...

![](https://drive.google.com/uc?export=view&id=1wkrks6AXOG2bNaX1vUq82C_cIoE7GcHs)

A lot of hype, for a lot of buzzwords. But what does it mean exactly?

Especially, what is the difference between **Artifical Intelligence**, **Machine Learning** and **Deep Learning**?

### A.I vs Machine Learning vs Deep Learning

![](https://drive.google.com/uc?export=view&id=1jij9_VDnCAKts-D0qu7vP3hwT6k5Lk99)

## I.2. Why does it matter now?

It all started with these guys...

### Alan Turing

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=16I3OtO2dgcy7zHW0nvcRYQNyLWewAWXm">
</p>

Turing is widely considered to be the **father of theoretical computer science and artificial intelligence**.

In 1950 Alan Turing published a landmark paper in which he speculated about the possibility of creating machines that think. If a machine could carry on a conversation that was indistinguishable from a conversation with a human being, then it was reasonable to say that the machine was "thinking".

The [Turing Test](https://en.wikipedia.org/wiki/Turing_test) was the first serious proposal in the **philosophy of artificial intelligence**. 

> “I do not see why it (the machine) should not enter any one of the fields normally covered by the human intellect, and eventually compete on equal terms. I do not think you even draw the line about sonnets, though the comparison is perhaps a little bit unfair because a sonnet written by a machine will be better appreciated by another machine.”
> 
> Alan Turing, _The London Times_, 1949


### Walter Pitts & Warren McCulloch

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1bD4ITkBz0y69JXppGhCZ42lgxDp-oxo5" width="400">
</p>

Inspired by work in neurology, **Walter Pitts** and **Warren McCulloch** analyzed networks of idealized artificial neurons and showed how they might perform simple logical functions.

They were the first to describe what later researchers would call a **neural network**.


### Marvin Minsky

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1_4Yg4RPJqqfNIIy7hfSeWH1De1TgYUZ8" width="200">
</p>

One of the students inspired by Pitts and McCulloch was a young **Marvin Minsky**, then a 24-year-old graduate student.

In 1951 (with Dean Edmonds) he built the first **neural net machine**, the SNARC. With the SNARC, Minksky built a  first ANN (Artificial Neural Network) that could simulate a rat finding its way through a maze.

At that time, Minsky’s work and theory went largely unnoticed by the public and was rejected by most AI researchers until computing power had reached a level where results could be clearly demonstrated.

> *Within our lifetime, machines may surpass us in general intelligence.*
>
> Marvin Minsky, 1951

### A brief timeline of the A.I. field

A.I. has a long history behind it, with a series of successes and of failures and disappointments (see [A.I. winter](https://en.wikipedia.org/wiki/AI_winter) for example). 

> 📚 **Resources**: The history behind A.I. is long and fascinating. Check [History of AI on Wikipedia](https://en.wikipedia.org/wiki/History_of_artificial_intelligence) to dig deeper.

![](https://drive.google.com/uc?export=view&id=1kUh0wO82soPDiXzGo8TEevfgzSkGZlto)

We have seen that most algorithms are not new, so why are we observing such a boom in the recent years?

### Reasons for the current A.I. revolution

Two main reasons for this current boom:
- Increase of computing power ([GPU](https://en.wikipedia.org/wiki/Graphics_processing_unit) - thanks to gamers who want good graphics 🎮)
- Exponential **increase of the volume of data collected**

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1ZYyfHa4IvLt6O_Kvl3Y1rMZQFZKzDFqd" width="50%">
</p>

### Reasons for the continuation of this boom

- Highly **active** and **exposed** research field
- Better access to the **knowledge** ([Thanks Vivadata 🙏🏻](https://vivadata.org/)) and **tools**

![](https://drive.google.com/uc?export=view&id=18oVX21ZazQKzWuzHQwIu9KmiVwnBuaca)

## I.3. Examples of application

### 🚙 Autonomous driving

![](https://drive.google.com/uc?export=view&id=1lDltwLk02Am5bsINiqk4u33oABQfawcA)

### 🚙 Autonomous driving

![](https://drive.google.com/uc?export=view&id=1gdJKYzGMtA3KSOD2JS8s53Wp3pzfDwVr)

### 🌆 Smart City

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1D7JgBRhPLk-83ozMeDVh30mxWlsV3RnT">
</p>

### 🔬 Health

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1hn0rFBFMpE1PqpjTrl61xMnbbGmQ8hss" width="70%">
</p>

### 🏪 Casherless stores

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1RQ1rBiICTtyRTvsAHMcj_0Zu-xb4CKrL">
</p>

### 🎙 Home Assistants

![](https://drive.google.com/uc?export=view&id=1RpkMLUsAyIgzJvdGnUi7d5UUEcajEY4G)

Etcetera, etcetera. I could carry on, but as you read/hear in the press, all fields are impacted and should, one day or another, be disrupted by Artifical Intelligence.

But how does it work exactly? To answer that question, we first need to understand how **Machine Learning (ML)** algorithms work.

___

# II. The ML problem

## II.0. Let's play a game 🦁

Machine Learning is all about **learning from example**.

Let's illustrate that by looking at which animal is `acerous` or not...

![](https://drive.google.com/uc?export=view&id=16HRksBF0rt8W5dHnzvsfQjzgcXBB2FGU)

## II.1. The intuition behind Machine Learning

Historically, computers were given algorithms that follow some sets of rules (for example in a chess game, "if chess pieces in this position on the board, play this move", or "launch this rocket with this given speed and angle").

But those programs were really bad at some given task (for example in computer vision, recognizing an image as a cat appeared tricky for traditional algorithms - as cats can have a lot of difference appearances, colors, positions, etc.).

Then came ML, with a completely different mindset: in ML, you feed **data** to a piece of software, and by going through the process of **training** this code (called **model**), you can then use it to make **predictions** on a data set (especially on previously unseen data).

![](https://drive.google.com/uc?export=view&id=1EMSRYv8KcKuUSIZEtUl8b0pN41amLIBx)

## II.2. Different types of ML problems 

### There are *different types* of Machine Learning problems

Machine Learning problems can be categorized in different categories. 

In general, a machine learning problem considers a set of n samples of data and then tries to predict properties of unknown data. If each sample is more than a single number and, for instance, a multi-dimensional entry (aka multivariate data), it is said to have several **attributes** or **features**.

Learning problems fall into a few categories:

- **Supervised learning**: the data comes with additional attributes (called **labels**) that we want to predict. A supervised learning problem can be either:
 
 
 - **Classification**: samples belong to two or more classes and we want to learn from already labelled data how to predict the class of unlabelled data.
 
 > An example of a classification problem would be handwritten digit recognition, in which the aim is to assign each input vector to one of a finite number of discrete categories. Another way to think of classification is as a discrete (as opposed to continuous) form of supervised learning where one has a limited number of categories and for each of the n samples provided, one is trying to label them with the correct category or class.
 
 - **Regression**: if the desired output consists of one or more continuous variables, then the task is called regression.
 
 > An example of a regression problem would be the prediction of the length of a salmon as a function of its age and weight.
 
- **Unsupervised Learning**: here, the training data consists of a set of input vectors x without any corresponding target values. The goal in such problems may be to discover groups of similar examples within the data, where it is called **clustering**, or to project the data from a high-dimensional space down to two or three dimensions for the purpose of visualization, called **dimensionality reduction**.


![](https://drive.google.com/uc?export=view&id=1YW1fZEV-a3n4EavCf4EwjQiEaqoG5IZX)

> 📚 **Resources**: What about **Reinforcement Learning**?
>
> An additional branch of machine learning is **reinforcement learning (RL)**. Reinforcement learning differs from other types of machine learning. In RL you don't collect examples with labels. Imagine you want to teach a machine to play a very basic video game and never lose. You set up the model (often called an **agent** in RL) with the game, and you tell the model not to get a "game over" screen. During training, the agent receives a **reward** when it performs this task, which is called a **reward function**. With reinforcement learning, the agent can learn very quickly how to outperform humans.
>
> The lack of a data requirement makes RL a tempting approach. However, designing a good reward function is difficult, and RL models are less stable and predictable than supervised approaches. Additionally, you need to provide a way for the agent to interact with the game to produce data, which means either building a physical agent that can interact with the real world or a virtual agent and a virtual world, either of which is a big challenge.
>
>Reinforcement learning is an active field of ML research, but in this course we'll focus on the problems detailed above because they're a better known problem, more stable, and result in a simpler system.
>
> For comprehensive information on RL, check out [Reinforcement Learning: An Introduction by Sutton and Barto](http://incompleteideas.net/book/RLbook2018.pdf).

## II.3. Identifying the type of problem

> 🏆 **Quizz**: Which **type of ML problem** do those examples correspond to? Supervised (Classification / Regression) or Unsupervised?
>
> - Photo tagging on Facebook 📸
> - Loan approvals 🏦
> - Targeted online ads 👗
> - Paypal fraud detection 💰
> - Speech recognition (ex: Siri) 🎙
> - Market segmentation 👪
> - Preventive maintenance 🔧

___

# III. The ML framework

Solving a Machine Learning problem implies having a completely different mindset when addressing it.

Let's see what is this mindset and what could be a **good framework** when solving a ML problem.

Let's follow a **series of 5 steps**.

## III.1. Define objective/hypothesis/data required

- Which question do we want to answer? What is the objective of our application?
- Which hypothesis can we formulate?
- What data is required? Labeled or not?

**Example**: We want to build a SPAM engine 📩

## III.2. Collect data

**Data is the sinews of war**. Think about clever way to collect it. Some ideas:

- Open data: [Kaggle datasets](https://www.kaggle.com/datasets), [Awesome Public Datasets](https://github.com/caesar0301/awesome-public-datasets), [datahub.io](http://datahub.io/), etc.)
- APIs (cf. Cambridge Analytica 👀)
- Scrapping (Beautiful Soup, Scrapy, etc.)
- Internal databases (SQL & SQL Alchemy, NoSQL, etc.)
- Other creative ideas :

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1ZDmkv-SFcJH6M4FCN337qNqBf6MvuuWo" width="60%">
</p>

## III.3. Clean, enrich, transform, explore & visualize data

- What to do with missing/incorrect/duplicate values? 
- Which columns (features) should you use to train your model? Can you create new interesting features?
- Explore your data and collect insights
- Visualize your data to understand it better

## III.4. Choose and train one (or several) model(s)

We will learn about classical ML and how to train them really soon 🤖

## III.5. Evaluate performances - conclude & iterate

Evaluate your model: is your model performing well at the task he has been trained to do?

If no, inspect why - go back to first step and iterate! 🔄

___

# IV. Example of ML classification algorithm : the k-Nearest Neighbor

#### Nearest Neighbors

The **Nearest Neighbors** model is very rarely used in practice, but it will allow us to get an idea about the basic approach to a **classification** problem.

The idea is very simple: for one test data point, we find the closest point in our training dataset and we use this label as the prediciton label for our test data point.

For example, suppose we want to classify an image in 1 out of 3 classes (ex: cat, dog or horse). The nearest neighbor classifier will take a test image, compare it to every single one of the training images, and predict the label of the closest training image.

If we visualize our training data points on a 2-dimension charts, we can plot **decision boundaries** which will indicate the **prediction label** of any new test data point we choose.

#### k-Nearest Neighbors

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1N6_QZd5aKFQ0m0OhS2x9CW0vC0JN_oXx">
</p>


k-Nearest-Neighbors is a classical Machine Learning algorithm that is a *generalization* of the Nearest Neighbors algorithm: 

This time, for one test data point, we find the **top-k closest points**, and have them **vote on the label** of the test data point.

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1h9gHU9tb6BTegXM8-Da0zXnVjBxrvYPc">
</p>


> 🔦 **Hint**: In particular, when k = 1, we recover the Nearest Neighbor classifier. Intuitively, higher values of k have a smoothing effect that makes the classifier more resistant to outliers.


> 📚 *Resources*: [Live visualization of k-NN algorithm](http://vision.stanford.edu/teaching/cs231n-demos/knn/)

Now it is time for the challenge! 🚀

And guess what? Today, your goal is to **rebuild from scratch the kNN algorithm**.