# Machine Learning with Python

***By Alex Borio***

## Introduzione al Machine Learning

### What is Machine Learning? 

"Machine learning is a subset of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn."

***-IBM***

<u>Thanks to Machine learning we are able to build systems that can learn from historical data, identify patterns, and make logical decisions with little to no human intervention.</u>

### Why is it so important nowadays?

- <u>Machine learning is growing in importance due to increasingly enormous volumes and variety of data, the access and affordability of computational power, and the availability of high speed Internet. </u>

- These digital transformation factors make it possible for one to rapidly and automatically **develop models that can quickly and accurately analyze extraordinarily large and complex data sets.**

### Machine learning is behind:

- Chatbots and predictive text, 
- Language translation apps, 
- The shows Netflix suggests to you, and how your social media feeds are presented.
- Fraud Detection
- Self-Driving Cars
- Healthcare
- Process Automation

## Different types of Machine Learning

### 1. Supervised Machine Learning

The main difference between supervised and unsupervised learning: Labeled data!

<u>Supervised learning uses labeled input and output data</u>, while an unsupervised learning algorithm does not.

<u>To put it simply, labeled data contains a collection of variables (features) and a specific output that we are trying to predict.</u>

The reason it is called supervised machine learning is because at least part of <u>this approach requires human oversight. </u>

Supervised Learning deals with **two types of problem:** <u>Regression problems and Classification problems.</u>


**Examples of supervised learning include:**

- Linear regression
- Logistic regression
- K-Nearest Neighbors (KNN) 
- Support vector machines (SVM)
- Decision trees
- Random Forest
- Naive Bayes

#### Regression VS Classification
    
**Regression and Classification algorithms are Supervised Learning algorithms**. Both the algorithms are used for prediction in Machine learning and work with the labeled datasets. 

<u>The main difference between Regression and Classification algorithms</u> is that **<u>Regression algorithms are used to predict the continuous values</u>** such as price, salary, age, etc. and **<u>Classification algorithms are used to predict/Classify the discrete values</u>** such as Male or Female, True or False, Spam or Not Spam, etc.

<img src="https://miro.medium.com/max/775/1*Qn4eJPhkvrEQ62CtmydLZw.png" alt="300" width="450" align="left"/>


### 2. Unsupervised Machine Learning

Unsupervised machine learning is the training of models on raw and unlabelled training data.

It is often <u>used to identify patterns and trends in raw datasets, or to cluster similar data into a specific number of groups.</u>

It’s also often an approach used in the early exploratory phase to better understand the datasets.  

**Examples of unsupervised learning include:**

- K – Means clustering
- Principal Component Analysis (PCA)

### 3. Reinforcement Learning

<u>Reinforcement learning is a technique that provides training feedback using a reward mechanism.</u>

Reinforcement learning does not require labeled data as does supervised learning. Further still, it doesn’t even use an unlabeled dataset as would unsupervised learning. 

The learning process occurs as a machine, or **Agent, that interacts with an environment and tries a variety of methods to reach an outcome.** <u>The Agent is rewarded or punished when it reaches a desirable or undesirable State.</u> 

<u>There is always a start state and an end state. However, to reach the end state, there might be a different path.</u>

***In Reinforcement Learning Problem an agent tries to manipulate the environment.*** 

The agent travels from one state to another. The agent gets the reward(appreciation) on success but will not receive any reward or appreciation on failure. <u>**In this way, the agent learns from the environment.**</u>

**Some applications of reinforcement learning are:**

- Robotics
- Autonomous driving
- Gaming

<img src="https://github.com/lorisliusso/Machine-Learning-with-Python/blob/master/Machine_Learning_map.png?raw=true" alt="300" width="800" align="left"/>

## Linear regression

It is a Supervised Machine Learning model that tries to find out the best possible linear relationship between the input features (X) and the target variable (y). Formula:

## y= aX + b + 𝜀      

(y=β0+βiX+ϵi)

**It optimizes slope a and intercept b by reducing the residuals between the actual y and the predicted y.**

To get the best weights/parameters, we usually minimize the sum of squared residuals (SSR) for all observations.

The linear relation between the input features and the output in <u>2D is simply a line.</u>

***Formula - Legend:***

- y is the predicted value of the dependent variable (y) for any given value of the independent variable (x).
- b is the intercept, the predicted value of y when the x is 0.
- a is the regression coefficient – how much we expect y to change as x increases.
- X is the independent variable ( the variable we expect is influencing y).
- 𝜀 is the error term (The distance between each point and the line, so the variation in the dependent variable not explained by the independent variables.)


<img src="https://github.com/lorisliusso/Machine-Learning-with-Python/blob/master/formula%20linear_reg%202d.png?raw=true" alt="400" width="600" align="left"/>


## Multiple Linear Regression

<u>Multiple or multivariate linear regression is a case of linear regression with two or more independent variables!</u>

Y= β0 +∑ βiXi + ϵi


<img src="https://miro.medium.com/max/1400/0*fr2NtfEx-ZrVMwz4" alt="400" width="800" align="left"/>


## The coefficient of determination 𝑅²

The coefficient of determination, denoted as 𝑅², tells you which amount of variation in 𝑦 can be explained by the independent variable/variables X, using the particular regression model. A larger 𝑅² indicates a better fit and means that the model can better explain the variation of the output with different inputs.

**The value 𝑅² = 1 corresponds to SSR = 0. That’s the perfect fit, since the values of predicted and actual responses fit completely to each other.**

<img src="https://github.com/lorisliusso/Machine-Learning-with-Python/blob/master/linear%20regression.png?raw=true" alt="400" width="800" align="left"/>
