# Linear Classifier
### Goal of Lesson
- Learn about Supervised Learning
- Explore how to use it for classification
- Understand Perceptron Classifier
- Use Perceptron as a linear classifier

 ## Supervised Learning
 - Given a dataset of input-output pairs, learn a function to map inputs to outputs
 - There are different tasks - but we start to focus on **Classification**
 
 
 ### Classification
 
 - Supervised learning task of learning a function mapping an input point to a descrete category

### Example
- Predict if it is going to rain or not
- We have historical data to train our model

| Date       | Humidity  | Pressure  | Rain      |
| :--------- |:---------:| ---------:| :---------|
| Jan. 1     | 93%       | 999.7     | Rain      |
| Jan. 2     | 49%       | 1015.5    | No Rain   |
| Jan. 3     | 79%       | 1031.1    | No Rain   |
| Jan. 4     | 65%       | 984.9     | Rain      |
| Jan. 5     | 90%       | 975.2     | Rain      |

- This is supervised learning as it has the label

### The task of Supervised Learning
- Simply explained, the task of from the example above, is to find a funcion $f$ as follows.

**Ideally**: $f(humidity, pressure)$

Examples:
- $f(93, 999.7) =$ Rain
- $f(49, 1015.5) =$ No Rain
- $f(79, 1031.1 =$ No Rain

**Goal**: Approximate the function $f$ - the approximation function is often denoted $h$

### Linear Classifier
- A linear classifier makes classification decision based on the value of a linear combination of the characteristics. ([wiki](https://en.wikipedia.org/wiki/Linear_classifier))

![linear_classifier.png](attachment:linear_classifier.png)

### Linear Classifier (math)
- $x_1$: Humidity
- $x_2$: Pressure
- $h(x_1, x_2) = w_0 + w_1 x_1 + w_2 x_2$

### Differently
- Weight vector $w: (w_0, w_1, w_2)$
- Input vector $x: (1, x_1, x_2)$
- Function dot-product: $x\cdot w: w_0 + w_1 x_1 + w_2 x_2$
- $h_w(x) = w\cdot x$

### Perceptron Classifier
- Is a linear algorithm that can be applied to binary classification

### Perceptron Learning Rule
- Given data point $x, y$ update each weight according to
    - $w_i = w_i + \alpha(y - h_w(x))\times x_i$
    - $w_i = w_i + \alpha($actual value - estimate$)\times x_i$
        - $\alpha$: learning rate

> #### Programming Notes:
> - Libraries used
>     - **pandas** - a data analysis and manipulation tool
>     - **numpy** - scientific computing with Python
>     - **matplotlib**- visualization with Python
>     - **sklearn**- tools for predictive data analysis
> - Functionality and concepts used
>     - **CSV**file
>     - **read_csv()** read a comma-separated values (csv) file into **pandas** DataFrame.
>     - **List Comprehension** to convert data
>     - **isnull()** Detect missing values
>     - **sum()** Return the sum of the values over the requested axis (can sum number of True-statements).
>     - **dropna()** clean the **pandas** DataFrame
>     - **train_test_split** from **sklearn**
>     - **Perceptron** to train (fit) the model
>     - **metrics.accuracy_score** to get the accuracy of the predictions

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.linear_model import Perceptron
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
data = pd.read_csv("F:/Machine Learning\files\weather.csv"files/weather.csv', parse_dates=True, index_col=0)
data.head()

FileNotFoundError: [Errno 2] No such file or directory: 'files/weather.csv'

In [None]:
data.isnull().sum()

In [None]:
dataset = data[['Humidity3pm', 'Pressure3pm', 'RainTomorrow']].dropna()

In [None]:
X = dataset[['Humidity3pm', 'Pressure3pm']]
y = dataset['RainTomorrow']
y = np.array([0 if value == 'No' else 1 for value in y])

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

In [None]:
clf = Perceptron(random_state=0)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
accuracy_score(y_test, y_pred)

In [None]:
sum(y == 0)/len(y)

In [None]:
fig, ax = plt.subplots()
X_data = X.to_numpy()

y_all = clf.predict(X_data)
ax.scatter(x=X_data[:,0], y=X_data[:,1], c=y_all, alpha=.25)

In [None]:
fig, ax = plt.subplots()

ax.scatter(x=X_data[:,0], y=X_data[:,1], c=y, alpha=.25)