# MACHINE LEARNING 

---

> Supervised Learning:

- Machine Learns from examples being supplied
- The right answers that it gets 
- It predicts based on the supplied data
- `Regression` is part of this type of learning method 
- Another supervised learning algorithm is `Classification Problem` 
- In classification problem learning algorithm has to fit a boundary line to the problem 

> Unsupervised Learning:

- In its `Clustering Algirithm` we find something interesting in unlebelled data 
- Data only comes with inputs x, but not output labels y. 
- Algorithm has to find structure in the data.
- `Anomaly Detection` is used to find Unusual data points
- `Dimensionality Reduction` Compress data using fewer number 

> Regression Model:

- **x:** Input variable (feature)
- **y:** Output variable (target)
- **m:** Number of training examples

A regression model predicts a continuous value based on input features. It learns the relationship between x and y from the training data, aiming to minimize the difference between predicted and actual values.

> Linear Model:

- 1. **Model:** y = wx+b
- 2. **Parameter:** w, b
- 3. **Cost fxn:** To plot such a linear graph, we use the cost function. So, in our graph we have:
    - ŷ which is the prediction 
    - y which is the data point 
    - ŷ-y is called the error where we are measuring our prediction and target vertical distance

    `Squared Error Cost Function = 1/2m ∑(ŷ - y)²`

    Since ŷ is our prediction value we can rewrite the cost function as 

    `Squared Error Cost Function = 1/2m ∑(f(x) - y)²`

- 4. **Objective:** Since our cost function is of the form x²(when only w is the parameter and b is not present), it is in form of a parabola, where the actual value is the minima of parabolic equation. 
    - For functions with both the parameter w and b, the cost function has a shape more of like a soup boul. 
    - We can also flatten this into a contours graph, where the middlemost section will be minimum value. 

    - <img src="image.png" width="350"/>

> Linear Regression Gradient Descent:

Linear regression gradient descent is an optimization algorithm used to find the best-fit line for a set of data points by minimizing the cost function (usually Mean Squared Error).

How it works:

- The linear regression model predicts y = wx + b
- The cost function measures how far the predictions are from the actual values.
- Gradient descent iteratively updates the parameters w (weight) and b (bias) to reduce the cost:
    - Compute the gradient (slope) of the cost function with respect to w and b.
    - Update w and b in opposite direction to gradient 
        - w = w-alpha(dho J/ dho w)
        - b = b-alpha(dho J/ dho b)
    - This process repeats until the cost function reaches its minimum, meaning the model fits the data as well as possible.
- <img src="image2.png" width="350"/>

Note: An incorrect gradient descent algorithm updates the variable simultaneously however what shall be done is: 

```py
    tmp_w = w-alpha(dho J/ dho w)
    tmp_b = b-alpha(dho J/ dho b)
    w = tmp_w
    b = tmp_b
```

This is because both the derivatives should take the old value of w,b in the derivative of J hence we update the values later on, after finding the derivative. 

Simplifying the derivatives we get: 

```py
    tmp_w = w-alpha(1/m ∑(f(x) - y)x)
    tmp_b = b-alpha(1/m ∑(f(x) - y))
    w = tmp_w
    b = tmp_b
```

> Multiple features instead of one:

Now if we have multiple features instead of \(x\), the equation is:

$f_{w,b}(\vec{x}) = w_1 x_1 + w_2 x_2 + \cdots + w_n x_n + b$

Or, in vector form:

$f_{w,b}(\vec{x}) = \vec{w} \cdot \vec{x} + b$

where

$\vec{w} = [w_1, w_2, \ldots, w_n] \quad$
$\vec{x} = [x_1, x_2, \ldots, x_n]$

In [68]:
import pandas as pd
import numpy as np

def m_print(text):
    print('\033[92m' + "\n" + text + '\033[0m')

data = pd.read_csv("co2.csv")
df = pd.DataFrame(data)
m_print("Original Database: ")
display(df.head())

m_print("Original Data Types: ")
print(df.dtypes)

# Some columns are text so we need to make them into integer :>
def conv_integer(datacolumn):
    global df
    indexing = {}
    if (df[datacolumn].dtype not in ['int64', 'float64', 'int32', 'float32']):
        indexing = dict(enumerate(list(set(df[datacolumn]))))
        value_to_int = {}
        for k, v in indexing.items():
            value_to_int[v] = k
        # value_to_int = {v: k for k, v in indexing.items()}
        df[datacolumn] = df[datacolumn].map(value_to_int)
        return indexing
    else:
        return
    
m_print("The indexing is: ")
print(np.array(list(map(conv_integer, list(df.columns)))))

m_print("Modified Data Types: ")
print(df.dtypes)

# Now let's try and run this 
m_print("Modified Database: ")
display(df.head())

# We are taking y as CO2 and all other independent variables are x'es
x = np.array(df[df.columns[0:-1]])
y = np.array(df[df.columns[-1]])

# Now we just added a new column of constant 1 to the matrix  
x = np.insert(x, 0, 1, axis=1)

# Now we take the coeffecient matrix 
# By formula our B = (X⊤X)¯¹X⊤Y

b = np.linalg.inv(np.dot(x.T, x)).dot(x.T).dot(y)

# y = b0 + b1x1 + b2x2.....
# Since testing data is normal.. we use this

m_print("Final testing with a row from original dataset: ")
x_testing = df.iloc[0][:-1]
print(x_testing)
y_prediction = b[0] + np.dot(x_testing, b[1:])

m_print("Prediction Result: \t\tOriginal Data Result:")
print(y_prediction, end="\t\t")
print(df.iloc[0, -1])

[92m
Original Database: [0m


Unnamed: 0,Make,Model,Vehicle Class,Engine Size(L),Cylinders,Transmission,Fuel Type,Fuel Consumption City (L/100 km),Fuel Consumption Hwy (L/100 km),Fuel Consumption Comb (L/100 km),Fuel Consumption Comb (mpg),CO2 Emissions(g/km)
0,ACURA,ILX,COMPACT,2.0,4,AS5,Z,9.9,6.7,8.5,33,196
1,ACURA,ILX,COMPACT,2.4,4,M6,Z,11.2,7.7,9.6,29,221
2,ACURA,ILX HYBRID,COMPACT,1.5,4,AV7,Z,6.0,5.8,5.9,48,136
3,ACURA,MDX 4WD,SUV - SMALL,3.5,6,AS6,Z,12.7,9.1,11.1,25,255
4,ACURA,RDX AWD,SUV - SMALL,3.5,6,AS6,Z,12.1,8.7,10.6,27,244


[92m
Original Data Types: [0m
Make                                 object
Model                                object
Vehicle Class                        object
Engine Size(L)                      float64
Cylinders                             int64
Transmission                         object
Fuel Type                            object
Fuel Consumption City (L/100 km)    float64
Fuel Consumption Hwy (L/100 km)     float64
Fuel Consumption Comb (L/100 km)    float64
Fuel Consumption Comb (mpg)           int64
CO2 Emissions(g/km)                   int64
dtype: object
[92m
The indexing is: [0m
[{0: 'AUDI', 1: 'JAGUAR', 2: 'ROLLS-ROYCE', 3: 'SMART', 4: 'SCION', 5: 'CHRYSLER', 6: 'TOYOTA', 7: 'DODGE', 8: 'MITSUBISHI', 9: 'MASERATI', 10: 'KIA', 11: 'INFINITI', 12: 'LEXUS', 13: 'GENESIS', 14: 'JEEP', 15: 'NISSAN', 16: 'HONDA', 17: 'FIAT', 18: 'SRT', 19: 'FORD', 20: 'MINI', 21: 'BMW', 22: 'LAND ROVER', 23: 'ASTON MARTIN', 24: 'ALFA ROMEO', 25: 'RAM', 26: 'BENTLEY', 27: 'BUICK', 28: 'LAMBOR

Unnamed: 0,Make,Model,Vehicle Class,Engine Size(L),Cylinders,Transmission,Fuel Type,Fuel Consumption City (L/100 km),Fuel Consumption Hwy (L/100 km),Fuel Consumption Comb (L/100 km),Fuel Consumption Comb (mpg),CO2 Emissions(g/km)
0,32,1953,6,2.0,4,6,0,9.9,6.7,8.5,33,196
1,32,1953,6,2.4,4,13,0,11.2,7.7,9.6,29,221
2,32,1983,6,1.5,4,14,0,6.0,5.8,5.9,48,136
3,32,1762,9,3.5,6,4,0,12.7,9.1,11.1,25,255
4,32,1606,9,3.5,6,4,0,12.1,8.7,10.6,27,244


[92m
Final testing with a row from original dataset: [0m
Make                                  32.0
Model                               1953.0
Vehicle Class                          6.0
Engine Size(L)                         2.0
Cylinders                              4.0
Transmission                           6.0
Fuel Type                              0.0
Fuel Consumption City (L/100 km)       9.9
Fuel Consumption Hwy (L/100 km)        6.7
Fuel Consumption Comb (L/100 km)       8.5
Fuel Consumption Comb (mpg)           33.0
Name: 0, dtype: float64
[92m
Prediction Result: 		Original Data Result:[0m
207.9054967829702		196
