# Modeling

### Importing important libraries

In [1]:
### importing the important libraries
import pandas as pd
import numpy as np

### Train Test Split
from sklearn.model_selection import train_test_split

### for modeling
from sklearn.linear_model import LinearRegression, Lasso, Ridge
from sklearn.svm import SVR

### Evaluation metrics
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

### Serialization
import pickle

### Data Loading

In [2]:
### specify the absolute path of dataset
dataset_path = "../dataset/ads.csv"

### importing the dataset
df = pd.read_csv(dataset_path)

### data preview
df.head(3)

Unnamed: 0.1,Unnamed: 0,TV,radio,newspaper,sales
0,1,230.1,37.8,69.2,22.1
1,2,44.5,39.3,45.1,10.4
2,3,17.2,45.9,69.3,9.3


### Selecting Target and features

### Single Feature "radio"

In [3]:
features = df.loc[:,["radio"]]
target = df["sales"]

### Train Test Split
* The dataset is split into training and testing sets using train_test_split(). Here, we use 80% of the data for training and 20% for testing.

In [4]:
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.2, random_state=42)

In [5]:
### checking the shape of train dataset
X_train.shape, y_train.shape

((160, 1), (160,))

In [6]:
### checking the shape of test dataset
X_test.shape, y_test.shape

((40, 1), (40,))

### Modeling

### 1. LinearRegression:
* The model is trained on the training data using fit().

In [7]:
lr_model = LinearRegression()
lr_model.fit(X_train, y_train)

#### Prediction using linearRegression
* We make predictions on the test set using predict()

In [8]:
lr_model_pred = lr_model.predict(X_test)

#### Evaluation
* Finally, we evaluate the model using r2_score, MAE, and MSE. These metrics provide insights into how well the model performs.

#### a. Mean Absolute Error

In [9]:
lr_model_mae = mean_absolute_error(y_true=y_test, y_pred=lr_model_pred)
print(f"The mean absolute error of linearregression with feature radio is {lr_model_mae}")

The mean absolute error of linearregression with feature radio is 3.9298787572224847


#### b. Mean Square Error

In [10]:
lr_model_mse = mean_squared_error(y_true=y_test, y_pred=lr_model_pred)
print(f"The mean squared error of linearregression with feature radio is {lr_model_mse}")

The mean squared error of linearregression with feature radio is 23.248766588129108


#### c. R2 Score

In [11]:
lr_model_r2_score = r2_score(y_true=y_test, y_pred=lr_model_pred)
print(f"The r2 score of linearregression with feature radio is {lr_model_r2_score}")

The r2 score of linearregression with feature radio is 0.2634309396999791


### 2. Lasso Regression:
* The model is trained on the training data using fit().

In [12]:
lasso_model = Lasso()
lasso_model.fit(X_train, y_train)

#### Prediction using Lasso Regression
* We make predictions on the test set using predict()

In [16]:
lasso_model_pred = lasso_model.predict(X_test)

#### Evaluation
*  Finally, we evaluate the model using r2_score, MAE, and MSE. These metrics provide insights into how well the model performs.

#### a. Mean Absolute Error

In [17]:
lasso_model_mae = mean_absolute_error(y_true=y_test, y_pred=lasso_model_pred)
print(f"The mean absolute error of lasso regression with feature radio is {lasso_model_mae}")

The mean absolute error of lasso regression with feature radio is 3.936001663248816


#### b. Mean Square Error

In [18]:
lasso_model_mse = mean_squared_error(y_true=y_test, y_pred=lasso_model_pred)
print(f"The mean squared error of lasso regression with feature radio is {lasso_model_mse}")

The mean squared error of lasso regression with feature radio is 23.231491957556592


#### c. R2 Score

In [19]:
lasso_model_r2_score = r2_score(y_true=y_test, y_pred=lasso_model_pred)
print(f"The r2 score of lasso regression with feature radio is {lasso_model_r2_score}")

The r2 score of lasso regression with feature radio is 0.26397823576231405


### 3. Ridge Regression:
* The model is trained on the training data using fit().

In [20]:
ridge_model = Ridge()
ridge_model.fit(X_train, y_train)

#### Prediction using Ridge Regression
* We make predictions on the test set using predict()

In [21]:
ridge_model_pred = ridge_model.predict(X_test)

#### Evaluation
*  Finally, we evaluate the model using r2_score, MAE, and MSE. These metrics provide insights into how well the model performs.

#### a. Mean Absolute Error

In [22]:
ridge_model_mae = mean_absolute_error(y_true=y_test, y_pred=ridge_model_pred)
print(f"The mean absolute error of ridge regression with feature radio is {ridge_model_mae}")

The mean absolute error of ridge regression with feature radio is 3.9298865792724698


#### b. Mean Square Error

In [23]:
ridge_model_mse = mean_squared_error(y_true=y_test, y_pred=ridge_model_pred)
print(f"The mean squared error of ridge regression with feature radio is {ridge_model_mse}")

The mean squared error of ridge regression with feature radio is 23.24873844022347


#### c. R2 Score

In [24]:
ridge_model_r2_score = r2_score(y_true=y_test, y_pred=ridge_model_pred)
print(f"The r2 score of lasso regression with feature radio is {ridge_model_r2_score}")

The r2 score of lasso regression with feature radio is 0.2634318314839079


### 4. Support Vector Regression:
* The model is trained on the training data using fit().

In [25]:
svr_model = SVR()
svr_model.fit(X_train, y_train)

#### Prediction using SVR Regression
* We make predictions on the test set using predict()

In [26]:
svr_model_pred = svr_model.predict(X_test)

#### Evaluation
*  Finally, we evaluate the model using r2_score, MAE, and MSE. These metrics provide insights into how well the model performs.

#### a. Mean Absolute Error

In [27]:
svr_model_mae = mean_absolute_error(y_true=y_test, y_pred=svr_model_pred)
print(f"The mean absolute error of suppor#### b. Mean Square Errort vector regression with feature radio is {svr_model_mae}")

The mean absolute error of support vector regression with feature radio is 4.040463881301298


#### b. Mean Square Error

In [28]:
svr_model_mse = mean_squared_error(y_true=y_test, y_pred=svr_model_pred)
print(f"The r2 score of support vector regression with feature radio is {svr_model_mse}")

The r2 score of support vector regression with feature radio is 26.153948062787624


#### c. R2 Score

In [29]:
svr_model_r2_score = r2_score(y_true=y_test, y_pred=svr_model_pred)
print(f"The r2 score of support vector regression with feature radio is {svr_model_r2_score}")

The r2 score of support vector regression with feature radio is 0.1713887756273752


# Metrics table for TV

In [30]:
metrics_table ={
    "Regression_Algorithms": ["LinearRegression","LassoRegression","RidgeRegression", "SupportVectorRegression"],
    "Mean_Absolute_Error": [lr_model_mae, lasso_model_mae, ridge_model_mae, svr_model_mae],
    "Mean_Squared_Error": [lr_model_mse, lasso_model_mse, ridge_model_mse, svr_model_mse],
    "R2_score": [lr_model_r2_score, lasso_model_r2_score, ridge_model_r2_score, svr_model_r2_score]
}

In [31]:
metrics_df = pd.DataFrame(metrics_table)
metrics_df

Unnamed: 0,Regression_Algorithms,Mean_Absolute_Error,Mean_Squared_Error,R2_score
0,LinearRegression,3.929879,23.248767,0.263431
1,LassoRegression,3.936002,23.231492,0.263978
2,RidgeRegression,3.929887,23.248738,0.263432
3,SupportVectorRegression,4.040464,26.153948,0.171389


Based on these observations, it's reasonable to choose one of the linear regression-based models (Linear Regression, Lasso Regression, or Ridge Regression) as the best model. Since all three models have almost identical performance metrics, the choice among them might depend on other factors such as model interpretability, computational efficiency, or specific requirements of the problem at hand.

In conclusion, the best model among the provided options, considering the given metrics, would be either Linear Regression, Lasso Regression, or Ridge Regression.

It's important to note that the choice of the best model should also consider other factors such as model complexity, interpretability, and computational efficiency, which are not provided in the given data. Additionally, further analysis such as cross-validation could provide a more robust evaluation of model performance.