# üìä Analyse des d√©penses publicitaires et des ventes
Ce notebook applique une **r√©gression lin√©aire multiple** pour pr√©dire les ventes en fonction des d√©penses publicitaires en TV, Radio, et Journaux.

In [3]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.preprocessing import StandardScaler

## üîπ Chargement des donn√©es

In [4]:
dataset = pd.read_csv("Advertising.csv", index_col=0)
dataset.head()

Unnamed: 0,TV,Radio,Newspaper,Sales
1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9


## üîπ S√©paration des variables

In [None]:
X = dataset.iloc[:, :3]  # TV, Radio, Newspaper
y = dataset.iloc[:, 3]   # Sales


1      22.1
2      10.4
3       9.3
4      18.5
5      12.9
       ... 
196     7.6
197     9.7
198    12.8
199    25.5
200    13.4
Name: Sales, Length: 200, dtype: float64

## üîπ S√©paration des donn√©es (train/test)

In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=23)

## üîπ R√©gression lin√©aire avec `scikit-learn`

In [5]:
model = LinearRegression()
model.fit(X_train, y_train)

bias = model.intercept_
W = model.coef_
print("Intercept (biais) :", bias)
print("Poids (coefficients) :", W)

yhat_sklearn = model.predict(X_test)
mse1 = mean_squared_error(y_test, yhat_sklearn)
print("MSE (sklearn) :", round(mse1, 2))

Intercept (biais) : 2.9281328114225094
Poids (coefficients) : [4.64851464e-02 1.81416165e-01 7.90438566e-05]
MSE (sklearn) : 2.82


## üîπ Standardisation des donn√©es pour le mod√®le from scratch

In [6]:
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=23)

X1_train, X2_train, X3_train = X_train[:,0], X_train[:,1], X_train[:,2]

## üîπ Impl√©mentation from scratch avec descente de gradient

In [7]:
n = len(y_train)
learning_rate = 0.01
epochs = 1000

bias = 1
w1, w2, w3 = 1, 2, 3

for it in range(epochs):
    yhat = bias + w1 * X1_train + w2 * X2_train + w3 * X3_train
    mse = mean_squared_error(y_train, yhat)
    if it % 100 == 0:
        print(f"it:{it} --> MSE: {round(mse, 2)}")

    gradient_bias = -(2/n) * np.sum(y_train - yhat)
    gradient_w1 = -(2/n) * np.sum(X1_train * (y_train - yhat))
    gradient_w2 = -(2/n) * np.sum(X2_train * (y_train - yhat))
    gradient_w3 = -(2/n) * np.sum(X3_train * (y_train - yhat))

    bias -= learning_rate * gradient_bias
    w1 -= learning_rate * gradient_w1
    w2 -= learning_rate * gradient_w2
    w3 -= learning_rate * gradient_w3

it:0 --> MSE: 186.04
it:100 --> MSE: 6.34
it:200 --> MSE: 2.87
it:300 --> MSE: 2.8
it:400 --> MSE: 2.79
it:500 --> MSE: 2.79
it:600 --> MSE: 2.79
it:700 --> MSE: 2.79
it:800 --> MSE: 2.79
it:900 --> MSE: 2.79


In [8]:
X1_test, X2_test, X3_test = X_test[:,0], X_test[:,1], X_test[:,2]
yhat_scratch = bias + w1 * X1_test + w2 * X2_test + w3 * X3_test
mse2 = mean_squared_error(y_test, yhat_scratch)
print("MSE (from scratch) :", round(mse2, 2))

MSE (from scratch) : 2.82


## ‚úÖ Conclusion
- Le mod√®le `sklearn` donne un MSE bas gr√¢ce √† une optimisation interne efficace.
- Le mod√®le **from scratch** obtient un MSE similaire si l'apprentissage est bien fait, prouvant que la descente de gradient fonctionne.
- Cette approche nous aide √† comprendre **comment fonctionne r√©ellement l'entra√Ænement d'un mod√®le lin√©aire**.