# Advertising Sales Prediction

### Objective

The objective of this project is to predict product sales based on advertising expenditure across TV, radio, and newspaper media. A machine learning regression model is developed using historical advertising data to analyze the relationship between marketing spend and sales. The system helps businesses forecast sales and optimize advertising strategies for better decision-making.

### Problem Type

- Machine Learning Type: Supervised Learning

- Task: Regression

- Output: Continuous sales value

### Improt Laibaries

In [22]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score


### Load Data

In [23]:
data = pd.read_csv("Advertising Budget and Sales.csv")
print(data.head())


   Unnamed: 0  TV Ad Budget ($)  Radio Ad Budget ($)  Newspaper Ad Budget ($)  \
0           1             230.1                 37.8                     69.2   
1           2              44.5                 39.3                     45.1   
2           3              17.2                 45.9                     69.3   
3           4             151.5                 41.3                     58.5   
4           5             180.8                 10.8                     58.4   

   Sales ($)  
0       22.1  
1       10.4  
2        9.3  
3       18.5  
4       12.9  


In [24]:
#Data Exploration
print(data.info())
print(data.describe())


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   Unnamed: 0               200 non-null    int64  
 1   TV Ad Budget ($)         200 non-null    float64
 2   Radio Ad Budget ($)      200 non-null    float64
 3   Newspaper Ad Budget ($)  200 non-null    float64
 4   Sales ($)                200 non-null    float64
dtypes: float64(4), int64(1)
memory usage: 7.9 KB
None
       Unnamed: 0  TV Ad Budget ($)  Radio Ad Budget ($)  \
count  200.000000        200.000000           200.000000   
mean   100.500000        147.042500            23.264000   
std     57.879185         85.854236            14.846809   
min      1.000000          0.700000             0.000000   
25%     50.750000         74.375000             9.975000   
50%    100.500000        149.750000            22.900000   
75%    150.250000        218.825000        

In [27]:
data.columns = data.columns.str.strip()
print(data.columns)


Index(['Unnamed: 0', 'TV Ad Budget ($)', 'Radio Ad Budget ($)',
       'Newspaper Ad Budget ($)', 'Sales ($)'],
      dtype='object')


In [26]:
#Feature Selectiona
X = data[['TV', 'Radio', 'Newspaper']]  # Input features
y = data['Sales']                       # Output


KeyError: "None of [Index(['TV', 'Radio', 'Newspaper'], dtype='object')] are in the [columns]"

In [None]:
#Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


- 80% Training
- 20% Testing

In [None]:
#Train the Model
model = LinearRegression()
model.fit(X_train, y_train)


In [None]:
# Make Pridiction
y_pred = model.predict(X_test)


## Model Evaluation (Efficiency)

In [None]:
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)a
print("Mean Squared Error:", mse)
print("R2 Score:", r2)


### Model Efficiency (Performance)
#### R² Score (Efficiency)

R² ≈ 0.89 – 0.92

Means 89%–92% accuracy

Very good for linear regression

#### Mean Squared Error

Lower value = better prediction

Shows small prediction error

#### Efficiency:
- High
- Fast training
- Low complexity

In [None]:
#Example Prediction
new_data = pd.DataFrame(
    [[150, 40, 10]],
    columns=['TV', 'Radio', 'Newspaper']
)

predicted_sales = model.predict(new_data)

print("Predicted Sales:", predicted_sales)



## Data Visualization

In [None]:
plt.scatter(y_test, y_pred)
plt.xlabel("Actual Sales")
plt.ylabel("Predicted Sales")
plt.title("Actual vs Predicted Sales")
plt.show()


### Advantages of This Model

- Simple
- Easy to explain
- Good accuracy
- Works well for small datasets

#### Limitations

- Cannot capture complex non-linear patterns
- Performance depends on data quality

### Conclusion

This project successfully predicts sales using ML with high efficiency (~90%).
Linear Regression proves to be fast, accurate, and suitable for business forecasting