## Title : Advertisement Analysis
* The goal of this project is to analyze the relationship between different modes of advertising and product sales.
* The project will also identify which advertising mode has the highest impact on sales, providing insights for strategic ad budget allocations.

 **Dataset Discription**



 

 Advertisement Dataset provides sales of products depending upon the type of advertisement.
 3 advertising modes are used i.e. advertising on Television, advertising on Radio, advertising using Newspaper

Attributes: 
*   TV(X1)
*   Radio(X2)
*   Newspaper(X3)
*   Sales(Y)


In [1]:
import numpy as np
import pandas as pd
from sympy import symbols,Eq,solve
from scipy.stats import pearsonr
from sklearn.metrics import r2_score

In [3]:
df=pd.read_csv("advertising.csv")

In [4]:
df.head(5)

Unnamed: 0,TV,Radio,Newspaper,Sales
0,230.1,37.8,69.2,22.1
1,44.5,39.3,45.1,10.4
2,17.2,45.9,69.3,12.0
3,151.5,41.3,58.5,16.5
4,180.8,10.8,58.4,17.9


In [5]:
X1=list(df['TV'])
X2=list(df['Radio'])
X3=list(df['Newspaper'])
Y=list(df['Sales'])
n=len(Y)
p=3


In [6]:
X1Y=np.multiply(X1,Y)
X2Y=np.multiply(X2,Y)
X3Y=np.multiply(X3,Y)
X1sq=np.square(X1)
X2sq=np.square(X2)
X3sq=np.square(X3)

**Least Square Regression**

In [7]:
print("Values From LSR Method")
m1=(n*sum(X1Y)-sum(X1)*sum(Y))/(n*sum(X1sq)-sum(X1)*sum(X1))
m2=(n*sum(X2Y)-sum(X2)*sum(Y))/(n*sum(X2sq)-sum(X2)*sum(X2))
m3=(n*sum(X3Y)-sum(X3)*sum(Y))/(n*sum(X3sq)-sum(X3)*sum(X3))
c=(sum(Y)-m1*sum(X1)-m2*sum(X2)-m3*sum(X3))/n

print(f"m1 = {m1}\nm2 = {m2}\nm3 = {m3}\nc  = {c}")

Values From LSR Method
m1 = 0.05546477046955868
m2 = 0.1244316555033846
m3 = 0.03832399510524219
c  = 2.9090921081536107


In [9]:
print("R sqaured and Adjusted R squared value of LSR ")
Y_pred=[]
for i in range(n):
  y_hat=m1*X1[i]+m2*X2[i]+m3*X3[i]+c
  Y_pred.append(y_hat)
R_sq=r2_score(Y,Y_pred)
Adj_R_sq=1-((1-R_sq)*(n-1)/(n-p-1))

print(f"R_squared = {R_sq}\nAdjusted R_square = {Adj_R_sq}")


R sqaured and Adjusted R squared value of LSR 
R_squared = 0.869588724038402
Adjusted R_square = 0.8675926330798062


In [10]:
X1X2=np.multiply(X1,X2)
X2X3=np.multiply(X2,X3)
X1X3=np.multiply(X1,X3)

**Gradient Descent**

In [12]:
print("Values From GD Method")
gd_m1,gd_m2,gd_m3,gd_c = symbols('gd_m1,gd_m2,gd_m3,gd_c')
eq1=Eq(gd_m1*sum(X1sq)+gd_m2*sum(X1X2)+gd_m3*sum(X1X3)+gd_c*sum(X1)-sum(X1Y),0)
eq2=Eq(gd_m2*sum(X2sq)+gd_m1*sum(X1X2)+gd_m3*sum(X2X3)+gd_c*sum(X2)-sum(X2Y),0)
eq3=Eq(gd_m3*sum(X3sq)+gd_m1*sum(X1X3)+gd_m2*sum(X2X3)+gd_c*sum(X3)-sum(X3Y),0)
eq4=Eq(gd_m1*sum(X1)+gd_m2*sum(X2)+gd_m3*sum(X3)+n*gd_c-sum(Y),0)

sol_eq=solve((eq1,eq2,eq3,eq4),(gd_m1,gd_m2,gd_m3,gd_c))
gd_m1=sol_eq[gd_m1]
gd_m2=sol_eq[gd_m2]
gd_m3=sol_eq[gd_m3]
gd_c =sol_eq[gd_c]
print(f"m1 = {gd_m1}\nm2 = {gd_m2}\nm3 = {gd_m3}\nc  = {gd_c}")


Values From GD Method
m1 = 0.0544457803375709
m2 = 0.107001228238703
m3 = 0.000335657922331233
c  = 4.62512407880865


In [21]:
print("R sqaured and Adjusted R squared value of GD ")
gd_Y_pred=[]
for i in range(n):
  gd_y_hat=(gd_m1*X1[i]+gd_m2*X2[i]+gd_m3*X3[i]+gd_c)
  gd_Y_pred.append(gd_y_hat)
gd_R_sq=r2_score(Y,gd_Y_pred)
gd_Adj_R_sq=1-((1-gd_R_sq)*(n-1)/(n-p-1))

print(f"R_squared = {gd_R_sq}\nAdjusted R_square = {gd_Adj_R_sq}")

R sqaured and Adjusted R squared value of GD 
R_squared = 0.9025912899684558
Adjusted R_square = 0.9011003403251159


  return f(*args, **kwargs)


**Euler's Form**

In [16]:
x1_sq=sum(X1sq)-sum(X1)**2/n
x2_sq=sum(X2sq)-sum(X2)**2/n
x3_sq=sum(X3sq)-sum(X3)**2/n
x1_y=sum(X1Y)-sum(X1)*sum(Y)/n
x2_y=sum(X2Y)-sum(X2)*sum(Y)/n
x3_y=sum(X3Y)-sum(X3)*sum(Y)/n
x1_x2=sum(X1X2)-sum(X1)*sum(X2)/n
x1_x3=sum(X1X3)-sum(X1)*sum(X3)/n
x2_x3=sum(X2X3)-sum(X2)*sum(X3)/n


In [17]:
print("Values From Euler's Form")
ef_m1,ef_m2,ef_m3=symbols('ef_m1,ef_m2,ef_m3')
ef_eq1=Eq((ef_m1*x1_sq + ef_m2*x1_x2 + ef_m3*x1_x3 - x1_y),0)
ef_eq2=Eq((ef_m2*x2_sq + ef_m1*x1_x2 + ef_m3*x2_x3 - x2_y),0)
ef_eq3=Eq((ef_m3*x3_sq + ef_m1*x1_x3 + ef_m2*x2_x3 - x3_y),0)

ef_sol = solve((ef_eq1,ef_eq2,ef_eq3),(ef_m1,ef_m2,ef_m3))

ef_m1=ef_sol[ef_m1]
ef_m2=ef_sol[ef_m2]
ef_m3=ef_sol[ef_m3]

ef_c=(sum(Y)-ef_m1*sum(X1)-ef_m2*sum(X2)-ef_m3*sum(X3))/n
print(f"m1 = {ef_m1}\nm2 = {ef_m2}\nm3 = {ef_m3}\nc  = {ef_c}")


Values From Euler's Form
m1 = 0.0544457803375708
m2 = 0.107001228238702
m3 = 0.000335657922330636
c  = 4.62512407880869


In [18]:
print("R sqaured and Adjusted R squared value of Euler's Form ")
ef_Y_pred=[]
for i in range(n):
  ef_y_hat=(ef_m1*X1[i]+ef_m2*X2[i]+ef_m3*X3[i]+ef_c, dtype=float)
  ef_Y_pred.append(ef_y_hat)
ef_R_sq=r2_score(Y,ef_Y_pred)
ef_Adj_R_sq=1-((1-ef_R_sq)*(n-1)/(n-p-1))

print(f"R_squared = {ef_R_sq}\nAdjusted R_square = {ef_Adj_R_sq}")

R sqaured and Adjusted R squared value of Euler's Form 
R_squared = 0.9025912899684558
Adjusted R_square = 0.9011003403251159


  return f(*args, **kwargs)


**Correlation Method**

In [24]:
print("Values From Correlation")
r1,_=pearsonr(X1,Y)
r2,_=pearsonr(X2,Y)
r3,_=pearsonr(X3,Y)

cr_m1=r1*(sum(Y)/sum(X1))
cr_m2=r2*(sum(Y)/sum(X2))
cr_m3=r3*(sum(Y)/sum(X3))
cr_c=(sum(Y)-cr_m1*sum(X1)-cr_m2*sum(X2)-cr_m3*sum(X3))/n

print(f"m1 = {cr_m1}\nm2 = {cr_m2}\nm3 = {cr_m3}\nc  = {cr_c}")

Correlation
m1 = 0.09273323244790398
m2 = 0.2273939697934377
m3 = 0.07822262799427612
c  = -6.185333821232566


In [27]:
print("R sqaured and Adjusted R squared value of Correlation ")
cr_Y_pred=[]
for i in range(n):
  cr_y_hat=cr_m1*X1[i]+cr_m2*X2[i]+cr_m3*X3[i]+cr_c
  cr_Y_pred.append(cr_y_hat)
cr_R_sq=r2_score(Y,cr_Y_pred)
cr_Adj_R_sq=1-((1-cr_R_sq)*(n-1)/(n-p-1))

print(f"R_squared = {cr_R_sq}\nAdjusted R_square = {cr_Adj_R_sq}")

R sqaured and Adjusted R squared value of Correlation 
R_squared = 0.1754774775287048
Adjusted R_square = 0.16285723483781755


**Model Evaluation**



1.   Gradient Descent:    R_squared = 0.9025912899684558,         Adjusted R_square = 0.9011003403251159
2.   MVM Euler's Form:    R_squared = 0.9025912899684558,
Adjusted R_square = 0.9011003403251159
3.   LSR :                R_squared = 0.869588724038402,
Adjusted R_square = 0.8675926330798062
4.   Correlation:         R_squared = 0.1754774775287048,
Adjusted R_square = 0.16285723483781755







Various Machine learning algorithms have given different results and the models can be ranked based on their performance depending upon dataset as:                
GD = MVM Euler's Form > LSR > Correlation

From This we can conclude that GD and Euler's Form are the performing better and LSR is performing good bu bot better than GD and Euler's Form and Correlation model is not the best fit for advertisement dataset.

### The analysis suggests that television and radio advertising might have a stronger correlation with sales compared to newspapers, depending on each channel's contribution to the regression outcome. The findings can help in formulating optimized advertising strategies for better market penetration and sales performance.