## 2.8.1 Ejercicios
**Ejercicio 2**  Para el dataset de marketing, realice las pruebas de hipótesis utilizando la tabla de ANOVA sobre: 
1) Uno de los parámetros, justificar cual. 
2) Todos los parámetros. 
3) Un subconjunto de parámetros, justificar cual. Usar la tabla de ANOVA para calcular el estadístico  $F_0$ y encontrar el p-valor asociado usando pf(F_0, df1, df2, lower.tail = FALSE). 
 
Justifique su respuesta y suba su código en Python a github.

## 0. Importar

In [90]:
import pandas as pd

import numpy as np
from scipy import stats

import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.stats.anova import anova_lm

from itertools import combinations

In [2]:
DatasetMarketing = pd.read_csv('./marketing_dataset.csv')

DatasetMarketing.head()

Unnamed: 0,youtube,facebook,newspaper,sales
0,276.12,45.36,83.04,26.52
1,53.4,47.16,54.12,12.48
2,20.64,55.08,83.16,11.16
3,181.8,49.56,70.2,22.2
4,216.96,12.96,70.08,15.48


## 1. Modelo Simple

Al realizar un comparativa (prueba F) entre el modelo nulo (solamente con la intercepción) respecto a cada uno de los posibles modelos simples, se tiene que el mejor modelo simple es el que emplea `youtube`. Además, del propio summary, tiene coeficientes significativos y relevantes para el modelo,

In [79]:
def CompareLinearModels(
        TableAnova_1,
        TableAnova_2,
    ):
    SCE_1 = TableAnova_1.iloc[-1,1]
    Df_1 = TableAnova_1.iloc[-1,0]

    SCE_2 = TableAnova_2.iloc[-1,1]
    Df_2 = TableAnova_2.iloc[-1,0]

    F0 = ((SCE_1-SCE_2)/(Df_1-Df_2))/(SCE_2/Df_2)
    return F0 , stats.f.sf(F0,Df_1-Df_2,Df_2)

In [3]:
*FeaturesLabels , TargetLabel = DatasetMarketing.columns

In [14]:
LinearModel_Nulo = smf.ols(f"{TargetLabel} ~ 1",DatasetMarketing).fit()
TableAnova_Nulo = anova_lm(LinearModel_Nulo)

In [89]:
for _feature in FeaturesLabels:
    _linear_model = smf.ols(f"{TargetLabel} ~ {_feature}",DatasetMarketing).fit()
    _table_anova = anova_lm(_linear_model)

    _F0_feature , _p_F_feature = CompareLinearModels(TableAnova_Nulo,_table_anova)
    print(f"{_feature} F_0 :: {_F0_feature:.8f} p-valor :: {_p_F_feature}")

youtube F_0 :: 312.14499437 p-valor :: 1.4673897001945903e-42
facebook F_0 :: 98.42158757 p-valor :: 4.354966001766402e-19
newspaper F_0 :: 10.88729908 p-valor :: 0.0011481958688881629


In [56]:
LinearModel_Simple = smf.ols(f"{TargetLabel} ~ youtube ",DatasetMarketing).fit()
LinearModel_Simple.summary()

0,1,2,3
Dep. Variable:,sales,R-squared:,0.612
Model:,OLS,Adj. R-squared:,0.61
Method:,Least Squares,F-statistic:,312.1
Date:,"Mon, 01 Sep 2025",Prob (F-statistic):,1.47e-42
Time:,17:13:16,Log-Likelihood:,-555.51
No. Observations:,200,AIC:,1115.0
Df Residuals:,198,BIC:,1122.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,8.4391,0.549,15.360,0.000,7.356,9.523
youtube,0.0475,0.003,17.668,0.000,0.042,0.053

0,1,2,3
Omnibus:,0.531,Durbin-Watson:,1.935
Prob(Omnibus):,0.767,Jarque-Bera (JB):,0.669
Skew:,-0.089,Prob(JB):,0.716
Kurtosis:,2.779,Cond. No.,406.0


In [58]:
TableAnova_Simple = anova_lm(LinearModel_Simple)
TableAnova_Simple

Unnamed: 0,df,sum_sq,mean_sq,F,PR(>F)
youtube,1.0,4773.05016,4773.05016,312.144994,1.46739e-42
Residual,198.0,3027.64404,15.291132,,


## 2. Modelo Completo

Al comparar el mejor modelo simple respecto al modelo completo, se tiene que existe una mejora significativa (valor alto en el estadístico F) al considerar todas las variables al momento de predecir `sales` además de un incremento en la métrica $R^2_{adj}$, solo teniendo a la variable `newspaper` como una variable que no aporta de manera significativa al modelo, es decir, se podría eliminar y no se perdería información para mejorar las predicciones del modelo.

In [59]:
LinearModel_Completo = smf.ols(
    f"{TargetLabel} ~ " + ' + '.join(FeaturesLabels),
    data = DatasetMarketing,
).fit()

LinearModel_Completo.summary()

0,1,2,3
Dep. Variable:,sales,R-squared:,0.897
Model:,OLS,Adj. R-squared:,0.896
Method:,Least Squares,F-statistic:,570.3
Date:,"Mon, 01 Sep 2025",Prob (F-statistic):,1.58e-96
Time:,17:19:03,Log-Likelihood:,-422.65
No. Observations:,200,AIC:,853.3
Df Residuals:,196,BIC:,866.5
Df Model:,3,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,3.5267,0.374,9.422,0.000,2.789,4.265
youtube,0.0458,0.001,32.809,0.000,0.043,0.049
facebook,0.1885,0.009,21.893,0.000,0.172,0.206
newspaper,-0.0010,0.006,-0.177,0.860,-0.013,0.011

0,1,2,3
Omnibus:,60.414,Durbin-Watson:,2.084
Prob(Omnibus):,0.0,Jarque-Bera (JB):,151.241
Skew:,-1.327,Prob(JB):,1.44e-33
Kurtosis:,6.332,Cond. No.,545.0


In [65]:
TableANova_Completo = anova_lm(LinearModel_Completo)
TableANova_Completo

Unnamed: 0,df,sum_sq,mean_sq,F,PR(>F)
youtube,1.0,4773.05016,4773.05016,1166.730757,1.809337e-84
facebook,1.0,2225.687908,2225.687908,544.050126,1.882722e-58
newspaper,1.0,0.127753,0.127753,0.031228,0.8599151
Residual,196.0,801.828379,4.090961,,


In [88]:
_F0_Completo , _p_F_Completo = CompareLinearModels(TableAnova_Nulo,TableANova_Completo)
print(f"{_feature} F_0 :: {_F0_Completo:.8f} p-valor :: {_p_F_Completo}")

newspaper F_0 :: 570.27070366 p-valor :: 1.5752272560924532e-96


In [None]:
SCE_Completo = TableANova_Completo.iloc[-1,1]
Df_Completo = TableANova_Completo.iloc[-1,0]
SCE_newspaper = TableANova_Completo.iloc[2,1]
Df_newspaper = TableANova_Completo.iloc[2,0]

_F0_Completo_Newspaper = ((SCE_Completo-SCE_newspaper)/(Df_Completo-Df_newspaper))/(SCE_Completo/Df_Completo)
_p_F_Completo_Newspaper = stats.f.sf(_F0_Completo_Newspaper,Df_Completo-Df_newspaper,Df_Completo)
print(f"Newspaper en Completo F_0 :: {_F0_Completo_Newspaper} p-valor :: {_p_F_Completo_Newspaper}")

Newspaper en Completo F_0 :: 1.0049680613071632 p-valor :: 0.4861500091399172


## 3. Modelo Mejor Subconjunto

El mejor subconjunto para un modelo es el originado por usar las variables `youtube` y `facebook` en el que no existe perdida de información significativa respecto al anterior mejor modelo (modelo completo) y además sus métricas $R^2_{adj}$ tienen valores iguales, y todas los coeficientes (variables) son significativas para el modelo junto con que su regresión es significativa.

In [95]:
for _features_set in combinations(FeaturesLabels,2):
    _linear_model = smf.ols(
        f"{TargetLabel} ~ " + ' + '.join(_features_set),
        DatasetMarketing,).fit()
    _table_anova = anova_lm(_linear_model)
    

    _F0_feature , _p_F_feature = CompareLinearModels(_table_anova,TableANova_Completo)
    print(f"{_features_set} F_0 :: {_F0_feature:.8f} p-valor :: {_p_F_feature} r^2 score {_linear_model.rsquared_adj}")

('youtube', 'facebook') F_0 :: 0.03122805 p-valor :: 0.8599150500805359 r^2 score 0.8961505479974428
('youtube', 'newspaper') F_0 :: 479.32516964 p-valor :: 1.5053389205756397e-54 r^2 score 0.6422399150864775
('facebook', 'newspaper') F_0 :: 1076.40583684 p-valor :: 1.5099599548144894e-81 r^2 score 0.32593061728991957


In [96]:
LinearModel_Subset = smf.ols(f"{TargetLabel} ~ youtube + facebook",DatasetMarketing).fit()
LinearModel_Subset.summary()

0,1,2,3
Dep. Variable:,sales,R-squared:,0.897
Model:,OLS,Adj. R-squared:,0.896
Method:,Least Squares,F-statistic:,859.6
Date:,"Mon, 01 Sep 2025",Prob (F-statistic):,4.83e-98
Time:,17:49:37,Log-Likelihood:,-422.66
No. Observations:,200,AIC:,851.3
Df Residuals:,197,BIC:,861.2
Df Model:,2,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,3.5053,0.353,9.919,0.000,2.808,4.202
youtube,0.0458,0.001,32.909,0.000,0.043,0.048
facebook,0.1880,0.008,23.382,0.000,0.172,0.204

0,1,2,3
Omnibus:,60.022,Durbin-Watson:,2.081
Prob(Omnibus):,0.0,Jarque-Bera (JB):,148.679
Skew:,-1.323,Prob(JB):,5.19e-33
Kurtosis:,6.292,Cond. No.,510.0
