# SVR - Ejercicio - Publicidad y Ventas

**Contexto**  
Este conjunto de datos contiene las ventas obtenidas, de acuerdo al monto invertido en publicidad en diferentes canales.

**Contenido**  
El conjunto de datos proviene de kaggle: [Advertising dataset](https://www.kaggle.com/datasets/tawfikelmetwally/advertising-dataset).  
Contiene 200 renglones, con las siguientes columnas:
* TV: inversión de anuncios por TV, en miles de USD
* Radio: inversión de anuncios en radio, en miles de USD
* Newspaper: inversión de anuncios en periódicos, en miles de USD
* **Sales**: ventas obtenidas en millones de USD **->(variable objetivo)**

**Planteamiento del problema**  
Se busca predecir que canales tienen mayor impacto en las ventas obtenidas.

In [1]:
# Importar librerias
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.svm import SVR
from sklearn import metrics

## Cargar Datos

In [2]:
# Importar los datos
df = pd.read_csv('Advertising.csv')
df.head()

Unnamed: 0.1,Unnamed: 0,TV,Radio,Newspaper,Sales
0,1,230.1,37.8,69.2,22.1
1,2,44.5,39.3,45.1,10.4
2,3,17.2,45.9,69.3,9.3
3,4,151.5,41.3,58.5,18.5
4,5,180.8,10.8,58.4,12.9


In [3]:
# Renombrar columnas
df.columns = ['renglón', 'tv', 'radio', 'periodico', 'ventas']
df.head()

Unnamed: 0,renglón,tv,radio,periodico,ventas
0,1,230.1,37.8,69.2,22.1
1,2,44.5,39.3,45.1,10.4
2,3,17.2,45.9,69.3,9.3
3,4,151.5,41.3,58.5,18.5
4,5,180.8,10.8,58.4,12.9


## Normalización

In [4]:
# Variables independientes
X = df[['tv', 'radio', 'periodico']]
X.head()

Unnamed: 0,tv,radio,periodico
0,230.1,37.8,69.2
1,44.5,39.3,45.1
2,17.2,45.9,69.3
3,151.5,41.3,58.5
4,180.8,10.8,58.4


In [5]:
# Normalizar
scaler = StandardScaler()
X_adj = scaler.fit_transform(X)
print(X_adj)

[[ 9.69852266e-01  9.81522472e-01  1.77894547e+00]
 [-1.19737623e+00  1.08280781e+00  6.69578760e-01]
 [-1.51615499e+00  1.52846331e+00  1.78354865e+00]
 [ 5.20496822e-02  1.21785493e+00  1.28640506e+00]
 [ 3.94182198e-01 -8.41613655e-01  1.28180188e+00]
 [-1.61540845e+00  1.73103399e+00  2.04592999e+00]
 [-1.04557682e+00  6.43904671e-01 -3.24708413e-01]
 [-3.13436589e-01 -2.47406325e-01 -8.72486994e-01]
 [-1.61657614e+00 -1.42906863e+00 -1.36042422e+00]
 [ 6.16042873e-01 -1.39530685e+00 -4.30581584e-01]
 [-9.45155670e-01 -1.17923146e+00 -2.92486143e-01]
 [ 7.90028350e-01  4.96973404e-02 -1.22232878e+00]
 [-1.43908760e+00  7.99208859e-01  1.62704048e+00]
 [-5.78501712e-01 -1.05768905e+00 -1.07502697e+00]
 [ 6.66253447e-01  6.50657027e-01  7.11007392e-01]
 [ 5.64664612e-01  1.65000572e+00  1.02862691e+00]
 [-9.25304978e-01  9.00494200e-01  3.84117072e+00]
 [ 1.56887609e+00  1.10306488e+00  1.16211917e+00]
 [-9.08957349e-01 -1.86635121e-01 -5.64073843e-01]
 [ 3.00679600e-03  4.29449843e-

In [6]:
# Variable dependiente
y = df[['ventas']]
y.head()

Unnamed: 0,ventas
0,22.1
1,10.4
2,9.3
3,18.5
4,12.9


In [7]:
# Normalizar
scaler_y = StandardScaler()
y_adj = scaler_y.fit_transform(y)
y_adj = y_adj.ravel()
print(y_adj)

[ 1.55205313 -0.69604611 -0.90740587  0.86033029 -0.21568303 -1.31091086
 -0.42704278 -0.15803946 -1.77205942 -0.65761706 -1.04190753  0.64897053
 -0.92662039 -0.83054778  0.9564029   1.6096967  -0.29254112  1.99398717
 -0.5231154   0.11096387  0.76425767 -0.29254112 -1.61834324  0.28389458
 -0.83054778 -0.38861374  0.18782197  0.36075268  0.93718838 -0.67683159
  1.41755147 -0.40782826 -0.8497623   0.64897053 -0.86897682 -0.23489755
  2.1861324   0.1301784  -0.75368968  1.43676599  0.49525434  0.59132696
  1.2830498  -0.21568303 -1.06112206  0.16860744 -0.65761706  1.76341289
  0.14939292 -0.83054778 -0.50390088 -0.63840254  1.64812575  1.37912242
  1.18697718  1.8594855  -1.63755776 -0.15803946  1.87870003  0.84111576
 -1.13798015  1.95555812  0.32232363 -0.00432327  0.76425767 -0.90740587
 -0.86897682 -0.11961041  0.93718838  1.59048218  0.82190124 -0.31175564
 -1.00347849 -0.58075897  0.57211244 -1.02269301 -1.36855443  0.03410578
 -1.67598681 -0.58075897 -0.42704278 -0.33097017 -0

In [8]:
print('X:', len(X_adj), 'y:', len(y_adj))

X: 200 y: 200


## Modelado

In [9]:
# Conjunto de entrenamiento y pruebas
## Escribe tu código aqui

In [10]:
print('X_train:', len(X_train), 'y_train:', len(y_train))
print('X_test:',  len(X_test),  'y_test:',  len(y_test))

X_train: 140 y_train: 140
X_test: 60 y_test: 60


### Selecciona el kernel de tu preferencia

In [None]:
# Entrenamiento
## Escribe tu código aqui

In [None]:
# Predicciones
## Escribe tu código aqui

array([-0.69170643, -1.14941575, -0.95559414,  2.06455833, -0.40549575,
       -1.171261  , -0.69345491, -0.14640786, -0.99780644,  0.50531125,
        2.02670284, -0.73456785, -0.51827431,  0.3562584 , -0.41826171,
       -0.18592055,  0.68267741, -1.10083597, -0.00326008,  0.67341734,
        2.01871622, -0.59056446,  0.25712856, -0.14427521, -1.22289883,
        0.31645945, -0.30301723,  1.57995572, -0.4458552 , -1.12730269,
        0.05751972,  1.71844578,  0.22012285,  1.56318565, -1.2533326 ,
       -1.44099393, -0.95207672, -0.22841299, -0.1915274 , -1.3712778 ,
       -0.81046777, -1.19303939,  0.36356672,  0.44447853,  0.50334092,
       -0.10232195, -1.26071444, -0.4460115 ,  0.38691961, -0.76497027,
       -0.44572605,  1.23732163,  0.56893756,  0.28023907, -0.88349034,
       -0.98965799,  1.14955824,  0.0509043 ,  0.37315209, -1.29650462])

## Evaluacion

In [None]:
## Escribe tu código aqui

MAE: 0.15454311970624224
MSE: 0.06304742973839171
RMSE: 0.25109247248452454
R2: 0.9375018846144175
