In [10]:
import pandas as pd
import numpy as np
import datetime as dt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split, cross_val_score, TimeSeriesSplit
import plotly.express as px

**Carga de Archivo y primer vistaso** 

El dataset esta compuesto por una variable de tiempo, la inversion en diferentes medios, y las ventas por fecha en dias. 

In [11]:
data = pd.read_csv('./Data/mmm.csv')
data

Unnamed: 0,Date,TV,Radio,Banners,Sales
0,2018-01-07,13528.10,0.00,0.00,9779.80
1,2018-01-14,0.00,5349.65,2218.93,13245.19
2,2018-01-21,0.00,4235.86,2046.96,12022.66
3,2018-01-28,0.00,3562.21,0.00,8846.95
4,2018-02-04,0.00,0.00,2187.29,9797.07
...,...,...,...,...,...
195,2021-10-03,0.00,0.00,1691.68,9030.17
196,2021-10-10,11543.58,4615.35,2518.88,15904.11
197,2021-10-17,0.00,4556.16,1919.19,12839.29
198,2021-10-24,0.00,0.00,1707.65,9063.45


## EDA

No tenemos datos nulos en el dataset.
Cuenta con 5 variables/features:
- 1: tiempo "Date"
- 2: inversion "TV" 
- 3: inversion "Radio"
- 4: inversion "Banners"
- 5: ventas "Sales"

In [12]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   Date     200 non-null    object 
 1   TV       200 non-null    float64
 2   Radio    200 non-null    float64
 3   Banners  200 non-null    float64
 4   Sales    200 non-null    float64
dtypes: float64(4), object(1)
memory usage: 7.9+ KB


Se puede ver que no tenemos inversion para: 

- tv en mas del 50% de las semanas
- Radio en mas del 50% de las semanas
- Banner tenemos inversion apartir del 25% de las semanas


In [13]:
(data.describe().T).to_csv('describe.csv',sep='|')

In [14]:
data.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
TV,200.0,2946.20765,4749.646908,0.0,0.0,0.0,7938.5275,13901.55
Radio,200.0,2213.58505,2505.967886,0.0,0.0,0.0,4624.0275,7696.22
Banners,200.0,1520.72255,870.764354,0.0,1657.195,1918.99,2069.7675,2518.88
Sales,200.0,10668.1415,2700.706683,4532.33,8396.9425,10853.105,12566.995,17668.34


In [15]:
col_medios = ['TV', 'Radio', 'Banners']
data_melt = pd.melt(data, id_vars=['Date'], value_vars=col_medios, var_name='Medio', value_name='Inversion')

In [16]:
data_melt.head()

Unnamed: 0,Date,Medio,Inversion
0,2018-01-07,TV,13528.1
1,2018-01-14,TV,0.0
2,2018-01-21,TV,0.0
3,2018-01-28,TV,0.0
4,2018-02-04,TV,0.0


la inversion en tv es mas alto que radio y a su vez este es mas alto que el de banner

In [17]:
fig = px.pie(data_melt, 
            values='Inversion', 
            names='Medio',
            title='Share of Investment by Media')
fig.show()

Los Banners son los que estan activos

In [18]:
data2 = data.copy()
data2[['TV_%', 'Radio_%', 'Banners_%']] = data2[['TV', 'Radio', 'Banners']].apply(lambda x: x / x.sum(), axis=1)
px.bar(data2, x="Date", y=['TV_%', 'Radio_%', 'Banners_%'], title='Share of Investment by Media')

Las Ventas correlacionan con las inveriones en los medios

In [20]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(go.Bar(x=data['Date'], y=data['TV'], name='TV'), secondary_y=False)
fig.add_trace(go.Bar(x=data['Date'], y=data['Radio'], name='Radio'), secondary_y=False)
fig.add_trace(go.Bar(x=data['Date'], y=data['Banners'], name='Banners'), secondary_y=False)
fig.add_trace(go.Scatter(x=data['Date'], y=data['Sales'], name='Sales'), secondary_y=True)

fig.show()

In [21]:
px.imshow(data.corr(), color_continuous_scale=px.colors.diverging.Tealrose)

## Creating a Saturation Effect

Queremos crear una transformación (= función matemática) con las siguientes propiedades:
- Si los gastos son 0, los gastos saturados también son 0.
- La transformación aumenta de manera monótona, es decir, cuanto mayor es el gasto en insumos, mayor es el gasto en producción saturada.
- Los valores saturados no crecen hasta el infinito. En cambio, están delimitados en la parte superior por algún número, digamos 1.

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=8cf106a6-4b1d-47ad-8f90-5ab041ad929e' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>