## Modelos arima Vs redes neuronales

Modelos clásicos estadísticos: Holt-Winters, ARIMA - based, modelos exponenciales tienen las siguientes deficiencias:
- no soportan valores vacíos
- asumiendo que los datos tienen relacion linear, la tendencia puede ser positiva o negativa. Es decir, no se adaptan a la variabilidad de los datos en el tiempo. Como por ejemplo podria funcionar con algun Season Arima Model
- trabajan sobre información univariante, es decir solo centradas en describir una característica del individuo

Siguiendo el ejemplo: 
- https://www.kaggle.com/ymlai87416/web-traffic-time-series-forecast-with-4-model
- Info Dataset: https://www.kaggle.com/jihyeseo/online-retail-data-set-from-uci-ml-repo

In [1]:
import pandas as pd
import numpy as np

In [2]:
oretail = pd.read_excel('C:/Users/Lucatic/Documents/Datasets/timeseries/Online Retail.xlsx') 
oretail[:4]

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom


In [3]:
## VER CAMPOS VACIOS
pd.DataFrame(data=oretail.isna().sum())

Unnamed: 0,0
InvoiceNo,0
StockCode,0
Description,1454
Quantity,0
InvoiceDate,0
UnitPrice,0
CustomerID,135080
Country,0


In [4]:
print(oretail.shape[0])
oretail.dtypes

541909


InvoiceNo              object
StockCode              object
Description            object
Quantity                int64
InvoiceDate    datetime64[ns]
UnitPrice             float64
CustomerID            float64
Country                object
dtype: object

In [5]:
oretail = oretail.astype({'CustomerID':np.int},errors='ignore')
oretail[:5]

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom


In [6]:
# Clientes desconocidos
oretail[oretail.CustomerID.isna()]

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
622,536414,22139,,56,2010-12-01 11:52:00,0.00,,United Kingdom
1443,536544,21773,DECORATIVE ROSE BATHROOM BOTTLE,1,2010-12-01 14:32:00,2.51,,United Kingdom
1444,536544,21774,DECORATIVE CATS BATHROOM BOTTLE,2,2010-12-01 14:32:00,2.51,,United Kingdom
1445,536544,21786,POLKADOT RAIN HAT,4,2010-12-01 14:32:00,0.85,,United Kingdom
1446,536544,21787,RAIN PONCHO RETROSPOT,2,2010-12-01 14:32:00,1.66,,United Kingdom
...,...,...,...,...,...,...,...,...
541536,581498,85099B,JUMBO BAG RED RETROSPOT,5,2011-12-09 10:26:00,4.13,,United Kingdom
541537,581498,85099C,JUMBO BAG BAROQUE BLACK WHITE,4,2011-12-09 10:26:00,4.13,,United Kingdom
541538,581498,85150,LADIES & GENTLEMEN METAL SIGN,1,2011-12-09 10:26:00,4.96,,United Kingdom
541539,581498,85174,S/4 CACTI CANDLES,1,2011-12-09 10:26:00,10.79,,United Kingdom


In [7]:
oretail[oretail.UnitPrice==0]

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
622,536414,22139,,56,2010-12-01 11:52:00,0.0,,United Kingdom
1970,536545,21134,,1,2010-12-01 14:32:00,0.0,,United Kingdom
1971,536546,22145,,1,2010-12-01 14:33:00,0.0,,United Kingdom
1972,536547,37509,,1,2010-12-01 14:33:00,0.0,,United Kingdom
1987,536549,85226A,,1,2010-12-01 14:34:00,0.0,,United Kingdom
...,...,...,...,...,...,...,...,...
536981,581234,72817,,27,2011-12-08 10:33:00,0.0,,United Kingdom
538504,581406,46000M,POLYESTER FILLER PAD 45x45cm,240,2011-12-08 13:58:00,0.0,,United Kingdom
538505,581406,46000S,POLYESTER FILLER PAD 40x40cm,300,2011-12-08 13:58:00,0.0,,United Kingdom
538554,581408,85175,,20,2011-12-08 14:06:00,0.0,,United Kingdom


In [8]:
import matplotlib.pyplot as plt
# import datetime

In [9]:
# Separamos campo Fechas
oretail = oretail.assign(Year=oretail.InvoiceDate.dt.year,Month=oretail.InvoiceDate.dt.month,total_in_eur=oretail.UnitPrice*oretail.Quantity)

oretail[:3]

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country,Year,Month,total_in_eur
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom,2010,12,15.3
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,2010,12,20.34
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom,2010,12,22.0


In [10]:
oretail = oretail.assign(ano_mes=oretail.InvoiceDate.dt.year.astype(str)+'-'+oretail.InvoiceDate.dt.month.astype(str))

oretail[:3]

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country,Year,Month,total_in_eur,ano_mes
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom,2010,12,15.3,2010-12
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,2010,12,20.34,2010-12
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom,2010,12,22.0,2010-12


In [11]:
TABLA_1 = oretail\
.query('UnitPrice>0')\
.pivot_table(columns='ano_mes',index='Country',values='total_in_eur',aggfunc=np.sum).reset_index()

TABLA_1

ano_mes,Country,2010-12,2011-1,2011-10,2011-11,2011-12,2011-2,2011-3,2011-4,2011-5,2011-6,2011-7,2011-8,2011-9
0,Australia,1005.1,9017.71,17150.53,6805.99,,14627.47,17055.29,333.4,13628.51,25164.77,4767.57,22489.2,5031.73
1,Austria,257.04,,1043.78,1329.78,683.2,518.36,1708.12,680.78,1249.43,-24.2,1191.95,1516.08,
2,Bahrain,205.74,-205.74,,,,,,,548.4,,,,
3,Belgium,1809.91,1154.05,5651.38,6229.41,1409.43,2161.32,3333.58,1954.15,2727.0,4273.17,2473.62,3536.12,4197.82
4,Brazil,,,,,,,,1143.6,,,,,
5,Canada,,,,,,,140.54,,534.24,1171.46,1768.58,51.56,
6,Channel Islands,363.53,645.08,2623.32,1495.17,194.15,1784.71,3509.33,293.0,903.79,2060.03,,4892.53,1321.65
7,Cyprus,1590.82,547.5,4216.52,460.89,-91.25,4013.55,938.39,-35.8,,1109.32,,,196.35
8,Czech Republic,,,277.48,-61.51,,549.26,,-57.51,,,,,
9,Denmark,1281.5,,1438.11,2699.57,168.9,399.22,3978.99,,515.7,3261.15,376.24,78.6,4570.16


In [16]:
columnas = TABLA_1.columns.to_list()
columnas.sort()
columnas

['2010-12',
 '2011-1',
 '2011-10',
 '2011-11',
 '2011-12',
 '2011-2',
 '2011-3',
 '2011-4',
 '2011-5',
 '2011-6',
 '2011-7',
 '2011-8',
 '2011-9',
 'Country']

In [18]:
TABLA_1 = TABLA_1[[ 'Country',
  '2010-12',
 '2011-1',
  '2011-2',
 '2011-3',
 '2011-4',
 '2011-5',
 '2011-6',
 '2011-7',
 '2011-8',
 '2011-9',
 '2011-10',
 '2011-11',
 '2011-12']]
TABLA_1

ano_mes,Country,2010-12,2011-1,2011-2,2011-3,2011-4,2011-5,2011-6,2011-7,2011-8,2011-9,2011-10,2011-11,2011-12
0,Australia,1005.1,9017.71,14627.47,17055.29,333.4,13628.51,25164.77,4767.57,22489.2,5031.73,17150.53,6805.99,
1,Austria,257.04,,518.36,1708.12,680.78,1249.43,-24.2,1191.95,1516.08,,1043.78,1329.78,683.2
2,Bahrain,205.74,-205.74,,,,548.4,,,,,,,
3,Belgium,1809.91,1154.05,2161.32,3333.58,1954.15,2727.0,4273.17,2473.62,3536.12,4197.82,5651.38,6229.41,1409.43
4,Brazil,,,,,1143.6,,,,,,,,
5,Canada,,,,140.54,,534.24,1171.46,1768.58,51.56,,,,
6,Channel Islands,363.53,645.08,1784.71,3509.33,293.0,903.79,2060.03,,4892.53,1321.65,2623.32,1495.17,194.15
7,Cyprus,1590.82,547.5,4013.55,938.39,-35.8,,1109.32,,,196.35,4216.52,460.89,-91.25
8,Czech Republic,,,549.26,,-57.51,,,,,,277.48,-61.51,
9,Denmark,1281.5,,399.22,3978.99,,515.7,3261.15,376.24,78.6,4570.16,1438.11,2699.57,168.9
