## Store Sales - Time Series Forecasting

### **Descriptions**

**train.csv:**
* store_nbr =  identifies the store at which the products are sold.
* family =  identifies the type of product sold.
* sales = total sales for a product family at a particular store at a given date. Fractional values are possible since products can be sold in fractional units (1.5 kg of cheese, for instance, as opposed to 1 bag of chips).
* onpromotion = the total number of items in a product family that were being promoted at a store at a given date.

**test.csv:**
* Same features as the training data. The target sales for the dates in this files.
* The dates in the test are for the 15 days after the last date in the training data.

**stores.csv**

* Store metadata: city, state, type and cluster.
* Cluster is a grouping of similar stores.

**oil.csv**

* Daily oil price. Include values during both the train and test data timeframes.
* Ecuador is an oil-dependent country and it´s economical health is highly bulnerable to shocks in oil prices.


**Holidays_events.csv**

* Holidays and events, with metadata.
* Transferred column: A transferred day is more like a normal day than a holiday.

**Additional Notes**
* Wages in the public sector are paid every two weeks on the 15 th and on the last day of the month. Supermarket sales could be affected by this. (Seasonality?)
* A magnitude 7.8 earthquake struck Ecuador on `April 16, 2016`. People rallied in relief efforts donating water and other first need products which greatly affected supermarket sales for several weeks after the earthquake.

In [36]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

### 1st Part: **EDA**

In [5]:
# Paths
train_pth = r"C:\Users\willi\OneDrive\Documentos\5. Estudos\Kaggle_Notebooks\Store_Sales\data\train.csv"
test_pth = r"C:\Users\willi\OneDrive\Documentos\5. Estudos\Kaggle_Notebooks\Store_Sales\data\test.csv"
stores_pth = r"C:\Users\willi\OneDrive\Documentos\5. Estudos\Kaggle_Notebooks\Store_Sales\data\stores.csv"
oil_pth = r"C:\Users\willi\OneDrive\Documentos\5. Estudos\Kaggle_Notebooks\Store_Sales\data\oil.csv"
transactions_pth = r"C:\Users\willi\OneDrive\Documentos\5. Estudos\Kaggle_Notebooks\Store_Sales\data\transactions.csv"
holidays_events_pth = r"C:\Users\willi\OneDrive\Documentos\5. Estudos\Kaggle_Notebooks\Store_Sales\data\holidays_events.csv"

In [73]:
# Datasets
train = pd.read_csv(train_pth)
stores = pd.read_csv(stores_pth)
oil = pd.read_csv(oil_pth)
events = pd.read_csv(holidays_events_pth)
transactions = pd.read_csv(transactions_pth)

# Merge train -  stores - transactions
df = pd.merge(train,stores,on='store_nbr')
df = pd.merge_ordered(df, transactions, on=['date', 'store_nbr'])

# Merge DataFrame - Events
df = pd.merge(df, events, on=['date'], how='outer')
df.dropna(how='all', inplace=True)

# Create a Data Index
df.drop('id', axis=1, inplace=True)
df.set_index('date')

Unnamed: 0_level_0,store_nbr,family,sales,onpromotion,city,state,type_x,cluster,transactions,type_y,locale,locale_name,description,transferred
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
2013-01-01,1.0,AUTOMOTIVE,0.0,0.0,Quito,Pichincha,D,13.0,,Holiday,National,Ecuador,Primer dia del ano,False
2013-01-01,1.0,BABY CARE,0.0,0.0,Quito,Pichincha,D,13.0,,Holiday,National,Ecuador,Primer dia del ano,False
2013-01-01,1.0,BEAUTY,0.0,0.0,Quito,Pichincha,D,13.0,,Holiday,National,Ecuador,Primer dia del ano,False
2013-01-01,1.0,BEVERAGES,0.0,0.0,Quito,Pichincha,D,13.0,,Holiday,National,Ecuador,Primer dia del ano,False
2013-01-01,1.0,BOOKS,0.0,0.0,Quito,Pichincha,D,13.0,,Holiday,National,Ecuador,Primer dia del ano,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2017-12-22,,,,,,,,,,Additional,National,Ecuador,Navidad-3,False
2017-12-23,,,,,,,,,,Additional,National,Ecuador,Navidad-2,False
2017-12-24,,,,,,,,,,Additional,National,Ecuador,Navidad-1,False
2017-12-25,,,,,,,,,,Holiday,National,Ecuador,Navidad,False
