# Time Series Data

This notebook prepares time series data for training forecast models. Please refer to the following Jupyter notebook for obtaining and processing Walmart Sales dataset:

- [Data Wrangling for Walmart Sales Datasets](https://github.com/nphan20181/walmart_sales/blob/master/00_walmart_data_wrangling.ipynb).

In [1]:
import pandas as pd
import datetime

# load pre-processed data
weekly_sales = pd.read_pickle('data/weekly_sales.pkl')

# extract week number of year from Date
weekly_sales['Week'] = weekly_sales['Date'].map(lambda x: datetime.date(x.year, x.month, x.day).isocalendar().week)
weekly_sales.reset_index(inplace=True)
weekly_sales['index'] = weekly_sales['index'] + 1
weekly_sales.rename(columns={'index': 'Time Series Index'}, inplace=True)
weekly_sales.head()

Unnamed: 0,Time Series Index,Date,IsHoliday,Week,Month,Quarter,Year,Weekly Sales (Million)
0,1,2010-01-08,False,1,1,1,2010,43.865605
1,2,2010-01-15,False,2,1,1,2010,41.348378
2,3,2010-01-22,False,3,1,1,2010,41.367822
3,4,2010-01-29,False,4,1,1,2010,39.717414
4,5,2010-02-05,False,5,2,1,2010,49.75074


In [2]:
import numpy as np

# transform data
weekly_sales['Log of Weekly Sales (Million)'] = np.log(weekly_sales['Weekly Sales (Million)'])
weekly_sales['Square Root of Weekly Sales (Million)'] = np.sqrt(weekly_sales['Weekly Sales (Million)'])

In [3]:
# save dataset to csv files
weekly_sales.to_csv('data/ts_dataset.csv', index=False)