# Time Series with WTI Crude

West Texas Intermediate or WTI has spot price data at several locations for this example we are pulling from one of my favorite sources of 3rd party data the St. Louis Federal Reserve's Economic Research or FRED. In this case the data can be found via .csv from [WTI FRED Data](https://fred.stlouisfed.org/series/DCOILWTICO).

### So what's so signficant about WTI Crude?
In short this is America's oil. WTI is what is found in oil plays all around the United States. WTI is sometimes known as Texas light sweet as a grade of crude oil. It's a medium crude oil because of it's low density and it's sweet because of its low sulfur content.

The major trading hub for WTI is in a small town in the state of Oklahoma called Cushing, Oklahoma. Make no bones about it, Cushing, Oklahoma is a strategically important hub in U.S. Energy.

Now WTI can be quoted in 3 major locations:
* Midland, TX - This is the heart of Texas Oil Production
* Cushing, OK - Major transportation and storage hub
* Houston, TX - Where oil moves on to international markets

As someone who grew up and lived in Oklahoma, the oil and natural gas industry is vital to the citizens of that state. States like the Dakotas, Alaska, rural Texas, Wyoming depend on oil and natural gas for jobs and a strong macroeconomy. 

We often see when oil prices are up those economies are doing well and unemployement may be 3% or even around 2%. Other parts of the country however, consumers often see prices of one of their major commodities rise and they have the inverse problem as wages often don't move along with rising energy costs.

There are many reasons as to why we would want to forecast the price of WTI. I outlined a few of them. We did not cover geopolitics or banking, I'll leave that to another person.

## The Data

Like we explained in the above this data are from spot price in Cushing, OK via the FRED. The data is daily in nature. This is where we start coding.

In [6]:
# Packages
import warnings
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
warnings.filterwarnings("ignore")
plt.style.use('fivethirtyeight')
import matplotlib
# ARIMA and other stat toys
import statsmodels.api as sm
# New shiny toy for forecasting
from fbprophet import Prophet

In [46]:
df = pd.read_excel("wtiPrice.xlsx")

# Get our first few rows to check
df.head()

Unnamed: 0,DATE,DCOILWTICO
0,2013-10-01,102.09
1,2013-10-02,104.15
2,2013-10-03,103.29
3,2013-10-04,103.83
4,2013-10-07,103.07


So right off the bat we can see that this is business day data. No big deal.

I'm not going to create a daily forecast. We can let day traders or financial institutions do that, better word would be attempt. Daily forecast of commodities is tough. We are going to focus on some longer term forecast here. I'll dive a little deeper into forecasting as an art and science a little later.

#### Data Types

In [47]:
# Types of data
df.dtypes

DATE          datetime64[ns]
DCOILWTICO           float64
dtype: object

In [48]:
# How much of a time range do we have
# Use the Order date field and Min and Max function
df['DATE'].min(), df['DATE'].max()

(Timestamp('2013-10-01 00:00:00'), Timestamp('2018-10-01 00:00:00'))

In [50]:
df = df.groupby('DATE')['DCOILWTICO'].sum().reset_index()

# Set index 
df = df.set_index('DATE')
df.index

DatetimeIndex(['2013-10-01', '2013-10-02', '2013-10-03', '2013-10-04',
               '2013-10-07', '2013-10-08', '2013-10-09', '2013-10-10',
               '2013-10-11', '2013-10-14',
               ...
               '2018-09-18', '2018-09-19', '2018-09-20', '2018-09-21',
               '2018-09-24', '2018-09-25', '2018-09-26', '2018-09-27',
               '2018-09-28', '2018-10-01'],
              dtype='datetime64[ns]', name='DATE', length=1259, freq=None)

In [53]:
y = df['DCOILWTICO'].resample('MS').mean()
y['2016']

DATE
2016-01-01    31.683158
2016-02-01    30.323000
2016-03-01    37.546364
2016-04-01    40.755238
2016-05-01    46.712381
2016-06-01    48.757273
2016-07-01    44.651500
2016-08-01    44.724348
2016-09-01    45.182381
2016-10-01    49.775238
2016-11-01    45.660952
2016-12-01    51.970476
Freq: MS, Name: DCOILWTICO, dtype: float64