# Analyzing the WTI candles

## Analysis

We will test for several classification algorithms in predicting the **direction** of the next price movement (up or down) after the inventory news from the EIA, published weekly at [investing.com](https://www.investing.com/). This will be achieved through basic classification rules used in ML. Later on, we will implement these predictions using advanced algorithms (e.g., SVM, Neural Networks) in the Azure ML suite.

<br>
### The problem statement

The problem can be stated, mathematically, as follows:

*Given a set of $M$ features of the form $\mathbf{X}_j = (x_{1j},x_{2j},\ldots,x_{ij},\ldots,x_{Nj})$ and given the target feature to be predicted, $\mathbf{Y} = (y_{1},y_{2},\ldots,y_{i},\ldots,y_{N})$, where each $j$-feature is a time-ordered set (i.e., a time series) and $N$ is the number of such $i$-timeframes; predict the next $y_i$, i.e., predict $y_{N+1}$.*

<br>
### Definitions

For now let us assume the features are:
- The time domain and the only one not being a time series (obviously; $\mathbf{X}_1$).
- The price (look for the price with the smallest entropy in the prediction; $\mathbf{X}_2$).
- The volume ($\mathbf{X}_3$).
- The past EIA crude inventory ($\mathbf{X}_4$).
- The actual EIA crude inventory ($\mathbf{X}_5$).

Therefore there will be $M=5$ features.

The target feature, $\mathbf{Y}$, will simply be: sign${\left(x_{i3}-x_{(i-1)3}\right)}$, that is, the sign (positive or negative) of the difference in price between the $i$-timeframe and the previous one.

<br><br><hr>
## Datasets

There will be two main datasets:
- The WTI candles from OANDA,
- The inventory notice from investing.

Eventually test for the need of other finantial indicators like demand, exports, imports, etc.

The characteristics of the WTI candles will be as follow:
- Daily sampling from January 2018 to the present,
- Timezones equal to America/NY,
- ISO date-time strings.

Download the investing news in the same timeframe. Formatting will be the same as the WTI dataset.

<br><br><hr>
## Algorithms

In this section the several algorithms will be described in general terms, that is, without making reference to the problem at hand as described in [the problem statement](#The-problem-statement).

<br>
### [One-rule classification](one-rule-classification.ipynb)

This algorithm consist in determining the most accurate feature $\mathbf{X}_j$ as the best classifier
given a dataset and a target feature $\mathbf{Y}$. The algorithm is fairly simple:

```
for each X_j in X:
    for each possible value in X_j:
        count the number of possible values in Y
        assign the most frequent values in Y to each possible value in X_j
    compute the error in this assignment: (number of true value in Y) / (frequency of that value)
select the feature with the smallest total error
```

The best feature selected according to this algorithm will be the only predictor.

<br>
### [Naïve Bayes](naive-bayes.ipynb)

This algorithm assigns a class $\hat{c}_k$ from all the possible classes in $\mathbf{C}$, to each $y_i$ in $\mathbf{Y}$ using the Bayes Theorem as follows:

$$
P\left(\mathbf{C}\,\middle|\,\{\mathbf{X}_j\}\right) = \frac{\mathcal{L}\left(\{\mathbf{X}_j\}\,\middle|\,\mathbf{C}\right) \times P\left(\mathbf{C}\right)}{P\left(\mathbf{X}_j\right)}
$$

where:

$P\left(\mathbf{C}\,\middle|\,\{\mathbf{X}_j\}\right)$ is the posterior probability distribution, that is, given the observed feature set $\{\mathbf{X}_j\}$.

$\mathcal{L}\left(\{\mathbf{X}_j\}\,\middle|\,\mathbf{C}\right)$ is the likelihood *function* (not distribution) which holds information on the probability of having observed the feature set $\{\mathbf{X}_j\}$ taking as ground truth the classes $\mathbf{C}$.

$P\left(\mathbf{C}\right)$ is the prior probability distribution of $\mathbf{C}$, that is, the probability of each class in the absence of any observation in the feature set.

$P\left(\mathbf{X}_j\right)$ this is the posterior probability distribution integrated over all classes. It is also called the evidence.

Thus, given a set of features $\{\mathbf{X}_j\}$ and our pejudices about the possible classes $\mathbf{C}$, this algorithm seeks the maximum of the posterior probability that assigns the class $\hat{c}_k$ to the set $\{\mathbf{X}_j\}$.


In [1]:
import os
import seaborn as sns
import pandas as pd
import pickle as pk
from pyCBT.common.path import exist
from pyCBT.providers.eia import series
from pyCBT.providers.oanda import account
from pyCBT.providers.oanda import historical

In [2]:
# initialize the API OANDA client
oanda_api = account.Client("101-004-7835907-001")
# initialize the WTI candles
wti_candles = historical.Candles(
    client=oanda_api,
    instrument="WTICO_USD",
    resolution="D",
    from_date="2018-01-01",
    to_date="2018-03-14",
    datetime_fmt="%Y-%m-%d %H:%M:%S.%f %z",
    timezone="America/New_York"
)

In [3]:
wti_candles.as_dataframe()

Unnamed: 0_level_0,OPEN,HIGH,LOW,CLOSE,VOLUME
DATETIME,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2018-01-01 17:00:00.000000 -0500,60.139,60.694,60.068,60.305,26482.0
2018-01-02 17:00:00.000000 -0500,60.345,61.916,60.246,61.861,26571.0
2018-01-03 17:00:00.000000 -0500,61.906,62.167,61.558,61.888,28299.0
2018-01-04 17:00:00.000000 -0500,61.858,62.009,61.06,61.531,25745.0
2018-01-07 17:00:00.000000 -0500,61.68,61.936,61.306,61.887,23428.0
2018-01-08 17:00:00.000000 -0500,61.887,63.438,61.768,63.428,35972.0
2018-01-09 17:00:00.000000 -0500,63.418,63.649,63.06,63.46,33032.0
2018-01-10 17:00:00.000000 -0500,63.46,64.741,63.411,63.522,37956.0
2018-01-11 17:00:00.000000 -0500,63.542,64.474,63.043,64.384,37709.0
2018-01-14 17:00:00.000000 -0500,64.41,64.884,64.089,64.819,23457.0
