<a href="https://colab.research.google.com/github/microprediction/endersnotebooks/blob/main/mean_reversion_attacker.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
!pip install --upgrade git+https://github.com/microprediction/endersgame.git

Collecting git+https://github.com/microprediction/endersgame.git
  Cloning https://github.com/microprediction/endersgame.git to /tmp/pip-req-build-58l367au
  Running command git clone --filter=blob:none --quiet https://github.com/microprediction/endersgame.git /tmp/pip-req-build-58l367au
  Resolved https://github.com/microprediction/endersgame.git to commit cda3fadb6bf6e789292bbf9202bda7688ffca933
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: endersgame
  Building wheel for endersgame (setup.py) ... [?25l[?25hdone
  Created wheel for endersgame: filename=endersgame-0.2.6-py3-none-any.whl size=26796 sha256=96706785e563189e23d187fd1b1363c63bac12b4bebd149b3c94429a46053ad3
  Stored in directory: /tmp/pip-ephem-wheel-cache-gnqetqpr/wheels/39/24/f0/19aeef5765f9b9f629bab092893ebd3c04bde902d978c742bb
Successfully built endersgame
Installing collected packages: endersgame
  Attempting uninstall: endersgame
    Found existing installation: endersg

# Mean Reversion Attacker Tutorial
This notebook demonstrates how to create an "attacker" (see [README.md](https://github.com/microprediction/endersgame/tree/main/endersgame/attackers)), and test it.

We use the steam generator (see this [notebook](https://github.com/microprediction/endersnotebooks/blob/main/enders_data_generator.ipynb)) to train it.  

## What should an attacker do?

It tries to predict `up` or `down` but not too often.

Our attacker will consume a univariate sequence of numerical data points $x_1, x_2, \dots x_t$ and try to exploit deviations from the [martingale property](https://en.wikipedia.org/wiki/Martingale_(probability_theory)), which is to say that we expect the series $x_t$ to satisfy:

$$ E[x_{t+k}] \approx x_t $$

roughly. Of course, there's no such thing in this world as a perfect martingale and it is your job to indicate when

$$ E[x_{t+k}] > x_t $$

or conversely.

## Overview
We will


1.   Start with an attacker that already has some accounting logic
2.   Modify the default `tick` and `predict` methods
3.   Run the attacker on mock data
4.   Run the attacker on real data
5.   Set up an optimization to tune the attacker's parameters
6.   See if it helps on the test set


## Imports


In [1]:
from endersgame.attackers.attackerwithsimplepnl import AttackerWithSimplePnL
from endersgame.rivertransformers.macd import MACD
from endersgame.datasources.streamgenerator import stream_generator
from river import stats
import numpy as np
import math
import types
from pprint import pprint
import json

## Step 1: Decide what state to maintain
Let's first implement the `tick` method. This should quickly respond to an incoming data point by modifying a rapidly changing `state`. Here we choose to maintain the current value and also an exponentially weighted moving average of historical values.

In [2]:
from endersgame.attackers.attackerwithsimplepnl import AttackerWithSimplePnL

class MyAttacker(AttackerWithSimplePnL):

     def __init__(self, a=0.01, **kwargs):
        super().__init__(**kwargs)
        self.state = {'running_avg':None,
                      'current_value':None}
        self.params = {'a':a}

     def tick(self, x:float):
         # Maintains an expon moving average of the data
         self.state['current_value'] = x
         if not np.isnan(x):
            if self.state['running_avg'] is None:
                self.state['running_avg'] = x
            else:
                self.state['running_avg'] = (1-self.params['a'])*self.state['running_avg'] + self.params['a']*x


### Testing tick
We are half way there. Let's check the state maintenance:

In [3]:
x_train_stream = stream_generator(stream_id=0,category='train')
attacker = MyAttacker()
for x in x_train_stream:
    attacker.tick(x)

print(f"After processing the entire stream, the current value is  {attacker.state['current_value']} and the moving average is {attacker.state['running_avg']}")
attacker.state

After processing the entire stream, the current value is  9537.392857140881 and the moving average is 9540.56231901366


{'running_avg': 9540.56231901366, 'current_value': 9537.392857140881}

## Making an `up` or `down` decision
Next we implement `predict` using a mean reversion strategy.

In [12]:
def predict(self, horizon:int=None)->float:
    if self.state['current_value'] > self.state['running_avg'] + 2:
        return -1
    if self.state['current_value'] < self.state['running_avg'] - 2:
        return 1
    return 0

attacker = MyAttacker()
attacker.predict = types.MethodType(predict, attacker) # <-- Attach the predict method to our existing instance of attacker


Let's check that if the current value is very high we should predict it will fall:

In [13]:
attacker.state['current_value'] = 10
attacker.state['running_avg'] = 5
print(attacker.predict(horizon=6))

-1


## Run the attacker on mock data
Let's put these together to creat an attacker with both `tick` and `predict`

In [15]:
horizon=100                           # Prediction horizon
attacker = MyAttacker()         # Always reset an attacker
attacker.predict = types.MethodType(predict, attacker)  # <-- If you find this awkward, you can always just put predict() in the class itself.

xs = [1,3,4,2,4,5,1,5,2,5,10]*100
for x in xs:
   y = attacker.tick_and_predict(x=x, horizon=horizon)

## Run the attacker on real data

In [16]:
horizon = 100       # Horizon
x_test_stream = stream_generator(stream_id=1,category='train')
attacker = MyAttacker()
attacker.predict = types.MethodType(predict, attacker)     #  <-- If you get sick of doing this then put the method in the class at the outset
for x in x_test_stream:
    y = attacker.tick_and_predict(x=x,horizon=horizon)

attacker.state

{'running_avg': 6441.850010946288, 'current_value': 6439.799999998536}

## Check the attacker's profit and loss


In [17]:
pprint(attacker.pnl.summary())

{'current_ndx': 181700,
 'losses': 76,
 'num_resolved_decisions': 181,
 'profit_per_decision': 0.05552486187847094,
 'standardized_profit_per_decision': 0.03370947839520563,
 'total_profit': 10.050000000003239,
 'win_loss_ratio': 1.381578947368421,
 'wins': 105}


## Train (globally) the using many streams
Let's create a function that evaluates the attacker for a choice of parameter `a`

In [18]:
def total_profit_objective(a, category='train', verbose=True):
    NUM_STREAMS = 20
    horizon = 100.                # Prediction horizon
    total_profit = 0
    for stream_id in range(NUM_STREAMS):
        attacker = MyAttacker(a=a)
        attacker.predict = types.MethodType(predict, attacker)
        x_test_stream = stream_generator(stream_id=stream_id,category=category)
        for x in x_test_stream:
            y = attacker.tick_and_predict(x=x,horizon=horizon)
        pnl = attacker.pnl.summary()
        total_profit += pnl['total_profit']
    if verbose:
        print(f'Using a={a} the total profit on the {category} data is {total_profit}')
    return -total_profit         # So smaller is better for the optimizer

# Let's try it out
profit = -total_profit_objective(a=0.06)

Using a=0.06 the total profit on the train data is 2.8920028024899036


Now we can pass this to an optimizer

In [None]:
import scipy.optimize as opt
result = opt.minimize_scalar(total_profit_objective, bounds=(0.001, 0.2), method='bounded',options={'maxiter': 5})

# Print the result
print(f"Optimal value of a: {result.x}")
print(f"Minimum total profit: {-result.fun}")  # Re-negate to get the actual profit

Using a=0.07701123623877092 the total profit on the train data is 123.64407981738506


## Oh wait ... does it work on the test set too?

In [None]:
test_profit = -total_profit_objective(a=0.13, category='test')
if test_profit<0:
   print('Back to the drawing board!')

Using a=0.13 the total profit is -1.722777777775736
Back to the drawing board!
