<a href="https://colab.research.google.com/github/microprediction/endersnotebooks/blob/main/pandas_ta_attacker.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install --upgrade git+https://github.com/microprediction/endersgame.git
!pip install pandas_ta
# It's probably fine to use the simpler import by the time your read this :)
#!pip install --upgrade endersgame

Collecting git+https://github.com/microprediction/endersgame.git
  Cloning https://github.com/microprediction/endersgame.git to /tmp/pip-req-build-1met_r19
  Running command git clone --filter=blob:none --quiet https://github.com/microprediction/endersgame.git /tmp/pip-req-build-1met_r19
  Resolved https://github.com/microprediction/endersgame.git to commit 0ccd0e66c6171baa04bafc36886e836b5e4aceb5
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting river (from endersgame==0.4.3)
  Downloading river-0.21.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.0 kB)
Downloading river-0.21.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m22.7 MB/s[0m eta [36m0:00:00[0m
[?25hBuilding wheels for collected packages: endersgame
  Building wheel for endersgame (setup.py) ... [?25l[?25hdone
  Created wheel for endersgame: filename=endersgame-0.4.3-py3-none-an

# Pandas Technical Analysis Attacker
This notebook demonstrates how to create an `Attacker` described in [attacker.md](https://github.com/microprediction/endersgame/blob/main/endersgame/attackers/attacker.md). You may want to glance at this [notebook](https://github.com/microprediction/endersnotebooks/blob/main/mean_reversion_attacker.ipynb) also, if you seek more context or wish to know how these attackers can be used in a new tournament.

Here we'll use:

*   The `pandas_ta` package to generate features from lags.
*   The `river` package to update a running regression.




In [3]:
from endersgame import Attacker, HORIZON, EPSILON
from river import linear_model
from collections import deque
from endersgame import stream_generator_generator
from pprint import pprint
import pandas as pd
import pandas_ta as ta
from endersgame.accounting.pnlutil import zero_pnl_summary, add_pnl_summaries

### Creating an Attacker driven by technical analysis signals
We derive from `Attacker` and use `linear_model.LinearRegression` from the river package to maintain a regression estimate of the value `HORIZON` steps ahead. Then, we `buy` if the prediction is considerably higher than `EPSILON` above the current value, and conversely.



In [4]:
class MyAttacker(Attacker):
    """
    An attacker that computes technical indicators based on recent history and uses
    an online linear regression model to predict future values and make trading decisions.

    Remarks:
       - pip install pandas_ta
       - Performance is hindered because features are computed repeatedly every data point
       - But it supports many features https://github.com/twopirllc/pandas-ta?tab=readme-ov-file#indicators-by-category

    """

    def __init__(self, max_history_len=500, threshold: float = 1.0, burn_in=1000, **kwargs):
        """
        Initializes the attacker.

        Parameters:
        - max_history_len (int): Number of recent data points to use for computing technical indicators.
        - threshold (float): Multiplier for EPSILON to decide when to act.
        - burn_in (int): Number of initial observations to skip before making predictions.
        """
        super().__init__(max_history_len=max_history_len, **kwargs)
        self.num_lags = max_history_len                # Number of recent values to use for technical indicators
        self.model = linear_model.LinearRegression(    # Online linear regression model
            intercept_init=0.0,                        # Initialize intercept to 0
            intercept_lr=0.0                            # Freeze the intercept (no learning)
        )
        self.input_queue = deque()                     # Queue to store input vectors and time indices
        self.current_ndx = 0                           # Observation index
        self.threshold = threshold
        self.burn_in = burn_in

    def compute_indicators(self, data: pd.Series):
        """
        Computes several technical indicators based on the recent history.

        Parameters:
        - data (pd.Series): The recent historical data as a pandas Series.

        Returns:
        - indicators (dict): A dictionary of computed technical indicators.
        """
        indicators = {}

        # Compute Relative Strength Index (RSI)
        rsi = ta.rsi(data, length=14)
        if rsi is not None and not rsi.empty:
            indicators['rsi'] = rsi.iloc[-1]
        else:
            indicators['rsi'] = 0  # Default value if not enough data

        # Compute Simple Moving Average (SMA)
        sma_50 = ta.sma(data, length=50)
        if sma_50 is not None and not sma_50.empty:
            indicators['sma_50'] = sma_50.iloc[-1]
        else:
            indicators['sma_50'] = 0  # Default value if not enough data

        # Compute Exponential Moving Average (EMA)
        ema_20 = ta.ema(data, length=20)
        if ema_20 is not None and not ema_20.empty:
            indicators['ema_20'] = ema_20.iloc[-1]
        else:
            indicators['ema_20'] = 0  # Default value if not enough data

        return indicators

    def tick(self, x):
        """
        Processes the new data point.

        - Maintains a queue of input vectors.
        - When the future value arrives after HORIZON steps, updates the model.

        Parameters:
        - x (float): The new data point.
        """
        self.current_ndx += 1  # Increment the observation index

        # Get recent history and convert to pandas Series
        history = self.get_recent_history(n=self.num_lags)
        if len(history) >= self.num_lags:
            history_series = pd.Series(history)

            # Compute technical indicators from the history
            indicators = self.compute_indicators(history_series)

            # Store the indicators and current index in the input queue
            self.input_queue.append({'ndx': self.current_ndx, 'indicators': indicators})

        # Check if we can update the model with data from HORIZON steps ago
        while self.input_queue and self.input_queue[0]['ndx'] <= self.current_ndx - HORIZON:
            # Retrieve the indicator vector and its time index
            past_data = self.input_queue.popleft()
            X_past = past_data['indicators']

            # The target value y is the data point at 'time_past + HORIZON'
            y = x  # Current data point is the target for the input from HORIZON steps ago

            # Update the model incrementally
            self.model.learn_one(X_past, y)

    def predict(self, horizon=HORIZON):
        """
        Makes a prediction for HORIZON steps ahead and decides whether to buy, sell, or hold.

        Parameters:
        - horizon (int): The prediction horizon (should be HORIZON).

        Returns:
        - int: 1 for buy, -1 for sell, 0 for hold.
        """
        if self.current_ndx < self.burn_in:
            return 0  # Not enough data for model to be reliable

        # Get recent history and convert to pandas Series
        history = self.get_recent_history(n=self.num_lags)
        if len(history) >= self.num_lags:
            history_series = pd.Series(history)

            # Compute technical indicators for the prediction
            indicators = self.compute_indicators(history_series)

            # Predict the future value HORIZON steps ahead
            y_pred = self.model.predict_one(indicators)

            # Get the last known value
            last_value = history_series.iloc[-1]

            # Calculate the expected profit
            expected_profit = y_pred - last_value

            # Decide based on whether expected profit exceeds threshold * EPSILON
            if expected_profit > self.threshold * EPSILON:
                return 1  # Buy
            elif expected_profit < -self.threshold * EPSILON:
                return -1  # Sell
            else:
                return 0  # Hold
        else:
            return 0  # Not enough history to make a prediction


### Explanation

### `tick` Method

The `tick` method processes each new incoming data point and updates the attacker's state accordingly:

- **Retrieve Recent History**:
  - Uses `get_recent_history(n=self.num_lags)` to fetch the most recent `num_lags` data points from the history maintained by the parent `Attacker` class.

- **Compute Technical Indicators**:
  - Converts the recent history into a pandas Series.
  - Calls `compute_indicators(history_series)` to calculate technical indicators such as RSI, SMA, and EMA based on the recent data.

- **Queue Indicators for Future Training**:
  - Appends a dictionary containing the current index (`'ndx'`) and the computed indicators (`'indicators'`) to `self.input_queue`.
  - This queue ensures that each set of indicators is paired with the correct future data point after `HORIZON` steps.

- **Update the Model with Historical Data**:
  - Checks if there are any indicator sets in `self.input_queue` that are now `HORIZON` steps old.
  - If such data exists, it:
    - Removes the oldest indicator set from the queue.
    - Uses the current data point `x` as the target value `y` corresponding to the past indicators.
    - Updates the online linear regression model incrementally with the feature-target pair using `self.model.learn_one(X_past, y)`.

### `predict` Method

The `predict` method uses the current state of the model to make trading decisions based on predicted future values:

- **Burn-in Period Check**:
  - If the number of processed data points (`self.current_ndx`) is less than `burn_in`, the method returns `0` (hold) to allow the model to stabilize and gather sufficient training data.

- **Ensure Sufficient History**:
  - Checks if there are enough data points (`num_lags`) to compute the required technical indicators.
  - If not, it returns `0` (hold) as there isn't enough information to make a reliable prediction.

- **Compute Indicators for Prediction**:
  - Retrieves the most recent `num_lags` data points using `get_recent_history(n=self.num_lags)` and converts them into a pandas Series.
  - Calls `compute_indicators(history_series)` to calculate the necessary technical indicators based on the latest data.

- **Make a Prediction**:
  - Uses the online linear regression model to predict the future value `HORIZON` steps ahead based on the computed indicators: `y_pred = self.model.predict_one(indicators)`.

- **Calculate Expected Profit**:
  - Determines the expected profit by subtracting the last known value from the predicted value: `expected_profit = y_pred - last_value`.

- **Decision Logic**:
  - **Buy (`1`)**: If the expected profit exceeds `threshold * EPSILON`, indicating a significant positive change.
  - **Sell (`-1`)**: If the expected profit is below `-threshold * EPSILON`, indicating a significant negative change.
  - **Hold (`0`)**: If the expected profit is within the range `[-threshold * EPSILON, threshold * EPSILON]`, indicating no substantial change.

This method ensures that trading decisions are only made when the model predicts a sufficiently large movement in the target variable, thereby potentially increasing the effectiveness of the trading strategy by avoiding minor fluctuations.



## Run the attacker on mock data
We use `tick_and_predict` from the parent class as this will track profit and loss for us.

In [5]:
attacker = MyAttacker()               # Always reset an attacker

xs = [1,3,4,2,4,5,1,5,2,5,10]*100
for x in xs:
   y = attacker.tick_and_predict(x=x)

## Run the attacker on real data
We reset the attacker every time it encounters a new stream, but track aggregate statistics.

In [6]:
gen_gen = stream_generator_generator(category='test')    # <-- You might want to change 'train' to 'test'
attacker = MyAttacker(max_history_len=20, threshold=2.0, burn_in=1000)
total_pnl = zero_pnl_summary()
for stream in gen_gen:
    for message in stream:
        attacker.tick_and_predict(x=message['x'])
    stream_pnl = attacker.pnl.summary()
    total_pnl = add_pnl_summaries(total_pnl,stream_pnl)

total_pnl.update({'profit_per_decision':total_pnl['total_profit']/total_pnl['num_resolved_decisions']})
pprint(total_pnl)

{'current_ndx': 4254823,
 'losses': 2125005,
 'num_resolved_decisions': 4193143,
 'profit_per_decision': -0.14755682896724368,
 'total_profit': -618726.884486195,
 'wins': 2068138}


And that's all we have. Again, you may want to refer to this [notebook](https://github.com/microprediction/endersnotebooks/blob/main/mean_reversion_attacker.ipynb) also.