# TARGET DETERMINATION FOR PIP MINER MODEL

This experiment is an extension of the `parameters` experiment. Given the range of with stable Martin Ratio:
- what cluster identity should be seleted? How can we combine them into a strategy?
- what could be the exit strategy for the strategy?

In [14]:
# Import Necessary Libraries, Define the parameters
import logging
from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pandas_ta as ta  # noqa
import plotly.graph_objects as go  # noqa
import quantstats as qt
import seaborn as sns
from quantminer import Miner

logger = logging.getLogger('optuna')
logger.setLevel(logging.WARNING)

data_dir = Path.cwd().parent / 'data'


### STEP 0 : DATA PREPARATION AND MODEL TRAINING
- Asset : EURUSD, 1-hour
- Parameter
  - n_pivots : 3; 4
  - n_clusters : 16; 15 
  - n_lookback : 8; 14
  - hold_period : 2, 3

In [15]:
# Read Price Data
data_path = data_dir / 'eur_h1.parquet'
raw_data = pd.read_parquet(data_path)

# Clean the data
data = raw_data.copy()
data = data.dropna()

# Feature Engineering
data['returns'] = data['close'].diff().fillna(0)
data['returns+1'] = data['returns'].shift(-1)

# Prepare the training data
train_daterange = pd.date_range('2010-01-01', '2021-12-31', freq='1h')
train_df = data[data.index.isin(train_daterange)]
train_data = np.array(train_df['close'])

In [16]:
# Parameters
n_pivots=3
n_clusters = 24
n_lookback=15
hold_period=3

miner = Miner(
    n_pivots=n_pivots,
    n_clusters=n_clusters,
    n_lookback=n_lookback,
    hold_period=hold_period,
    model_type='sequential'
)

# Fit the model
miner.fit(train_data)

11.722329145998637

In [17]:
# Create a feature for the predicted labels
data['cluster_labels'] = miner.transform(data['close']).astype(int)
train_df = data[data.index.isin(train_daterange)]

### EXPERIMENT ONE : STRATEGY SELECTION
For this experiment, we would select the clusters that beat a benchmark (Buy-and-Hold)
- Profit Factor : 1
- Sharpe ratio : 
- Ulcer Performance Index : From base data
- Average Drawdown : From base data

#### PROCEDURE
1. Compute and store the returns array and martin ratio for each label/cluster, that meet the requirement (beat the benchmark; the Buy-Hold returns). Map each return to the label and direction.
2. Select the best label with by Martin ratio.
3. Compute the drawdown correlation between the returns from best label and other labels/returns. Select and store returns from correlation below a threshold value (default = .4)
4. Combine the returns based:
  - STRATEGY 1 : based on precendence, in order of descending martin ratio
  - STRATEGY 2 : concurrent returns are allowed

5. Test strategies on test_data