## Strategy
How will Strategy module will be used:

It will take a:
- df (the start and the ending date will be provided within the dataframe)
- type of objective function to use ('Sharpe Ratio', 'Multiple', ... any metrics)


It will contain methods:
- that will perform dynamic universe selection
- that will contain the trading strategy (will take the parameters as input)
- that will optimize for the best parameters given the objective function (will call the trading strategy method)\
-> Make sure to enforce the use of discrete parameters (by using an integer space (not real))\
-> Use BayesOptCV (cross validation, not Bayesian Optimization)
- that will perform the walk forward analysis (from sklearn.model_selection import TimeSeriesSplit)

It will output the strategy return column, position, cumulative return, trades, sessions, cumulative session return.


======================================================

Practical Recommendations
Low-Dimensional Problems (<5 dimensions):

init_points: 5–10
n_iter: 10–30
Moderate-Dimensional Problems (5–10 dimensions):

init_points: 10–15
n_iter: 30–50
High-Dimensional Problems (>10 dimensions):

Bayesian optimization might struggle due to the curse of dimensionality. Consider alternatives like random search or evolutionary algorithms if dimensions are very high.

======================================================

The key is to perform separate walk-forward analysis for both the strategies and the rebalancing process

---

In [1]:
import requests
import json
import math
import pandas as pd
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt
from qgridnext import show_grid
from datetime import datetime, timedelta
import sys  
import os
import pandas_ta as ta
import sklearn as sk
import datetime as dt

# Ensure the directories are in the system path
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..', '..', 'Data_Management'))) #We have a double .. as we are in the Strategy subfolder
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..', '..', 'Universe_Selection')))
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..', '..', 'Signal_Generation')))
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..', '..', 'Risk_Management')))

# Import the modules
from data import Data
from calculations import Calculations, Metrics
from coarse import Coarse_1 as Coarse
from fine import Fine_1 as Fine
from entry_signal import Trend_Following, Mean_Reversion
from tail_risk import Stop_Loss, Take_Profit
from manage_trade import Manage_Trade
from position import Position

Skipping category 'layer-1', already processed.
Skipping category 'depin', already processed.
Skipping category 'proof-of-work-pow', already processed.
Skipping category 'proof-of-stake-pos', already processed.
Skipping category 'meme-token', already processed.
Skipping category 'dog-themed-coins', already processed.
Skipping category 'eth-2-0-staking', already processed.
Skipping category 'non-fungible-tokens-nft', already processed.
Skipping category 'governance', already processed.
Skipping category 'artificial-intelligence', already processed.
Skipping category 'infrastructure', already processed.
Skipping category 'layer-2', already processed.
Skipping category 'zero-knowledge-zk', already processed.
Skipping category 'storage', already processed.
Skipping category 'oracle', already processed.
Skipping category 'bitcoin-fork', already processed.
Skipping category 'restaking', already processed.
Skipping category 'rollup', already processed.
Skipping category 'metaverse', already p

Importing all_data.csv file for all types of data

In [9]:
# Specify the relative or absolute path to the CSV file
file_path = r"C:\Users\yassi\OneDrive\Documents\GitHub\Portfolio_1\Technical_Portfolio\Data_Management\all_data.csv"

# Read the CSV file
data = pd.read_csv(file_path, index_col=['date', 'coin'], parse_dates=['date'])
data

Unnamed: 0_level_0,Unnamed: 1_level_0,close,creturns,high,log_return,low,open,price,returns,volume,volume_in_dollars
date,coin,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2017-08-17 05:00:00,BTCUSDT,4315.320000,1.506209e-03,4328.690000,-6.498159,4291.370000,4308.830000,4315.320000,0.001506,2.323492e+01,1.002661e+05
2017-08-17 05:00:00,ETHUSDT,303.100006,4.940270e-03,303.279999,-5.310335,300.000000,301.609985,303.100006,0.004940,3.776725e+02,1.144725e+05
2017-08-17 06:00:00,BTCUSDT,4324.350000,3.151810e-06,4345.450000,-6.169374,4309.370000,4330.290000,4324.350000,0.002093,7.229691e+00,3.126371e+04
2017-08-17 07:00:00,BTCUSDT,4349.990000,1.868776e-08,4349.990000,-5.127863,4287.410000,4316.620000,4349.990000,0.005929,4.443249e+00,1.932809e+04
2017-08-17 07:00:00,ETHUSDT,307.959991,8.617874e-05,307.959991,-4.048752,302.600006,302.679993,307.959991,0.017444,7.547451e+02,2.324313e+05
...,...,...,...,...,...,...,...,...,...,...,...
2024-12-27 23:00:00,SYSUSDT,0.112200,0.000000e+00,0.112400,-5.632999,0.111100,0.111600,0.112200,0.003578,7.290790e+05,8.180266e+04
2024-12-27 23:00:00,TRXUSDT,0.259400,0.000000e+00,0.259600,-7.860292,0.258600,0.259400,0.259400,0.000386,1.499596e+07,3.889953e+06
2024-12-27 23:00:00,VETUSDT,0.045760,0.000000e+00,0.045760,-5.479996,0.045370,0.045580,0.045760,0.004169,1.004372e+07,4.596008e+05
2024-12-27 23:00:00,WAXPUSDT,0.040910,0.000000e+00,0.040940,-5.915972,0.040590,0.040770,0.040910,0.002696,1.056349e+06,4.321524e+04


In [8]:
start_time = dt.datetime(2024, 1, 1)
end_time = dt.datetime(2024, 2, 1)
timeframes = ['1w', '1d', '4h', '1h', '30m','15m', '5m', '1m']
index = 3 #It is better to choose the highest frequency for the backtest to be able to downsample
interval = timeframes[index]
symbols = ['BTCUSDT', 'ETHUSDT', 'BNBUSDT', 'ADAUSDT', 'XRPUSDT']
data = Data(symbols, start_time, end_time, interval).df
data

AttributeError: 'Data' object has no attribute 'df'

The below is for objective function

In [4]:


#Generate a signal
tf = Trend_Following()

_df = tf.supertrend_signals(df, str_length, str_mult)

pos = Position(_df, _min_pos, _max_pos)
_df = pos.initialize_position()
sl = Stop_Loss(_df, sl_type, sl_ind_length, sl_ind_mult, sl_signal_only)
_df = sl.apply_stop_loss(fixed_sl, plot = True)
tp = Take_Profit(_df, tp_type, tp_mult, tp_signal_only)
_df = tp.apply_take_profit(fixed_tp, plot = True)

_df = cal.merge_cols(_df, common = 'exit_signal', use_clip = True)
_df = pos.calculate_position(_df)

mt = Manage_Trade(_df)
_df = mt.erw_actual_allocation(max_perc_risk, max_dollar_allocation)

#########################

_df = cal.update_all(_df)
_df

ValueError: If using all scalar values, you must pass an index

In [3]:
_df.session

date                        
2024-01-01 09:00:00  BTCUSDT     0.0
                     ETHUSDT     0.0
2024-01-01 10:00:00  BTCUSDT     0.0
                     ETHUSDT     0.0
2024-01-01 11:00:00  BTCUSDT     0.0
                                ... 
2024-01-31 16:00:00  ETHUSDT     7.0
2024-01-31 17:00:00  BTCUSDT    14.0
                     ETHUSDT     7.0
2024-01-31 18:00:00  BTCUSDT    14.0
                     ETHUSDT     8.0
Name: session, Length: 1460, dtype: float64

---

## Dynamic Universe Selection Strategy

current_universe = {}\
max_positions = 4
```pseudocode
for each row:
	for each coin in current_universe:
		If the current position of the coin == 0
			Remove it from current universe

	if len(universe) < max_positions:

		current coins = coins at the current index
		available_coins = current_coins - universe => All coins not in the universe

		filter = above_ema, volume_rank < 50 (could be optimized), std_rank < 4 (should be FINAL Constant),	 entry_signal.shift() == 1
		potential_coins = available coins with applied filter => Potenatial coins that could be added to the universe
		potential_coins = potential_coins.sort(based on std_rank)
	
		missing_positions = max_positions - len(current_universe)
		to_be_added = potential_coins[:missing_positions]

		current_universe = current_universe + to_be_added #Update the current universe

	for each coin in the current row:
		if coin is in universe:
			df[(time, coin), 'in_universe'] = True => mark it as part of the universe

	return df = df[df['in_universe']]

```



In [4]:
def create_test_df(num_times: int = 5, num_coins: int = 10):
    """Creates a multi-index DataFrame for testing."""
    times = pd.to_datetime(['2024-01-01'] + [pd.Timestamp('2024-01-01') + pd.Timedelta(days=i) for i in range(1, num_times)])
    coins = [f"Coin_{i}" for i in range(num_coins)]
    index = pd.MultiIndex.from_product([times, coins], names=['time', 'coin'])
    
    df = pd.DataFrame(index=index)
    df['above_ema'] = np.random.choice([True, False], size=len(df))
    df['volume_rank'] = np.random.randint(1, 100, size=len(df))
    df['std_rank'] = np.random.randint(1, 10, size=len(df))
    df['entry_signal'] = np.random.randint(0, 2, size=len(df)) # 0 or 1
    df['position'] = np.random.randint(0, 2, size=len(df))
    return df

# Example usage to create a test DataFrame:
test_df = create_test_df(num_times=4, num_coins=8)
test_df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,above_ema,volume_rank,std_rank,entry_signal,position
time,coin,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2024-01-01,Coin_0,False,59,3,1,1
2024-01-01,Coin_1,False,89,2,1,1
2024-01-01,Coin_2,False,19,5,1,1
2024-01-01,Coin_3,True,98,1,0,0
2024-01-01,Coin_4,True,7,5,0,1


In [6]:
import pandas as pd
import numpy as np
from typing import Set, List
import pandas as pd
from typing import List

def update_universe(df: pd.DataFrame, max_positions: int = 4) -> pd.Series:
    """
    Updates a DataFrame to track a dynamic universe of coins.
    Should include the dataframe with the lower frequency data. (daily, weekly, etc.)
    Assumes a stacked dataframe
    """
    current_universe = set()
    df['in_universe'] = False

    for time_index in df.index.get_level_values(0).unique():
        # Remove coins that are no longer in the universe *for this time index*
        coins_to_remove = []
        for coin in current_universe:
            if (time_index, coin) in df.index and df.loc[(time_index, coin), 'position'] == 0:
                coins_to_remove.append(coin)
                df.loc[(time_index, coin), 'in_universe'] = False
        current_universe.difference_update(coins_to_remove) #use difference_update for set manipulation


        current_coins = df.loc[time_index].index
        available_coins = set(current_coins) - current_universe

        if len(current_universe) < max_positions and available_coins:
            temp_df = df.loc[(time_index, list(available_coins)), :].copy()

            # The shift was the main source of the bug. It was shifting across coins,
            # which is incorrect. We should not shift at all in this context.
            # The intention was likely to use the *previous* time slice data.
            # This is handled later.

            filter_condition = (
                (temp_df['above_ema']) &
                (temp_df['volume_rank'] < 50) &
                (temp_df['std_rank'] < 10) &
                (temp_df['entry_signal'] == 1)
            )

            potential_coins_df = temp_df[filter_condition]

            if not potential_coins_df.empty:
                potential_coins_df = potential_coins_df.sort_values(by='std_rank')
                potential_coins = set(potential_coins_df.index.get_level_values(1))
                missing_positions = max_positions - len(current_universe)
                to_be_added: List[str] = list(potential_coins)[:missing_positions]
                current_universe.update(to_be_added)

        df.loc[(time_index, list(current_universe)), 'in_universe'] = True
    
    df = df.unstack()
    df['in_universe'] = df['in_universe'].shift()
    df = df.stack()
    return df['in_universe'], current_universe

test_df['in_universe'], current_universe = update_universe(test_df)

print(current_universe)
test_df

{'Coin_2'}


  df = df.stack()


Unnamed: 0_level_0,Unnamed: 1_level_0,above_ema,volume_rank,std_rank,entry_signal,position,in_universe
time,coin,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2024-01-01,Coin_0,False,59,3,1,1,
2024-01-01,Coin_1,False,89,2,1,1,
2024-01-01,Coin_2,False,19,5,1,1,
2024-01-01,Coin_3,True,98,1,0,0,
2024-01-01,Coin_4,True,7,5,0,1,
2024-01-01,Coin_5,False,39,7,0,1,
2024-01-01,Coin_6,True,39,2,1,0,
2024-01-01,Coin_7,False,29,1,0,1,
2024-01-02,Coin_0,False,69,1,0,0,False
2024-01-02,Coin_1,False,33,1,1,1,False


In [7]:
###### To Optimize ######
#All parameters:
all_frequency = ['1W', '1D', '4h','1h', '30min','15min', '5min', '1min'] #All possible frequencies for the resampling
low_freq_index = 1 #The index of the lowest frequency for the resampling
low_freq = all_frequency[low_freq_index] #The lowest frequency for the resampling
max_dollar_allocation = 10000
std_window = 2
mean_window = 2
ema_window = 2
high_freq_index = 3 #The index of the highest frequency for the resampling
high_freq = all_frequency[high_freq_index] #The highest frequency for the resampling
str_length = 10
str_mult = 3
_min_pos = 0
_max_pos = 1
sl_type = 'atr'
sl_ind_length = 14
sl_ind_mult = 3
sl_signal_only = True
fixed_sl = True
tp_type = 'rr'
tp_mult = 2
tp_ind_length = 0
tp_signal_only = True
fixed_tp = True
max_perc_risk = 0.01




#Downsample the data
cal = Calculations()
df = cal.downsample(data, low_freq)

#Perform coarse analysis and filtering
coarse = Coarse()
df = coarse.volume_flag(data, max_dollar_allocation)
df = coarse.sort_by_volume(df)
df = coarse.sort_by_std(df, std_window, mean_window)
fine = Fine()
df = fine.above_ema(df, ema_window)

#apply update_univers
df['in_universe'], current_universe = update_universe(df)


NameError: name 'data' is not defined

In [34]:
#Join the universe selection data with high frequency data
test_df_ = test_df.unstack().reindex(_df.unstack().index)

In [39]:
test_df_.ffill().head(50)

Unnamed: 0_level_0,above_ema,above_ema,above_ema,above_ema,above_ema,above_ema,above_ema,above_ema,volume_rank,volume_rank,...,position,position,in_universe,in_universe,in_universe,in_universe,in_universe,in_universe,in_universe,in_universe
coin,Coin_0,Coin_1,Coin_2,Coin_3,Coin_4,Coin_5,Coin_6,Coin_7,Coin_0,Coin_1,...,Coin_6,Coin_7,Coin_0,Coin_1,Coin_2,Coin_3,Coin_4,Coin_5,Coin_6,Coin_7
date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2024-01-01 09:00:00,,,,,,,,,,,...,,,,,,,,,,
2024-01-01 10:00:00,,,,,,,,,,,...,,,,,,,,,,
2024-01-01 11:00:00,,,,,,,,,,,...,,,,,,,,,,
2024-01-01 12:00:00,,,,,,,,,,,...,,,,,,,,,,
2024-01-01 13:00:00,,,,,,,,,,,...,,,,,,,,,,
2024-01-01 14:00:00,,,,,,,,,,,...,,,,,,,,,,
2024-01-01 15:00:00,,,,,,,,,,,...,,,,,,,,,,
2024-01-01 16:00:00,,,,,,,,,,,...,,,,,,,,,,
2024-01-01 17:00:00,,,,,,,,,,,...,,,,,,,,,,
2024-01-01 18:00:00,,,,,,,,,,,...,,,,,,,,,,


---

### Objective Function