## FX Strategy using CAI prediction

This strategy trades the FX rate of USD and EUR. The hypothesis is that USD will rise against the EUR during EUR business hours and fall during the USD business hours. This is called the time of the day effect and seen due to HF OF and returns (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2099321).

In [1]:
from AlgorithmImports import *
from datetime import datetime, time

In [2]:
from QuantConnect.PredictNowNET import PredictNowClient
from QuantConnect.PredictNowNET.Models import *

In [3]:
client = PredictNowClient("email", "username")
client.connected

True

### Prepare the data
In this notebook, we will create a strategy that short EUR.USD when Europe is open and long when Europe is closed and US is open. We will aggregate the daily return of this static strategy that is activate everyday, and use CAI to predict if the strategy is profitable for a given date. We will follow this On and Off signal to create a dynamic strategy and benchmark its performance.

In [4]:
# load minute bar data of EURUSD
qb = QuantBook()
symbol = qb.add_forex("EURUSD").symbol
df_price = qb.History(symbol, datetime(2020,1,1), datetime(2021,1,1)).loc[symbol]

20250201 17:32:07.433 TRACE:: QuantBook started; Is Python: True
20250201 17:32:07.571 TRACE:: Config.GetValue(): downloader-data-update-period - Using default value: 7
20250201 17:32:07.590 TRACE:: Config.Get(): Configuration key not found. Key: databases-refresh-period - Using default value: 1.00:00:00
20250201 17:32:08.959 TRACE:: Config.GetValue(): qb-data-hour - Using default value: 9
20250201 17:32:08.971 TRACE:: Config.Get(): Configuration key not found. Key: lean-manager-type - Using default value: LocalLeanManager
20250201 17:32:08.972 TRACE:: Config.Get(): Configuration key not found. Key: data-permission-manager - Using default value: DataPermissionManager
20250201 17:32:08.975 TRACE:: Config.GetValue(): zip-data-cache-provider - Using default value: 10
20250201 17:32:08.976 TRACE:: Config.Get(): Configuration key not found. Key: fundamental-data-provider - Using default value: CoarseFundamentalDataProvider
20250201 17:32:08.979 TRACE:: Config.GetValue(): scheduled-event-lea

In [5]:
# resample to hourly returns
minute_returns = df_price["close"].pct_change()
hourly_returns = (minute_returns + 1).resample('H').prod() - 1
df_hourly_returns = hourly_returns.to_frame()
df_hourly_returns['time'] = df_hourly_returns.index.time

# generate buy and sell signals and get strategy returns
# Sell EUR.USD when Europe is open
sell_eur = ((df_hourly_returns['time'] > time(3)) & (df_hourly_returns['time'] < time(9)))

# Buy EUR.USD when Europe is closed and US is open
buy_eur = ((df_hourly_returns['time'] > time(11)) & (df_hourly_returns['time'] < time(15)))

# signals as 1 and -1
ones = pd.DataFrame(1, index=df_hourly_returns.index, columns=['signals'])
minus_ones = pd.DataFrame(-1, index=df_hourly_returns.index, columns=['signals'])
signals = minus_ones.where(sell_eur, ones.where(buy_eur, 0))

# strategy returns
strategy_returns = df_hourly_returns['close'] * signals['signals']
strategy_returns = (strategy_returns + 1).resample('D').prod() - 1
df_strategy_returns = strategy_returns.to_frame().ffill()
df_strategy_returns.reset_index(level=None, drop=False, inplace=True, col_level=0, col_fill="")

  hourly_returns = (minute_returns + 1).resample('H').prod() - 1


### Save the data
We will label the data and save it to disk (ObjectStore) with the model name. This file will be uploaded to PredictNow.

In [6]:
# Define the model name and data lable
model_name = "fx-time-of-day"
label =  "strategy_ret"

# Label the data and save it to the object store
df_strategy_returns = df_strategy_returns.rename(columns={0: label, 'time': 'date'})
parquet_path = qb.object_store.get_file_path(f'{model_name}.parquet')
df_strategy_returns.to_parquet(parquet_path)
df_strategy_returns

Unnamed: 0,date,strategy_ret
0,2020-01-01,0.000000
1,2020-01-02,0.001234
2,2020-01-03,0.000757
3,2020-01-04,0.000000
4,2020-01-05,0.000000
...,...,...
361,2020-12-27,0.000000
362,2020-12-28,0.001719
363,2020-12-29,-0.000267
364,2020-12-30,-0.001754


### Create the Model
Create the model by sending the parameters to PredictNow

In [7]:
model_parameters = ModelParameters(
    mode=Mode.TRAIN, 
    type=ModelType.CLASSIFICATION, 
    feature_selection=FeatureSelection.SHAP, 
    analysis=Analysis.SMALL, 
    boost=Boost.GBDT, 
    testsize=len(df_strategy_returns) / 2,  # testsize < 1 --> ratio, > 1 --> exact # of rows
    timeseries=False,
    probability_calibration=False,    #  refine your probability
    exploratory_data_analysis=False,
    weights="no")                     # yes, no, custom)

str(client.create_model(model_name, model_parameters))

'{"model_name":"","message":"Successfully stored the model","success":true}'

### Train the Model
Provide the path to the data, and its label.
This task may take several minutes.

In [8]:
train_request_result = client.train(model_name, parquet_path, label)
str(train_request_result)

'{"train_id":"d7c2c66d-54b2-4182-8714-cfd0f3dde25c","model_name":"saved_model_fx-time-of-day.pkl","message":"Training the model is successfully requested.","success":true}'

### Get the training result
We can create dataframes using the training results

In [9]:
from time import sleep
while(True):
    training_result = client.get_training_result(model_name, train_request_result.train_id)
    if training_result.completed:
        break
    print(training_result)
    sleep(5)

2025-02-01T17:32:28.0404716Z: PROGRESS (0/6) | In Progress...
2025-02-01T17:32:32.9930560Z: PROGRESS (1/6) | Performing preprocessing/building of the model
2025-02-01T17:32:34.1801730Z: PROGRESS (3/6) | Performing hyperparameter optimization...
2025-02-01T17:32:34.1801730Z: PROGRESS (3/6) | Performing hyperparameter optimization...
2025-02-01T17:32:44.4762650Z: PROGRESS (4/6) | Checking for feature selection methods...
2025-02-01T17:32:44.4762650Z: PROGRESS (4/6) | Checking for feature selection methods...
2025-02-01T17:32:57.4904390Z: PROGRESS (5/6) | Making predictions...
2025-02-01T17:33:00.7998190Z: PROGRESS (6/6) | Saving files...
2025-02-01T17:33:00.7998190Z: PROGRESS (6/6) | Saving files...


In [10]:
from io import StringIO

# predicted probability (float between 0 and 1) for validation/training data set
# the last column notes the probability that it's a "1", i.e. positive return
predicted_prob_cv = pd.read_json(StringIO(training_result.predicted_prob_cv))
print("predicted_prob_cv")
print(predicted_prob_cv)

# predicted probability (float between 0 and 1) for the testing data set
predicted_prob_test = pd.read_json(StringIO(training_result.predicted_prob_test))
print("predicted_prob_test")
print(predicted_prob_test)

# predicted label, 0 or 1, for validation/training data set. Classified as class 1 if probability > 0.5
predicted_targets_cv = pd.read_json(StringIO(training_result.predicted_targets_cv))
print("predicted_targets_cv")
print(predicted_targets_cv)

# predicted label, 0 or 1, for testing data set. Classified as class 1 if probability > 0.5
predicted_targets_test = pd.read_json(StringIO(training_result.predicted_targets_test))
print("predicted_targets_test")
print(predicted_targets_test)

# feature importance score, shows what features are being used in the prediction
# more helpful when you include your features
# and only works when you set param['feature_selection'] to SHAP or CMDA
if training_result.feature_importance:
    feature_importance = pd.read_json(StringIO(training_result.feature_importance))
    print("feature_importance")
    print(feature_importance)

# performance metrics in terms of accuracies and so on
performance_metrics = pd.read_json(StringIO(training_result.performance_metrics))
print("performance_metrics")
print(performance_metrics)

predicted_prob_cv
     Unnamed: 0       date       0.0       1.0
0             0 2020-01-01  0.336371  0.663629
1             1 2020-01-02  0.326334  0.673666
2             2 2020-01-03  0.852360  0.147640
3             3 2020-01-04  0.852360  0.147640
4             4 2020-01-05  0.852360  0.147640
..          ...        ...       ...       ...
178         178 2020-06-27  0.589632  0.410368
179         179 2020-06-28  0.589632  0.410368
180         180 2020-06-29  0.568739  0.431261
181         181 2020-06-30  0.841845  0.158155
182         182 2020-07-01  0.694885  0.305115

[183 rows x 4 columns]
predicted_prob_test
          date       0.0       1.0
0   2020-07-02  0.472320  0.527680
1   2020-07-03  0.490312  0.509688
2   2020-07-04  0.490312  0.509688
3   2020-07-05  0.490312  0.509688
4   2020-07-06  0.435553  0.564447
..         ...       ...       ...
178 2020-12-27  0.371432  0.628568
179 2020-12-28  0.236488  0.763512
180 2020-12-29  0.252852  0.747148
181 2020-12-30  0.456329

### Start predicting with the trained model

In [11]:
predict_result = client.predict(model_name, parquet_path, exploratory_data_analysis=False, probability_calibration=False)
str(predict_result)

'{"eda":null,"filename":"input_live_2025-02-01_17.33.21_merged_data.parquet","labels":"{\\"pred_target\\":{\\"0\\":0.0,\\"1\\":0.0,\\"2\\":0.0,\\"5\\":0.0,\\"6\\":0.0,\\"7\\":0.0,\\"8\\":0.0,\\"9\\":0.0,\\"12\\":0.0,\\"13\\":0.0,\\"14\\":0.0,\\"15\\":0.0,\\"16\\":0.0,\\"20\\":0.0,\\"21\\":0.0,\\"22\\":0.0,\\"23\\":0.0,\\"26\\":0.0,\\"27\\":0.0,\\"28\\":0.0,\\"29\\":0.0,\\"30\\":0.0,\\"33\\":0.0,\\"34\\":0.0,\\"35\\":0.0,\\"36\\":0.0,\\"37\\":0.0,\\"40\\":0.0,\\"41\\":0.0,\\"42\\":0.0,\\"43\\":0.0,\\"44\\":0.0,\\"48\\":0.0,\\"49\\":0.0,\\"50\\":0.0,\\"51\\":0.0,\\"54\\":0.0,\\"55\\":0.0,\\"56\\":0.0,\\"57\\":0.0,\\"58\\":0.0,\\"61\\":0.0,\\"62\\":0.0,\\"63\\":0.0,\\"64\\":0.0,\\"65\\":0.0,\\"68\\":0.0,\\"69\\":0.0,\\"70\\":0.0,\\"71\\":0.0,\\"72\\":0.0,\\"75\\":0.0,\\"76\\":0.0,\\"77\\":0.0,\\"78\\":0.0,\\"79\\":0.0,\\"82\\":0.0,\\"83\\":0.0,\\"84\\":0.0,\\"85\\":0.0,\\"86\\":0.0,\\"89\\":0.0,\\"90\\":0.0,\\"91\\":0.0,\\"92\\":0.0,\\"93\\":0.0,\\"96\\":0.0,\\"97\\":0.0,\\"98\\":0.0,\\"9

In [12]:
labels = pd.read_json(StringIO(predict_result.labels))
print("labels")
print(labels)
probabilities = pd.read_json(StringIO(predict_result.probabilities))
print("probabilities")
print(probabilities)

labels
     pred_target
0              0
1              0
2              0
5              0
6              0
..           ...
359            0
362            0
363            0
364            0
365            0

[257 rows x 1 columns]
probabilities
     1970-01-01 00:00:00  1970-01-01 00:00:01
0               0.721428             0.278572
1               0.721428             0.278572
2               0.721428             0.278572
5               0.721428             0.278572
6               0.721428             0.278572
..                   ...                  ...
359             0.721428             0.278572
362             0.721428             0.278572
363             0.721428             0.278572
364             0.721428             0.278572
365             0.721428             0.278572

[257 rows x 2 columns]


  probabilities = pd.read_json(StringIO(predict_result.probabilities))
