![QuantConnect Logo](https://cdn.quantconnect.com/web/i/icon.png)
<hr>

# FOREX Strategy using Corrective Artificial Intelligence (CAI)

This notebook connects to PredictNow, trains a model, and generates predictions.

The model hypothesis is that USD will rise against the EUR during EUR business hours and fall during the USD business hours. This is called the time of the day effect and seen due to HF OF and returns (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2099321).

### Connect to PredictNow

In [1]:
from AlgorithmImports import *
from QuantConnect.PredictNowNET import PredictNowClient
from QuantConnect.PredictNowNET.Models import *
from datetime import datetime, time
from io import StringIO
from time import sleep
import pandas as pd

qb = QuantBook()
client = PredictNowClient("jared@quantconnect.com", "jared_broad")
client.connected

20250203 18:20:46.260 TRACE:: QuantBook started; Is Python: True


True

### Prepare the Data
In this notebook, we will create a strategy that short EURUSD when Europe is open and long when Europe is closed and US is open. We will aggregate the daily return of this static strategy that is activate everyday, and use CAI to predict if the strategy is profitable for a given date. We will follow this On and Off signal to create a dynamic strategy and benchmark its performance.

In [2]:
# load minute bar data of EURUSD
symbol = qb.add_forex("EURUSD").symbol
df_price = qb.History(symbol, datetime(2020,1,1), datetime(2021,1,1)).loc[symbol]

# resample to hourly returns
minute_returns = df_price["close"].pct_change()
hourly_returns = (minute_returns + 1).resample('H').prod() - 1
df_hourly_returns = hourly_returns.to_frame()
df_hourly_returns['time'] = df_hourly_returns.index.time

# generate buy and sell signals and get strategy returns
# Sell EUR.USD when Europe is open
sell_eur = ((df_hourly_returns['time'] > time(3)) & (df_hourly_returns['time'] < time(9)))

# Buy EUR.USD when Europe is closed and US is open
buy_eur = ((df_hourly_returns['time'] > time(11)) & (df_hourly_returns['time'] < time(15)))

# signals as 1 and -1
ones = pd.DataFrame(1, index=df_hourly_returns.index, columns=['signals'])
minus_ones = pd.DataFrame(-1, index=df_hourly_returns.index, columns=['signals'])
signals = minus_ones.where(sell_eur, ones.where(buy_eur, 0))

# strategy returns
strategy_returns = df_hourly_returns['close'] * signals['signals']
strategy_returns = (strategy_returns + 1).resample('D').prod() - 1
df_strategy_returns = strategy_returns.to_frame().ffill()
df_strategy_returns.reset_index(level=None, drop=False, inplace=True, col_level=0, col_fill="")



  hourly_returns = (minute_returns + 1).resample('H').prod() - 1


### Save the Data
We will label the data and save it to disk (ObjectStore) with the model name. This file will be uploaded to PredictNow.

In [3]:
# Define the model name and data lable
model_name = "fx-time-of-day"
label =  "strategy_ret"

# Label the data and save it to the object store
df_strategy_returns = df_strategy_returns.rename(columns={0: label, 'time': 'date'})
parquet_path = qb.object_store.get_file_path(f'{model_name}.parquet')
df_strategy_returns.to_parquet(parquet_path)
df_strategy_returns

Unnamed: 0,date,strategy_ret
0,2020-01-01,0.000000
1,2020-01-02,0.001234
2,2020-01-03,0.000757
3,2020-01-04,0.000000
4,2020-01-05,0.000000
...,...,...
361,2020-12-27,0.000000
362,2020-12-28,0.001719
363,2020-12-29,-0.000267
364,2020-12-30,-0.001754


### Create the Model
Create the model by sending the parameters to PredictNow

In [4]:
model_parameters = ModelParameters(
    mode=Mode.TRAIN, 
    type=ModelType.CLASSIFICATION, 
    feature_selection=FeatureSelection.SHAP, 
    analysis=Analysis.SMALL, 
    boost=Boost.GBDT, 
    testsize=42.0,
    timeseries=False,
    probability_calibration=False,    # True if we want to refine your probability
    exploratory_data_analysis=False,  # True if we want to use exploratory data analysis
    weights="no")                     # yes, no, custom

create_model_result = client.create_model(model_name, model_parameters)
str(create_model_result)

'{"model_name":"","message":"Successfully stored the model","success":true}'

### Train the Model
Provide the path to the data, and its label.
This task may take several minutes.

In [5]:
train_request_result = client.train(model_name, parquet_path, label)
str(train_request_result)

'{"train_id":"2bf645ce-88f7-47bd-b459-a5b715c444b5","model_name":"saved_model_fx-time-of-day.pkl","message":"Training the model is successfully requested.","success":true}'

### Get the training result
The training results include dataframes with eprformance metrics and predicted probability and labels.  

In [6]:
while(True):
    training_result = client.get_training_result(model_name, train_request_result.train_id)
    if training_result.completed:
        break
    print(training_result)
    sleep(5)

2025-02-03T18:21:03.6190255Z: PROGRESS (0/6) | In Progress...
2025-02-03T18:21:08.4484680Z: PROGRESS (1/6) | Performing preprocessing/building of the model
2025-02-03T18:21:09.5431400Z: PROGRESS (3/6) | Performing hyperparameter optimization...
2025-02-03T18:21:09.5431400Z: PROGRESS (3/6) | Performing hyperparameter optimization...
2025-02-03T18:21:09.5431400Z: PROGRESS (3/6) | Performing hyperparameter optimization...
2025-02-03T18:21:26.5403730Z: PROGRESS (4/6) | Checking for feature selection methods...
2025-02-03T18:21:26.5403730Z: PROGRESS (4/6) | Checking for feature selection methods...
2025-02-03T18:21:26.5403730Z: PROGRESS (4/6) | Checking for feature selection methods...
2025-02-03T18:21:44.1895160Z: PROGRESS (5/6) | Making predictions...
2025-02-03T18:21:50.0857160Z: PROGRESS (6/6) | Saving files...
2025-02-03T18:21:50.0857160Z: PROGRESS (6/6) | Saving files...


In [7]:
# Predicted probability (float between 0 and 1) for validation/training data set
# the last column notes the probability that it's a "1", i.e. positive return
predicted_prob_cv = pd.read_json(StringIO(training_result.predicted_prob_cv))
print("predicted_prob_cv")
print(predicted_prob_cv)

# Predicted probability (float between 0 and 1) for the testing data set
predicted_prob_test = pd.read_json(StringIO(training_result.predicted_prob_test))
print("predicted_prob_test")
print(predicted_prob_test)

# Predicted label, 0 or 1, for validation/training data set. Classified as class 1 if probability > 0.5
predicted_targets_cv = pd.read_json(StringIO(training_result.predicted_targets_cv))
print("predicted_targets_cv")
print(predicted_targets_cv)

# Predicted label, 0 or 1, for testing data set. Classified as class 1 if probability > 0.5
predicted_targets_test = pd.read_json(StringIO(training_result.predicted_targets_test))
print("predicted_targets_test")
print(predicted_targets_test)

# Feature importance score, shows what features are being used in the prediction
# More helpful when you include your features
# and only works when you set feature_selection to FeatureSelection.SHAP or FeatureSelection.CMDA
if training_result.feature_importance:
    feature_importance = pd.read_json(StringIO(training_result.feature_importance))
    print("feature_importance")
    print(feature_importance)

# Performance metrics in terms of accuracies
performance_metrics = pd.read_json(StringIO(training_result.performance_metrics))
print("performance_metrics")
print(performance_metrics)

predicted_prob_cv
     Unnamed: 0       date       0.0       1.0
0             0 2020-01-01  0.101022  0.898978
1             1 2020-01-02  0.728072  0.271928
2             2 2020-01-03  0.670253  0.329747
3             3 2020-01-04  0.616022  0.383978
4             4 2020-01-05  0.616022  0.383978
..          ...        ...       ...       ...
319         319 2020-11-15  0.895370  0.104630
320         320 2020-11-16  0.406761  0.593239
321         321 2020-11-17  0.826305  0.173695
322         322 2020-11-18  0.724556  0.275444
323         323 2020-11-19  0.155107  0.844893

[324 rows x 4 columns]
predicted_prob_test
         date       0.0       1.0
0  2020-11-20  0.448291  0.551709
1  2020-11-21  0.533450  0.466550
2  2020-11-22  0.533450  0.466550
3  2020-11-23  0.634685  0.365315
4  2020-11-24  0.196221  0.803779
5  2020-11-25  0.541116  0.458884
6  2020-11-26  0.269221  0.730779
7  2020-11-27  0.128720  0.871280
8  2020-11-28  0.678627  0.321373
9  2020-11-29  0.678627  0.321373


### Start Predicting with the Trained Model

In [8]:
predict_result = client.predict(model_name, parquet_path, exploratory_data_analysis=False, probability_calibration=False)

In [9]:
### Diplsay the prediction results
labels = pd.read_json(StringIO(predict_result.labels))
print("labels")
print(labels)
probabilities = pd.read_json(StringIO(predict_result.probabilities))
print("probabilities")
print(probabilities)

labels
     pred_target
0              0
1              0
2              0
3              0
5              0
..           ...
359            0
362            0
363            0
364            0
365            0

[308 rows x 1 columns]
probabilities
     1970-01-01 00:00:00  1970-01-01 00:00:01
0                0.89321              0.10679
1                0.89321              0.10679
2                0.89321              0.10679
3                0.89321              0.10679
5                0.89321              0.10679
..                   ...                  ...
359              0.89321              0.10679
362              0.89321              0.10679
363              0.89321              0.10679
364              0.89321              0.10679
365              0.89321              0.10679

[308 rows x 2 columns]


  probabilities = pd.read_json(StringIO(predict_result.probabilities))
