# Sector rotation 

In [18]:
%load_ext autoreload
%autoreload 2
import sector_rot
import pandas as pd
from pathlib import Path

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In Simonian et al. (2019) the Fama–French–Carhart (FFC) factor **realisations for the current month *t*** are fed straight into a Random-Forest model as predictor variables (“features”) to generate a point estimate of each sector’s excess return⁠—what the paper calls the “RF-predicted return”﻿.

That RF-predicted return is then used **as a trading signal for the *next* month ( *t + 1*)** inside the association-rule-learning (ARL) overlay that powers the sector-rotation strategy:

> “the signals are the RF-predicted return of a sector … and the ratio of volatilities … If … the RF-predicted return for **next month** is greater than a designated threshold value, then we will own the sector for the month”﻿.

So the workflow is:

1. **Month *t***

   * Observe the four FFC factor returns (MKT, SMB, HML, MOM).
   * Feed them into the trained RF to obtain a *contemporaneous* predicted sector return.

2. **Month *t + 1***

   * Treat that predicted value (together with a volatility-ratio signal) as an input to ARL rules that decide whether the sector is held during month *t + 1*.
   * Evaluate the realised return over month *t + 1*.

### What the model **does not** do

* It never forecasts the factor returns themselves for *t + 1*; it simply uses the observed factor values at *t*.
* The risk-decomposition (pseudo-beta) exercise appears later in the article and is presented only as an interpretability device—translating RF feature importances into something that looks like traditional betas﻿. Those pseudo-betas are **not** fed back into the predictive model or the trading rules.

**Bottom line:** the author uses the month-*t* Fama–French factor returns directly to produce a model-based prediction of sector returns, and that prediction becomes one of the signals for trading one month ahead; factor-risk decompositions are used solely for ex-post interpretation, not for forecasting.


In [24]:
%cd ..

/Users/minhquangngo/Documents


In [36]:
data_dir = Path.cwd()/'data'

df_dict = {
    file.stem.replace("sector_","") : pd.read_parquet(file)
    for file in data_dir.glob("sector_*.parquet")
}



In [37]:
df_dict['10'].tail(50)

Unnamed: 0_level_0,vol,ret,shrout,prc,askhi,bidlo,put_volume,call_volume,put_call_ratio,vix_close,...,enhanced_baker,news_sent,mktcap,turn_sd,sect_mktcap,mvel1,dolvol,daily_illq,excess_ret,excess_mkt_ret
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2018-10-17,7090816.0,-0.006811,1790268.0,82.103614,82.844743,81.218036,5257.450597,9271.779058,0.738169,17.4,...,1.677,0.03,147554200.0,3.072879,146987500.0,18.809706,582181600.0,1.169872e-05,-0.006891,-0.00088
2018-10-18,9135598.0,-0.004694,1799754.0,81.919688,82.662846,81.06046,7247.380047,14073.035953,0.803405,20.059999,...,1.677,0.02,148641500.0,3.072879,147435300.0,18.817048,748385400.0,6.27151e-06,-0.004774,-0.01548
2018-10-19,8483855.0,-0.007025,1813272.0,81.210296,82.805008,80.770177,9356.581814,13284.470162,0.809722,19.889999,...,1.677,-0.01,149770700.0,3.072879,147256400.0,18.824616,688976400.0,1.01968e-05,-0.007105,-0.00258
2018-10-22,6825382.0,-0.011078,1813387.0,80.534156,81.618499,79.809177,5923.296388,10345.804992,0.831708,19.639999,...,1.677,0.0,148468700.0,3.072879,146039600.0,18.815885,549676400.0,2.015399e-05,-0.011158,-0.00388
2018-10-23,9712357.0,-0.026617,1821805.0,78.276509,79.388233,76.988453,10144.860877,17747.840518,0.586361,20.709999,...,1.677,-0.01,145669900.0,3.072879,142604500.0,18.796854,760249400.0,3.501068e-05,-0.026697,-0.00628
2018-10-24,10290280.0,-0.037594,1834264.0,75.423177,78.905061,75.282592,9503.541755,19322.373138,0.623939,25.23,...,1.677,0.0,142271100.0,3.072879,138346000.0,18.773245,776125700.0,4.843839e-05,-0.037674,-0.03338
2018-10-25,8117874.0,0.011599,1832241.0,76.491694,77.563637,75.648372,8404.785245,11966.378354,1.098704,24.219999,...,1.677,-0.01,143720200.0,3.072879,140151200.0,18.783379,620950000.0,1.868014e-05,0.011519,0.01922
2018-10-26,10690920.0,-0.007505,1832931.0,76.014522,77.041966,74.575708,8527.553713,10198.959978,0.853735,24.16,...,1.677,-0.01,142854600.0,3.072879,139329400.0,18.777338,812665200.0,9.235131e-06,-0.007585,-0.01658
2018-10-29,9453560.0,-0.018507,1844366.0,74.526563,76.830017,73.56057,8728.831942,8719.024538,0.958609,24.700001,...,1.677,0.02,141794900.0,3.072879,137454300.0,18.769892,704541400.0,2.626841e-05,-0.018587,-0.00778
2018-10-30,9881714.0,0.023403,1842134.0,76.252335,76.529745,74.214967,4641.044555,12393.716677,1.43911,23.35,...,1.677,0.01,144696200.0,3.072879,140467000.0,18.790147,753503800.0,3.10589e-05,0.023323,0.01652


# Playground 

In [8]:
!pwd

  pid, fd = os.forkpty()


/Users/minhquangngo/Documents/vsc/erasmus/msc_thesis


In [25]:
%cd vsc/erasmus/msc_thesis/

/Users/minhquangngo/Documents/vsc/erasmus/msc_thesis


In [10]:
test_df_2018 = df_dict['25'].loc[df_dict['25'].index.year == 2018]

In [11]:
test_df_2018

Unnamed: 0_level_0,vol,ret,shrout,prc,askhi,bidlo,put_volume,call_volume,put_call_ratio,vix_close,...,enhanced_baker,news_sent,mktcap,turn_sd,sect_mktcap,mvel1,dolvol,daily_illq,excess_ret,excess_mkt_ret
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2018-01-02,3.843326e+06,0.009714,666221.882551,443.274607,444.518792,436.431993,14600.031628,20244.065038,0.874721,9.770000,...,1.896,0.26,1.969452e+08,3.815204,2.953192e+08,19.098436,1.703649e+09,5.701971e-06,0.009654,0.00844
2018-01-03,4.374425e+06,0.008385,665910.076465,451.353288,452.356247,444.997086,17015.462603,21165.022284,1.046441,9.150000,...,1.896,0.28,1.995966e+08,3.815204,3.005607e+08,19.111809,1.974411e+09,4.247062e-06,0.008325,0.00584
2018-01-04,4.503065e+06,0.003689,668200.809774,452.080903,455.708103,449.769374,16662.991761,25260.083437,0.857916,9.220000,...,1.896,0.25,2.008192e+08,3.815204,3.020808e+08,19.117916,2.035750e+09,1.812009e-06,0.003629,0.00414
2018-01-05,4.860183e+06,0.009985,669017.043735,459.909595,460.398489,453.594002,23319.332560,33928.315151,0.831814,9.220000,...,1.896,0.25,2.045234e+08,3.815204,3.076874e+08,19.136193,2.235245e+09,4.467258e-06,0.009925,0.00654
2018-01-08,4.979417e+06,0.003876,667497.474912,469.060371,472.270243,463.681354,20143.662740,25810.308010,0.773483,9.520000,...,1.896,0.28,2.080289e+08,3.815204,3.130966e+08,19.153188,2.335647e+09,1.659476e-06,0.003816,0.00184
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2018-12-21,1.217874e+07,-0.027350,647179.535480,580.785647,619.743481,575.273451,61379.034294,76345.500563,1.263877,30.110001,...,2.409,-0.04,2.654431e+08,3.815204,3.758726e+08,19.396911,7.073236e+09,3.866652e-06,-0.027460,-0.02181
2018-12-24,5.085851e+06,-0.020962,644367.737901,567.539548,587.330367,554.823112,24830.817952,28718.724097,1.259637,36.070000,...,2.409,-0.04,2.584838e+08,3.815204,3.657042e+08,19.370343,2.886421e+09,7.262120e-06,-0.021072,-0.02561
2018-12-26,7.816316e+06,0.064628,642084.465412,627.160864,628.012613,585.035323,39573.805199,47691.823645,0.841862,30.410000,...,2.409,-0.08,2.871421e+08,3.815204,4.026902e+08,19.475488,4.902088e+09,1.318379e-05,0.064518,0.05049
2018-12-27,6.964319e+06,0.002377,642970.331346,621.639877,624.195905,593.515679,38048.274760,54933.158339,1.107463,29.959999,...,2.409,-0.07,2.842983e+08,3.815204,3.996960e+08,19.465535,4.329298e+09,5.491548e-07,0.002267,0.00769


In [12]:
test_result_dict= sector_rot.rolling_pred(
    208039388113350502,
    "0c861f5f9a874e05b04e43bb6341bd96",
    df = test_df_2018,
    lookback_time=50,
    vol_threshold = 1.0,
    pred_thresh = 0.0,
    excess_ret_pred_threshold = 0.0,
    sr = 2,
    lr = 50,
    experiment_name= 'pre_test').fit()
    

MLRuns path: py/mlruns/208039388113350502
Meta path: py/mlruns/208039388113350502/meta.yaml
Meta path exists: True
MLRuns path: py/mlruns/208039388113350502
Meta path: py/mlruns/208039388113350502/meta.yaml
Meta path exists: True

=== Debugging _extract_features ===
Looking for factors file at: py/mlruns/208039388113350502/0c861f5f9a874e05b04e43bb6341bd96/params/factors
Features loaded: ['excess_mkt_ret', 'smb', 'hml', 'umd']
Experiment 490636618977650074 created
Dumping models

=== Debugging _dump_model ===

=== Debugging _extract_model_pkl ===
Checking if meta.yaml exists: True
Meta content: {'artifact_location': 'mlflow-artifacts:/208039388113350502', 'creation_time': 1748272532371, 'experiment_id': '208039388113350502', 'last_update_time': 1748272532371, 'lifecycle_stage': 'active', 'name': 'rf'}
meta.yaml name extract: rf
RF path: py/mlartifacts/208039388113350502/0c861f5f9a874e05b04e43bb6341bd96/artifacts/rf_model/*.pkl
Surr path: py/mlartifacts/208039388113350502/0c861f5f9a874e0

In [25]:
test_result_dict['rf_signal_set'].tail(100)

index
2018-08-07    1
2018-08-08    1
2018-08-09    1
2018-08-10    0
2018-08-13    0
             ..
2018-12-21    0
2018-12-24    0
2018-12-26    0
2018-12-27    0
2018-12-28    0
Name: signal, Length: 100, dtype: int64

In [10]:
ols_pred

index
1998-01-02   NaN
1998-01-05   NaN
1998-01-06   NaN
1998-01-07   NaN
1998-01-08   NaN
              ..
1999-12-20   NaN
1999-12-21   NaN
1999-12-22   NaN
1999-12-23   NaN
1999-12-27   NaN
Length: 500, dtype: float64

In [142]:
rf_pred

index
2018-10-17         NaN
2018-10-18         NaN
2018-10-19         NaN
2018-10-22         NaN
2018-10-23         NaN
2018-10-24         NaN
2018-10-25         NaN
2018-10-26         NaN
2018-10-29         NaN
2018-10-30         NaN
2018-10-31         NaN
2018-11-01         NaN
2018-11-02         NaN
2018-11-05         NaN
2018-11-06         NaN
2018-11-07         NaN
2018-11-08         NaN
2018-11-09         NaN
2018-11-12         NaN
2018-11-13         NaN
2018-11-14         NaN
2018-11-15         NaN
2018-11-16         NaN
2018-11-19         NaN
2018-11-20         NaN
2018-11-21         NaN
2018-11-23         NaN
2018-11-26         NaN
2018-11-27         NaN
2018-11-28         NaN
2018-11-29         NaN
2018-11-30   -0.006386
2018-12-03    0.006566
2018-12-04    0.008382
2018-12-06   -0.002721
2018-12-07   -0.003864
2018-12-10   -0.013041
2018-12-11   -0.001130
2018-12-12   -0.006713
2018-12-13    0.005548
2018-12-14   -0.007802
2018-12-17   -0.006931
2018-12-18   -0.010017
2018-

In [148]:
# Create matched dataframe with rf predictions and excess returns
matched_df = pd.DataFrame({
    'excess_ret': test_df['excess_ret'],
    'preds': rf_pred
}).dropna()


In [149]:
matched_df

Unnamed: 0_level_0,excess_ret,preds
index,Unnamed: 1_level_1,Unnamed: 2_level_1
2018-11-30,-0.002323,-0.006386
2018-12-03,0.022875,0.006566
2018-12-04,-0.028615,0.008382
2018-12-06,-0.017293,-0.002721
2018-12-07,-0.006089,-0.003864
2018-12-10,-0.01616,-0.013041
2018-12-11,0.000425,-0.00113
2018-12-12,0.003378,-0.006713
2018-12-13,0.004085,0.005548
2018-12-14,-0.023679,-0.007802


# Extracting signals

In [15]:
import mlflow
import os
import subprocess
import time

In [16]:
def kill_process_on_port(port):
    try:
        result = subprocess.check_output(f"lsof -ti tcp:{port}", shell=True, text=True)
        pids = result.strip().split('\n')
        for pid in pids:
            if pid:
                print(f"Killing process {pid} on port {port}")
                os.system(f"kill -9 {pid}")
    except subprocess.CalledProcessError:
        print(f"No process found on port {port}")

def start_mlflow_ui(port=5000):
    kill_process_on_port(port)
    print(f"Starting MLflow UI on port {port} ...")
    subprocess.Popen(
        ["mlflow", "ui", "--host", "127.0.0.1", "--port", str(port)],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL
    )
    time.sleep(3)
    print(f"MLflow UI running at http://127.0.0.1:{port}")

start_mlflow_ui()

Killing process 6976 on port 5000
Killing process 6988 on port 5000
Killing process 6989 on port 5000
Killing process 6991 on port 5000
Killing process 6992 on port 5000
Killing process 6993 on port 5000
Starting MLflow UI on port 5000 ...
MLflow UI running at http://127.0.0.1:5000


In [17]:
mlflow.set_tracking_uri(uri="http://127.0.0.1:5000/")


In [26]:
path_rf =sector_rot.all_runs(208039388113350502).get_run_folders()

In [27]:
path_rf

[{'f5b7855c3eae48f18c61879afbc7e95e': '20_rf'},
 {'d3514248163147a9bed8b9bc43be3e7e': '30_rf'},
 {'aa59f403dba240a8b7b2beecfbe40e7e': '10_rf'},
 {'d3ade145c212426c8744ff2f269c7cc0': '15_rf'},
 {'8c314a99cfe34959b48d84568f7d7af7': '20_rf'},
 {'0c861f5f9a874e05b04e43bb6341bd96': '25_rf'},
 {'0f699eb2796a47448ecf4d285475e2d8': '35_rf'},
 {'5d5153d80ed14485979ed50ce95318c9': '40_rf'},
 {'184a3886e7e44ff2b029ee3062dd7d53': '55_rf'},
 {'f78b833577c7409b8303018a0b2c7d67': '60_rf'},
 {'7f1f5fa8974d49f7a3bd602b0d3c98a5': '10_rf'},
 {'31e61872b7294107ad15a8e688063482': '55_rf'},
 {'5b5ebd3629434c22b2eeb7ca658bc1f4': '50_rf'},
 {'09f893bd6910426e902f4395887ea5bf': '50_rf'},
 {'22b774e8e1b9450cb58e11d1d1d87746': '30_rf'},
 {'b4b9220535f1404db27e15c36e3b2774': '15_rf'},
 {'fad7fcfbed2248ce819c0b9dc98e2cc6': '40_rf'},
 {'5f1c187b34a042f3b62d7eb56755ecfe': '35_rf'},
 {'73981100838d4d8a82b4c939e64def9a': '45_rf'},
 {'51a38a614dcb45f78e8972c2b314f672': '60_rf'},
 {'fbb9762c39a24e828e0f6af9ae0f5db4': '4

In [38]:
for i in path_rf:
    for turn, sector in i.items():
        print(f"Training ARL for run {turn} of sector {sector}")
        sector_numb = ''.join(filter(str.isdigit, sector))
        arl_run = sector_rot.rolling_pred(
    208039388113350502,
    str(turn),
    df = df_dict[str(sector_numb)],
    lookback_time= 365,
    vol_threshold = 1.0,
    pred_thresh = 0.0,
    excess_ret_pred_threshold = 0.0,
    sr = 21,
    lr = 126,
    experiment_name= 'rf_test_arl').fit()


Training ARL for run f5b7855c3eae48f18c61879afbc7e95e of sector 20_rf
MLRuns path: py/mlruns/208039388113350502
Meta path: py/mlruns/208039388113350502/meta.yaml
Meta path exists: True
MLRuns path: py/mlruns/208039388113350502
Meta path: py/mlruns/208039388113350502/meta.yaml
Meta path exists: True

=== Debugging _extract_features ===
Looking for factors file at: py/mlruns/208039388113350502/f5b7855c3eae48f18c61879afbc7e95e/params/factors
Features loaded: ['excess_mkt_ret', 'smb', 'hml', 'umd']
Experiment already exists with ID: 317094863492494700
Dumping models

=== Debugging _dump_model ===

=== Debugging _extract_model_pkl ===
Checking if meta.yaml exists: True
Meta content: {'artifact_location': 'mlflow-artifacts:/208039388113350502', 'creation_time': 1748272532371, 'experiment_id': '208039388113350502', 'last_update_time': 1748272532371, 'lifecycle_stage': 'active', 'name': 'rf'}
meta.yaml name extract: rf
RF path: py/mlartifacts/208039388113350502/f5b7855c3eae48f18c61879afbc7e95e

KeyboardInterrupt: 

In [29]:
sector_numb

'25'

In [13]:
first_rf_pass = sector_rot.rolling_pred(
    208039388113350502,
    "f5b7855c3eae48f18c61879afbc7e95e",
    df = df_dict['25'],
    lookback_time= 365,
    vol_threshold = 1.0,
    pred_thresh = 0.0,
    excess_ret_pred_threshold = 0.0,
    sr = 21,
    lr = 126,
    experiment_name= 'rf_test_arl').fit()


MLRuns path: py/mlruns/208039388113350502
Meta path: py/mlruns/208039388113350502/meta.yaml
Meta path exists: True
MLRuns path: py/mlruns/208039388113350502
Meta path: py/mlruns/208039388113350502/meta.yaml
Meta path exists: True

=== Debugging _extract_features ===
Looking for factors file at: py/mlruns/208039388113350502/f5b7855c3eae48f18c61879afbc7e95e/params/factors
Features loaded: ['excess_mkt_ret', 'smb', 'hml', 'umd']
Experiment 317094863492494700 created
Dumping models

=== Debugging _dump_model ===

=== Debugging _extract_model_pkl ===
Checking if meta.yaml exists: True
Meta content: {'artifact_location': 'mlflow-artifacts:/208039388113350502', 'creation_time': 1748272532371, 'experiment_id': '208039388113350502', 'last_update_time': 1748272532371, 'lifecycle_stage': 'active', 'name': 'rf'}
meta.yaml name extract: rf
RF path: py/mlartifacts/208039388113350502/f5b7855c3eae48f18c61879afbc7e95e/artifacts/rf_model/*.pkl
Surr path: py/mlartifacts/208039388113350502/f5b7855c3eae48f