# 📘 Label from Top Rule-Based Configs
This notebook loads the best-performing rule-based strategies and uses their trades as high-confidence positive labels for training a new machine learning model.

## 🔧 Setup
Import required packages and define paths to results and configs.

In [39]:
import pandas as pd
import json
from pathlib import Path

RESULTS_DIR = Path('../results')
CONFIG_PATH = Path('../configs/top_10_configs.json')
TRADES_PATH = RESULTS_DIR / 'batch_backtest_results.csv'

## 📥 Load Top Config IDs
Read the top 10 rule-based configurations from the updated JSON file.

In [40]:
with open(CONFIG_PATH) as f:
    top_configs = json.load(f)

top_config_ids = [cfg['Config_ID'] for cfg in top_configs]
print(f"Loaded {len(top_config_ids)} top configs:", top_config_ids)

Loaded 10 top configs: ['CFG_097', 'CFG_101', 'CFG_113', 'CFG_121', 'CFG_105', 'CFG_117', 'CFG_125', 'CFG_109', 'CFG_081', 'CFG_089']


## 📊 Load All Backtest Trades
Import all trades generated by all filter combinations.

In [41]:
df = pd.read_csv(TRADES_PATH, parse_dates=['Entry Date', 'Exit Date'])
print(f"Total trades loaded: {len(df)}")
df.head()

Total trades loaded: 16716


Unnamed: 0,Ticker,Entry Date,Exit Date,Buy Price,Sell Price,Return %,Config_ID,USE_CONFIRMATION_CANDLE,USE_RSI_FILTER,USE_MACD_HIST_FILTER,USE_FORCE_INDEX_FILTER,USE_ATR_FILTER,USE_MACD_DIVERGENCE,USE_TRAILING_EXIT
0,AAPL,2008-01-24,2008-01-31,135.6,135.36,-0.176991,CFG_009,False,False,False,True,False,False,False
1,AAPL,2008-01-25,2008-02-01,130.01,133.75,2.876702,CFG_009,False,False,False,True,False,False,False
2,AAPL,2008-01-28,2008-02-04,130.01,131.65,1.261441,CFG_009,False,False,False,True,False,False,False
3,AAPL,2008-01-29,2008-02-05,131.54,129.36,-1.657291,CFG_009,False,False,False,True,False,False,False
4,AAPL,2008-02-07,2008-02-14,121.24,127.46,5.13032,CFG_009,False,False,False,True,False,False,False


## ✅ Filter for Top Config Trades
Keep only the trades that come from one of the top 10 configs.

In [42]:
df_top = df[df['Config_ID'].isin(top_config_ids)].copy()
print(f"Trades from top configs: {len(df_top)}")
df_top.head()

Trades from top configs: 2092


Unnamed: 0,Ticker,Entry Date,Exit Date,Buy Price,Sell Price,Return %,Config_ID,USE_CONFIRMATION_CANDLE,USE_RSI_FILTER,USE_MACD_HIST_FILTER,USE_FORCE_INDEX_FILTER,USE_ATR_FILTER,USE_MACD_DIVERGENCE,USE_TRAILING_EXIT
11750,AAPL,2008-01-28,2008-02-04,130.01,131.65,1.261441,CFG_081,True,False,True,False,False,False,False
11751,AAPL,2008-01-29,2008-02-05,131.54,129.36,-1.657291,CFG_081,True,False,True,False,False,False,False
11752,AAPL,2008-02-07,2008-02-14,121.24,127.46,5.13032,CFG_081,True,False,True,False,False,False,False
11753,AAPL,2008-10-09,2008-10-16,88.74,101.89,14.818571,CFG_081,True,False,True,False,False,False,False
11754,AAPL,2010-08-27,2010-09-03,241.62,258.77,7.097922,CFG_081,True,False,True,False,False,False,False


## 🏷️ Assign Labels
Label these trades as high-confidence positives (1 = bounce occurred).

In [43]:
df_top['Label'] = 1
df_top[['Ticker', 'Entry Date', 'Exit Date', 'Return %', 'Label']].head()

Unnamed: 0,Ticker,Entry Date,Exit Date,Return %,Label
11750,AAPL,2008-01-28,2008-02-04,1.261441,1
11751,AAPL,2008-01-29,2008-02-05,-1.657291,1
11752,AAPL,2008-02-07,2008-02-14,5.13032,1
11753,AAPL,2008-10-09,2008-10-16,14.818571,1
11754,AAPL,2010-08-27,2010-09-03,7.097922,1


## 💾 Save Labeled Data
Save the result to a CSV file for training the new ML model.

In [44]:
out_path = RESULTS_DIR / 'rule_based_labeled_trades.csv'
df_top.to_csv(out_path, index=False)
print(f"Saved labeled dataset to: {out_path}")

Saved labeled dataset to: ..\results\rule_based_labeled_trades.csv
