## Hull Tactical Market Prediction - Understanding the Metric

## 1. Setup

In [None]:
from pathlib import Path
import numpy as np
import pandas as pd

In [None]:
competition_dataset_directory = Path('/kaggle/input/hull-tactical-market-prediction')

pd.set_option('display.float_format', '{:.6f}'.format)
pd.set_option('display.max_rows', 50)
pd.set_option('display.max_columns', 50)

In [None]:
df = pd.read_csv(competition_dataset_directory / 'train.csv')
columns = ['date_id', 'forward_returns', 'risk_free_rate', 'market_forward_excess_returns']
df = df.loc[:, columns]
print(f'Training Set Shape: {df.shape}')
display(df)

## 2. Baseline Strategy

#### 2.1 Moving-Average Crossover on Synthetic Price

This baseline trading strategy is a simple trend-following model built on a synthetic price series derived from forward returns. The goal is to dynamically adjust market exposure (position) based on short/long-term momentum signals.

#### 2.2 Constructing the Synthetic Price

Since the dataset provides forward returns (future market returns for each time step), we can reconstruct a notional price series by compounding these returns:

$$
P_t = \prod_{i=1}^{t} (1 + r^{\text{forward}}_i)
$$

This represents how the market’s price would evolve if those forward returns were realized sequentially.

#### 2.3 Rolling Averages

To capture both short-term and long-term market trends, two simple moving averages (SMAs) are calculated on the lagged synthetic price:

* a 10-day mean to represent short-term trend
* a 50-day mean to represent long-term trend

#### 2.4 Defining the Trading Signal

The signal is defined as the ratio between the short-term and long-term moving averages:

$$
\text{signal} = \frac{\text{SMA}_{10}}{\text{SMA}_{50}}
$$

This ratio measures short-term momentum relative to the long-term trend:

* Values greater than 1 indicate upward momentum (bullish regime).
* Values less than 1 indicate downward momentum (bearish regime).

#### 2.5 Positioning Logic

The continuous signal is discretized into three possible investment regimes:

| Signal condition | Market view      | Position | Interpretation                 |
| ---------------- | ---------------- | -------- | ------------------------------ |
| signal > 1.01    | Strong uptrend   | 2        | Leveraged long (200% exposure) |
| signal < 0.99    | Strong downtrend | 0        | Fully defensive (cash only)    |
| otherwise        | Neutral          | 1        | Fully invested (100% exposure) |

This setup emulates a 2x/1x/0x exposure model depending on the strength of the detected momentum.

In [None]:
df['synthetic_price'] = (1 + df['forward_returns']).cumprod()
df['synthetic_price_lag_1'] = df['synthetic_price'].shift(1).bfill()
df['synthetic_price_lag_1_rolling_mean_10'] = df['synthetic_price_lag_1'].rolling(10, min_periods=1).mean()
df['synthetic_price_lag_1_rolling_mean_50'] = df['synthetic_price_lag_1'].rolling(50, min_periods=1).mean()
df['signal'] = df['synthetic_price_lag_1_rolling_mean_10'] / df['synthetic_price_lag_1_rolling_mean_50']
df['position'] = np.select(
    [df['signal'] > 1.01, df['signal'] < 0.99],
    [2, 0],
    default=1
)

## 3. Validation

For evaluation, the last **151 trading days** of the dataset (from `date_id` 8839 to 8989) are used as the validation period:

$$
\text{Validation window: } 8989 - 8839 + 1 = 151 \text{ days}
$$

This period is chosen to represent approximately six months of trading activity, which aligns with the competition’s submission horizon.

In [None]:
df_validation = df.loc[(df['date_id'] >= 8838) & (df['date_id'] <= 8989)].copy(deep=True)

## 4. Metric

#### 4.1 Strategy Returns

For each period, portfolio returns are computed as a blend of risk-free and market returns, weighted by the chosen position:

$$
r^{\text{strategy}}_t
= r^{f}_t (1 - \text{position}_t) * r^{\text{forward}}_t \text{position}_t
$$

- A position of 0 means the strategy is fully in cash, earning only the risk-free rate.
- A position of 1 means fully invested in the market.
- A position of 2 implies a leveraged long exposure, where borrowed capital incurs the risk-free cost.

#### 4.2 Excess Returns

To measure true active performance relative to a risk-free benchmark, both strategy and market returns are expressed in excess form:

$$
r^{\text{strategy excess}}_t = r^{\text{strategy}}_t - r^{f}_t
$$

$$
r^{\text{market excess}}_t = r^{\text{forward}}_t - r^{f}_t
$$

In [None]:
df_validation['strategy_returns'] = df_validation['risk_free_rate'] * (1 - df_validation['position']) + df_validation['position'] * df_validation['forward_returns']
df_validation['strategy_excess_returns'] = df_validation['strategy_returns'] - df_validation['risk_free_rate']
df_validation['market_excess_returns'] = df_validation['forward_returns'] - df_validation['risk_free_rate']

#### 4.3 Trading Days Assumption

Financial markets typically have about **252 trading days** per year, excluding weekends and holidays. This constant is used to annualize daily statistics, such as volatility and Sharpe ratio, so that results from shorter validation windows can be interpreted on a yearly scale.

$\text{trading\_days} = 252$

#### 4.4 Cumulative Excess Returns

The cumulative excess return measures the total compounded growth of the strategy above the risk-free rate:

$$
(1 + r^{\text{excess}}_1)(1 + r^{\text{excess}}_2)\dots(1 + r^{\text{excess}}_T) - 1
$$

A value greater than 1.0 indicates positive growth relative to a risk-free investment.

This is computed for both:

* the strategy (`strategy_excess_returns`)
* and the market (`market_excess_returns`)

#### 4.5 Mean Excess Return (Geometric)

To get an average daily excess return that accounts for compounding, we use the geometric mean:

$$
\bar{r} = \left(\prod*{t=1}^{T}(1 + r_t^{\text{excess}})\right)^{1/T} - 1
$$

This represents the average daily rate of return (above risk-free) that would produce the same total growth if earned consistently every day.

#### 4.6 Standard Deviation of Returns

The standard deviation of daily returns measures volatility or day-to-day fluctuation:

$$
\sigma = \sqrt{\frac{1}{T-1}\sum_{t=1}^{T}(r_t - \bar{r})^2}
$$

This is computed separately for the strategy’s realized returns and for the market’s forward returns.

#### 4.7 Annualized Volatility

Volatility is then annualized by scaling daily standard deviation with the square root of the number of trading days per year:

$$
\sigma_{\text{annual}} = \sigma_{\text{daily}} \times \sqrt{252}
$$

and converted into percentage form. This represents the expected range of annual return fluctuations, assuming daily returns are independent and identically distributed.

In [None]:
trading_days = 252

strategy_excess_cumulative_returns = (1 + df_validation['strategy_excess_returns']).prod()
strategy_mean_excess_return = strategy_excess_cumulative_returns ** (1 / len(df_validation)) - 1
strategy_std = df_validation['strategy_returns'].std()
strategy_volatility = float(strategy_std * np.sqrt(trading_days) * 100)

market_excess_cumulative_returns = (1 + df_validation['market_excess_returns']).prod()
market_mean_excess_return = market_excess_cumulative_returns ** (1 / len(df_validation)) - 1
market_std = df_validation['forward_returns'].std()
market_volatility = float(market_std * np.sqrt(trading_days) * 100)

print(
f'''
Strategy
--------
Excess Cumulative Returns: {strategy_excess_cumulative_returns:.6f}
Mean Excess Cumulative Returns: {strategy_mean_excess_return:.6f}
Standard Deviation: {strategy_std:.6f}
Annual Volatility: {strategy_volatility:.6f}

Market
------
Excess Cumulative Returns: {market_excess_cumulative_returns:.6f}
Mean Excess Cumulative Returns: {market_mean_excess_return:.6f}
Standard Deviation: {market_std:.6f}
Annual Volatility: {market_volatility:.6f}
'''
)

#### 4.8 Adjusted Sharpe Ratio and Penalty Components

After computing the basic performance metrics for both the strategy and the market, this section adjusts the Sharpe ratio by penalizing excessive volatility and underperformance relative to the market benchmark. The goal is to ensure that the evaluation metric rewards stable, market-comparable performance rather than aggressive or overleveraged behavior.

#### 4.9 Volatility Penalty

To prevent models from taking on excessive risk, the strategy’s volatility is compared to that of the market:

$$
\text{Excess Volatility} = \max\left(0, \frac{\sigma_{\text{strategy}}}{\sigma_{\text{market}}} - 1.2\right)
$$

If the strategy’s volatility is below or up to **1.2x the market volatility**, no penalty is applied. If it exceeds that threshold, the excess portion increases **linearly** with the volatility ratio.

The final volatility penalty factor is then:

$$
\text{Volatility Penalty} = 1 + \text{Excess Volatility}
$$

This means that higher volatility directly reduces the adjusted Sharpe score.

#### 4.10 Return Penalty

The return gap penalizes strategies that achieve a lower mean excess return than the market:

$$
\text{Return Gap} = \max\left(0, (\bar{r}_{\text{excess\_market}} - \bar{r}_{\text{excess\_strategy}}) \times 100 \times 252\right)
$$

The difference between market and strategy mean excess returns (in daily terms) is annualized and converted into percentage points.
If the strategy outperforms the market, the penalty is zero.

The return penalty is **quadratic**, meaning the penalty increases more sharply for larger gaps:

$$
\text{Return Penalty} = 1 + \frac{(\text{Return Gap})^2}{100}
$$

This ensures that moderate underperformance has a small effect, while large underperformance significantly reduces the final score.

#### 4.11 Sharpe Ratio

The Sharpe ratio measures annualized risk-adjusted return:

$$
\text{Sharpe} = \frac{\bar{r}_{\text{strategy}}^{\text{excess}}}{\sigma_{\text{strategy}}} \times \sqrt{252}
$$

It reflects how much excess return the strategy delivers per unit of volatility.

#### 4.12 Adjusted Sharpe Ratio

The adjusted Sharpe ratio incorporates both penalties to discourage risk-taking or low-return behavior:

$$
\text{Adjusted Sharpe} =
\frac{\text{Sharpe}}
{\text{Volatility Penalty} \times \text{Return Penalty}}
$$

This ensures that:

* High-volatility strategies (over 1.2× market risk) are penalized,
* Underperforming strategies (below market return) are penalized,
* Only risk-efficient, market-consistent strategies achieve high scores.


In [None]:
excess_volatility = np.maximum(0, strategy_volatility / market_volatility - 1.2) if market_volatility > 0 else 0
volatility_penalty = 1 + excess_volatility

return_gap = max(0, (market_mean_excess_return - strategy_mean_excess_return) * 100 * trading_days)
return_penalty = 1 + (return_gap ** 2) / 100

sharpe = strategy_mean_excess_return / strategy_std * np.sqrt(trading_days)
adjusted_sharpe = sharpe / (volatility_penalty * return_penalty)

print(
f'''
Excess Volatility: {excess_volatility:.6f}
Volatility Penalty: {volatility_penalty:.6f}

Return Gap: {return_gap:.6f}
Return Penalty: {return_penalty:.6f}

Sharpe: {sharpe:.6f}
Adjusted Sharpe: {adjusted_sharpe:.6f}
'''
)