# **Project Description**

This notebook implements an advanced portfolio optimization strategy using Deep Learning models (specifically a **Gated Recurrent Unit (GRU)** network) trained directly to maximize the **Sharpe Ratio**.

The core part lies in the **hybrid feature set** and the **custom loss function**:
1.  **Hybrid Feature Engineering:** The model input is a combination of:
    * **Low-Rank Risk Factors:** Principal Components (PCA) extracted from the covariance matrix of returns to capture market-wide risk exposures.

    * **Technical Indicators:** Standard momentum and volatility signals (RSI, MACD, etc.).

2.  **Custom Sharpe Ratio Loss:** The model is trained using a custom loss function that minimizes the **Negative Sharpe Ratio** while including a **concentration penalty** (Herfindahl-Hirschman Index-inspired) to enforce diversification and prevent over-allocation to single assets.

## **Methodology Stages**

1.  **Data Acquisition:** Historical prices for a curated set of 87 European stocks are downloaded.

2.  **Feature Engineering:** Technical indicators are calculated, and factors are extracted via PCA/SVD on the returns covariance matrix.

3.  **Model Architecture:** A sequential GRU model and a Feedforward Neural Network (FNN) benchmark are trained to output long-only (Softmax-constrained) daily portfolio weights.

4.  **Out-of-Sample Simulation:** The models' performance is evaluated on unseen test data (post-2024-01-01) based on key financial metrics.

WARNING:

This notebook is not a financial advisor.



In [1]:
pip install ta

Collecting ta
  Downloading ta-0.11.0.tar.gz (25 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: ta
  Building wheel for ta (setup.py) ... [?25l[?25hdone
  Created wheel for ta: filename=ta-0.11.0-py3-none-any.whl size=29412 sha256=95e266d7bd5b7c20079e719d7fcfd36184443fd0b9bf2b844dad18572e8accfa
  Stored in directory: /root/.cache/pip/wheels/5c/a1/5f/c6b85a7d9452057be4ce68a8e45d77ba34234a6d46581777c6
Successfully built ta
Installing collected packages: ta
Successfully installed ta-0.11.0


In [2]:
import yfinance as yf
import pandas as pd
import numpy as np
import ta
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.preprocessing import MinMaxScaler
from datetime import datetime

In [3]:




TICKERS_BY_COUNTRY = {
    'Germany': ['ALV.DE', 'DBK.DE', 'CBK.DE', 'HAG.DE', 'DB1.DE', 'FPE.DE', 'DHER.DE', 'MUV2.DE', 'VNA.DE', 'SDF.DE'],
    'France': ['BNP.PA', 'ACA.PA', 'GLE.PA', 'CS.PA', 'OR.PA', 'ENGI.PA', 'SCR.PA', 'CA.PA', 'PUB.PA', 'SAN.PA'],
    'UK': ['LLOY.L', 'BARC.L', 'HSBA.L', 'NWG.L', 'SSE.L', 'AV.L', 'PRU.L', 'LGEN.L', 'AHT.L', 'BP.L'],
    'Spain': ['SAN.MC', 'BBVA.MC', 'CABK.MC', 'AMS.MC', 'MAP.MC', 'SAB.MC', 'ELE.MC', 'ENG.MC', 'IBE.MC', 'IAG.MC'],
    'Italy': ['ISP.MI', 'UCG.MI', 'BAMI.MI', 'BMED.MI', 'FBK.MI', 'G.MI', 'AZM.MI', 'PST.MI', 'RACE.MI', 'IP.MI'],
    'Netherlands': ['INGA.AS', 'ADYEN.AS', 'ABN.AS', 'WKL.AS', 'AD.AS', 'ASML.AS', 'HEIA.AS', 'TKWY.AS', 'KPN.AS'],
    'Sweden': ['NDA-SE.ST', 'SEB-A.ST', 'SHB-A.ST', 'SWED-A.ST', 'GETI-B.ST', 'VOLV-B.ST', 'AZN.ST', 'TELIA.ST', 'ESSITY-B.ST', 'ERIC-B.ST'],
    'Switzerland': ['UBSG.SW', 'VETN.SW', 'ZURN.SW', 'NESN.SW', 'SGSN.SW', 'CFR.SW', 'GIVN.SW', 'SREN.SW', 'NOVN.SW'],
    'Belgium': ['KBC.BR', 'ABI.BR', 'UCB.BR', 'ACKB.BR', 'SOLB.BR', 'TUB.BR', 'ELI.BR', 'WDP.BR', 'COLR.BR']
}

TICKERS = [ticker for sublist in TICKERS_BY_COUNTRY.values() for ticker in sublist]
START_DATE = '2021-01-01'
END_DATE = '2025-09-30'
NUM_ASSETS = len(TICKERS)

SEQ_LENGTH = 10
TRAIN_TEST_SPLIT_DATE = '2024-01-01'
K_FACTORS = 5

print(f"Total number of assets: **{NUM_ASSETS}**")


def stage_1_acquire_and_prepare_data(tickers, start, end):

    print("\n--- Stage 1: Data Acquisition and Preparation ---")


    data = yf.download(tickers, start=start, end=end, progress=False)


    close_prices_all = data['Close'].copy()


    close_prices_filled = close_prices_all.ffill().bfill()


    daily_returns_all = close_prices_filled.pct_change()


    daily_returns_for_check = daily_returns_all.iloc[1:]
    valid_tickers = daily_returns_for_check.columns[daily_returns_for_check.notna().all()].tolist()

    if not valid_tickers:
        print("CRITICAL ERROR: No assets with complete daily returns found after filling NaNs and checking for completeness.")
        print("This often happens if all chosen tickers were delisted, had extremely sparse data, or the date range is too strict.")

        return pd.DataFrame(), pd.DataFrame()


    print(f"Number of clean assets: **{len(valid_tickers)}** (from {len(tickers)} initial)")


    data_filtered_by_tickers = data.loc[:, (slice(None), valid_tickers)]


    data_clean = data_filtered_by_tickers.dropna()


    close_prices_clean = data_clean['Close']
    daily_returns_clean = close_prices_clean.pct_change().dropna(how='all')


    print(f"Clean OHLCV Data shape: {data_clean.shape}")
    print(f"Clean Daily Returns shape: {daily_returns_clean.shape}")

    return data_clean, daily_returns_clean

full_price_data, daily_returns = stage_1_acquire_and_prepare_data(TICKERS, START_DATE, END_DATE)


def stage_2_feature_engineering(full_price_data):

    print("\n--- Stage 2: Feature Engineering (Technical Indicators) ---")

    features = []

    asset_names = full_price_data.columns.get_level_values(1).unique().tolist()


    for ticker in asset_names:

        df = full_price_data.loc[:, (slice(None), ticker)].copy()


        df.columns = df.columns.droplevel(1)


        df.loc[:, 'MA20'] = df['Close'].rolling(window=20).mean()
        df.loc[:, 'EMA50'] = ta.trend.ema_indicator(df['Close'], window=50)
        macd = ta.trend.MACD(df['Close'])
        df.loc[:, 'MACD'] = macd.macd_diff()


        df.loc[:, 'RSI'] = ta.momentum.rsi(df['Close'], window=14)

        stoch = ta.momentum.StochasticOscillator(high=df['High'], low=df['Low'], close=df['Close'])
        df.loc[:, 'STOCH_K'] = stoch.stoch()


        bb = ta.volatility.BollingerBands(close=df['Close'], window=20)
        df.loc[:, 'BB_WIDTH'] = bb.bollinger_wband()

        df.loc[:, 'ATR'] = ta.volatility.AverageTrueRange(high=df['High'], low=df['Low'], close=df['Close'], window=14).average_true_range()



        indicator_cols = ['MA20', 'EMA50', 'MACD', 'RSI', 'STOCH_K', 'BB_WIDTH', 'ATR']
        asset_features = df[indicator_cols].copy()
        asset_features.columns = [f'{ticker}_{col}' for col in indicator_cols]
        features.append(asset_features)


    feature_matrix = pd.concat(features, axis=1)


    feature_matrix_clean = feature_matrix.dropna()
    print(f"Feature Matrix shape: {feature_matrix_clean.shape}")
    print(f"Total features: **{len(feature_matrix_clean.columns)}** (7 indicators * {len(asset_names)} assets)")

    return feature_matrix_clean

feature_matrix = stage_2_feature_engineering(full_price_data)


def stage_3_low_rank_feature_extraction(returns_df, variance_threshold=0.90):

    print("\n--- Stage 3: Low-Rank Feature Extraction (Compression) ---")


    aligned_returns = returns_df.loc[feature_matrix.index]


    covariance_matrix = aligned_returns.cov()
    print(f"Covariance Matrix shape: {covariance_matrix.shape}")


    eigen_values, eigen_vectors = np.linalg.eigh(covariance_matrix.values)


    sorted_indices = np.argsort(eigen_values)[::-1]
    eigen_values = eigen_values[sorted_indices]
    eigen_vectors = eigen_vectors[:, sorted_indices]


    total_variance = np.sum(eigen_values)
    cumulative_variance_ratio = np.cumsum(eigen_values) / total_variance


    k_factors = np.where(cumulative_variance_ratio >= variance_threshold)[0][0] + 1

    print(f"Total Variance: {total_variance:.4f}")
    print(f"Number of factors (k) retaining >= {variance_threshold*100}% variance: **{k_factors}**")


    factor_vectors = eigen_vectors[:, :k_factors]


    returns_values = aligned_returns.values
    factor_returns_values = returns_values @ factor_vectors


    factor_returns = pd.DataFrame(
        factor_returns_values,
        index=aligned_returns.index,
        columns=[f'Factor_{i+1}' for i in range(k_factors)]
    )

    print(f"Factor Returns shape: {factor_returns.shape}")


    final_input_df = pd.concat([factor_returns, feature_matrix.loc[factor_returns.index]], axis=1)

    return final_input_df, aligned_returns, k_factors

final_input_data, aligned_returns, K_FACTORS = stage_3_low_rank_feature_extraction(daily_returns)


INPUT_SIZE = final_input_data.shape[1]
HIDDEN_SIZE_RNN = 64
NUM_LAYERS_RNN = 2
NUM_ASSETS_FINAL = aligned_returns.shape[1]


class SharpeRatioLoss(nn.Module):

    def __init__(self, risk_free_rate=0.0, concentration_lambda=0.1):
        super().__init__()
        self.risk_free_rate = risk_free_rate
        self.concentration_lambda = concentration_lambda

    def forward(self, weights, returns):

        portfolio_returns = torch.sum(weights * returns, dim=1)


        mean_return = torch.mean(portfolio_returns)
        std_dev = torch.std(portfolio_returns)


        epsilon = 1e-6


        sharpe_ratio = (mean_return - self.risk_free_rate) / (std_dev + epsilon)
        negative_sharpe_ratio = -sharpe_ratio


        max_weight = torch.max(weights, dim=1).values
        penalty = torch.relu(max_weight - 0.50)
        concentration_penalty = torch.mean(penalty) * 100

        total_loss = negative_sharpe_ratio + self.concentration_lambda * concentration_penalty

        return total_loss


class PortfolioGRU(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super().__init__()

        self.gru = nn.GRU(input_size, hidden_size, num_layers, batch_first=True)

        self.fc = nn.Linear(hidden_size, output_size)

        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):

        gru_out, _ = self.gru(x)

        last_step_out = gru_out[:, -1, :]
        raw_weights = self.fc(last_step_out)

        weights = self.softmax(raw_weights)
        return weights


class PortfolioFNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()

        self.layer1 = nn.Linear(input_size * SEQ_LENGTH, hidden_size)
        self.relu = nn.ReLU()
        self.layer2 = nn.Linear(hidden_size, hidden_size)
        self.fc_out = nn.Linear(hidden_size, output_size)
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):

        x = x.reshape(x.size(0), -1)

        x = self.relu(self.layer1(x))
        x = self.relu(self.layer2(x))
        raw_weights = self.fc_out(x)
        weights = self.softmax(raw_weights)
        return weights

def prepare_dl_data(data, returns, seq_length, train_split_date):



    scaler = MinMaxScaler(feature_range=(0, 1))
    scaled_data = scaler.fit_transform(data)
    scaled_df = pd.DataFrame(scaled_data, index=data.index, columns=data.columns)

    X, Y = [], []
    for i in range(seq_length, len(scaled_df)):

        X.append(scaled_df.iloc[i-seq_length:i].values)

        Y.append(returns.iloc[i].values)

    X = np.array(X)
    Y = np.array(Y)


    dates = scaled_df.index[seq_length:]


    train_idx = dates < train_split_date
    test_idx = dates >= train_split_date

    X_train, Y_train = X[train_idx], Y[train_idx]
    X_test, Y_test = X[test_idx], Y[test_idx]


    X_train_t = torch.tensor(X_train, dtype=torch.float32)
    Y_train_t = torch.tensor(Y_train, dtype=torch.float32)
    X_test_t = torch.tensor(X_test, dtype=torch.float32)
    Y_test_t = torch.tensor(Y_test, dtype=torch.float32)

    print(f"\nTraining set size: {X_train_t.shape[0]} days")
    print(f"Testing set size: {X_test_t.shape[0]} days")


    test_returns = returns.loc[dates[test_idx]]

    return X_train_t, Y_train_t, X_test_t, Y_test_t, test_returns

X_train, Y_train, X_test, Y_test, test_returns_df = prepare_dl_data(
    final_input_data, aligned_returns, SEQ_LENGTH, TRAIN_TEST_SPLIT_DATE
)


def train_model(model, X_train, Y_train, epochs=200, lr=0.001):

    criterion = SharpeRatioLoss()
    optimizer = optim.Adam(model.parameters(), lr=lr)


    for epoch in range(epochs):
        model.train()


        weights = model(X_train)

        loss = criterion(weights, Y_train)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (epoch + 1) % 50 == 0:
            print(f'Epoch [{epoch+1}/{epochs}], Loss (Neg Sharpe): {loss.item():.4f}')

    return model


print("\n--- Training Model 1: GRU-based Optimizer ---")
gru_model = PortfolioGRU(INPUT_SIZE, HIDDEN_SIZE_RNN, NUM_LAYERS_RNN, NUM_ASSETS_FINAL)
gru_model = train_model(gru_model, X_train, Y_train, epochs=200)

print("\n--- Training Model 2: FNN Benchmarking Model ---")
fnn_model = PortfolioFNN(INPUT_SIZE, HIDDEN_SIZE_RNN, NUM_ASSETS_FINAL)
fnn_model = train_model(fnn_model, X_train, Y_train, epochs=200)


def simulate_portfolio(model, X_test, returns_df):

    model.eval()
    with torch.no_grad():

        weights_t = model(X_test).numpy()


    portfolio_returns = np.sum(weights_t * returns_df.values, axis=1)


    portfolio_returns_series = pd.Series(portfolio_returns, index=returns_df.index)

    return portfolio_returns_series, weights_t


gru_portfolio_returns, gru_weights = simulate_portfolio(gru_model, X_test, test_returns_df)
fnn_portfolio_returns, fnn_weights = simulate_portfolio(fnn_model, X_test, test_returns_df)


print("\n--- Stage 5: Portfolio Simulation Complete ---")
print(f"GRU Portfolio Returns simulated over {len(gru_portfolio_returns)} days.")

def calculate_metrics(returns_series, weights_array=None, name="Portfolio"):



    annual_returns = returns_series.mean() * 252
    annual_volatility = returns_series.std() * np.sqrt(252)
    sharpe_ratio = annual_returns / annual_volatility

    cumulative_return = (1 + returns_series).prod() - 1


    cumulative_wealth = (1 + returns_series).cumprod()
    peak = cumulative_wealth.expanding(min_periods=1).max()
    drawdown = (cumulative_wealth / peak) - 1
    max_drawdown = drawdown.min()

    metrics = {
        'Name': name,
        'Sharpe Ratio': sharpe_ratio,
        'Annual Volatility ($\sigma_p$)': annual_volatility,
        'Cumulative Return': cumulative_return,
        'Maximum Drawdown (MDD)': max_drawdown
    }


    if weights_array is not None:

        daily_hhi = np.sum(weights_array**2, axis=1)
        mean_hhi = np.mean(daily_hhi)


        mean_ena = 1 / mean_hhi

        metrics['Mean HHI'] = mean_hhi
        metrics['Mean ENA'] = mean_ena

    return metrics


gru_metrics = calculate_metrics(gru_portfolio_returns, gru_weights, "GRU Model Portfolio")
fnn_metrics = calculate_metrics(fnn_portfolio_returns, fnn_weights, "FNN Model Portfolio")



report_df = pd.DataFrame([gru_metrics, fnn_metrics])


report_df['Sharpe Ratio'] = report_df['Sharpe Ratio'].round(4)
report_df['Annual Volatility ($\sigma_p$)'] = (report_df['Annual Volatility ($\sigma_p$)'] * 100).round(2).astype(str) + '%'
report_df['Cumulative Return'] = (report_df['Cumulative Return'] * 100).round(2).astype(str) + '%'
report_df['Maximum Drawdown (MDD)'] = (report_df['Maximum Drawdown (MDD)'].abs() * 100).round(2).astype(str) + '%'
if 'Mean HHI' in report_df.columns:
    report_df['Mean HHI'] = report_df['Mean HHI'].round(4)
if 'Mean ENA' in report_df.columns:
    report_df['Mean ENA'] = report_df['Mean ENA'].round(2)


print("\n--- Stage 6: Performance Report (Out-of-Sample) ---")
print(report_df.to_markdown(index=False))


best_model_name = report_df.iloc[report_df['Sharpe Ratio'].idxmax()]['Name']
print(f"\n **Best Performing Model (Highest Sharpe Ratio): {best_model_name}**")

  'Annual Volatility ($\sigma_p$)': annual_volatility,
  report_df['Annual Volatility ($\sigma_p$)'] = (report_df['Annual Volatility ($\sigma_p$)'] * 100).round(2).astype(str) + '%'
  report_df['Annual Volatility ($\sigma_p$)'] = (report_df['Annual Volatility ($\sigma_p$)'] * 100).round(2).astype(str) + '%'
  data = yf.download(tickers, start=start, end=end, progress=False)


Total number of assets: **87**

--- Stage 1: Data Acquisition and Preparation ---
Number of clean assets: **87** (from 87 initial)
Clean OHLCV Data shape: (1158, 435)
Clean Daily Returns shape: (1157, 87)

--- Stage 2: Feature Engineering (Technical Indicators) ---
Feature Matrix shape: (1109, 609)
Total features: **609** (7 indicators * 87 assets)

--- Stage 3: Low-Rank Feature Extraction (Compression) ---
Covariance Matrix shape: (87, 87)
Total Variance: 0.0303
Number of factors (k) retaining >= 90.0% variance: **45**
Factor Returns shape: (1109, 45)

Training set size: 677 days
Testing set size: 422 days

--- Training Model 1: GRU-based Optimizer ---
Epoch [50/200], Loss (Neg Sharpe): -0.3271
Epoch [100/200], Loss (Neg Sharpe): -0.5326
Epoch [150/200], Loss (Neg Sharpe): -0.7390
Epoch [200/200], Loss (Neg Sharpe): -0.8805

--- Training Model 2: FNN Benchmarking Model ---
Epoch [50/200], Loss (Neg Sharpe): -0.1517
Epoch [100/200], Loss (Neg Sharpe): -0.2383
Epoch [150/200], Loss (Neg

# **Results**

## **Key Results**

The custom SharpeRatioLoss function included an explicit concentration penalty ($L_{concentration}$):$$\text{penalty} = \text{torch.relu}(\max(\text{weights}) - 0.50)$$This term added to the total loss if any single asset's weight exceeded 50%.


By minimizing this loss, the model was actively discouraged from heavily concentrating in one or two assets.

The **GRU Model Portfolio** demonstrated superior performance on the out-of-sample test set:

| Metric | GRU Model Portfolio | FNN Model Portfolio |
| :--- | :--- | :--- |
| **Sharpe Ratio** | **2.0118 (Outstanding)** | 1.2080 |
| Cumulative Return | **64.40%** | 35.06% |
| Max Drawdown (MDD) | **12.49%** | 17.48% |
| Mean ENA (Diversification) | **4.86** | 3.89 |

**WARNING**:

This notebook is not a financial advisor.

