# Case Study and Empirical Analysis of Portfolio Optimization Strategies Described in the Thesis

### 🔍 Case Study Overview

This notebook presents the case study referenced in the thesis, focusing on an empirical evaluation of portfolio optimization methods using the WIG20 index. The objective is to assess the practical performance of various optimization techniques and compare them to benchmark strategies such as equally weighted and inverse-volatility portfolios.

We begin by retrieving historical data for the WIG20 index and its constituent stocks. This dataset forms the basis for all subsequent analyses, including return computation, risk assessment, and backtesting of portfolio strategies.

In [2]:
import pandas as pd
import yfinance as yf
import datetime

from typing import List, Optional

### 📊 Data Retrieval

The historical WIG20 data used in this analysis was retrieved from [Yahoo Finance](https://finance.yahoo.com) using the `yfinance` Python library. This library provides a convenient interface for programmatically downloading financial time series data.

To obtain the relevant dataset, a list of WIG20 stock tickers must be defined, formatted according to the `yfinance` API conventions. The user can specify either:

- A **start date** and **end date**, defining the time range for historical data, or  
- A fixed **period** (e.g., `"1y"` for one year), which automatically selects the date range from today going backward.

The `yfinance` library handles data downloading and returns well-structured historical data, including daily open, high, low, close, adjusted close prices, and trading volume for each specified asset.

We specify the tickers, as well as the start date and end date.

In [39]:
WIG20 = [
         "ALR.WA", "ALE.WA", "BDX.WA", "CCC.WA", "CDR.WA",
         "CPS.WA", "DNP.WA", "KTY.WA", "JSW.WA", "KGH.WA",
         "KRU.WA", "LPP.WA", "MBK.WA", "OPL.WA", "PEO.WA",
         "PGE.WA", "PKN.WA", "PKO.WA", "PZU.WA", "PCO.WA"
     ]

start = datetime.datetime(2022,6,1)
end = datetime.datetime(2025,6,1)
# period = '3y' - worth mentioning

### 📥 Data Download Utility Function

For the case study and application purposes, I developed a custom function to streamline the process of downloading financial data via the `yfinance` library. This function allows users to specify either explicit start and end dates or a relative period (e.g., last year), providing flexible data retrieval options.

This utility serves as the core data-fetching component both in the notebook analysis and the accompanying application, ensuring consistent and efficient access to historical market data.


In [4]:
def gather_data(tickers: List[str], 
                start_date: Optional[str] = None, 
                end_date: Optional[str] = None, 
                period: str = '1y'):
    if start_date and end_date:
        data = yf.download(tickers, group_by= "column", start= start_date, end= end_date)
    else:
        data = yf.download(tickers, group_by= "column", period= period)
    if isinstance(data, pd.Series):
        data = data.to_frame(name=tickers[0])      
    data = data['Close']
    data = data.reset_index()
    data['Date'] = pd.to_datetime(data['Date'])
    data.set_index('Date', inplace=True)

    return data.sort_values('Date', ascending=False)

Here, we can see the close prices from the latest 10 days.

In [40]:
wig20 = gather_data(WIG20, start_date= start, end_date= end)
wig20.head(10)

[*********************100%***********************]  20 of 20 completed


Ticker,ALE.WA,ALR.WA,BDX.WA,CCC.WA,CDR.WA,CPS.WA,DNP.WA,JSW.WA,KGH.WA,KRU.WA,KTY.WA,LPP.WA,MBK.WA,OPL.WA,PCO.WA,PEO.WA,PGE.WA,PKN.WA,PKO.WA,PZU.WA
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
2025-05-30,34.445,103.949997,623.799988,218.399994,221.800003,16.700001,549.599976,22.66,122.900002,392.100006,866.5,14420.0,820.0,9.642,17.09,184.300003,9.338,73.540001,75.279999,61.18
2025-05-29,34.599998,104.599998,629.0,218.800003,220.5,16.139999,546.0,22.780001,123.449997,406.0,868.5,14740.0,847.599976,9.872,17.115,185.0,9.706,72.629997,76.019997,61.040001
2025-05-28,35.205002,107.550003,657.0,229.5,219.100006,16.5,553.0,23.18,124.650002,414.399994,877.5,15350.0,856.400024,9.826,17.65,186.600006,9.472,74.550003,77.800003,61.900002
2025-05-27,33.549999,107.300003,661.400024,238.199997,220.199997,16.639999,551.0,22.879999,125.849998,420.5,872.0,15590.0,842.0,9.84,17.280001,184.600006,9.244,72.599998,77.080002,61.0
2025-05-26,33.514999,106.650002,658.799988,235.300003,220.100006,16.700001,544.799988,23.049999,127.300003,421.0,867.5,15625.0,826.400024,9.804,16.965,181.800003,9.2,71.900002,75.82,60.5
2025-05-23,33.055,103.449997,640.0,225.800003,218.199997,16.6,527.200012,22.98,123.599998,399.200012,834.5,15020.0,791.599976,9.722,16.33,176.5,8.976,70.550003,74.120003,58.619999
2025-05-22,32.224998,104.099998,643.200012,229.100006,220.0,16.610001,527.400024,22.799999,122.099998,392.5,834.5,15495.0,802.799988,9.848,16.700001,178.850006,9.21,70.519997,74.900002,60.099998
2025-05-21,32.974998,105.400002,645.0,228.199997,225.0,17.004999,527.200012,23.15,122.349998,394.200012,848.0,15600.0,800.0,9.926,18.395,179.399994,9.478,71.68,75.18,60.099998
2025-05-20,33.799999,106.699997,634.200012,226.699997,229.899994,17.02,533.799988,23.16,123.599998,402.700012,852.0,15830.0,803.400024,9.888,18.615,181.5,9.506,72.989998,76.0,62.400002
2025-05-19,33.330002,105.5,624.200012,229.800003,232.199997,17.16,532.400024,23.190001,124.150002,404.100006,863.0,15980.0,806.599976,9.692,18.299999,178.75,9.476,72.230003,75.599998,62.560001


In [8]:
wig20.head(10).to_latex()

'\\begin{tabular}{lrrrrrrrrrrrrrrrrrrrr}\n\\toprule\nTicker & ALE.WA & ALR.WA & BDX.WA & CCC.WA & CDR.WA & CPS.WA & DNP.WA & JSW.WA & KGH.WA & KRU.WA & KTY.WA & LPP.WA & MBK.WA & OPL.WA & PCO.WA & PEO.WA & PGE.WA & PKN.WA & PKO.WA & PZU.WA \\\\\nDate &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  \\\\\n\\midrule\n2025-05-30 00:00:00 & 34.445000 & 103.949997 & 623.799988 & 218.399994 & 221.800003 & 16.700001 & 549.599976 & 22.660000 & 122.900002 & 392.100006 & 866.500000 & 14420.000000 & 820.000000 & 9.642000 & 17.090000 & 184.300003 & 9.338000 & 73.540001 & 75.279999 & 61.180000 \\\\\n2025-05-29 00:00:00 & 34.599998 & 104.599998 & 629.000000 & 218.800003 & 220.500000 & 16.139999 & 546.000000 & 22.780001 & 123.449997 & 406.000000 & 868.500000 & 14740.000000 & 847.599976 & 9.872000 & 17.115000 & 185.000000 & 9.706000 & 72.629997 & 76.019997 & 61.040001 \\\\\n2025-05-28 00:00:00 & 35.205002 & 107.550003 & 657.000000 & 229.500000 & 219.100006 & 16.500000 & 553.000000 & 23.1800

Now, when we have a dataset with appropriate data, we can start doing the portfolio optimization empirical analysis. We start with testing portfolio construction techniques, which I described in the chapter 3.

In [10]:
from skfolio.preprocessing import prices_to_returns
from skfolio.model_selection import WalkForward, cross_val_predict
from skfolio.optimization import InverseVolatility, EqualWeighted

In [45]:
returns = prices_to_returns(wig20)

EW_portfolio = cross_val_predict(EqualWeighted(), returns, cv=WalkForward(test_size=21, train_size=120))
IV_portfolio = cross_val_predict(InverseVolatility(), returns, cv=WalkForward(test_size=21, train_size=120))

But we can evaluate performance of the Equal Weighted Portfolio

In [None]:
def print_performance_metrics(portfolio):
    print("📊 Portfolio Performance Metrics")
    print("-" * 45)
    print(f"Sharpe Ratio:                    {portfolio.sharpe_ratio:.4f}")
    print(f"Annualized Sharpe Ratio:         {portfolio.annualized_sharpe_ratio:.4f}")
    print(f"Variance:                        {portfolio.variance:.4f}")
    print(f"Annualized Variance:             {portfolio.annualized_variance:.4f}")
    print(f"Volatility:                      {portfolio.standard_deviation:.4f}")
    print(f"Annualized Volatility:           {portfolio.annualized_standard_deviation:.4f}")
    print(f"Mean Return:                     {portfolio.mean:.4f}")
    print(f"Annualized Mean Return:          {portfolio.annualized_mean:.4f}")
    print(f"Cumulative Return:               {portfolio.cumulative_returns[-1]:.4f}")
    print("-" * 45)

In [56]:
def create_performance_metrics_table(portfolio, name="Portfolio 1"):
    metrics = {
        "Sharpe Ratio": round(portfolio.sharpe_ratio, 4),
        "Annualized Sharpe": round(portfolio.annualized_sharpe_ratio, 4),
        "Variance": round(portfolio.variance, 6),
        "Annualized Variance": round(portfolio.annualized_variance, 6),
        "Standard Deviation": round(portfolio.standard_deviation, 4),
        "Annualized Std Dev": round(portfolio.annualized_standard_deviation, 4),
        "Mean Return": round(portfolio.mean, 4),
        "Annualized Mean": round(portfolio.annualized_mean, 4),
        "Cumulative Return": round(portfolio.cumulative_returns[-1], 4)
    }

    df = pd.DataFrame(list(metrics.items()), columns=["Metrics", name])
    return df

In [74]:
EW_performance = create_performance_metrics_table(EW_portfolio, 'EqualWeighted')
IV_performance = create_performance_metrics_table(IV_portfolio, 'Inverse Volatility')
performance_comparison = pd.merge(EW_performance, IV_performance, on='Metrics')
performance_comparison

Unnamed: 0,Metrics,EqualWeighted,Inverse Volatility
0,Sharpe Ratio,-0.0247,-0.0296
1,Annualized Sharpe,-0.3922,-0.4699
2,Variance,0.00018,0.000171
3,Annualized Variance,0.045343,0.043194
4,Standard Deviation,0.0134,0.0131
5,Annualized Std Dev,0.2129,0.2078
6,Mean Return,-0.0003,-0.0004
7,Annualized Mean,-0.0835,-0.0977
8,Cumulative Return,-0.2018,-0.236


We create a Maximum Sharpe Ratio model with shrinkage for the estimation of the expected returns and denoising for the estimation of the covariance matrix

In [59]:
from skfolio import RiskMeasure
from skfolio.moments import DenoiseCovariance, EmpiricalMu
from skfolio.optimization import MeanRisk, ObjectiveFunction
from skfolio.prior import EmpiricalPrior

In [65]:
model = MeanRisk(
    objective_function = ObjectiveFunction.MINIMIZE_RISK,
    risk_measure = RiskMeasure.VARIANCE,
    risk_free_rate= 0.05,
    prior_estimator= EmpiricalPrior(
        mu_estimator= EmpiricalMu(), 
        covariance_estimator = DenoiseCovariance()
        ),
    portfolio_params=dict(name="Minimum Variance Portfolio"),
)

MV_portfolio = cross_val_predict(model, returns, cv=WalkForward(test_size=21, train_size=120))

In [75]:
MV_performance = create_performance_metrics_table(MV_portfolio, "Minimum Variance Portfolio")
performance_comparison = pd.merge(performance_comparison, MV_performance, on='Metrics')
performance_comparison

Unnamed: 0,Metrics,EqualWeighted,Inverse Volatility,Minimum Variance Portfolio
0,Sharpe Ratio,-0.0247,-0.0296,-0.0497
1,Annualized Sharpe,-0.3922,-0.4699,-0.7885
2,Variance,0.00018,0.000171,0.000139
3,Annualized Variance,0.045343,0.043194,0.034958
4,Standard Deviation,0.0134,0.0131,0.0118
5,Annualized Std Dev,0.2129,0.2078,0.187
6,Mean Return,-0.0003,-0.0004,-0.0006
7,Annualized Mean,-0.0835,-0.0977,-0.1474
8,Cumulative Return,-0.2018,-0.236,-0.3563


## Black-Litterman Model

In [66]:
from skfolio.prior import BlackLitterman

Let’s assume we are able to accurately estimate views about future realization of the market. We estimate that PKO will have an expected return of 25% p.a. (absolute view) and will outperform Orlen by 22% p.a. (relative view). We also estimate that KGHM will outperform Orlen by 15% p.a (relative view). By converting these annualized estimates into daily estimates to be homogenous with the input, we get

In [69]:
analyst_views = [
    "PKO.WA == 0.00098",
    "PKO.WA - PKN.WA == 0.00086",
    "KGH.WA - PKN.WA == 0.00059",
]

In [72]:
model_bl = MeanRisk(
    risk_measure = RiskMeasure.VARIANCE,
    objective_function = ObjectiveFunction.MINIMIZE_RISK,
    prior_estimator = BlackLitterman(
        views = analyst_views,
        risk_free_rate= 0.05
        ),
    portfolio_params = dict(name="Black & Litterman"),
)
BL_portfolio = cross_val_predict(model_bl, returns, cv=WalkForward(test_size=21, train_size=120))

In [76]:
BL_performance = create_performance_metrics_table(BL_portfolio, "Black & Litterman")
performance_comparison = pd.merge(performance_comparison, BL_performance, on='Metrics')
performance_comparison

Unnamed: 0,Metrics,EqualWeighted,Inverse Volatility,Minimum Variance Portfolio,Black & Litterman
0,Sharpe Ratio,-0.0247,-0.0296,-0.0497,-0.0484
1,Annualized Sharpe,-0.3922,-0.4699,-0.7885,-0.7678
2,Variance,0.00018,0.000171,0.000139,0.000139
3,Annualized Variance,0.045343,0.043194,0.034958,0.034999
4,Standard Deviation,0.0134,0.0131,0.0118,0.0118
5,Annualized Std Dev,0.2129,0.2078,0.187,0.1871
6,Mean Return,-0.0003,-0.0004,-0.0006,-0.0006
7,Annualized Mean,-0.0835,-0.0977,-0.1474,-0.1436
8,Cumulative Return,-0.2018,-0.236,-0.3563,-0.3471


## Risk Parity Model

In [77]:
from skfolio.optimization import RiskBudgeting

In [79]:
model = RiskBudgeting(
    risk_measure=RiskMeasure.VARIANCE,
    risk_free_rate= 0.05,
    prior_estimator= EmpiricalPrior(
        mu_estimator= EmpiricalMu(), 
        covariance_estimator = DenoiseCovariance()
        ),
    portfolio_params=dict(name="Risk Parity - Variance"),
)
RP_portfolio = cross_val_predict(model, returns, cv=WalkForward(test_size=21, train_size=120))

In [81]:
RP_performance = create_performance_metrics_table(RP_portfolio, "Risk Parity")
performance_comparison = pd.merge(performance_comparison, RP_performance, on='Metrics')
performance_comparison

Unnamed: 0,Metrics,EqualWeighted,Inverse Volatility,Minimum Variance Portfolio,Black & Litterman,Risk Parity_x,Risk Parity_y
0,Sharpe Ratio,-0.0247,-0.0296,-0.0497,-0.0484,-0.0497,-0.0302
1,Annualized Sharpe,-0.3922,-0.4699,-0.7885,-0.7678,-0.7885,-0.48
2,Variance,0.00018,0.000171,0.000139,0.000139,0.000139,0.000167
3,Annualized Variance,0.045343,0.043194,0.034958,0.034999,0.034958,0.042047
4,Standard Deviation,0.0134,0.0131,0.0118,0.0118,0.0118,0.0129
5,Annualized Std Dev,0.2129,0.2078,0.187,0.1871,0.187,0.2051
6,Mean Return,-0.0003,-0.0004,-0.0006,-0.0006,-0.0006,-0.0004
7,Annualized Mean,-0.0835,-0.0977,-0.1474,-0.1436,-0.1474,-0.0984
8,Cumulative Return,-0.2018,-0.236,-0.3563,-0.3471,-0.3563,-0.2379


## Hierarchical Risk Parity

In [None]:
from skfolio.distance import KendallDistance
from skfolio.optimization import HierarchicalRiskParity

In [85]:
model = HierarchicalRiskParity(
    risk_measure=RiskMeasure.VARIANCE, portfolio_params=dict(name="HRP-Variance"),
    distance_estimator=KendallDistance(absolute=True),
    prior_estimator = EmpiricalPrior(
        mu_estimator= EmpiricalMu(), 
        covariance_estimator = DenoiseCovariance()
        )
)
HRP_portfolio = cross_val_predict(model, returns, cv=WalkForward(test_size=21, train_size=120))

In [86]:
HRP_performance = create_performance_metrics_table(HRP_portfolio, "Hierarchical Risk Parity")
performance_comparison = pd.merge(performance_comparison, HRP_performance, on='Metrics')
performance_comparison

Unnamed: 0,Metrics,EqualWeighted,Inverse Volatility,Minimum Variance Portfolio,Black & Litterman,Risk Parity_x,Risk Parity_y,Hierarchical Risk Parity
0,Sharpe Ratio,-0.0247,-0.0296,-0.0497,-0.0484,-0.0497,-0.0302,-0.0328
1,Annualized Sharpe,-0.3922,-0.4699,-0.7885,-0.7678,-0.7885,-0.48,-0.5212
2,Variance,0.00018,0.000171,0.000139,0.000139,0.000139,0.000167,0.000158
3,Annualized Variance,0.045343,0.043194,0.034958,0.034999,0.034958,0.042047,0.039724
4,Standard Deviation,0.0134,0.0131,0.0118,0.0118,0.0118,0.0129,0.0126
5,Annualized Std Dev,0.2129,0.2078,0.187,0.1871,0.187,0.2051,0.1993
6,Mean Return,-0.0003,-0.0004,-0.0006,-0.0006,-0.0006,-0.0004,-0.0004
7,Annualized Mean,-0.0835,-0.0977,-0.1474,-0.1436,-0.1474,-0.0984,-0.1039
8,Cumulative Return,-0.2018,-0.236,-0.3563,-0.3471,-0.3563,-0.2379,-0.251
