Skip to content

Commit

Permalink
fix(docs): docs and examples updated
Browse files Browse the repository at this point in the history
  • Loading branch information
HugoDelatte committed Jan 3, 2024
1 parent d05b661 commit 0f5575e
Show file tree
Hide file tree
Showing 15 changed files with 89 additions and 64 deletions.
2 changes: 1 addition & 1 deletion docs/user_guide/cluster.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
Clustering Estimators
*********************

The `skfolio.cluster` module complement `sklearn.cluster` with additional clustering
The `skfolio.cluster` module complements `sklearn.cluster` with additional clustering
estimators including the :class:`HierarchicalClustering` that forms hierarchical
clusters from a distance matrix. It is used in the following portfolio optimizations:

Expand Down
20 changes: 11 additions & 9 deletions docs/user_guide/optimization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -165,9 +165,9 @@ Maximum Sharpe Ratio portfolio:
Prior Estimator
===============

Every optimization estimator has a parameter named `prior_estimator`.
Every portfolio optimization has a parameter named `prior_estimator`.
The :ref:`prior estimator <prior>` fits a :class:`~skfolio.prior.PriorModel` containing
the estimation of assets' expected returns, covariance matrix, returns and Cholesky
the estimation of assets expected returns, covariance matrix, returns and Cholesky
decomposition of the covariance. It represents the investor’s prior beliefs about the
model used to estimate such distribution.

Expand Down Expand Up @@ -216,8 +216,8 @@ This example is **purposely complex** to demonstrate how multiple estimators can
combined.

The model below is a Maximum Sharpe Ratio optimization using a Factor Model for the
estimation of the assets' expected reruns and covariance matrix. A Black & Litterman
model is used for the estimation of the factors' expected reruns and covariance matrix,
estimation of the **assets** expected reruns and covariance matrix. A Black & Litterman
model is used for the estimation of the **factors** expected reruns and covariance matrix,
incorporating the analyst' views on the factors. Finally, the Black & Litterman prior
expected returns are estimated using an equal-weighted market equilibrium with a risk
aversion of 2 and a denoised prior covariance matrix:
Expand Down Expand Up @@ -264,7 +264,7 @@ aversion of 2 and a denoised prior covariance matrix:
Custom Estimator
================
It is very common to use a custom implementation for the prior estimator. For
It is very common to use a custom implementation for the moments estimators. For
example, you may want to use an in-house estimation for the covariance or a predictive
model for the expected returns.

Expand Down Expand Up @@ -719,11 +719,13 @@ Stacking Optimization
*********************

:class:`StackingOptimization` is an ensemble method that consists in stacking the output
of individual optimization estimators with a final optimization estimator.
of individual portfolio optimizations with a final portfolio optimization.

The weights are the dot-product of individual estimators' weights with the final
estimator's weights. Stacking allows to use the strength of each individual estimator
by using their output as input of a final estimator.
The weights are the dot-product of individual optimizations weights with the final
optimization weights.

Stacking allows to use the strength of each individual portfolio optimization by
using their output as input of a final portfolio optimization.

To avoid data leakage, out-of-sample estimates are used to fit the outer
optimization.
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guide/population.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Let's explore some of the methods:
population.plot_composition()
A `Population` is returned by the `predict` method of some Optimization that
A `Population` is returned by the `predict` method of some portfolio optimization that
supports multi-outputs.

For example, fitting :class:`~skfolio.optimization.MeanRisk` with parameter
Expand Down
17 changes: 11 additions & 6 deletions docs/user_guide/portfolio.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,13 @@ Portfolio

`Portfolio` classes implement a large set of attributes and methods intended for
portfolio analysis. They are returned by the `predict` method of
:ref:`optimization estimators <optimization>`. They are also data-containers (calling
:ref:`portfolio optimizations <optimization>`.

They are also data-containers (calling
:python:`np.asarray(portfolio)` returns the portfolio returns) making them compatible
with `sklearn.model_selection` tools. They use `slots` for improved performance.
with `sklearn.model_selection` tools.

They use `slots` for improved performance.

Base Portfolio
**************
Expand Down Expand Up @@ -94,19 +98,20 @@ Portfolio
:class:`Portfolio` inherits from :class:`BasePortfolio`. The portfolio returns are the
dot product of the assets weights with the assets returns minus costs:

.. math:: r_p = R \cdot w^{T} - c^{T} \cdot | w - w_{prev} |
.. math:: r_p = R \cdot w^{T} - c^{T} \cdot | w - w_{prev} | - f^{T} \cdot w

with :math:`r_p` the vector of portfolio returns , :math:`R` the matrix of assets
returns, :math:`w` the vector of assets weights, :math:`c` the vector of assets costs
and :math:`w_{prev}` the assets previous weights.
returns, :math:`w` the vector of assets weights, :math:`c` the vector of assets
transaction costs, :math:`f` the vector of assets management fees and :math:`w_{prev}`
the assets previous weights.

.. warning::

The :class:`Portfolio` formulation is **homogenous** to the convex optimization
problems for coherent analysis. It's important to note that this portfolio
formulation is **not perfectly replicable** due to weight drift when asset prices
move. The only case where it would be perfectly replicable is with periodic
rebalancing with zero or constant transaction cost. In practice, portfolios are
rebalancing with zero costs. In practice, portfolios are
rebalanced frequently enough, so this weight drift becomes negligible in regards to
model analysis and selection. Before trading, a full replicability analysis should
be performed, which is another topic left to the investor.
Expand Down
8 changes: 4 additions & 4 deletions docs/user_guide/prior.rst
Original file line number Diff line number Diff line change
Expand Up @@ -133,14 +133,14 @@ separately.
Combining Multiple Prior Estimators
***********************************
Prior estimators can be combined. For example, it is possible to create a Black &
Litterman Factor Model by simply using a :class:`BlackLitterman` estimator for the prior
Litterman Factor Model by using a :class:`BlackLitterman` estimator for the prior
estimator of the :class:`FactorModel`:

**Example:**

Factor model for the estimation of the assets expected returns and covariance matrix
with a Black & Litterman model for the estimation of the factors expected reruns and
covariance matrix, incorporating the analyst views on the factors.
Factor model for the estimation of the **assets** expected returns and covariance matrix
with a Black & Litterman model for the estimation of the **factors** expected reruns and
covariance matrix, incorporating the analyst views on the **factors**.

.. code-block:: python
Expand Down
6 changes: 3 additions & 3 deletions docs/user_guide/uncertainty_set.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
Uncertainty Set Estimator
*************************

The :ref:`Uncertainty Set estimator <uncertainty_set_ref>` build an ellipsoidal
The :ref:`Uncertainty Set estimator <uncertainty_set_ref>` builds an ellipsoidal
:class:`UncertaintySet` of the distribution moments.

An ellipsoidal uncertainty set is defined by its size :math:`\kappa` and
Expand All @@ -29,11 +29,11 @@ attribute.
`X` can be any array-like structure (numpy array, pandas DataFrame, etc.)


Available estimators for the expected returns are:
Available estimators for the expected returns:
* :class:`EmpiricalMuUncertaintySet`
* :class:`BootstrapMuUncertaintySet`

Available estimators for the covariance are:
Available estimators for the covariance:
* :class:`EmpiricalCovarianceUncertaintySet`
* :class:`BootstrapCovarianceUncertaintySet`

Expand Down
3 changes: 2 additions & 1 deletion examples/1_mean_risk/plot_4_mean_variance_cdar.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,8 @@
# %%
# Let's compare the average and standard-deviation of the Sharpe Ratio and CDaR Ratio of
# the portfolios on the training set versus the test set:
# | Train:
#
# Train:
print(population_train.measures_mean(measure=RatioMeasure.ANNUALIZED_SHARPE_RATIO))
print(population_train.measures_std(measure=RatioMeasure.ANNUALIZED_SHARPE_RATIO))

Expand Down
6 changes: 3 additions & 3 deletions examples/1_mean_risk/plot_9_uncertainty_set.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,9 +103,9 @@
#
# The model parameters to tune are:
#
# * CVaR target (upper constraint): `max_cvar`
# * CVaR confidence level: `cvar_beta`
# * Mu uncertainty set confidence level: `confidence_level` of the :class:`~skfolio.uncertainty_set.EmpiricalMuUncertaintySet`
# * `max_cvar`: CVaR target (upper constraint)
# * `cvar_beta`: CVaR confidence level
# * `confidence_level`: Mu uncertainty set confidence level of the :class:`~skfolio.uncertainty_set.EmpiricalMuUncertaintySet`
#
# For embedded parameters in the `GridSearchCV`, you need to use a double underscore:
# `mu_uncertainty_set_estimator__confidence_level`
Expand Down
4 changes: 2 additions & 2 deletions examples/5_clustering/plot_3_hrp_vs_herc.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,8 +154,8 @@
cv = CombinatorialPurgedCV(n_folds=16, n_test_folds=14)

# %%
# We choose `n_folds` and `n_test_folds` to get more than 100 test paths and an average
# training size around 255 days:
# We choose `n_folds` and `n_test_folds` to obtain more than 100 test paths and an average
# training size of approximately 255 days:
cv.summary(X_test)

# %%
Expand Down
8 changes: 4 additions & 4 deletions examples/5_clustering/plot_5_nco_grid_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
=============================
The previous tutorial introduced the
:class:`~skfolio.optimization.NestedClustersOptimization` optimization.
:class:`~skfolio.optimization.NestedClustersOptimization`.
In this tutorial, we will perform hyperparameter search using `GridSearch` and
distribution analysis with `CombinatorialPurgedCV`.
Expand Down Expand Up @@ -136,8 +136,8 @@
cv = CombinatorialPurgedCV(n_folds=9, n_test_folds=7)

# %%
# We choose `n_folds` and `n_test_folds` to get more than 30 test paths and an average
# training size around 255 days:
# We choose `n_folds` and `n_test_folds` to obtain more than 30 test paths and an average
# training size of approximately 255 days:
cv.summary(X_test)

# %%
Expand All @@ -159,7 +159,7 @@
# We plot the out-of-sample distribution of Sharpe Ratio for the NCO model:
pred_nco.plot_distribution(
measure_list=[RatioMeasure.ANNUALIZED_SHARPE_RATIO]
).show()
)

# %%
# Let's print the average and standard-deviation of out-of-sample Sharpe Ratios:
Expand Down
36 changes: 18 additions & 18 deletions examples/6_ensemble/plot_1_stacking.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,18 @@
Stacking Optimization
=====================
This tutorial introduces the :class:`~skfolio.optimization.StackingOptimization`
optimization.
This tutorial introduces the :class:`~skfolio.optimization.StackingOptimization`.
Stacking Optimization is an ensemble method that consists in stacking the output of
individual optimization estimators with a final optimization estimator.
individual portfolio optimizations with a final portfolio optimization.
The weights are the dot-product of individual estimators weights with the final
estimator weights.
The weights are the dot-product of individual optimizations weights with the final
optimization weights.
Stacking allows to use the strength of each individual estimator by using their
output as input of a final estimator.
Stacking allows to use the strength of each individual portfolio optimization by using
their output as input of a final portfolio optimization.
To avoid data leakage, out-of-sample estimates are used to fit the outer estimator.
To avoid data leakage, out-of-sample estimates are used to fit the outer optimization.
.. note ::
The `estimators_` are fitted on the full `X` while `final_estimator_` is trained
Expand All @@ -26,7 +25,7 @@
# Data
# ====
# We load the FTSE 100 dataset. This dataset is composed of the daily prices of 64
# assets from the FTSE 100 Index composition starting from 2010-01-04 up to 2023-05-31:
# assets from the FTSE 100 Index composition starting from 2000-01-04 up to 2023-05-31:
from plotly.io import show
from sklearn.model_selection import GridSearchCV, train_test_split

Expand Down Expand Up @@ -94,7 +93,7 @@
# %%
# Parameter Tuning
# ================
# To show you how parameter tuning works in a staking model, we find the model
# To demonstrate how parameter tuning works in a staking model, we find the model
# parameters that maximizes the out-of-sample Calmar Ratio using `GridSearchCV` with
# `WalkForward` cross-validation on the training set.
# The `WalkForward` are chosen to simulate a three months (60 business days) rolling
Expand Down Expand Up @@ -125,7 +124,8 @@
# %%
# Prediction
# ==========
# We evaluate the two models using the same `WalkForward` object on the test set:
# We evaluate the Stacking model and the Benchmark using the same `WalkForward` object
# on the test set:
pred_bench = cross_val_predict(
benchmark,
X_test,
Expand Down Expand Up @@ -157,7 +157,7 @@
# %%
# Analysis
# ========
# The Stacking model outperforms the Benchmark for the below ratios:
# The Stacking model outperforms the Benchmark on the test set for the below ratios:
for ptf in population:
print("=" * 25)
print(" " * 8 + ptf.name)
Expand All @@ -181,8 +181,8 @@
cv = CombinatorialPurgedCV(n_folds=20, n_test_folds=18)

# %%
# We choose `n_folds` and `n_test_folds` to get more than 100 test paths and an average
# training size around 252 days:
# We choose `n_folds` and `n_test_folds` to obtain more than 100 test paths and an
# average training size of approximately 252 days:
cv.summary(X_test)

# %%
Expand All @@ -196,12 +196,12 @@

# %%
# The predicted object is a `Population` of `MultiPeriodPortfolio`. Each
# `MultiPeriodPortfolio` represents one testing path of a rolling portfolio.
# `MultiPeriodPortfolio` represents one test path of a rolling portfolio.

# %%
# Distribution
# ============
# We plot the out-of-sample distribution of Sharpe Ratio for the Stacking model:
# Let's plot the out-of-sample distribution of Sharpe Ratio for the Stacking model:
pred_stacking.plot_distribution(
measure_list=[RatioMeasure.ANNUALIZED_SHARPE_RATIO], n_bins=40
)
Expand All @@ -217,7 +217,7 @@
)

# %%
# Now let's analyze how the sub-models would have performed independently and compare
# Now, let's analyze how the sub-models would have performed independently and compare
# their distribution with the Stacking model:
population = Population([])
for model_name, model in model_stacking.estimators:
Expand All @@ -243,4 +243,4 @@
# ==========
# The Stacking model outperforms the Benchmark on the historical test set. The
# distribution analysis on the recombined (non-historical) test sets shows that the
# Stacking model continue to outperform the Benchmark in average.
# Stacking model continues to outperform the Benchmark in average.
21 changes: 18 additions & 3 deletions examples/7_pre_selection/plot_1_drop_correlated.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@
the optimization.
Highly correlated assets tend to increase the instability of mean-variance optimization.
In this example we will compare a mean-variance optimization with and without
In this example, we will compare a mean-variance optimization with and without
pre-selection.
"""

Expand Down Expand Up @@ -88,8 +89,8 @@
cv = CombinatorialPurgedCV(n_folds=10, n_test_folds=6)

# %%
# We choose `n_folds` and `n_test_folds` to get more than 100 test paths and an average
# training size around 800 days:
# We choose `n_folds` and `n_test_folds` to obtain more than 100 test paths and an average
# training size of approximately 800 days:
cv.summary(X_test)

# %%
Expand Down Expand Up @@ -125,6 +126,9 @@
show(fig)

# %%
# |
#
# Model 1:
print(
"Average of Sharpe Ratio:"
f" {pred_1.measures_mean(measure=RatioMeasure.ANNUALIZED_SHARPE_RATIO):0.2f}"
Expand All @@ -133,3 +137,14 @@
"Std of Sharpe Ratio:"
f" {pred_1.measures_std(measure=RatioMeasure.ANNUALIZED_SHARPE_RATIO):0.2f}"
)

# %%
# Model 2:
print(
"Average of Sharpe Ratio:"
f" {pred_2.measures_mean(measure=RatioMeasure.ANNUALIZED_SHARPE_RATIO):0.2f}"
)
print(
"Std of Sharpe Ratio:"
f" {pred_2.measures_std(measure=RatioMeasure.ANNUALIZED_SHARPE_RATIO):0.2f}"
)
6 changes: 3 additions & 3 deletions examples/7_pre_selection/plot_2_select_best_performers.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
:class:`~skfolio.pre_selection.SelectKExtremes` to select the `k` best or the `k` worst
assets according to a given measure before the optimization.
In this example we will use a `Pipeline` to assemble the pre-selection step with a
minimum variance optimization. Then we will use cross-validation to find the optimal
In this example, we will use a `Pipeline` to assemble the pre-selection step with a
minimum variance optimization. Then, we will use cross-validation to find the optimal
number of pre-selected assets to maximize the mean out-of-sample Sharpe Ratio.
"""

Expand Down Expand Up @@ -69,7 +69,7 @@
# %%
# Parameter Tuning
# ================
# To show how parameter tuning works in a Pipeline model, we find the number of
# To demonstrate how parameter tuning works in a Pipeline model, we find the number of
# pre-selected assets `k` that maximizes the out-of-sample Sharpe Ratio using
# `GridSearchCV` with `WalkForward` cross-validation on the training set. The
# `WalkForward` is chosen to simulate a three months (60 business days) rolling
Expand Down

0 comments on commit 0f5575e

Please sign in to comment.