major updates to docs

robertmartin8 · Mar 18, 2020 · 80b3db9 · 80b3db9
1 parent 1d07a55
commit 80b3db9
Show file tree

Hide file tree

Showing 14 changed files with 195 additions and 128 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -2,7 +2,7 @@
 
 Some of the things that I'd love for people to help with:
 
-- Improve performance of existing code (but not at the cost of readability) – I realise that most optimisation projects in python use `cvxopt` rather than `scipy.optimize`, but the latter is far cleaner and much more readable. If it transpires that performance differs by orders of magnitude, I will definitely consider switching.
+- Improve performance of existing code (but not at the cost of readability) – are there any nice numpy hacks I've missed?
 - Add new optimisation objectives. For example, if you think that the best performance metric has not been included, write an optimiser! (or suggest it in [Issues](https://github.com/robertmartin8/PyPortfolioOpt/issues) and I will have a go).
 - Help me write more tests! If you are someone learning about quant finance and/or unit testing in python, what better way to practice than to write some tests on an open-source project! Feel free to check for edge cases, or test performance on a dataset with more stocks.
 

diff --git a/README.md b/README.md
@@ -180,6 +180,10 @@ Funds remaining: $8.42
 
 Harry Markowitz's 1952 paper is the undeniable classic, which turned portfolio optimisation from an art into a science. The key insight is that by combining assets with different expected returns and volatilities, one can decide on a mathematically optimal allocation which minimises the risk for a target return – the set of all such optimal portfolios is referred to as the **efficient frontier**.
 
+<center>
+<img src="https://github.com/robertmartin8/PyPortfolioOpt/blob/master/media/efficient_frontier.png" style="width:80%;"/>
+</center>
+
 Although much development has been made in the subject, more than half a century later, Markowitz's core ideas are still fundamentally important, and see daily use in many portfolio management firms.
 The main drawback of mean-variance optimisation is that the theoretical treatment requires knowledge of the expected returns and the future risk-characteristics (covariance) of the assets. Obviously, if we knew the expected returns of a stock life would be much easier, but the whole game is that stock returns are notoriously hard to forecast. As a substitute, we can derive estimates of the expected return and covariance based on historical data – though we do lose the theoretical guarantees provided by Markowitz, the closer our estimates are to the real values, the better our portfolio will be.
 

diff --git a/docs/EfficientFrontier.rst b/docs/EfficientFrontier.rst
@@ -4,34 +4,63 @@
 Efficient Frontier Optimisation
 ###############################
 
-The implementation of efficient frontier optimisation in PyPortfolioOpt is separated
-into the :py:mod:`objective_functions` and :py:mod:`efficient_frontier` modules. It
-was designed this way because in my mind there is a clear conceptual separation
-between the optimisation objective and the actual optimisation method – if we
-wanted to use something other than mean-variance optimisation via quadratic programming,
-these objective functions would still be applicable.
-
-It should be noted that while efficient frontier optimisation is technically a very
+Mathematical optimisation is a very difficult problem in general, particularly when we are dealing
+with complex objectives and constraints. However, **convex optimisation** problems are a well-understood
+class of problems, which happen to be incredibly useful for finance. A convex problem has the following form:
+
+.. math::
+
+    \begin{equation*}
+    \begin{aligned}
+    & \underset{\mathbf{x}}{\text{minimise}} & & f(\mathbf{x}) \\
+    & \text{subject to} & & g_i(\mathbf{x}) \leq 0, i = 1, \ldots, m\\
+    &&& A\mathbf{x} = b,\\
+    \end{aligned}
+    \end{equation*}
+
+where :math:`\mathbf{x} \in \mathbb{R}^n`, and :math:`f(\mathbf{x}), g_i(\mathbf{x})` are convex functions. [1]_
+
+Fortunately, portfolio optimisation problems (with standard and objective constraints) are convex. This
+allows us to immediately apply the vast body of theory as well as the refined solving routines -- accordingly,
+the main difficulty is inputting our specific problem into a solver.
+
+PyPortfolioOpt aims to do the hard work for you, allowing for one-liners like ``ef.min_volatility()``
+to generate a portfolio that minimises the volatility, while at the same time allowing for more
+complex problems to be built up from modular units. This is all possible thanks to 
+`cvxpy <https://www.cvxpy.org/>`_, the *fantastic* python-embedded modelling
+language for convex optimisation upon which PyPortfolioOpt's efficient frontier functionality lies.
+
+As a brief aside, I should note that while "efficient frontier" optimisation is technically a very
 specific method, I tend to use it as a blanket term (interchangeably with mean-variance
 optimisation) to refer to anything similar, such as minimising variance.
 
-Optimisation
-============
+Structure
+=========
 
-PyPortfolioOpt uses `scipy.optimize <https://docs.scipy.org/doc/scipy/reference/optimize.html>`_.
-I realise that most python optimisation projects use `cvxopt <https://cvxopt.org/>`_
-instead, but I do think that scipy.optimize is far cleaner and much more readable
-(as per the Zen of Python, "Readability counts"). That being said, scipy.optimize
-arguably has worse documentation, though ultimately I felt that it was intuitive
-enough to justify the lack of explained examples. Because they are both based on
-`LAPACK <http://www.netlib.org/lapack/>`_, I don't see why performance should
-differ significantly, but if it transpires that cvxopt is faster by an order of
-magnitude, I will definitely consider switching.
+As shown in the definition of a convex problem, there are essentially two things we need to specify:
+the optimisation objective, and the optimisation constraints. For example, the classic portfolio
+optimisation problem is to **minimise risk** subject to a **return constraint** (i.e the portfolio
+must return more than a certain amount). From an implementation perspective, however, there is
+not much difference between an objective and a constraint. Consider a similar problem, which is to
+**maximize return** subject to a **risk constraint** -- now, the role of risk and return have swapped. 
 
-.. tip::
+To that end, PyPortfolioOpt defines an :py:mod:`objective_functions` module that contains objective functions
+(which can also act as constraints, as we have just seen). The actual optimisation occurs in the :py:class:`efficient_frontier.EfficientFrontier` class.
+This class provides straightforward methods for optimising different objectives (all documented below).
+
+However, PyPortfolioOpt was designed so that you can easily add new constraints or objective terms to an existing problem.
+For example, adding a regularisation objective (explained below) to a minimum volatility objective is as simple as::
 
-    If you would like to plot the efficient frontier, take a look at the :ref:`cla`.  
+    ef = EfficientFrontier(expected_returns, cov_matrix)  # setup
+    ef.add_objective(objective_functions.L2_reg)  # add a secondary objective
+    ef.min_volatility()  # find the portfolio that minimises volatility and L2_reg
+
+.. tip::
+
+    If you would like to plot the efficeint frontier, take a look at the :ref:`cla`.
 
+Basic Usage
+===========
 
 .. automodule:: pypfopt.efficient_frontier
 
@@ -46,18 +75,25 @@ magnitude, I will definitely consider switching.
                 As of v0.5.0, you can pass a collection (list or tuple) of (min, max) pairs
                 representing different bounds for different assets.
 
+            .. tip::
+
+                If you want to generate short-only portfolios, there is a quick hack. Multiply
+                your expected returns by -1, then optimise a long-only portfolio.
+
+
         .. automethod:: max_sharpe
 
-            .. note::
+            .. caution::
+
+                Because ``max_sharpe()`` makes a variable substitution, additional objectives may
+                not work as intended. 
 
-                If you want to generate short-only portfolios, there is a quick hack. Multiply
-                your expected returns by -1, then maximise a long-only portfolio.
 
-        .. automethod:: max_unconstrained_utility
+        .. automethod:: max_quadratic_utility
 
             .. note::
 
-                pypfopt.BlackLitterman provides a method for calculating the market-implied
+                ``pypfopt.black_litterman`` provides a method for calculating the market-implied
                 risk-aversion parameter, which gives a useful estimate in the absence of other
                 information!
 
@@ -67,27 +103,42 @@ magnitude, I will definitely consider switching.
     :py:meth:`efficient_return`, the optimiser will fail silently and return
     weird weights. *Caveat emptor* applies!
 
+Adding objectives and constraints
+=================================
+
+EfficientFrontier inherits from the BaseConvexOptimizer class. In particular, the functions to
+add constraints and objectives are documented below:
+
+
+    .. class:: pypfopt.base_optimizer.BaseConvexOptimizer
+
+        .. automethod:: add_constraint
+
+        .. automethod:: add_objective
+
+
 Objective functions
 ===================
 
 .. automodule:: pypfopt.objective_functions
     :members:
 
+
 One of the experimental features implemented in PyPortfolioOpt is the L2 regularisation
 parameter ``gamma``, which is discussed below.
 
 .. _L2-Regularisation:
 
-L2 Regularisation
-=================
+More on L2 Regularisation
+=========================
 
 As has been discussed in the :ref:`user-guide`, efficient frontier optimisation often
 results in many weights being negligible, i.e the efficient portfolio does not end up
 including most of the assets. This is expected behaviour, but it may be undesirable
 if you need a certain number of assets in your portfolio.
 
 In order to coerce the efficient frontier optimiser to produce more non-negligible
-weights, I have added what can be thought of as a "small weights penalty" to all
+weights, we add what can be thought of as a "small weights penalty" to all
 of the objective functions, parameterised by :math:`\gamma` (``gamma``). Considering
 the minimum variance objective for instance, we have:
 
@@ -113,36 +164,31 @@ used to make them larger).
     universes, or if you want more non-negligible weights in the final portfolio,
     increase ``gamma``.
 
-.. _custom-objectives:
+.. _custom-optimisation:
 
-Custom objectives
-=================
+Custom optimisation problems
+============================
 
-Though it is simple enough to modify ``objective_functions.py`` to implement
-a custom objective (indeed, this is the recommended approach for long-term use),
-I understand that most users would find it much more convenient to pass a
-custom objective into the optimiser without having to edit the source files.
+Previously we described an API for adding constraints and objectives to one of the core
+optimisation problems in the ``EfficientFrontier`` class. However, what if you aren't interested
+in anything related to ``max_sharpe()``, ``min_volatility()``, ``efficient_risk()`` etc and want to
+set up a completely new problem to optimise for some custom objective?
 
-Thus, v0.2.0 introduces a simple API within the ``EfficientFrontier`` object for
-optimising your own objective function.
+The ``EfficientFrontier`` class inherits from the ``BaseConvexOptimizer``, which allows you to
+define your own optimisation problem. You can either optimise some generic ``convex_objective``
+(which *must* be built using ``cvxpy`` atomic functions -- see `here <https://www.cvxpy.org/tutorial/functions/index.html>`_)
+or a ``nonconvex_objective``, which uses ``scipy.optimize`` as the backend and thus has a completely
+different API.
 
-The first step is to define the objective function, which must take an array
-of weights as input (with optional additional arguments), and return a single
-float corresponding to the cost. As an example, we will pretend that L2
-regularisation is not built-in and re-implement it below::
+    .. class:: pypfopt.base_optimizer.BaseConvexOptimizer
 
-    def my_objective_function(weights, cov_matrix, k):
-        variance = np.dot(weights.T, np.dot(cov_matrix, weights))
-        return variance + k * (weights ** 2).sum()
+        .. automethod:: convex_objective
+
+        .. automethod:: nonconvex_objective
 
-Next, we instantiate the ``EfficientFrontier`` object, and pass the objectives
-function (and all required arguments) into ``custom_objective()``::
 
-    ef = EfficientFrontier(mu, S)
-    weights = ef.custom_objective(my_objective_function, ef.cov_matrix, 0.3)
+References
+==========
 
+.. [1] Boyd, S.; Vandenberghe, L. (2004). `Convex Optimization <https://web.stanford.edu/~boyd/cvxbook/>`_.
 
-.. caution::
-    It is assumed that the objective function you define will be solvable
-    by sequential quadratic programming. If this isn't the case, you may
-    experience silent failure.
diff --git a/docs/ExpectedReturns.rst b/docs/ExpectedReturns.rst
@@ -27,13 +27,8 @@ superior models and feed them into the optimiser.
     .. autofunction:: mean_historical_return
 
         This is probably the default textbook approach. It is intuitive and easily interpretable,
-        however the estimates are unlikely to be accurate. This is a problem especially in the
-        context of a quadratic optimiser, which will maximise the erroneous inputs, In some informal
-        backtests, I've found that vanilla efficient frontier portfolios (using mean historical
-        returns and sample covariance) actually do have a statistically significant outperformance
-        over the S&P500 (in the order of 3-5%), though the same isn't true for cryptoasset portfolios.
-        At some stage, I may redo these backtests rigorously and add them to the repo
-        (see the :ref:`roadmap` page for more).
+        however the estimates are subject to large uncertainty. This is a problem especially in the
+        context of a quadratic optimiser, which will maximise the erroneous inputs.
 
 
     .. autofunction:: ema_historical_return

diff --git a/docs/OtherOptimisers.rst b/docs/OtherOptimisers.rst
@@ -36,8 +36,12 @@ The advantages of this are that it does not require inversion of the covariance
 matrix as with traditional quadratic optimisers, and seems to produce diverse
 portfolios that perform well out of sample.
 
+.. image:: ../media/dendrogram.png
+   :width: 80%
+   :align: center
 
-.. automodule:: pypfopt.hierarchical_portfolios
+
+.. automodule:: pypfopt.hierarchical_portfolio
 
     .. autoclass:: HRPOpt
         :members:
@@ -54,6 +58,10 @@ that is especially advantageous when we apply linear inequalities. Unlike generi
 the CLA is specially designed for portfolio optimisation. It is guaranteed to converge after a certain
 number of iterations, and can efficiently derive the entire efficient frontier.
 
+.. image:: ../media/cla_plot.png
+   :width: 80%
+   :align: center
+
 .. tip:: 
 
     In general, unless you have specific requirements e.g you would like to efficiently compute the entire
@@ -74,13 +82,13 @@ the same API, though as of v0.5.0 we only support ``max_sharpe()`` and ``min_vol
 Implementing your own optimiser
 ===============================
 
-Please note that this is quite different to implementing :ref:`custom-objectives`, because in
+Please note that this is quite different to implementing :ref:`custom-optimisation`, because in
 that case we are still using the same quadratic optimiser. However, HRP and CLA optimisation
 have a fundamentally different optimisation method. In general, these are much more difficult
 to code up compared to custom objective functions.
 
 To implement a custom optimiser that is compatible with the rest of PyPortfolioOpt, just
-extend ``BaseOptimizer`` (or ``BaseConvexOptimizer`` if you want to use ``cvxpy`` or ``scipy.optimize``),
+extend ``BaseOptimizer`` (or ``BaseConvexOptimizer`` if you want to use ``cvxpy``),
 both of which can be found in ``base_optimizer.py``. This gives you access to utility
 methods like ``clean_weights()``, as well as making sure that any output is compatible
 with ``portfolio_performance()`` and post-processing methods.
@@ -151,4 +159,3 @@ References
 
 .. [1] López de Prado, M. (2016). `Building Diversified Portfolios that Outperform Out of Sample <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2708678>`_. The Journal of Portfolio Management, 42(4), 59–69.
 .. [2] Bailey and Loópez de Prado (2013). `An Open-Source Implementation of the Critical-Line Algorithm for Portfolio Optimization <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2197616>`_ 
-.. [3] Rockafellar and Uryasev (2011) `Optimization of conditional value-at-risk <http://www.ise.ufl.edu/uryasev/files/2011/11/CVaR1_JOR.pdf>`_.
diff --git a/docs/RiskModels.rst b/docs/RiskModels.rst
@@ -84,7 +84,7 @@ covariance.
         recent data when calculating covariance, in the same way that the exponential
         moving average price is often preferred to the simple average price. For a full
         explanation of how this estimator works, please refer to the
-        `blog post <http://reasonabledeviations.com/2018/08/15/exponential-covariance/>`_
+        `blog post <https://reasonabledeviations.com/2018/08/15/exponential-covariance/>`_
         on my academic website.
 
     .. autofunction:: min_cov_determinant
@@ -102,11 +102,6 @@ covariance.
 
     .. autofunction:: cov_to_corr
 
-        .. note::
-
-            This is especially useful when it comes to visualise the 'correlation matrices' that
-            are associated with (shrunk) covariance matrices, using Matplotlib's ``imshow`` or
-            Seaborn's ``heatmap``.
 
     .. autofunction:: correlation_plot
 

diff --git a/docs/Roadmap.rst b/docs/Roadmap.rst
@@ -23,15 +23,23 @@ have any other feature requests, please raise them using GitHub
 1.0.0
 =====
 
-Please see HERE for full details 
+- Migrated backend from ``scipy`` to ``cvxpy`` and made significant breaking changes to the API
 
-- Migrated backend from ``scipy`` to ``cvxpy``. 
-- changed portfolio_performance API 
+  - PyPortfolioOpt is now significantly more robust and numerically stable.
+  - These changes will not affect basic users, who can still access features like ``max_sharpe()``.
+  - However, additional objectives and constraints (including L2 regularisation) are now 
+    explicitly added before optimising some 'primary' objective.
 
-Breaking changes
-----------------
+- Added basic plotting capabilities for the efficient frontier, hierarchical clusters, 
+  and HRP dendrograms.
+- Added a basic transaction cost objective.
+- Made breaking changes to some modules and classes so that PyPortfolioOpt is easier to extend
+  in future:
+
+  - Replaced ``BaseScipyOptimizer`` with ``BaseConvexOptimizer``
+  - ``hierarchical_risk_parity`` was replaced by ``hierarchical_portfolios`` to leave the door open for other hierarchical methods.
+  - Sadly, removed CVaR optimisation for the time being until I can properly fix it.
 
-- No more ``gamma`` parameter – you must add the appropriate ``L2_reg`` objective. 
 
 0.5.0
 =====
@@ -127,7 +135,7 @@ fixing a bug in the arguments of a call to ``portfolio_performance``.
 0.3.3
 -----
 
-Migrated the project internally to use the ``poetry`` dependency manager. Will still keep ``setup.py`` and ``requirements.txt``, but ``poetry`` is now the recommended way to interact with ``PyPortfolioOpt``
+Migrated the project internally to use the ``poetry`` dependency manager. Will still keep ``setup.py`` and ``requirements.txt``, but ``poetry`` is now the recommended way to interact with PyPortfolioOpt.
 
 0.3.4
 -----