In [1]:
/**
 * @file portfolio-optimization-moead-cpp.ipynb
 *
 * A simple practical application of MultiObjective Decomposition Evolutionary Algorithm
 * - Differential Variant (MOEA/D-DE) in portfolio optimization. This example allows user to freely choose 
 * multiple stocks of their choice, which upon request, generates csv automagically 
 * via a helper function.
 *
 * The algorithm will try and optimize the trade-off between the returns and
 * volatility of the requested stocks.
 *
 * Data from Pandas Datareader library (https://pandas-datareader.readthedocs.io/en/latest/).
 */

In [2]:
#include <mlpack/xeus-cling.hpp>

#include <ensmallen.hpp>
#include "../utils/portfolio.hpp"

In [3]:
// Header files to create and show the plot.
#define WITHOUT_NUMPY 1
#include "matplotlibcpp.h"
#include "xwidgets/ximage.hpp"

namespace plt = matplotlibcpp;

In [4]:
using namespace ens;

In [5]:
using namespace ens::test;

### 1. Set the Model Parameters

In this section, we will select the parameters for the optimizer. Parameters include name of the stocks, starting date, ending date and Finance API Source.

In [6]:
//! Declare user specified data.
std::string stocks, startDate, endDate, dataSource;

In [7]:
stocks = "AAPL,NKE,GOOGL,AMZN";

//! Uncomment to set custom stocks.
// std::cout << "Type the name of symbol of the stocks via comma separated values (no spaces)" << std::endl;
// std::cin >> stocks;

We're setting the data source to Yahoo Finance API by default. For custom data-source, refer pandas-datareader documentation to get the exhaustive list of available API sources.

In [8]:
dataSource = "yahoo";

//! Uncomment to set custom data-source.
//std::cin >> dataSource;

In [9]:
startDate = "01/01/2015";

//! Uncomment to set custom start-date.
// std::cout << "Starting Date (YYYY/MM/DD or DD/MM/YYYY)" << std::endl;
// std::cin >> startDate;

In [10]:
endDate = "31/12/2019";

//! Uncomment to set custom end-date.
// std::cout << "End Date (YYYY/MM/DD or DD/MM/YYYY)" << std::endl;
// std::cin >> endDate;

### 2. Loading the Dataset

In this section, we will create a helper class which will generate the CSV file for us based on the parameters provided in previous section. This class would also define the objective functions in question, namely: Return and Volatility. Ideally, we would want to maximize the returns and reduce the volatility. Since our implementation of algorithm works on minimization of all objectives, we have appended negative sign to the returns objective which converts it into a minimization problem.

In [11]:
class PortfolioFunction
{
  public:
    PortfolioFunction(const std::string& stocks,
                      const std::string& dataSource,
                      const std::string& startDate,
                      const std::string& endDate)
    {
    //! Generate the requested csv file.
      Portfolio(stocks, dataSource, startDate, endDate,"portfolio.csv");
      returns.load("portfolio.csv", arma::csv_ascii);
      returns.shed_col(0);

      assets = returns.n_cols;
    }

    //! Get the starting point.
    arma::mat GetInitialPoint()
    {
      return arma::Col<double>(assets, 1, arma::fill::zeros);
    }
    
    struct VolatilityObjective
    {
        VolatilityObjective(const arma::mat& returns) : returns(returns) {}

        double Evaluate(const arma::mat& coords)
        {
          const double portfolioVolatility = arma::as_scalar(arma::sqrt(
                coords.t() * arma::cov(returns) * 252 * coords));
          return portfolioVolatility;
        }

        arma::mat returns;
    };

    struct ReturnsObjective
    {
        ReturnsObjective(const arma::mat& returns) : returns(returns) {}

        double Evaluate(const arma::mat& coords)
        {
          const double portfolioReturns = arma::accu(arma::mean(returns) %
              coords.t()) * 252;
          
          //! Negative sign appended to convert to minimization problem.
          return -portfolioReturns;
        }

        arma::mat returns;
    };


    //! Get objective functions.
    std::tuple<VolatilityObjective, ReturnsObjective> GetObjectives()
    {
      return std::make_tuple(VolatilityObjective(returns), ReturnsObjective(returns));
    }

    arma::mat returns;
    size_t assets;
};


//! The constructor will generate the csv file.
PortfolioFunction pf(stocks, dataSource, startDate, endDate);

const double lowerBound = 0;
const double upperBound = 1;

DefaultMOEAD opt(150, // Population size.
                 300,  // Max generations.
                 1.0,  // Crossover probability.
                 0.9, // Probability of sampling from neighbor.
                 20, // Neighborhood size.
                 20, // Perturbation index.
                 0.5, // Differential weight.
                 2, // Max childrens to replace parents.
                 1E-10, // epsilon.
                 lowerBound, // Lower bound.
                 upperBound // Upper bound.
                );

arma::mat coords = pf.GetInitialPoint();
auto objectives = pf.GetObjectives();

### 3. Optimization 

MOEA/D-DE (Multi-Objective Evolutionary Algorithm based on Decomposition - Differential Evolution) is a multi-objective optimization algorithm that works via Decomposition. Unlike traditional algorithms like NSGA-II, the concept of dominance is non-existent here. Instead, a set of "Reference Directions" are generated which explicitly allows the user to control the distribution of the final Pareto Front. With the help of Decomposition functions, a scalar optimization problem is framed which has a "pulling" effect on the population towards the true Pareto Front. 
MOEA/D-DE is not just faster than NSGA-II but also produces high-quality Pareto Front in very few iterations.

MOEAD offers a plethora of Decomposition Functions and Reference Direction generators via templates. For our case, we've utilized the trusty ```DefaultMOEAD```. Read the class documentation for other options.

We would like to track the optimization process over the generations. For that let's create a container to store the current Pareto Front.

In [12]:
std::vector<arma::cube> paretoFrontArray{};

This data structure would then be passed on to the "QueryFront" Callback which will track the evolution for us.

Begin Optimization! (This will take a fair amount of time).

In [13]:
opt.Optimize(objectives, coords, QueryFront(2, paretoFrontArray));

Let's collect the results and inspect our first set of solution.

In [14]:
arma::cube paretoFront = opt.ParetoFront();

std::cout << paretoFront.slice(0) << std::endl;

   9.9965e-06
  -1.2723e-05



Let's create an array to store the X and Y coordinates of all the Pareto Fronts.

In [15]:
size_t numQuery = 300 / 2; // maxGeneration / queryRate.

std::vector<std::vector<double>> frontArrayX(numQuery);
std::vector<std::vector<double>> frontArrayY(numQuery);

Convert to neccessary data structure.

In [16]:
void FillFront(std::vector<double>& frontX,
               std::vector<double>& frontY,
               arma::cube& paretoFront)
{
    size_t numPoints = paretoFront.n_slices;

    //! Store the X, Y coordinates of the Pareto Front.
    frontX.resize(numPoints);
    frontY.resize(numPoints);

    for (size_t idx = 0; idx < numPoints; ++idx)
    {

        frontX[idx] = paretoFront.slice(idx)(0);
        // Append negative again to restore the original 
        // maximization objective.
        frontY[idx] = -paretoFront.slice(idx)(1);
    }
}

In [17]:
for (size_t idx = 0; idx < numQuery; ++idx)
    FillFront(frontArrayX[idx], frontArrayY[idx], paretoFrontArray[idx]);

### 4.  Plotting

As said before, we desire higher returns and lower volatility. The Pareto Front generated gives an optimal set of solutions such that, higher volatility is traded-off with higher returns and vice-versa. Hence, all the solutions are "optimal". Based on user's preference, he/she can choose their solution from the generated front.

The Axis Labels are as follows:

X-Axis: Volatility

Y-Axis: Returns

We expect an increase in volatility with increase in returns.

In [85]:
plt::figure_size(800, 800);

for (size_t idx = 0; idx < numQuery; ++idx)
    plt::scatter(frontArrayX[idx], frontArrayY[idx], 50);

plt::xlabel("Volatility");
plt::ylabel("Returns");

plt::title("The Pareto Front");
plt::legend();

plt::save("./plot.png");
auto im = xw::image_from_file("plot.png").finalize();
im

A Jupyter widget with unique id: 837e59900bb54a27b3418ebbe1a35d95

### 5. Final Thoughts

In this notebook, we've seen how a MultiObjective Optimization algorithm can help in investing in stocks. We specified our stocks and witnessed our algorithm optimize the returns vs volatility trade-off in live. Feel free to play around by selecting various stocks, start-date, end-date and see how the outcomes plays off. 