# Cookbook Example 0: Portfolio Optimization

In this cookbook example we will:

1. Import data
2. Perform basic wrangling
3. Build a portfolio optimization model
4. Plot our optimal portfolio
5. Get the optimal weights

Here, we go through the basics of this basic workflow and explain the main Dal-io objects (Translator, Pipe, Model and Application) through this context.

This is exactly the same workflow as shown in the documentation index, though here we focuss more on the actual process than introducing concepts.

**If you have a local copy of the repository, we should append the main directory to the PYTHONPATH**

In [None]:
import sys
sys.path.append("..")

import warnings
warnings.filterwarnings("ignore")
warnings.simplefilter("ignore")

Now we'll import numpy and the base Dal-io submodules.

In [None]:
import numpy as np

import dalio.external as de
import dalio.translator as dt
import dalio.pipe as dp
import dalio.model as dm
import dalio.application as da

Here we define the set of ticker symbols we will be using throughout the analysis and set up the main stock input.

Here we picked competitor pairs from different industries to make the weights neater, though you can specify any ticker symbol available from your source (in this case Yahoo! Finance). Go ahead and try it!

`External`s are often only instanciated once, as they are essentially useless on their own. Here we instanciate one just as the input to the `Translator`.

In [None]:
tickers=["GOOG", "MSFT", "ATVI", "TTWO", "GM", "FORD", "SPY"]

stocks = dt.YahooStockTranslator()\
    .set_input(de.YahooDR())

Now we run the translator to ensure the data is all present (missing stock symbols will cause problems later on) and well-formatted.

In [None]:
stocks.run(ticker=tickers)

Here we go through several examples of `Pipe` subclasses. Each one has one input, and can be chaned into a `PipeLine`, as is the case with `close`. We can also instanciate the objects to be solely used within the `PipeLine`, though you will not have access to them later on. For that reason, we instanciate `time_conf` outside of the `PipeLine` and then place it in ther; this will allow us to control the date selection from any point of the analysis. Thing of this piece as a sort of _remote control_ for date selection.

Notice that while a good ammount of work is done behind the scenes, you, as a designer, will still need to make several decisions so that the output is as you require. 

Let's see an example of this challenge with `annual_rets`. Here we calculate the annual returns by getting the difference in price between each year's last closing prices. We do this by first applying a period function to get said last price, followed by a percent change pipe. One could just as well use `lambda x: (x[0] - x[-1])/x[0]` inside of the `Period` pipe to calculate the different between the first and last prices of the year. Both of these are understandable, yet they have different computations and slightly different results.

In [None]:
time_conf = dp.DateSelect()

close = dp.PipeLine(
    dp.ColSelect(columns="close"),
    time_conf
)(stocks)

annual_rets = close + \
    dp.Period("Y", agg_func=lambda x: x[-1]) + \
    dp.Change(strategy="pct_change")

cov = dp.Custom(lambda df: df.cov(), strategy="pipe")\
    .with_input(annual_rets)

exp_rets = annual_rets + dp.Custom(np.mean)

Now we are introduced to two `Model` subclasses and one `_Builder`. 

As you can see, the `Model` instances take in multible inputs, each with a dedicated name. You can learn more about what the inputs to each Model are and their requirements by checking their documentation. We can also notice these models can serve as any other input to other `_Tranformer` instance.

`_Builder` instances, on top of the standard class inputs, have pieces that must be set. Each of these pieces have their own set of options and parameters, each of which often represent parameters to some underlying function or object initialization.

In [None]:
ef = dm.MakeEfficientFrontier(weight_bounds=(-0.5, 1))\
    .set_input("sample_covariance", cov)\
    .set_input("expected_returns", exp_rets)\

weights = dp.OptimumWeights()(ef)\
    .set_piece("strategy", "max_sharpe", risk_free_rate=0.0)

opt_port = dm.OptimumPortfolio()\
    .set_input("weights_in", weights)\
    .set_input("data_in", close)

The final stage of a graph is often an `Application` instance, as we can see below. These are actually subclasses of `Model` and thus have the same input structure, as well as a similarly-specified set of outputs. These outputs will be instances of `External`, which can output to external resources just as they get input from them. 

In this case, the `External` instance manages a plot figure, while the `Application` instance processes its input and guides the `External` instance on how to plot in.

In [None]:
graph = da.PandasXYGrapher(x=None, y="close", legend="upper right")\
    .set_input("data_in", dp.Index(100)(opt_port))\
    .set_output("data_out", de.PyPlotGraph(figsize=(16, 8)))

This is the reason we left that `DateSelect` instance in the python environment when creating `close`. Now, and at any point of the analysis, we have access to it and can specify the time range on which the analysis will be conducted.

In [None]:
time_conf.set_start("2016-01-01")
time_conf.set_end("2019-12-31")

Now we can finally run the graph, which will plot the indexed portfolio.

We can also get any output from the models and pipes we have created, for example to get the optimum weights used in the plotted portfolio.

Feel free to tweek any part of the analysis and re-run the application to layer the new input on top of the old.

Some things to try are:

* Change the date range to see how this portfolio would have been optimized differently for a different time period.

* Use a different set of tickers (make sure they are available from the data input)

* Use different data inputs for your analysis.

* Set different objectives and weight constraints on the `MakeEfficientFrontier` model.

In [None]:
graph.run(ticker=tickers)

In [None]:
weights.run(ticker=tickers)