## The Markowitz Portfolio Allocation Problem: Risky and Risk Free Assets
[Markowitz portfolio allocation](https://en.wikipedia.org/wiki/Markowitz_model) identifies the weights $w$'s for assets in a portfolio such that the portfolio variance (risk) is minimized for a specified rate of portfolio return (reward). The Markowitz allocation problem for a portfolio $\mathcal{P}$ composed of both risky and risk-free assets (savings accounts, certificate of deposit, bonds, etc) is given by:

\begin{eqnarray*}
\text{minimize}~\sigma_{\mathcal{P}}^2 &=& \sum_{i\in\mathcal{P}}\sum_{j\in\mathcal{P}}w_{i}w_{j}
\text{cov}\left(r_{i},r_{j}\right) \\
\text{subject to}~\mathbb{E}(r_{\mathcal{P}})& = & \sum_{i\in\mathcal{P}}w_{i}\cdot\mathbb{E}(r_{i})\geq{R^{*}}\\
w_{f}+\sum_{i\in\mathcal{P}}w_{i} & = & 1\\
\text{and}~w_{i}&\geq&{0}\qquad{\forall{i}\in\mathcal{P}}
\end{eqnarray*}

The term $w_{i}\geq{0}$ denotes the fraction of risky asset $i$ in the portfolio $\mathcal{P}$, the quantity $w_{f}$ denotes the weight of risk-free assets in the portfolio $\mathcal{P}$, while $R^{*}$ is the minimal required return for $\mathcal{P}$. The term $\sigma_{\mathcal{P}}$ denotes the portfolio variance, $r_{i}$ denotes the return for asset $i$ and $\text{cov}\left(r_{i},r_{j}\right)$ denotes [covariance](https://en.wikipedia.org/wiki/Covariance) between the return of asset $i$ and $j$ in the portfolio. The non-negativity of the fractions $w_{i}$ forbids short selling. This constraint can be relaxed if borrowing is allowed. 

### Learning objectives
In this example, we'll compute the [Efficient Frontier](https://en.wikipedia.org/wiki/Efficient_frontier) for a portfolio of risky assets, where the expected return and the covariance of the returns are calculated from a historical dataset. 

* First, we'll load the historical dataset. The data we'll explore is daily open-high-low-close values for firms in the [S&P500 index](https://en.wikipedia.org/wiki/S%26P_500) over the last five years.
* Next, we'll compute the expected returns and the covariance arrays from the historical dataset
* Finally, we'll compute the efficient frontier by solving the optimization problem described above: minimize the risk for a specified minimum reward value.

## Setup

In [1]:
include("Include.jl");

[32m[1m  Activating[22m[39m project at `~/Desktop/julia_work/CHEME-5760-Labs-F23`
[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m    Updating[22m[39m git-repo `https://github.com/varnerlab/VLDecisionsPackage.jl.git`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-5760-Labs-F23/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-5760-Labs-F23/Manifest.toml`


### Historical dataset
We gathered a daily open-high-low-close `dataset` for each firm in the [S&P500](https://en.wikipedia.org/wiki/S%26P_500) for the past five-trading years (a maximum of `1256` data points per firm). However, not all the firms in the `dataset` have the maximum number of trading days, i.e., some firms are missing information for various reasons; perhaps they were acquired, merged, or delisted, etc. We will exclude these firms from the `dataset`.

We load the price `dataset` by calling the `MyPortfolioDataSet()` function:

In [2]:
dataset = MyPortfolioDataSet() |> x-> x["dataset"];

Each firm is assigned an `index` (the `keys` of the `dataset` dictionary). Let's get these `keys`, sort them, and then store these firm indices in the `firms` array:

In [3]:
firms = keys(dataset) |> collect |> sort;

In [8]:
number_of_firms = length(firms);

### Compute the expected return and covariance matrices
First, we compute the expected (annualized) log return by passing the `dataset` and a list of firms we are interested in (held in the $N\times{1}$ `firms` array) to the `log_return_matrix(...)` method. The result is stored in the `return_matrix` variable, a $T-1\times{N}$ array of log return values. 
Each row of `return_matrix` corresponds to a time-value, while each column corresponds to a firm: 

In [5]:
return_matrix = log_return_matrix(dataset,firms);

Next, we estimate the `covariance_matrix` from the `return_matrix` using the `cov(...)` function which is exported by the [Statistics.jl package](https://docs.julialang.org/en/v1/stdlib/Statistics/): 

In [6]:
covariance_matrix = cov(return_matrix);

Finally, we estimate the expected return for each firm from the `return_matrix` using the `mean(...)` function which is exported by the [Statistics.jl package](https://docs.julialang.org/en/v1/stdlib/Statistics/): 

In [9]:
tmp = mean(return_matrix, dims=1);
expected_return = Array{Float64,1}()
for i ∈ 1:number_of_firms
    push!(expected_return, tmp[i])
end