# Case Study: Portfolio Optimization
<figure>
  <img src="stocks.png" alt=".." title="Optional title" />
  <figcaption><center>Figure: The historical rate of return of five technology stocks from the beginning of 2012.</center></figcaption>
</figure>

[Modern portfolio theory](http://en.wikipedia.org/wiki/Modern_portfolio_theory) is based on the
[Markowitz](http://en.wikipedia.org/wiki/Harry_Markowitz) model 
for determining a portfolio of stocks with a desired expected rate of
return that has the smallest amount of variance. The main idea is that
by <i>diversifying</i> (investing in a mixture of different stocks),
one can guard against large amounts of variance in the rates of return
of the individual stocks.

Suppose $p_1, \ldots, p_m$ are the historical prices of a stock over
some period of time. We define the $rate\ of\ return$ at time $t$,
relative to the initial price $p_1$ by

\begin{equation}
	r_t := (p_t - p_1)/p_1, \text{for} \ \ \ t=1,\ldots,m. \ \ \ \ \ \ (1)
\end{equation}

The $expected\ rate\ of\ return$ is the mean $\mu$ of the rates of
return, and the $risk$ is defined as the $standard\ deviation$ $\sigma$ of the rates of return:

\begin{equation}
	\mu := \frac{1}{m}\sum_{t = 1}^m r_t
	\ \ \ \text{and} \ \ \ 
	\sigma := \sqrt{\frac{1}{m} \sum_{t=1}^m (r_t - \mu)^2}.
\end{equation}

Given a collection of $n$ stocks, let $r^i_t$ be the rate of return
of stock $i$ at time $t$. Let $r$ be the $n \times 1$ vector of the
expected rates of return of the $n$ stocks. In addition, let
$\Sigma$ be the $n \times n$ $covariance\ matrix$ of the rates of
return of the $n$ stocks. Thus, $r_i$ is the mean of the rates of
return of stock $i$, $\Sigma_{ii}$ is the variance of the rates of
return of stock $i$, and $\Sigma_{ij}$ is the covariance of the rates
of return of stocks $i$ and $j$:

\begin{equation}
	r_i := \frac{1}{m}\sum_{t = 1}^m r_t^i
    \ \ \ \text{and} \ \ \
	\Sigma_{ij} := \frac{1}{m}\sum_{t = 1}^m (r^i_t - r_i)(r^j_t - r_j).
\end{equation}

We let $x_i$ be the fraction of our investment money we put into stock
$i$, for $i = 1,\ldots,n$. For the sake of this study, we assume
there is no $short\ selling$ (i.e., holding a stock in negative
quantity). Thus, $x$ is a vector of length $n$ that has nonnegative
entries that sum to one (i.e., $x \geq 0$ and $\sum_{i=1}^n x_i =
1$). The vector $x$ represents our $portfolio$ of investments.
The expected rate of return and standard deviation of a portfolio $x$ are then given by

\begin{equation}
	\mu := r^Tx \ \ \ \text{and}\ \ \ \sigma := \sqrt{x^T \Sigma x}.
\end{equation}




### Exercise 1 (in class)

1.
Download financial data (**csv** files) from [Yahoo!~Canada Finance](http://ca.finance.yahoo.com) for the following twenty stocks:

* Technology:  AAPL, IBM, GOOG, MSFT, AABA
* Services:  AMZN, COST, EBAY, TGT, WMT
* Financial:  BMO, BNS, JPM, RY, TD
* Energy:  BP, CVX, IMO, TOT, XOM

Store the **csv** files in a directory called **'data'**.


2.
Complete the function $load\_stock(dirname, startdate, enddate)$:

This function must read the $adjusted\ closing\ prices$ of all stocks in the given directory between the start date and end date, and compute the rates of return as in equation (1).

In [None]:
function load_stocks(dirname, startdate, enddate)
    filenames = readdir(dirname)
    X = []
    dates = []
    names = []
    for i = 1:length(filenames)
        file = filenames[i]
        data = readdlm( string( dirname, "/", file), ',')
        # complete the code 
        ...
    end
        
    return Array{Float64,2}(X), dates, Array{String,1}(names)
end

Use the following start and end dates:
```
startdate = '2017-01-03'; enddate = '2017-12-31';
```
After you complete $load\_stock(dirname, startdate, enddate)$ Use the following code to plot your results using the given function $disp\_stocks$ in $preprocess.jl$

In [None]:
include("preprocess.jl")
using PyPlot
X, dates, names = load_stocks("data", "2017-01-03", "2017-12-31")
disp_stocks(X, dates, names)

3.
Complete function $meancov(X)$ that returns the $n \times 1$ vector $r$ of means and the $n \times n$ covariance matrix $Sig$ of the rates of returns of $n$ stocks given by $X$:

In [None]:
function meancov(X)
    # complete the code 
end

After you complete $meancov(X)$, run the following code to get estimated returns and covariance.

In [None]:
r, Sig = meancov(X);

4.
Complete the function $portfolio\_scatter(r, Sig, num)$.

This function must generate random portfolios and make a scatter plot
of their expected rates of return and standard deviation. Each random
portfolio is generated by randomly allocating a fraction of the
overall investment among a small set of 5 randomly chosen stocks. Make
a scatter plot with $num = 1000$ points.

In [None]:
function portfolio_scatter(r, Sig, num)
    n = length(r)
    randmu = zeros(num)
    randSig = zeros(num)
    figure()
    
    # complete the code 
    
    plot(randSig, randmu, "b+", markersize = 5)
    xlabel("Std. Dev.")
    ylabel("Expected Rate of Returen")

end


Run the following code when you complete the function $portfolio\_scatter$.

In [None]:
portfolio_scatter(r, Sig, 1000)

### Exercise 2 (Homework)

1.
Use [**JuMP**](https://jump.readthedocs.io/en/latest/)(please read the documentation) to compute the portfolio with minimum risk. What is
  the expected rate of return and standard deviation of this
  portfolio? Plot the rate of return of this portfolio over the entire
  time period. What is the portfolio with maximum possible expected
  rate of return? Complete the function $return\_range$ that
  returns $num$ linearly spaced rates of return between the
  rate of return of the portfolio with minimum risk and the maximum
  possible rate of return:
  
More about **JuMP**: you can install **JuMP** by 
```
Pkg.add("JuMP")
```
Note that you need to specify a solver for **JuMP** to optimize your problem, here you need to formulate the portfolio optimization problem as a quadratic programming, it can be solved by a free solver $Ipopt$, you are recommended to install $Ipopt$ and use that as your solver.
```
Pkg.add("Ipopt.jl")
```
If you are using Ipopt as your solver, you can initialize the model by
```
m = Model(solver = IpoptSolver())
```
When you finish debugging, set `m = Model(solver = IpoptSolver(print_level=0))` to silence IPOPT output.


In [None]:
using JuMP
using Ipopt
function return_range(r, Sig, num)
    n = length(r)
    # complete the code 

end

Run the following code when you complete $return_range$.

In [None]:
rrange = return_range(r, Sig, 12)

2.
Given a desired expected rate of return, we can see from the
  scatter plot that there are many portfolios that we can choose that
  have this expected rate of return. However, each of these portfolios
  have a different level of risk, or standard deviation. Among these,
  the most $efficient$ portfolio is the one giving us the least
  amount of risk.

  Each expected rate of return determines a different efficient
  portfolio. Plotting the expected rate of return and standard
  deviation of each of the efficient portfolios will give us a curve
  called the $efficient\ frontier$.
  
  Complete the function `efficient_frontier(r, Sig, num)`
```
Y, rates, sigs = efficient_frontier(r, Sig, num)
```
This function will compute $num$ efficient portfolios with
linearly spaced rates of return (obtained from
$return\_range$). These portfolios will be stored in the
$n \times num$ matrix $Y$, and their
corresponding expected rates of return and standard deviation in
vectors $rates$ and $sigs$. Plot $sigs$ and
$rates$ on the scatter plot, with $num = 12$:


In [None]:
function efficient_frontier(r, Sig, num)
    n = length(r)
    rrange = return_range(r, Sig, num)
    
    sqrtSig = sqrtm(Sig)
    
    Y = zeros(n, num)
    
    for jj = 1:num
        # complete the code
        ...
        
        Y[:,jj] = ...
        
    end
    
    rates = ...
    sigs = ...
    
    return Y, rates, sigs
end

Run the following code after you complete the function `efficient_frontier`

In [None]:
figure()
portfolio_scatter(r, Sig, 1000)
num = 12
Y, rates, sigs = efficient_frontier(r, Sig, num);
plot(sigs, rates, "ro-"); 
ylim(0, 0.5)
xlim(0, maximum(sigs))

Display your results using the given function `disp_portfolios(Y, rates, sigs, names)`

In [None]:
include("preprocess.jl")
disp_portfolios(Y, rates, sigs, names)

### Exercise 3 (in class 2)

Add a risk-free investment called 'RF' to the collection of stocks with a 3% rate of return. Use your `efficient_frontier` code from Exercise 2 to determine the new efficient frontier and plot it on the same plot with the original efficient frontier. You will notice that the new efficient frontier has two pieces: (1) a linear piece, and (2) a nonlinear piece that coincides with the original efficient frontier. What does the linear piece represent? 

In [None]:
num = 12
f = 0.03
# complete the code 

The portfolio where these two pieces join is called the $market\ portfolio$. Complete the function `market_portfolio(f, r, Sig)` that computes the market portfolio corresponding to a risk-free rate of return $f$:

Use `IpoptSolver` for the function `market_portfolio` (quadratic programming with linear constraints) and use `SCSSolver` for the function `risk_free_rate`. (linear programming with quadratic constraints)

You can use `Model(solver = SCSSolver(verbose=0))` to silent SCS output.

In [None]:
#Exercise 3
function market_portfolio(f, r, Sig)
    
    # Define function func such that func(sig) = 0 when risk_free_rate(sig, r, Sig) = f.
    func = ...
    
    # Compute the minimum value of sig
    sig1 = ...
    
    # Compute the maximum value of sig
    sig2 = ...
    
    #Use BinarySearch to solve func(sig) = 0
    sig = BinarySearch(func, sig1, sig2)
    
    # The market portfolio is the portfolio on the efficient frontier with risk
    # equal to the sig satisfying   risk_free_rate(sig, r, Sig) = f.
    
    ...
    
end

function risk_free_rate(sig, r, Sig)
    n = length(r)
    sqrtSig = sqrtm(Sig)
    
    # Dual multiplier lambda gives slope of efficient frontier at the point
    # (r'*x, sqrt(x'*Sig*x)), where x is the portfolio with maximum expected
    # rate of return with risk at most sig.
    m = Model(solver = SCSSolver(verbose=0))
    @variable(m, x[1:n] >= 0)
    @objective(m, Max, r'*x )
    @constraint(m, sum(x) == 1)
    @constraint(m, soc, norm(sqrtSig*x) <= sig)

    status = solve(m)
    weights = getvalue(x)
    dualvar = getdual(soc)
    lambda = dualvar[1]
    
    # The risk-free rate is the y-intercept of the line tangent to the 
    # efficient frontier at the point (r'*x, sqrt(x'*Sig*x))
    rate = ...  # complete the code 
    
    return rate
 
end

function BinarySearch(func, x1, x2)
    # This is a generic binary search routine.
    # Given x1 and x2, with.
    #   func(x1) < 0  and  func(x2) > 0, or
    #   func(x1) > 0  and  func(x2) < 0,
    # returns x with abs(func(x)) < 1e-6.
    @printf "Binary search:\n"
    if func(x1) > 0 && func(x2) < 0
        # Swap x1 and x2
        tmp = x1
        x1 = x2
        x2 = tmp
    end
    ii = 0; y = Inf
    @printf "%4s%10s%10s\n" "Iter" "x" "|f(x)|"
    x = 0
    while abs(y) > 1e-6 && ii <= 30
        ii += 1; x = (x1 + x2)/2; y = func(x)
        @printf "%4d%10.2e%10.2e\n" ii x abs(y)
        if y < 0
            x1 = x
        else
            x2 = x
        end
    end
    @printf "Done."
    return x
end

Run the following code after you complete the above functions.

In [None]:
using SCS
f = 0.03
x = market_portfolio(f, r, Sig)

Plot the line that is tangent to the original efficient frontier at the market portfolio. What does the top half of this tangent line represent?


In [None]:
# complete the code and plot the figure