### FE630 - Midterm Project

**Author**: Sid Bhatia

**Date**: March 25th, 2023

**Pledge**: I pledge my honor that I have abided by the Stevens Honor System.

**Professor**: Papa Momar Ndiaye

##### Question 1. (15 pts)

The supplied $\texttt{data.zip}$ file contains 30 space-delimited text files that contain price and volume data for 30 companies. Each row of each file contains date, opening price, closing price, high price, low price, volume, and adjusted closing price (last column). You will need that data for question 1. 

Write a program $\texttt{processdata}$ to:

1. Read all daily price files;
2. Create a price matrix $\texttt{P}$ by aligning the data’s dates and placing the adjusted closing prices side-by-side in columns;
3. From the $\texttt{P matrix}$, create a matrix of simple (not logarithmic) daily returns $\texttt{R}$;
4. Compute the vector of average daily returns mu for the companies using the $\texttt{mean}$ function (do not use loops);
5. Compute the covariance matrix $\texttt{Q}$ from the return matrix using the $\texttt{cov}$ function; and
6. Save the return vector $\texttt{mu}$ and the covariance matrix $\texttt{Q}$ in the native format for your programming language in a file called $\texttt{inputs.ext}$, where $\texttt{ext}$ is the appropriate extension for a binary file in your language.

In [14]:
import pandas as pd
import numpy as np
import os
from typing import List

def processdata(data_dir: str = 'data') -> None:
    """
    Processes stock price data to compute and save the average daily return vector and covariance matrix.
    
    This function reads stock price data from text files, each containing data for a company, then:
    1. Creates a price matrix with adjusted close prices,
    2. Calculates the daily return matrix,
    3. Computes the vector of average daily returns for each company,
    4. Computes the covariance matrix of the return matrix,
    5. Saves the average daily returns vector and the covariance matrix to binary files.
    
    Parameters:
    - data_dir (str): The directory containing the stock price files. Default is 'data'.
    
    Returns:
    - None. The function saves two files: 'inputs_mu.pkl' and 'inputs_Q.pkl' with the results.
    """
    # List to store the adjusted close price data for each company.
    price_data: List[pd.Series] = []

    # Loop through each file in the specified directory.
    for file in os.listdir(data_dir):
        if file.endswith('.txt'):
            filepath = os.path.join(data_dir, file)
            # Read data, assuming space-separated values without an explicit header.
            df = pd.read_csv(filepath, sep=' ', header=None,
                             names=['Date', 'Open', 'Close', 'High', 'Low', 'Volume', 'Adj Close'])
            # Set date as the index for easy alignment later.
            df.set_index('Date', inplace=True)
            # Append the adjusted close price series to the list.
            price_data.append(df['Adj Close'])

    # Concatenate all the adjusted close prices side-by-side, aligning by date.
    P = pd.concat(price_data, axis=1)
    P.sort_index(inplace=True)  # Ensure the dates are in order.

    # Calculate daily returns by comparing each price with the previous day's price.
    R = P.pct_change().dropna()  # Drop the first row since its percentage change is undefined.

    # Calculate the vector of average daily returns for each company.
    mu = R.mean(axis=0)

    print(mu)

    # Calculate the covariance matrix of the daily returns.
    Q = R.cov()

    print(Q)

    # Save the vector of average daily returns and the covariance matrix as binary files.
    mu.to_pickle('inputs_mu.pkl')
    Q.to_pickle('inputs_Q.pkl')

processdata()

Adj Close    0.000225
Adj Close    0.000473
Adj Close    0.000945
Adj Close    0.000540
Adj Close   -0.000340
Adj Close    0.000563
Adj Close   -0.000295
Adj Close    0.000339
Adj Close    0.001099
Adj Close    0.000425
Adj Close    0.001033
Adj Close    0.001010
Adj Close   -0.000270
Adj Close    0.000669
Adj Close    0.000521
Adj Close    0.000637
Adj Close    0.000240
Adj Close    0.000290
Adj Close    0.000705
Adj Close    0.000468
Adj Close    0.000901
Adj Close    0.000491
Adj Close    0.000233
Adj Close    0.000139
Adj Close    0.000595
Adj Close    0.001255
Adj Close    0.000200
Adj Close    0.000226
Adj Close    0.000022
Adj Close   -0.000107
dtype: float64
           Adj Close  Adj Close  Adj Close  Adj Close  Adj Close  Adj Close   
Adj Close   0.000342   0.000079   0.000080   0.000097   0.000104   0.000069  \
Adj Close   0.000079   0.000144   0.000066   0.000088   0.000062   0.000051   
Adj Close   0.000080   0.000066   0.000169   0.000076   0.000058   0.000056   
Adj Close

##### Question 2. (15 pts)

Write a function called $\texttt{port}$ that uses standard quadratic programming libraries that will:

- Take the set of input parameters $\texttt{mu}$ (mean vector $\mu$), $\texttt{Q}$ (covariance matrix $Q$), and $\texttt{tau}$ (risk tolerance $\tau$), and return vector $h$ that maximizes the following utility function $U$ defined by: $$U(h) = -\frac{1}{2}h^T Q h + \tau h^T \mu$$ subject to the constraints $$0 \leq h_i \leq 0.1 \; \text{for all} \; i, \; \text{and}$$ $$\sum_{i=1}^n h_i = h^T e = 1$$ where $n$ is the number of securities in the portfolio.

In [17]:
import cvxpy as cp

def port(mu, Q):
    pass