Python for Finance --- Final Exam
----

**MSc in Mathematics and Finance, Imperial College London**

Autumn Term 2023-2024

Tuesday 10 December 2024
***


## GENERAL INSTRUCTIONS


- For each question, you are asked to create a function with specific inputs and outputs.

- You should copy / paste all your functions, one after the other, in a single file named `CID.py`
    + You will find a sample CID.py in blackboard

- You may only use the libraries below

- Grading details:
    + Clarity of the code (name of temporary variables, comments)
    + Correct solution
    
- At the end of the examination, you should update your CID.py file into blackboard, renaming with it with your CID e.g. 12345678.py

---

In [1]:
import platform
print("Current Python Version",platform.python_version())
if platform.python_version()<"3.11":
    print("ERROR: you are using a Python version lower than 3.11")

Current Python Version 3.12.3


### Allowed libraries ONLY

In [2]:
import numpy as np
from abc import ABC,abstractmethod
from scipy.stats import norm
import time as time
import pandas as pd

# PROBLEM I : OOP and Backtesting (Q1 10 POINTS | Q2 10 POINTS | Q3 10 POINTS)
---

Complete the `Backtester` class below by implementing the required methods. The goal is to write a Python program to simulate a simple portfolio backtester. The input will be daily stock prices in the form of a 2D `np.array()`, where each row represents a day and each column represents a different stock.

For each task, carefully follow the instructions, constraints, and input/output requirements.

The abstract class `Backtester` has the following class atttributes:
- `self.prices`: a matrix   $P\in \mathbb{R}^{T\times M}$ where $P_{i,j}$, represents the price for the $i$-th time step and $j$-stock for $i\in\{0,1,...,T\}$ and $j\in\{1,...,M\}$.
- `self.portfolio_values`: a 1D array $V\in\mathbb{R}^{T}$, where $V_i$ represents the porfotlio value at time $I$
- `self.portfolio_shares`:  a matrix  $\text{shares}\in \mathbb{R}^{T\times M}$ where $\text{shares}_{i,j}$, represents the amount of shares held at time $i$ for stock $j$.
- `self.proportional_fees`:  $p\in\mathbb{R}_+$ representing the proportional transaction fees

In [3]:
class Backtester(ABC):
    def __init__(self, prices: np.array):
        """
        Initialize the Backtester with daily stock prices.
        
        Args:
            prices (2D np.array): A matrix where each row represents daily prices of stocks,
                                        and each column represents a stock.
        """
        self.prices = prices.copy() #deep copy of prices
        self.portfolio_values=np.zeros(prices.shape[0]) #portfolio value at each time step
        self.portfolio_shares=np.zeros((prices.shape[0],prices.shape[1])) # shares held in the portfolio for each time steps
        self.proportional_fees=0
    def display_portfolio(self):
        print("Portfolio values:",self.portfolio_values)
        print("Protfolio shares",self.portfolio_shares)



    def compute_fees(self,rebalance_period: int)->np.array:
        self.construct_equally_weighted_portfolio(rebalance_period)
        """
        compute fees incurred by rebalancing
        
        Output:
            array of daily fees
        """
        
        return 
    @abstractmethod
    def construct_equally_weighted_portfolio(self,rebalance_period: int):
        pass

class Nofees(Backtester):
    def __init__(self,prices):
       super().__init__(prices)
    def compute_rebalance_shares(self,portfolio_value: float,day_index: int)-> np.array:
        """
        Compute the numbers of shares for each stock to construct an equally weighted portfolio
        
        Args:
            portfolio_value (float): latest total value of the portfolio
            day_index (int): The index of the day for which we want to perform a rebalance
        
        Returns:
            np.array: A list where each element is number of shares needed to be held to construct an equally weighted porfolio
        """
        return 

    def construct_equally_weighted_portfolio(self, rebalance_period: int):
        """
        Implement an equally weighted portfolio with n-day rebalancing with an initial value of $1
        
        Args:
            rebalance_period (int): Number of days after which to rebalance the portfolio.
        
        Returns:
            
        """
        
            
class Proportionalfees(Backtester):
    def __init__(self,prices,p):
       super().__init__(prices)
       self.proportional_fees=p
    def compute_delta_change_in_share_value(self,shares:np.array,day_index: int)->np.array:
        """
        Computes the percentage change in amount of shares held 
        
        Args:
            shares (float): Number of shares held before rebalance
            day_index (int): The index of the day for which we want to perform a rebalance
        
        Returns:
            np.array: An array with the dollar value change on each stock
        """
       


    def compute_target_shares(self,shares:np.array,day_index: int)->np.array:
        """
        Computes the target shares to obtain an equally balanced portfolio after proportional costs are deducted
        
        Args:
            shares (float): Number of shares held before rebalance
            day_index (int): The index of the day for which we want to perform a rebalance
        
        Returns:
            np.array: An array with the dollar value change on each stock
        """
       
        
      
    
    def construct_equally_weighted_portfolio(self, rebalance_period: int):
        """
        Add proportional fees to the backtest. Fees are applied during rebalancing.

        Args:
             rebalance_period (int): Number of days after which to rebalance the portfolio.

        Returns:
           
           
        """
       
       
    


## Question 1: Equally weighted portfolio without fees
 We will implement a rebalancing procedure to make a portfolio equally-weighted at a given time, with an initial wealth of 1000$. 

 In this setup, we can represent each asset price by $ P_{i,j}$ where $j\in\{1,2,3,...,M\}$ represents the stock and $i\in\{0,1,2,...,T\}$ represents time. For clarity $M\in\mathbb{I}$ represents the number of stocks and  $T\in\mathbb{I}$ represents the number of days. Finally $V_i$ the portfolio value at time $i\in\{0,1,2,...,T\}$. We note that $V_0=1000$

 Then, in order to compute the number of shares to make the portfolio equally weighted at any point we just need to perform:

 $$\text{shares}_{i,j}=\begin{cases} \frac{V_{i} }{  P_{i,j}\times M} \quad & if \quad  i \equiv 0 \;(\mathrm{mod}\; n) \\
 \text{shares}_{i-k,j}, \; where\quad k=\underset{x\in\{1,...,n-1\}}{\arg\min} \left\lbrace i-x \equiv 0 \;(\mathrm{mod}\; n)\right\rbrace \quad  & otherwise \end{cases}         (Eq:1)
$$
 where $n\in\mathbb{I}$ is the `rebalance_period (int)`. And

 $$V_i=\begin{cases} 1000, \quad &if \quad &i=0\\ \sum_{j=1}^M \text{shares}_{i-1,j} \times P_{i,j}\quad &if\quad  &i\in\{1,...,T\}\end{cases}  (Eq:2)$$

## Task A) 
 Implement the `compute_rebalance_shares(self,portfolio_value: float,day_index: int)-> np.array` method, which computes the number of shares needed to be held at a given day indexed by $i$.

 $$\text{shares}_{i,j}= \frac{V_{i} }{  P_{i,j}\times M}, \quad for \quad j=1,...,M$$
## Task B) 

 Using the method from the previous task, implement an equally weighted portfolio with initial value of 1000$ and with n-day rebalancing as part of the:

 `construct_equally_weighted_portfolio(self, rebalance_period: int)` subclass method. 

 The input `rebalance_period (int)` represents the number of days after which to rebalance the portfolio.

 The method should **update**:
- `self.portfolio_values` which should represent the portfolio values at each time point following the rebalance strategy  (Eq:1) above
- `self.portfolio_shares`: which should represent the shares held to make the portfolio equally weighted at each time step (Eq:2) above




### **Notes:** 

 1) We will assume that the initial value of the portfolio is $1000

 2) Note that the following relation must hold:

 $$\frac{1}{M}=\frac{\text{shares}_{i,j}\times  P_{i,j}}{V_i}, \quad if \quad  i \equiv 0 \;(\mathrm{mod}\; n)$$

## Example:
 prices = np.array([
 
    Day 1 prices [100, 200, 300,400],  
    Day 2 prices [101, 198, 305,410],  
    Day 3 prices [102, 202, 310,405],  
    Day 4 prices [101, 198, 305,410], 
 ])

 **A)** Let's assume $n=1$ e.g. rebalancing every day

 Then the portfolio value should be: $V=$`[1000, 1010.4166666666667, 1019.0813257925988, 1010.5747342262915]`
 And the number of shares
 shares= [
 
    Day 1 shares   [2.5       , 1.25      , 0.83333333, 0.625     ,
    Day 2 shares   [2.50103135, 1.27577862, 0.82821038, 0.61610772],                          
    Day 3 shares   [2.49774835, 1.26123926, 0.82183978, 0.62906255],
    Day 4 shares   [2.50142261, 1.2759782 , 0.82833995, 0.61620411],
 ]
 
 **B)** Let's assume $n=2$ e.g. rebalancing every two days

 Then the portfolio value should be: $V=$`[1000., 1010.41666667, 1018.95833333, 1010,.45276842]`
 And the number of shares
 shares= [
 
    Day 1 shares   [2.5       , 1.25      , 0.83333333, 0.625     ,  
    Day 2 shares   [2.5       , 1.25      , 0.83333333, 0.625     ],                           
    Day 3 shares   [2.4974469 , 1.26108705, 0.82174059, 0.62898663],
    Day 4 shares   [2.4974469 , 1.26108705, 0.82174059, 0.62898663],
 ]

In [4]:
prices = np.array([[100, 200, 300,400],[101, 198, 305,410],[102, 202, 310,405],[101, 198, 305,410]],dtype=float)

In [5]:
BT = Nofees(prices)

In [6]:
BT.construct_equally_weighted_portfolio(1)
BT.display_portfolio()

Portfolio values: [1000.         1010.41666667 1019.08132579 1010.57473423]
Protfolio shares [[2.5        1.25       0.83333333 0.625     ]
 [2.50103135 1.27577862 0.82821038 0.61610772]
 [2.49774835 1.26123926 0.82183978 0.62906255]
 [2.50142261 1.2759782  0.82833995 0.61620411]]


In [7]:
BT.construct_equally_weighted_portfolio(2)
BT.display_portfolio()

Portfolio values: [1000.         1010.41666667 1018.95833333 1010.45276842]
Protfolio shares [[2.5        1.25       0.83333333 0.625     ]
 [2.5        1.25       0.83333333 0.625     ]
 [2.4974469  1.26108705 0.82174059 0.62898663]
 [2.4974469  1.26108705 0.82174059 0.62898663]]


# Question 2: Equally weighted portfolio with proportional transaction fees
 We will implement a rebalancing procedure to make a portfolio equally-weighted at a given time, with an initial wealth of 1000$ **and proportional transaction fees**

 Using the same notation as before, we need to adjust the portfolio value by substracting proportional fees. Let's denote by $p\geq 0$ the proportional cost of trading. Then the value of the portfolio with fees is constructed the following way:

 $$V_i=\begin{cases} 1000(1-p), \quad &if \quad &i=0\\ \sum_{j=1}^M \left(\text{shares}_{i-1,j} \times P_{i,j} \right)\quad &if\quad  &i\in\{1,...,T\}\end{cases} (Eq:3)$$

 where $$\text{shares}_{i,j}=\frac{\left(\sum_{j=1}^M P_{i,j} \text{shares}_{i-1,j}\right)-p\left(\sum_{j=1}^M \left |\Delta\text{share value}_{i,j}\right|\right)}{M  P_{i,j}} (Eq:4)$$

 $$\Delta\text{share value}_{i,j}= \frac{\sum_{j=1}^M\left(\text{shares}_{i-1,j} \times  P_{i,j} \right) }{  M}-\text{shares}_{i-1,j} P_{i,j}, \quad for \quad j=1,...,M \quad (Eq:5)$$


## Task A) 
 Implement
 `compute_delta_change_in_share_value(self,shares:np.array,day_index: int)->np.array:` method, which computes equation $(Eq:5)$ above        

 The input `shares (np.array)` represents Number of shares held before rebalance e.g. $\text{shares}_{i-1}$
 The input `day_index (int)` represents the index of the day for which we want to perform a rebalance i.e. variable $i$ in $(Eq:5)$
        
 The output should be a   `np.array` with the dollar value change on each stock 

## Task B) 

 `compute_target_shares(self,shares:np.array,day_index: int)->np.array` method, which computes equation $(Eq:4)$ above        
 The input `shares (np.array)` represents Number of shares held before rebalance e.g. $\text{shares}_{i-1}$
 The input `day_index (int)` represents the index of the day for which we want to perform a rebalance i.e. variable $i$ in $(Eq:4)$
        
 The output should be a   `np.array` with the dollar value change on each stock 


## Task C) 

 Using the method from the previous task, implement an equally weighted portfolio with initial value of 1000$ and with n-day rebalancing and porportional transaction costs as part of the:

 `construct_equally_weighted_portfolio(self, rebalance_period: int)` subclass method. 

 The input `rebalance_period (int)` represents the number of days after which to rebalance the portfolio.

 The method should **update**:
- `self.portfolio_values` which should represent the portfolio values at each time point following the rebalance strategy  (Eq:3) above
- `self.portfolio_shares`: which should represent the shares held to make the portfolio equally weighted at each time step (Eq:5) above



### **Notes/Hints:** 

 1) We will assume that the initial value of the portfolio is $1000

 2) You can (and are encouraged) to use methods from Question 1, although this is not compulsory nor necessary

 3) Note that the following relation must hold:

 $$\frac{1}{M}=\frac{\text{shares}_{i,j}\times P_{i,j}}{V_i}, \quad if \quad  i \equiv 0 \;(\mathrm{mod}\; n)$$

 4) You can use Question 1 to compare results, you should observe that portfolio values are smaller for any $q>0$ and should match the previous question if q=0

 5) Note that the values $V_i$ represent the value of the portfolio inmediately after trading

## Example:
 prices = np.array([
 
    Day 1 prices [100, 200, 300,400],  
    Day 2 prices [101, 198, 305,410],  
    Day 3 prices [102, 202, 310,405],  
    Day 4 prices [101, 198, 305,410], 
 ])

**A)** Let's assume $n=1$ e.g. rebalancing every day and 1% transaction fees e.g. p=0.01

 Then the portfolio value should be: $V=$`[ 990. , 1000.209375 , 1008.68262919, 1000.15847549]`
 And the number of shares


 shares= [                            
 
    Day 1 shares   [2.475      ,1.2375     ,0.825      ,0.61875   ],
    Day 2 shares   [2.47576578 ,1.26289063 ,0.81984375 ,0.60988377],           
    Day 3 shares   [2.47226135 ,1.24836959 ,0.81345373 ,0.6226436],          
    Day 4 shares   [2.47563979 ,1.26282636 ,0.81980203 ,0.60985273],
 ]

**B)** Let's assume $n=2$ e.g. rebalancing every two days and 1% transaction fees e.g. p=0.01

 Then the portfolio value should be: $V=$`[  990. ,  1000.3125 ,1008.6924375,1000.27256524]`
 And the number of shares


 shares= [                            
 
    Day 1 shares   [2.475      ,1.2375     ,0.825      ,0.61875   ],
    Day 2 shares   [2.475      ,1.2375     ,0.825      ,0.61875   ],           
    Day 3 shares   [2.47228539 ,1.24838173 ,0.81346164 ,0.62264965],           
    Day 4 shares   [2.47228539 ,1.24838173 ,0.81346164 ,0.62264965], 
 ]

In [8]:
BT = Proportionalfees(prices,0.01)
BT.construct_equally_weighted_portfolio(1)
BT.display_portfolio()

Portfolio values: [ 990.         1000.209375   1008.68262919 1000.15847549]
Protfolio shares [[2.475      1.2375     0.825      0.61875   ]
 [2.47576578 1.26289063 0.81984375 0.60988377]
 [2.47226135 1.24836959 0.81345373 0.6226436 ]
 [2.47563979 1.26282636 0.81980203 0.60985273]]


In [9]:
BT = Proportionalfees(prices,0.01)
BT.construct_equally_weighted_portfolio(2)
BT.display_portfolio()

Portfolio values: [ 990.         1000.3125     1008.6924375  1000.27256524]
Protfolio shares [[2.475      1.2375     0.825      0.61875   ]
 [2.475      1.2375     0.825      0.61875   ]
 [2.47228539 1.24838173 0.81346164 0.62264965]
 [2.47228539 1.24838173 0.81346164 0.62264965]]


## Question 3: Transaction cost calculation

We will implement the base class method `compute_fees(self,rebalance_period: int)->np.array`. This method will compute the daily fees incurred by the portfolio rebalancing procedure. The following formula can be used to compute the incurred fees:
### $$\text{fees}_{i}=\begin{cases} \left(\frac{1}{1-p}-1\right)\sum_{j=1}^M P_{i,j}shares_{i,j}, \quad &if,\; i=0 \\
          \sum_{j=1}^M P_{i,j}shares_{i-1,j}-\sum_{j=1}^M P_{i,j}shares_{i,j}, \quad &if,\; i>0\end{cases}$$


## Example:
 prices = np.array([
 
    Day 1 prices [100, 200, 300,400],  
    Day 2 prices [101, 198, 305,410],  
    Day 3 prices [102, 202, 310,405],  
    Day 4 prices [101, 198, 305,410], 
 ])

**A)** Let's assume $n=1$ e.g. rebalancing every day and 5% transaction fees e.g. p=0.05

 Then the fees should be: $\text{fees}=$`[50.        ,  0.49479167,  0.4981799 ,  0.50032038]`$



**B)** Let's assume $n=2$ e.g. rebalancing every two days and 1% transaction fees e.g. p=0.01

 Then the fees should be: $\text{fees}=$`[[10.       ,  0.       ,  0.0763125,  0.       ]]`$


In [10]:
BT = Proportionalfees(prices,0.05)
BT.compute_fees(1)

array([50.        ,  0.49479167,  0.4981799 ,  0.50032038])

In [11]:
BT = Proportionalfees(prices,0.01)
BT.compute_fees(2)

array([10.       ,  0.       ,  0.0763125,  0.       ])

# PROBLEM II: The Bergomi model (Q1 10 Points | Q2 10 Points | Q3 10 POINTS | Q4 10 POINTS)
---

The Bergomi model is one of the key stochastic volatility models used in option pricing on Equity markets. 
Under the risk-neutral measure (and assuming no interest and no dividend for simplicity), the asset price process $(S_t)_{t\geq 0}$ satisfies the following dynamics:
$$
\log (S_t)-\log(S_0) =- \frac{1}{2}\int_0^t \sigma^2_s ds+ \int_0^t \sigma_s ds B_s\approx - \frac{1}{2}\sum_{i=1}^{\lfloor nt\rfloor}\sigma^2_{t_{i-1}} (t_i-t_{i-1}) +\sum_{i=1}^{\lfloor nt\rfloor}\sigma_{t_{i-1}} (B_{t_i}-B_{t_{i-1}}) ,
$$
starting from $S_0>0$, where $B$ is a standard Brownian motion, and where $(\sigma_t)_{t\geq 0}$ is another stochastic process representing the instantaneous volatility satisfying
$$
\sigma_{t} = \sigma_{0}\exp\Big\{\nu W_t\Big\},
$$
for some strictly positive constants $\sigma_0,\nu>0$ and where $W$ is another standard Brownian motion.

The two Brownian motions $W$ and $B$ are correlated with correlation $\rho \in [-1,1]$. Said otherwise, we can write 
$$
B_t = \rho W_t + \sqrt{1-\rho^2} W_t^{\perp},
$$
where $(W_t^{\perp})_{t\geq 0}$ is a standard Brownian motion independent of $W$. We define $t_i=\frac{iT}{N-1}$ for $N\in\mathbb{N}$

## **Question 1**
Write a function `generate_two_brownians` that generate the two correlated Brownian motions W and B as follows:

*Notes:*
- Given a time horizon $T>0$, the discretisation time grid should be $\left\{t_{i} = \frac{iT}{nb\_steps-1}\right\}_{i=0,nb\_steps-1}$, so that both B and W should be of size (nb_steps, nb_paths) and start from 0 at time 0.
- Loops are not allowed, only `numpy` computations are. 


In [12]:
def generate_two_brownians(rho:float, T:float, nb_steps:int, nb_paths:int):
    """
        #Inputs:
        rho (float): correlation between the two Brownian motions
        T (float): time horizon
        nb_steps (int): number of time steps
        nb_paths (int): number of paths
        
        #Outputs:
        paths of B and W: np.array, np.array
    """
    
    return 

## **Question 2** 
Write a function `generate_S_paths` that generate the paths of the stock price $S$. 

*Notes:*
- You are required to use the function `generate_two_brownians` above
- Loops are not allowed, only `numpy` computations are. 
- S should of size (nb_steps, nb_paths).
- The list `params` is params = $[S_0, \sigma_0, \nu, \rho, T]$.

In [13]:
def generate_S_paths(params: np.array, nb_steps:int, nb_paths:int):
    """
        #Inputs:
        params = (S0, sigma0, nu, rho, t)
            - S0 (float): initial value of the stock price
            - sigma0 (float): initial value of the volatility
            - nu (float): volatility of volatility parameter
            - rho (float): correlation between the two Brownian motions
            - T (float): time horizon
        nb_steps (int): number of time steps
        nb_paths (int): number of paths
        
        #Outputs:
        paths of S: np.array
    """
  
    
    return 


## **Question 3** 
Write a function `call_price_bergomi` that outputs the price of a Call option in the Bergomi model by Monte Carlo, namely
$$
\mathbb{E}\Big[\max(S_T - K,0)\Big].
$$

You are of course required to use the function `generate_S_paths` above.

In [14]:
def call_price_bergomi(K:float, params: np.array, nb_steps:int, nb_paths:int):
    """
        #Inputs:
        K (float): strike of the option
        params = (S0, sigma0, nu, rho, T)
            - S0 (float): initial value of the stock price
            - sigma0 (float): initial value of the volatility
            - nu (float): volatility of volatility parameter
            - rho (float): correlation between the two Brownian motions
            - T (float): time horizon
        nb_steps (int): number of time steps
        nb_paths (int): number of paths
        
        #Outputs:
        call Price: float
    """

   
    
    return 


As a consistency check, you may want to check that, with the values
$$
S_0 = 100, \quad
\sigma_0 = 30\%, \quad
\nu = 0.2, \quad
K = 99, \quad
T = 1, \quad
\rho = -0.7.
$$
your function gives a price close to $12.36$.

## **Question 4** 
Using the function `generate_S_paths`, write a function `barrier_price_bergomi` that outputs the price of a Barrier Call option in the Bergomi model with payoff
$$
\mathbb{E}\Big[\max(S_T - K,0)\boldsymbol{1}_{S_T > B}\Big],
$$
for some barrier level $B>0$.

Notes: 
- You are required to use only `np` function and no `if/then/else` statements.
- the function should also return the percentage of samples that yield a non-zero payoff using `map`.

With a barrier equal to $110$, you should be expecting a price close to $11.62$ and the percentage of valid paths around $34\%$.

In [15]:
def barrier_price_bergomi(K:float, B:float, params: np.array, nb_steps:int, nb_paths:int):
    """
        #Inputs:
        K (float): strike of the option
        B (float): barrier of the option
        params = (S0, sigma0, nu, rho, T)
            - S0 (float): initial value of the stock price
            - sigma0 (float): initial value of the volatility
            - nu (float): volatility of volatility parameter
            - rho (float): correlation between the two Brownian motions
            - T (float): time horizon
        nb_steps (int): number of time steps
        nb_paths (int): number of paths
        
        #Outputs:
        barrier_price: float
        perc_valid: float
    """

   
    
    return 


In [16]:
S0, sigma0, nu, rho, T = 100., .3, .2, -.7, 1.
K = 99
nb_steps, nb_paths = 900, 50000
params = [S0, sigma0, nu, rho, T]

In [17]:
call_price_bergomi(K, params, nb_steps, nb_paths)

12.345189295939221

In [18]:
barrier = 110
barrier_price_bergomi(K, barrier, params, nb_steps, nb_paths)

(11.821840826369074, 0.34874)

# PROBLEM III: Pandas and Equity data (Q1:20 POINTS  | Q2:10 POINTS)
---

The file `stock_data.csv`contains financial data of stock prices for different symbols

The columns of the csv file represent different stocks or indices:
"SP500" is the S&P 500 index
"TSLA","V","AMD","COST" are single stocks

you can read the file to pandas using the code below

In [19]:
df=pd.read_csv("stock_data.csv",index_col=0)
df.head()

Unnamed: 0_level_0,SP500,TSLA,V,AMD,COST
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2010-06-29,1041.23999,1.592667,17.862499,7.48,55.630001
2010-06-30,1030.709961,1.588667,17.6875,7.32,54.830002
2010-07-01,1027.369995,1.464,18.215,7.39,54.900002
2010-07-02,1022.580017,1.28,18.295,7.17,54.23
2010-07-06,1028.060059,1.074,18.067499,7.04,54.0


# Question 1: Calculate the Moving Average and Identify Crossovers

**A)** Implement a function `calculate_moving_average(df: pd.DataFrame, n: int) -> pd.DataFrame:` with the following specifications:

Input:
    
- df: A pandas DataFrame with dates as the index and stock names as columns. The values represent the stock's closing prices.
- n: The number of days for the moving average (integer, e.g., 5 for a 5-day moving average).
    
Output:
    
- A pandas DataFrame with the same structure as the input but containing the n-day moving averages

  Given a sequence of prices $(S_0,....,S_N)$ the Moving Average (MA) process with a rolling window o size $n$ is defined as $(MA(S_n,n),....,(MA(S_N,n))$

  $$MA(S_i,n)=\frac{1}{n}\sum_{j=i-n+1}^{i} S_j,\quad for \; i\geq n$$

  **Note**: If $i<n$, then the convention is to set it to **NaN**, i.e. $ MA(S_i,n)=\text{NaN}$ for $i<n$.


**B)** Write a function `detect_crossovers(df: pd.DataFrame, moving_averages: pd.DataFrame) -> pd.DataFrame` that detects when the price crosses the moving average for each stock. The function should return a DataFrame where True indicates a crossover event and False otherwise. A crossover occurs when the stock price crosses above the moving average from below or below the moving average from above.

Inputs:
- df: A pandas DataFrame with stock prices as described in the main task.
- moving_averages: A pandas DataFrame containing the n-day moving averages for each stock.

Outputs:
- A pandas DataFrame where each cell is True if a crossover occurs on that day, and False otherwise.

Mathematically, we say that a cross-over occurs if either:

- $ MA(S_i,n)< S_i$ and $ MA(S_i+1,n)> S_{i+1}$

- $ MA(S_i,n)> S_i$ and $ MA(S_i+1,n)< S_{i+1}$

**Remark**: If NaN is being compared the convention will be to set the value to False


In [20]:
def calculate_moving_average(df: pd.DataFrame, n: int) -> pd.DataFrame:
    """
    following specification
    Input:
    - df: A pandas DataFrame with dates as the index and stock names as columns. The values represent the stock's closing prices.
    - n: The number of days for the moving average (integer, e.g., 5 for a 5-day moving average).

    Output:
    - A pandas DataFrame with the same structure as the input but containing the n-day moving averages.

    """
    

def detect_crossovers(df: pd.DataFrame, moving_averages: pd.DataFrame) -> pd.DataFrame:
    """
    following specification
    Input:
    - df: A pandas DataFrame with dates as the index and stock names as columns. The values represent the stock's closing prices.
    - moving_averages: A pandas DataFrame containing the n-day moving averages for each stock.

    Output:
    - A pandas DataFrame where each cell is True if a crossover occurs on that day, and False otherwise.

    """
    
    

## Example 1) moving average

In [21]:
calculate_moving_average(df.iloc[:20,:],10)

Unnamed: 0_level_0,SP500,TSLA,V,AMD,COST
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2010-06-29,,,,,
2010-06-30,,,,,
2010-07-01,,,,,
2010-07-02,,,,,
2010-07-06,,,,,
2010-07-07,,,,,
2010-07-08,,,,,
2010-07-09,,,,,
2010-07-12,,,,,
2010-07-13,1053.252997,1.272267,18.52225,7.338,55.222


## Example 2) detect crossover

In [22]:
detect_crossovers(df.iloc[:20,:],calculate_moving_average(df.iloc[:20,:],10))

Unnamed: 0_level_0,SP500,TSLA,V,AMD,COST
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2010-06-29,False,False,False,False,False
2010-06-30,False,False,False,False,False
2010-07-01,False,False,False,False,False
2010-07-02,False,False,False,False,False
2010-07-06,False,False,False,False,False
2010-07-07,False,False,False,False,False
2010-07-08,False,False,False,False,False
2010-07-09,False,False,False,False,False
2010-07-12,False,False,False,False,False
2010-07-13,False,False,False,False,False


# Question 2: Anomaly detection

**A)** Implement a function `detect_anomalies(df: pd.DataFrame, t: z_score_threshold) -> pd.DataFrame:` with the following specifications:

Input:
    
- df: A pandas DataFrame with dates as the index and stock names as columns. The values represent the stock's closing prices.
-  z: The z-score threshold for anomaly flagging 
    
Output:
    
- A pandas DataFrame with the same structure that flags True for an anomaly and False for a "normal" data point. An anomaly is undertood as a stock return value exceeding a z-score threshold

  We use percentage returns dfined as $R_t=(S_{t}-S_{t-1})/S_{t-1}$. Then The Z-score for a given return is defined as:
  
  $$\text{z-score}(R_i)=\frac{R_i-\mathbb{E}[R]}{\sqrt{\mathbb{V}[R]}},$$
  where R is the random variable representing the returns for a specific stock. In our case, we will estimate $\mathbb{E}[R]$ and $\mathbb{V}[R]$ using 
`.mean()` and `.var()` pandas functions

**Remark** The convention will be to remove the first date as it will have NaN values due to the fact of anomaly detection working on returns rather than prices

In [4]:
def detect_anomalies(df: pd.DataFrame, z_score_threshold:float) -> pd.DataFrame:
    """
    following specification
    Input:
    - df: A pandas DataFrame with dates as the index and stock names as columns. The values represent the stock's closing prices.
    - z: The z-score threshold for anomaly flagging 

    Output:
    - A pandas DataFrame with the same structure that flags True for an anomaly and False for a "normal" data point

    """
    
    

In [24]:
detect_anomalies(df.iloc[:20,:],1.5)

Unnamed: 0_level_0,SP500,TSLA,V,AMD,COST
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2010-06-30,False,False,False,False,False
2010-07-01,False,False,False,False,False
2010-07-02,False,False,False,False,False
2010-07-06,False,False,False,False,False
2010-07-07,True,False,False,True,False
2010-07-08,False,True,False,False,True
2010-07-09,False,False,False,False,False
2010-07-12,False,False,False,False,False
2010-07-13,False,False,False,False,False
2010-07-14,False,False,False,False,False
