# Modeling the Risk Neutral Measure

## Statistics Refreshers
* Basic financial arithmetics with problem solving
* Significant Tests, confidence interval, model explanatory power, null/alt hypothesis
* Expponential/Logarithmic
* PMF: The probability mass function
* cdf: cumulative distribution function
* Martingales and measure theory

    
### Stochastic Differential Equation (SDE)
* Generic: In these equations we can see that the underlying asset price, or stochastic process, St, evolves according to two terms. The first term is a drift term, that defines the deterministic movement of the asset over time. The second term then represents the volatility in the stochastic process, and leverages the Brownian Motion building block.
* Bachelier: Perhaps the simplest SDE specification is the Bachelier model where we assume that a Brownian Motion defines the volatility in the stochastic process and also allow for drift in the underlying asset.
* Black-Scholes leads to the well-known Black-Scholes formula. Specification of the Black-Scholes dynamics, as we have done in (2.37), yields asset prices, St that are log-normally distributed, in contrast to the normally distributed asset prices in the Bachelier model.

### Ito's Lemma
Ito’s Formula, or Lemma is perhaps the most important result in stochastic calculus. It is how we calculate differentials of complex functions in stochastic calculus, and, along those lines, is the stochastic calculus equivalent of the chain rule in ordinary calculus.
* Taylor series
* drift-diffusion model
* O(dy/dx)
* closed-form solution
* log-normally distributed
* correction term
* partial differential equations (PDEs)
* Feynman-Kac formula
* Girsanov’s theorem: In the realm of fixed income, and in particular interest rate modeling, we will find that the risk-neutral measure is not always the most convenient pricing measure. This is inherently due to interest rates being a stochastic quantity, making the discounting terms themselves stochastic.
* Radon-Nikodym derivative

Notes
* Mathematical finance - Trusted journal subscription

# Modeling the Physical Measure
In this chapter we transition from working in the risk-neutral measure where we
can replicate a contingent claim using a combination of simpler assets, to the physical measure, where investors are risk-averse, replication arguments are no longer possible and instead we must rely on our ability to forecast. In the physical measure we are no longer interested in the market implied prices, as determined by a set of options, we are interested in actual future outcomes of the assets themselves.

## Efficient Market Hypothesis
The Efficient Market Hypothesis (EMH) is a key tenet of financial market theory and
states that asset prices reflect all available information. This means that, according to the EMH, investors should not be able to consistently outperform on a risk-adjusted
basis
The efficient market hypothesis is broken into three forms, weak, semi-strong and
strong market efficiency, each of which is described below:
* Weak: Current prices reflect all information available in historical prices. If this
form of efficiency is true, then technical analysis and quant strategies based on
momentum and mean-reversion should fail to generate excess profits.
* Semi-Strong: Current prices reflect all publicly available information about a
firm or asset. If this form of market efficiency is correct, then investors cannot
generate profits by relying on fundamental data for a company, trading based on
news releases, earnings statements, or any other information that is accessible
to a broad set of investors.
* Strong: Current prices reflect all public and private information related to a
firm. If this form of market efficiency is to be believed, then even insider trading
cannot generate excess returns

## Market Anomalies
These behavioral phenomenon (bias) give rise to the notion of underlying risk premia. Behavioral finance, and risk premia are introduced and explored further in the next section.
* **momentum effect**
* Another documented anomaly is that firms with low book-to-market values tend to outperform over longer periods than firms with high book-to-market values [74]. This phenomenon is often referred to as the value premia, or **value effect**.
* Kahneman [114], for example, postulated that investors inherently are characterized by loss aversion, meaning they are more concerned with avoiding losses than they are with achieving gains. A canonical example of a risk premia is a premium earned from selling insurance. When buying homeowner’s insurance, for example, homeowner’s are happily willing to pay the prevailing market rate generally without asking questions. Fundamentally, this is to avoid large lump sum payments.


## Linear Regression
A future return, meaning we are making a forecast, and there are several variables
that we postulate have some bearing on that future return. The main assumptions embedded in a linear regression model:
1. Linear relationship between dependent and explanatory variables:
2. Homoscedasticity & Independence of Residuals
3. Lack of High Correlation Between Explanatory Variables

A common example of this is when we build an expected return model. In this case, we will build expected returns that are conditional on a set of explanatory variables. In the equity markets, these explanatory variables may be motivated by a companies balance sheet, earnings estimates, press releases or other fundamental data. For example, a commonly used value signal uses current price-to-earnings or price-to-book as a forecast for subsequent returns. In many cases the explanatory variable may be a return from a previous period.

Additionally, we often rely on regression techniques to build factor models that are based on contemporaneous regressions.

## Time Series
Modeling the evolution of a series over time, that is indexed by time in a specific order. 
Perhaps the most fundamental defining feature of time series data is whether it is
stationary or non-stationary. Non-stationary means that the distribution evolves over time, which creates challenges when modeling the process. Stationary processes, on the other hand, have the same distributional properties regardless of time, making them comparatively easier to model because we know all observations emanate from the
same distribution, making the distribution we are looking to forecast more tractable. 

In many cases, it is convenient to first de-trend and remove these components, such as seasonality, prior to analyzing the time series

An important feature of stationary asset price processes is that they imply a mean-reverting component. It means that divergences from a trend or baseline are likely to revert. 
* mean-reversion signals.

Conversely, if data is non-stationary, as it is in a random walk, as we will soon see,
then it is commonplace to difference the data. The difference, then, is often stationary.
This is the case in a random walk, which is non-stationary, but when differenced is
stationary, as it is a simple white noise process. These building blocks are explored
further in the next section. In practice, we often observe that an assets price process is non-stationary. This implies a lack of mean-reversion in the asset prices, and is in line with the EMH. In these cases, we then difference the data, and work with the return process, which is usually found to be stationary.

* ARMA models: 
In an ARMA, or Autoregressive Moving Average model, we combine parts of AR and MA models. Like AR and MA models, ARMA models are defined by the number of lags that are included in the process. In the case of an ARMA models, we need to specify the number of both AR and MA lags. ARMA(n, q), where n is the number of AR lags, q is the number of MA lags, and ϕi and θi are the model parameters for the ith AR or MA term, respectively.

    * Autoregressive (AR) processes allow for momentum or mean-reversion in the return process for an asset, a notable deviation from a random walk. More generally, an AR(n) process can be constructed by including lagged terms up to n. Dickey-Fuller and Augmented Dickey-Fuller test
    * Moving Average (MA)
    
One approach to this is to use an **Akaikeinformation criterion (AIC)** , which is a technique for model selection that helps balance model complexity vs. model fit. Additionally, plots of the **autocor-relation** and **partial autocorrelation** function can be useful for identifying the number of significant lags

    * Autocorrelation Function (ACF): measures the total autocorrelation between the process at time t and t + s without controlling for other lags.
    * Partial Autocorrelation Function (PACF): measures the partial auto-correlation between the process at time t and t+s removing the effects of the lags at time t + 1 through t + s − 1.
    
* state space representation

## Panel Regression
In many financial datasets we observe a cross-section at each point in time, and have a subsequent time series for each asset in the cross section. That is, we are often working with panel data. In the following list, we briefly describe these most common techniques
for working with panel data:
* **Fixed Effects:** Assume a constant slope across the panel but allow the intercept to vary. This can be equivalently formulated as including a dummy or indicator variable for each asset. 
* **Random Effects:** The distinguishing feature of a random effects model is that it allows for the effects in different assets, or groups, to be random, rather than a constant as in the fixed effects model. This random effect is assumed to be independent of the explanatory variables.
* **Mixed Effects:** Incorporates a combination of fixed and random effects into the panel regression model for different variables

## Portfolio & Investment Concepts
One of the core concepts in investing is that of a portfolio. Few if any investors choose to invest their money in a single asset. Instead, they rely on a portfolio, or set of assets. This enables them to benefit from the seminal concept of diversification. Some have even gone so far as to describe diversification as the only available free
lunch in financial markets. 
* **Time Value of Money:** The concept of time value of money is centered on the idea that a dollar today is worth more than a dollar tomorrow. This concept is closely related to the idea of opportunity costs, and is fundamentally because, if we have the dollar today we can invest it and earn interest on it, or spend it and earn utility from our purchase.We can use the concept of time value of money to establish an equivalence between present and future cashflows.
* **Compounding returns:** Another central concept in the realm of portfolio management is that of compounding, which refers to the idea that over time, investment gains are magnified as we not only earn interest on initial capital, but we also earn interest on previously earned interest.
* **Portfolio Calculations:**  Two of the most common portfolio statistics that we will need to compute are the return and volatility, or standard deviation of a portfolio. In conjunction, these two metrics help us identify the risk-adjusted return of a portfolio which is perhaps the most common mechanism of gauging a portfolio’s attractiveness.


## Bootstrapping
Bootstrapping is a non-parametric technique for creating sample paths from an empirical distribution. The idea behind bootstrapping is that each realization we observe is in actuality just a draw from some unobserved density. 
As bootstrapping works by re-sampling paths from an observed dataset, it is inherently a method for generating synthetic data. Use of this synthetic data can be of great use in a finance world where we are generally lacking data, however. Additionally, bootstrapping can be useful in that it gives a broad sense of the range of potential outcomes based on limited parametric assumptions about the observed
data.

## Principal Component Analysis
It has particular appeal as it provides insight into the structure of the underlying matrix. It can help us identify, for example, what are the most important drivers of variance within the set of assets. Matrix decomposition



In the last two chapters, we have detailed the foundational tools for modeling the risk-neutral and physical measure, respectively. In many ways, the skillset in the two worlds appears quite different. In the risk-neutral world, we rely on hedging arguments, replication and must be well versed in stochastic calculus. In the physical measure, by contrast, these replication arguments no longer help us.

# Python: OOP
There are four main tenets of object-oriented programming, inheritance,
encapsulation, abstraction and polymorphism, each of which is briefly summarized
in the following list:
* Inheritance: allows us to create class hierarchies and have classes inherit functionality from other classes.
* Encapsulation: an object’s internal representation is hidden from outside the object.
* Abstraction: exposing necessary functionality from an object while abstracting other details from the user of a class.
* Polymorphism: ability of a sub-class to take on different forms and either have their own behavior or inherit their behavior from a parent

## Data Structures

## Classes: 
Attributes & Methods, Global functions, operators, Constructors & destructors

## Design Patterns
Design Patterns are a specified set of coding structures or guidelines that lead to
re-usable, generic code. This set of design patterns is intended to make creating
objects, structuring them and communication between them more seamless. The set of available design patterns are broken into three logical groups: creational,
structural and behavioral. A brief definition of each type of design pattern can be
found here:
* **Creational:** deals with optimal creation/initialization of objects.
* **Structural:** deals with organizing classes to form larger objects that provide new functionality.
* **Behavioral:** deals with communications between two or more objects.

* **Abstract Base** classes are classes within an inheritance hierarchy that aren’t meant to be created themselves, but instead are meant to provide a shell with pre-determined functionality for derived classes to implement.
* **Factory Pattern** consists of a single global function that initializes and returns an instance of the appropriate base or derived class type based on a given input parameter. It is useful for helping to create a class within a class hierarchy with many derived classes. It is a creational design pattern where we enable the user to specify the type of class within a hierarchy that they would like to create, along with some required additional arguments to create the object.
* **Singleton Pattern** is also a creational pattern that ensures that only one instance of a class can be created each time the application is run. This paradigm is useful for things that we want to make sure are unique, such as connections to databases and log files.
* **Template method** works when we define a sketch of an algorithm in an abstract base class without defining certain details of the functionality.

## Search Algorithms: Binary Search
The binary search algorithm is a way of searching for a value in an already sorted
array. We explore methods for sorting arrays in the next section. Binary search works
by splitting the sorted array in half at each step, and checking whether the item in
the middle is above or below the search value. 

## Sort Algorithms
* A **selection sort** algorithm is a naive sort algorithm that sorts by repeatedly finding the smallest value in the array and placing it at the beginning of the array. 
* An **insertion sort** works by iterating sequentially through the array elements and placing the chosen element in the appropriate place in the already sorted portion of the array.
* A **Bubble Sort** is a simple sorting algorithm that works by repeatedly swapping elements that are sorted incorrectly until all elements are sorted in the correct order. Because this algorithm only moves elements one place at a time, it is not an efficient algorithm and many comparisons need to be done in order for the algorithm to know it is finished.
* A **merge sort** which is a more involved sorting algorithm based on a divide and conquer approach. The algorithm works by dividing the array in half, and merging the two halves after they have been sorted. This process is called recursively, that is, we break the array into smaller and smaller pieces until we are able to sort them.