## Quant Roles of Interest
* **Asset Management Quants:** A quant at an asset manager will have a large
emphasis placed on portfolio construction and portfolio optimization techniques. As
such, understanding of the theory of optimization, and the various ways to apply
it to investment portfolios is a critical skill-set. Asset management quants may also
be responsible for building alpha models and other signals, and in doing so will
leverage econometric modeling tools. Asset management quants will rely heavily on
the material covered in section IV of this text.
* **Data Scientist:** Financial technology is a burgeoning area of growth within the
finance industry and is a natural place for quants. Technological advances have led to a proliferation of data over the last decade, leading to new opportunities for quants
to analyze and try to extract signals from the data. At these firms, quants generally
serve in data scientist type roles, apply machine learning techniques and solve big
data problems. For example, a quant may be responsible for building or applying
Natural Language Processing algorithms to try to company press releases and trying
to extract a meaningful signal to market to buy-side institutions.

### Ito's Lemma
Ito’s Formula, or Lemma is perhaps the most important result in stochastic calculus. It is how we calculate differentials of complex functions in stochastic calculus, and, along those lines, is the stochastic calculus equivalent of the chain rule in ordinary calculus.
* Taylor series
* drift-diffusion model
* O(dy/dx)
* closed-form solution
* log-normally distributed
* correction term
* partial differential equations (PDEs)
* Feynman-Kac formula
* Girsanov’s theorem: In the realm of fixed income, and in particular interest rate modeling, we will find that the risk-neutral measure is not always the most convenient pricing measure. This is inherently due to interest rates being a stochastic quantity, making the discounting terms themselves stochastic.
* Radon-Nikodym derivative

# Working with financial datasets
Some of the most common challenges that we face when working with financial
data include:
* Cleaning data in a robust, consistent way
* Handling gaps in different parts of our data
* Handling extremely large data sets
* Structuring the data in an optimal way
* Building robust data integrity checks to ensure high quality data

## Data Collection
Data sources, types, & quality
webscraping, databases, 

### Data sets
Pairing them with macroeconomic data
* Stock data: price/returns, time series
* Currency: exchange rates historical
* Futures: Curve, rolling (strategy) because of expiry date
* Options: multi-diemnsial (different strike prices & expiries)
* Fixed Income: 
    * Interest rates--> yield curve, a representation for yields of different instruments (i.e. bonds or swaps)
    * Credit --> default rate curve, infer the probability that a firm will defaul at various times
    
### Data Sources
* **Yahoo Finance:** Perhaps the most popular free source of data for aspiring  quants. Mainly consists of stock price data with some futures data as well.
* **Ken French’s Website:** Useful historical datasets of the returns for FamaFrench factors.
* **FRED:** Federal Reserve Bank of St. Louis website. Contains a significant amount of historical data on economic variables, such as GDP, employment and credit spreads.
* **Treasury.gov:** Historical yield curve data for the US.
* **Quandl:** Contains equity market data as well as data on futures in different asset classes.
* **HistData:** Contains free data from FX markets, including intra-day data.
* **OptionMetrics:** Contains relatively clean options and futures data for equity and other markets. A great, but costly source of options data.
* **CRSP:** A broad, robust, historical database that doesn’t suffer from survivorship bias. Equity prices, and other datasets available via CRSP.

### Cleaning procedure
* **Proper Handling Corporate actions**: dividends, stock splits, mergers and acquisitions. 
* **Avoiding survivorship bias**
* **Detecting Arbitrage in the Data**: With respect to options data, there are set of arbitrage conditions that must hold for options of different strikes:
    * Call prices must be monotonically decreasing in strike.
    * Put prices must be monotonically increasing in strike.
    * The rate of change with respect to strike for call prices must be greater than −1 and less than 0.
    * The rate of change with respect to strike for put prices must be greater than 0 and less than 1.
    * Call and put prices must be convex with respect to changes in strike
* Interpolation & filling forward
* Filling vis Regression
* Boostrapping
* Outlier Detection
    * Single vs. Multi-variate
    * Plotting
    * Standard deviation
    * Density Analysis
    * Distance from kNN
    
## Model Validation
* Model Documentation: prevailing theory, relevant literature, model parameters, algorithm logic, data sourcing/quality/preparation, assumptions, 
* Code Review: user-friendly, reproducible/scalable, 
* Unit Tests
* Production Model Change Process