# DataCamp

## Data Engineering

#### Introduction to Data Engineering

#### Building Data Engineering Pipelines in Python

## Programming

#### <span style="color:green">**Introduction to Python**</span>

#### Intermediate Python for Data Science

#### Python Data Science Toolbox (Part 1)

#### Python Data Science Toolbox (Part 2)

#### Writing Functions in Python

#### Data Types for Data Science

#### Writing Efficient Python Code

#### Working with Dates and Times in Python

#### Optimizing Python Code with pandas

#### Object-Oriented Programming in Python

#### Introduction to Data Science in Python

#### Software Engineering for Data Scientist in Python

#### Unit Testing for Data Science in Python

#### Command Line Automation in Python

#### Creating Robust Python Workflows

#### Introduction to PySpark

#### Big Data Fundamentals with PySpark

#### Python for Spreadsheet Users

#### Python for R Users

## Importing and Cleaning Data

#### Importing Data in Python (Part 1)

#### Importing Data in Python (Part 2)

#### Cleaning Data in Python

#### Streamlined Data Ingestion with pandas

#### Cleaning Data with PySpark

## Data Manipulation

#### pandas Foundations

#### Manipulating DataFrames with pandas

#### Manipulating Time Series Data in Python

#### Merging DataFrames with pandas

#### Introduction to Databases in Python

#### Analyzing Social Media Data in Python

#### Introduction to Spark SQL in Python

#### Feature Engineering with PySpark

#### Parallel Computing with Dask

#### pandas Joins for Spreadsheet Users

#### Analyzing IoT Data in Python

## Data Visualization

#### Introduction to Data Visualization with Python

#### Interactive Data Visualization with Bokeh

#### Introduction to Matplotlib

#### Introduction to Seaborn

#### Data Visualization with Seaborn

#### Visualing Time Series Data in Python

#### Improving Your Data Visualization in Python

## Probability and Statistics

#### Statistical Thinking in Python (Part 1)

#### Statistical Thinking in Python (Part 2)

#### Network Analysis in Python (Part 1)

#### Network Analysis in Python (Part 2)

#### Time Series Analysis in Python

#### Introduction to Linear Modeling in Python

#### Statistical Simulation in Python

#### Case Studies in Statistical Thinking

#### Generalized Linear Models in Python

#### Foundations of Probability in Python

#### Experimental Design in Python

## Machine Learning

#### Supervised Learning with scikit-learn

#### Introduction to Deep Learning in Python

#### Machine Learning for Marketing in Python

#### Unsupervised Learning in Python

#### Machine Learning with Tree-Based Models in Python

#### Linear Classifiers in Python

#### Extreme Gradient Boosting with XGBoost

#### Introduction to TensorFlow in Python

#### Machine Learning for Time Series Data in Python

#### Preprocessing for Machine Learning in Python

#### Introduction to Deep Learning with Keras

#### Machine Learning for Finance in Python

#### Clustering Methods with SciPy

#### Advanced Deep Learning with Keras

#### Foundations of Predictive Analytics in Python (Part 1)

#### Foundations of Predictive Analytics in Python (Part 2)

#### Dimensionality Reduction in Python

#### Introduction to Machine Learning with PyTorch

#### Forecasting Using ARIMA Models in Python

#### Feature Engineering for Machine Learning in Python

#### Advanced NLP with spaCy

#### Machine Learning with PySpark

#### Feature Engineering for NLP in Python

#### AI Fundamentals

#### Hyperparameter Tuning in Python

#### Model Validation in Python

#### Sentiment Analysis in Python

#### Building Recommendation Engines with PySpark

#### Design Machine Learning Workflows in Python

#### Ensemble Methods in Python

#### Machine Translation in Python

## Applied Finance

#### Intro to Python for Finance

#### Importing and Managing Financial Data in Python

#### Intro to Portfolio Risk Management in Python

#### Introduction to Portfolio Analysis in Python

#### Introduction to Financial Concepts in Python

#### Financial Forecasting in Python

# <span style="color:red">**Advanced Portfolio Construction and Analysis with Python**</span>

***

## <span style="color:blue">**Style and Factors**</span>

***

#### Video: Welcome Video
**Develop programming skills and expertise in data science**  
**Transfer expertise in programming and/or data science to the investment industry**  
**Manage investments better and with greater confidence**  

#### Video: Introduction to Factor Investing
**Indicization - creating new indices that captures a portion of active management; it is rules based and systematic and desgined to outperform the cap-weighted benchmark in the long-run**  
Factor - a variable that influences asset returns  
Factors have an extrinsic commonality and impact on asset returns  
Exposure to factor risk, over the long-run, is rewarded and quantfied by a factor risk premium  
**Three Categories of Factors**  
Macro - Growth, Inflation, etc... (don't generally tend to be used in investment management)  
Statistical - something extracted from the data that may or may not be identifiable  
Style (Intrinsic) - Value-Growth, Momentum, Low Volatility (research has shown that everybody is a factor investor)

#### Video: Factor Models and the CAPM
**General form of a factor model**  
R(i) = B(1)f(1) + B(2)f(2) + ... + B(n)f(n) + alpha + epsilon  
**CAPM:** E[r(i)] - r(f) = cov(r(i), r(m)) / var(r(m)) * (E[r(m)] - r(f)) = B(i) * (E[r(m)] - r(f))  
The CAPM does a fairly poor job at explaining anomalies observed in the data  
Fama-French multi-factor model - an extension of the CAPM - tries to explain away these anomalies

#### Video: Multi-Factor Models and Fama-French

Fama and French is an extension of the CAPM  
The FF model includes Size (market cap) and Value (price-to-book) as additional factors  
Momentum is commonly included as an additional factor  
FF --> E[r(p)] = r(f) + Beta(i, mkt)E[r(m) - r(f)] + Beta(i, smb)E[smb] + Beta(i, hml)E[hml]

#### Video: Factor Benchmarks and Style Analysis
R(m) = W(1)R(i,1) + W(2)R(i,2) + W(r)R(i,3) + alpha + epsilon  
  W(i) > 0 and sum(W(i)) = 1  
Solved through quadratic programming and repeated for a sliding 1-3 year window  
Quality of fit can be measured through the pseudo R^2  
Manager value added can also be assessed
Manager style can be inferred

#### Video: Shortcomings of Cap-Weighted Indices
EW beats CW  
CW indices tend to be heavily concentrated poorly diversified portfolios - this leads to inefficient diversification of unrewarded and specific risks  
Smart (weighted) benchmarks have been introduced to fix the problem  
These include - equal weighted, min-var, risk-parity

#### Video: From Cap-Weighted Benchmarks to Smart-Weighted Benchmarks  
CW indices provide an inefficient exposure to rewarded systematic risks  
In other words, holding the CW leads to paying for instead of cashing in on the risk premia associated with factor exposure  
Smart factor indices - (1) select your desired factor exposures

#### Video: Module 1 Lab Session - Foundations

In [11]:
import pandas as pd

brka_d = pd.read_csv("data/brka_d_ret.csv", parse_dates=True, index_col=0)

In [13]:
brka_d.head()

Unnamed: 0_level_0,BRKA
DATE,Unnamed: 1_level_1
2018-12-24,-0.018611
2018-12-26,0.0432
2018-12-27,0.012379
2018-12-28,0.013735
2018-12-31,0.011236


In [16]:
import edhec_risk_kit_201 as erk

%load_ext autoreload
%autoreload 2

brka_m = brka_d.resample("M").apply(erk.compound).to_period("M")

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [17]:
brka_m.head()

Unnamed: 0_level_0,BRKA
DATE,Unnamed: 1_level_1
1990-01,-0.140634
1990-02,-0.030852
1990-03,-0.069204
1990-04,-0.003717
1990-05,0.067164


#### Discussion Prompt: Cap-Weighted Indices and Equity Benchmarks

#### Quiz: Module 1 Graded Quiz

## <span style="color:blue">**Robust Estimates for the Covariance Matrix**</span>

***

#### Reading: Module 2 Key Points

#### Video: The Curse of Dimensionality

#### Video: Estimating the Covariance Matrix with a Factor Model

#### Video: Honey I Shrunk the Covariance Matrix!

#### Video: Portfolio Construction with Time-Varying Risk Parameters

#### Video: Exponentially Weighted Average

#### Video: ARCH and GARCH Models

#### Video: Module 2 Lab Session - Covariance Estimation

#### Discussion Prompt: Covariance Matrix Estimation

#### Quiz: Module 2 Graded Quiz

## <span style="color:blue">**Robust Estimates for Expected Returns**</span>

***

#### Reading: Module 3 Key Points

#### Video: Lack of Robustness of Expected Return Estimates

#### Video: Agnostic Priors on Expected Return Estimates

#### Video: Using Factor Models to Estimate Expected Returns

#### Video: Extracting Implied Expected Returns

#### Video: Introducing Active Views

#### Video: Black-Litterman Analysis

#### Reading: The Intuition Behind Black-Litterman Model Portfolios

#### Video: Module 3 Lab Session - Black-Litterman

#### Discussion Prompt: Expected REturns

#### Quiz: Module 3 Graded Quiz

## <span style="color:blue">**Portfolio Optimization in Practice**</span>

***

#### Reading: Module 4 Key Points

#### Reading: Survey - Alternative Equity Beta Investing

#### Video: Naive Diversification

#### Video: Scientific Diversification

#### Video: Measuring Risk Contributions

#### Video: Simplified Risk Parity Portfolio

#### Video: Risk Parity Portfolio

#### Video: Comparing Diversification Options

#### Video: Module 4 Lab Session - Risk Contribution and Risk Parity

#### Reading: Dive into Heuristic Diversification

#### Discussion Prompt: Portfolio Construction Methodologies

#### Quiz Module 4 Graded Quiz

#### Reading: To Be Continued (2)

# <span style="color:blue">**DataCamp (Non-Essential)**</span>

#### Python for MATLAB Users

#### Web Scraping in Python

#### Analyzing Police Activity with pandas

#### Regular Expressions in Python

#### Dealing with Missing Data in Python

#### Preparing for Coding Interview Questions in Python

#### Credit Risk Modeling in Python

#### Machine Learning with the Experts: School Budgets

#### Exploratory Data Analysis in Python

#### Predicting Customer Churn in Python

#### Analyzing Marketing Campaigns with pandas

#### Analyzing US Census Data in Python

#### HR Analytics in Python: Predicting Employee Churn

#### Customer Segmentation in Python

#### Introduction to Natural Language Processing in Python

#### Buidling Chatbots in Python

#### Convolutional Neural Networks for Image Processing

#### Image Processing in Python

#### Preparing for Machine Learning Interview Questions in Python

#### Winning a Kaggle Competition in Python

#### Fraud Detection in Python

#### Supply Chain Analytics in Python

#### Recurrent Neural Networks for Language Modeling in Python

#### Preparing for Statistics Interview Questions in Python

#### Customer Anlaytics and A/B Testing in Python

#### Visualing Geospatial Data in Python

#### Introduction to MongoDB in Python

#### Biomedical Image Analysis in Python

#### Working with Geospatial Data in Python

#### Spoken Language Processing in Python

#### Introduction to AWS Boto in Python