## Harnessing factor data using pandas_datareader

In [1]:
#!pip install pandas_datareader

Collecting pandas_datareader
  Downloading pandas_datareader-0.10.0-py3-none-any.whl.metadata (2.9 kB)
Downloading pandas_datareader-0.10.0-py3-none-any.whl (109 kB)
Installing collected packages: pandas_datareader
Successfully installed pandas_datareader-0.10.0
[0m

In [2]:
import warnings

In [3]:
import pandas_datareader as pdr
from IPython.display import display

In [4]:
warnings.filterwarnings("ignore")

Fetches the Fama-French research data factors and stores it in 'factors'

In [13]:
factors = pdr.get_data_famafrench(
    "F-F_Research_Data_Factors",
)
factors

{0:          Mkt-RF   SMB    HML    RF
 Date                              
 2020-02   -8.13  1.07  -3.80  0.12
 2020-03  -13.39 -4.79 -13.88  0.13
 2020-04   13.65  2.45  -1.34  0.00
 2020-05    5.58  2.49  -4.85  0.01
 2020-06    2.46  2.69  -2.23  0.01
 2020-07    5.77 -2.30  -1.44  0.01
 2020-08    7.63 -0.28  -2.88  0.01
 2020-09   -3.63 -0.03  -2.65  0.01
 2020-10   -2.10  4.27   4.31  0.01
 2020-11   12.47  5.72   2.15  0.01
 2020-12    4.63  4.79  -1.34  0.01
 2021-01   -0.03  7.50   2.85  0.01
 2021-02    2.78  2.07   7.10  0.00
 2021-03    3.08 -2.28   7.27  0.00
 2021-04    4.93 -3.20  -0.95  0.00
 2021-05    0.29 -0.27   7.13  0.00
 2021-06    2.75  1.60  -7.75  0.00
 2021-07    1.27 -3.94  -1.81  0.00
 2021-08    2.91 -0.46  -0.10  0.00
 2021-09   -4.37  0.67   5.10  0.00
 2021-10    6.65 -2.37  -0.45  0.00
 2021-11   -1.55 -1.32  -0.41  0.00
 2021-12    3.10 -1.64   3.22  0.01
 2022-01   -6.25 -5.96  12.80  0.00
 2022-02   -2.29  2.19   3.10  0.00
 2022-03    3.06 -1.66  -

In [6]:
display(factors["DESCR"])

'F-F Research Data Factors\n-------------------------\n\nThis file was created by CMPT_ME_BEME_RETS using the 202412 CRSP database. The 1-month TBill rate data until 202405 are from Ibbotson Associates. Starting from 202406, the 1-month TBill rate is from ICE BofA US 1-Month Treasury Bill Index. Copyright 2024 Eugene F. Fama and Kenneth R. French\n\n  0 : (59 rows x 4 cols)\n  1 : Annual Factors: January-December (5 rows x 4 cols)'

Displays the first few rows of the first (monthly) dataset in 'factors'

In [7]:
data = factors[0].head()

In [8]:
display(data)

Unnamed: 0_level_0,Mkt-RF,SMB,HML,RF
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2020-02,-8.13,1.07,-3.8,0.12
2020-03,-13.39,-4.79,-13.88,0.13
2020-04,13.65,2.45,-1.34,0.0
2020-05,5.58,2.49,-4.85,0.01
2020-06,2.46,2.69,-2.23,0.01


Displays the first few rows of the second dataset (annual) in 'factors'

In [9]:
data = factors[1].head()

In [10]:
display(data)

Unnamed: 0_level_0,Mkt-RF,SMB,HML,RF
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2020,23.66,12.72,-46.1,0.45
2021,23.57,-3.78,25.39,0.04
2022,-21.58,-7.04,25.97,1.43
2023,21.69,-3.28,-13.7,4.95
2024,19.78,-11.13,-9.1,5.26


Fetches the Fama-French research data factors within the specified date range and stores it in 'factors'

In [14]:
factors = pdr.get_data_famafrench(
    "F-F_Research_Data_Factors", start="2000-01-01", end="2023-12-31"
)

In [15]:
display(factors)

{0:          Mkt-RF    SMB   HML    RF
 Date                              
 2000-01   -4.74   5.77 -1.88  0.41
 2000-02    2.45  21.36 -9.59  0.43
 2000-03    5.20 -17.20  8.13  0.47
 2000-04   -6.40  -6.68  7.26  0.46
 2000-05   -4.42  -6.05  4.75  0.50
 ...         ...    ...   ...   ...
 2023-08   -2.39  -3.20 -1.08  0.45
 2023-09   -5.24  -2.49  1.45  0.43
 2023-10   -3.18  -3.88  0.19  0.47
 2023-11    8.83  -0.03  1.66  0.44
 2023-12    4.87   6.36  4.92  0.43
 
 [288 rows x 4 columns],
 1:       Mkt-RF    SMB    HML    RF
 Date                            
 2000  -17.60  -4.60  44.98  5.89
 2001  -15.21  18.16  18.52  3.83
 2002  -22.76   4.39   8.09  1.65
 2003   30.75  26.49   4.67  1.02
 2004   10.72   4.45   7.61  1.20
 2005    3.09  -2.36   9.41  2.98
 2006   10.60   0.09  11.93  4.80
 2007    1.04  -7.44 -17.18  4.66
 2008  -38.34   2.40   1.05  1.60
 2009   28.26   9.18  -9.65  0.10
 2010   17.37  14.15  -5.15  0.12
 2011    0.44  -5.73  -8.41  0.04
 2012   16.27  -1.40  1

**Jason Strimpel** is the founder of <a href='https://pyquantnews.com/'>PyQuant News</a> and co-founder of <a href='https://www.tradeblotter.io/'>Trade Blotter</a>. His career in algorithmic trading spans 20+ years. He previously traded for a Chicago-based hedge fund, was a risk manager at JPMorgan, and managed production risk technology for an energy derivatives trading firm in London. In Singapore, he served as APAC CIO for an agricultural trading firm and built the data science team for a global metals trading firm. Jason holds degrees in Finance and Economics and a Master's in Quantitative Finance from the Illinois Institute of Technology. His career spans America, Europe, and Asia. He shares his expertise through the <a href='https://pyquantnews.com/subscribe-to-the-pyquant-newsletter/'>PyQuant Newsletter</a>, social media, and has taught over 1,000+ algorithmic trading with Python in his popular course **<a href='https://gettingstartedwithpythonforquantfinance.com/'>Getting Started With Python for Quant Finance</a>**. All code is for educational purposes only. Nothing provided here is financial advise. Use at your own risk.