<a href="https://colab.research.google.com/github/boyerb/Investments/blob/master/Ex07-WRDS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Investments: Theory, Fundamental Analysis, and Data Driven Analytics**, Bates, Boyer, and Fletcher

# Example Chapter 7: The WRDS API
In this example we illustrate how you can access CRSP data using the WRDS API. You will first need to obtain a WRDS account through your institution with a username and password, and set up dual authentication through the WRDS site.

###Imports and Setup
We install and import four packages.  Two of these packages, `simple_finance.py` and `wrds` are not are not included in Colab by default.

* We use the  `!curl` command to download the custom helper module `simple_finance.py` from the course GitHub repo.  By packaging code into useful functions, this helper module keeps the Python examples clean, minimizes distractions from secondary coding details, and maintains focus on the fore objectives. You can see the complete file and code at [`https://github.com/boyerb/Investments/blob/master/functions/simple_finance.py`](https://github.com/boyerb/Investments/blob/master/functions/simple_finance.py).  

* We also use the `!pip install`  command to install the WRDS package.  Libraries like `NumPy` and `Pandas` are already installed in most Python environments, so they can be imported directly without any extra steps.

In [1]:
!curl -O https://raw.githubusercontent.com/boyerb/Investments/master/functions/simple_finance.py
import simple_finance as sf
!pip -q install wrds
import wrds
import numpy as np
import pandas as pd

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100 29827  100 29827    0     0   113k      0 --:--:-- --:--:-- --:--:--  112k


### Establish a WRDS Connection
When you run this block, you will be prompted to enter your WRDS username and password, and then complete dual authentication on your phone. You will then be asked if you want to create a .pgpass file. In local environments such as PyCharm or Visual Studio, a .pgpass file stored on your PC allows you to use the WRDS API without re-entering your credentials each time. In Google Colab, however, the file is saved in the temporary home directory (/root/.pgpass) and will be erased whenever you start a new session.  As such you can just respond "`n`".

In [2]:
db = wrds.Connection()

Enter your WRDS username [root]:boyerdude1
Enter your password:··········
WRDS recommends setting up a .pgpass file.
Create .pgpass file now [y/n]?: n
You can create this file yourself at any time with the create_pgpass_file() function.
Loading library list...
Done


### Controlling DataFrame Display
By default, Pandas may hide some columns or wrap output across multiple lines when displaying a wide DataFrame. The settings below change this behavior to provide better diaplys of Pandas DataFrames.

In [3]:
pd.set_option('display.max_columns', None)   # Show all columns without truncation
pd.set_option('display.width', 1000)   # Set the display width so output stays on one line

### Download Data
After creating a list of identifiers and defining the start and end dates for the data you wish to use, the `sf.get_crsp_msf_by_ids` function will pull the data for you.  

**Inputs**
* WRDS database connection object (`db`)
* List of identifiers
* Start date
* End date  

**Output**
* Pandas DataFrame with the desired data

Identifiers can be either **tickers** or **PERMNOs**. The function will automatically detect what kind of identifier you are using. Since tickers can change over time, this becomes an important consideration when pulling data over long periods. To address this, CRSP assigns each security a permanent identifier, the PERMNO, that does not change. In addition, WRDS provides an [online tool](https://wrds-www.wharton.upenn.edu/pages/get-data/center-research-security-prices-crsp/annual-update/tools/translate-to-permcopermno/)
 to translate tickers into PERMCO/PERMNO identifiers. The CRSP PERMCO is a permanent *company* identifier, while the PERMNO is a permanent *security identifier*. Several securities may be associated with the same company since firms can issue various classes of equity.  

In [5]:
ids=['IBM','GOOG']    #
start='1990-01-01'
end = '2024-12-31'
data=sf.get_crsp_msf_by_ids(db,ids,start,end)

In [None]:
print(data)