##  Probit models to forecast binary outcomes such as recessions
<div style="text-align: right"> Fogli Alessandro </div>
<div style="text-align: right"> ID 231273 </div>
<div style="text-align: right"> Project #3 </div>

### Install packages

In [137]:
from scipy import stats
import pandas as pd
import numpy as np
import statsmodels.api as sm
import yfinance as yf
import pandas_datareader as pdr
from IPython.display import display, HTML
import datetime as dt
import getFamaFrenchFactors as gff
import quandl
from fredapi import Fred
import config
fred = Fred(api_key= config.fred_api)
QUANDL_KEY = config.quandl_key
quandl.ApiConfig.api_key = QUANDL_KEY

### Data (all quarterly)

Get data of difference in yields between 10 year and 3 months U.S. Treasuries

In [139]:
term = fred.get_series('T10Y3M', observation_start="1955-02-01", observation_end= "2022-02-01" ,frequency='q')
term = term.tolist()
#start at 1982-01-01

Get data of Federal Funds Rate

In [23]:
funds_rate = fred.get_series('FEDFUNDS', observation_start="1955-02-01", observation_end= "2022-02-01" ,frequency='q')
funds_rate = funds_rate.tolist()

Get data of GDP Growth

In [49]:
gdp = fred.get_series('GDP', observation_start="1955-02-01", observation_end= "2022-03-01" ,frequency='q', units='pch')
gdp1 = gdp.tolist()
#miss first quarter 2022

NBER based Recession Indicators for the United States

In [143]:
nber = fred.get_series('USREC', observation_start="1955-02-01", observation_end= "2022-03-01" ,frequency='q')
nber = nber.astype(int)
nber = nber.tolist()

Get data of S&P500

In [114]:
sp500 = quandl.get("MULTPL/SP500_REAL_PRICE_MONTH", start_date='1954-07-01', end_date='2022-03-01', collapse="monthly")
#sp500_rtn = sp500.pct_change()
#sp500_rtn.fillna(0, inplace=True)

sp500_quarter_rtn = sp500.resample("3M").mean()

sp500_quarter_rtn = sp500_quarter_rtn.pct_change()
sp500_quarter_rtn = sp500_quarter_rtn.iloc[2: , :]
sp500_quarter_rtn = sp500_quarter_rtn.apply(lambda x: x* 100)

### Probit model

A probit regression is a version of the generalized linear model used to model dichotomous outcome variables. It uses the inverse standard normal distribution as a linear combination of the predictors. The binary outcome variable Y is assumed to have a Bernoulli distribution with parameter p (where the success probability is p∈(0,1). Hence, the probit link function is:
$$ probit(Y) = \sum_{k=0}^n \beta_{k} x_{ik} $$

The Probit model assumes that the firm’s probability of recession has a cumulative standard-normal distribution, rather than a logistic distribution. However, by multiplying the results of the logistic distribution by an appropriate coefficient the distribution of the Probit model can be obtained.