# Equity Research Report - Classification Ratings

Every equity researcher examines a company's financial statement. Each statement maybe adjusted to reflect certain items (depending on the industry), such as adding back non-recurring items, adjustment for depreciation and amortization. Some of these adjustments are necessary and close attention to SEC filing on the footnotes is necessary.

With that in mind, is it possible to predict an earnings classification ratings based on financial statements and the researchers' output? 

This notebook will analyze over 10 year CFRA reports with each quarterly filings.

### Process

- Find all equity research report and its corresponding financial statement reportings
- With domain knowledge, select necessary features on the 3 statements to determine what is necessary
- (future) SEC started providing footnotes. We will need to map the footnote ID to the company

The ratings are broken down into another feature called `delt`. This occurs when an analyst indicates a change in position. For example: "...maintain buy", "...upgrade from buy to strong buy", "... downgrade from strong buy to buy" etc. 


### Word on Models
This model is simplified to first test out certain features.

- There are imbalance problems (of course!) 
- Imbalance occurs when one class is significantly more or less than other.
- Test for 4 classification models (Logistic, gradient boost, random forest, decision trees)

In [16]:
import pandas as pd
import numpy as np

# visualization imports
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('fivethirtyeight')

%matplotlib inline

# modelling imports

from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import precision_recall_fscore_support, roc_auc_score, confusion_matrix,accuracy_score
from sklearn.preprocessing import StandardScaler
from sklearn. model_selection import cross_val_score

from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC

from imblearn.over_sampling import SMOTE
import xgboost as xgb

from sklearn.model_selection import learning_curve

In [25]:
# data was preorganized and cleaned by matching equity research report ticker/rating to 3 statements
df = pd.read_csv("./../all_simplified_ver_data.csv")
features = ['Revenue', 'Cost of Revenue',
       'Net Income', 'Gross Profit', 'Operating Expenses',
       'Selling, General & Administrative', 'Research & Development',
       'Depreciation & Amortization_x', 'Operating Income (Loss)',
       'Non-Operating Income (Loss)', 'Interest Expense, Net',
       'Net Extraordinary Gains (Losses)',
       'Cash, Cash Equivalents & Short Term Investments',
       'Accounts & Notes Receivable', 'Inventories', 'Total Current Assets',
       'Property, Plant & Equipment, Net',
       'Long Term Investments & Receivables', 'Other Long Term Assets',
       'Total Noncurrent Assets', 'Total Assets', 'Payables & Accruals',
       'Short Term Debt', 'Total Current Liabilities', 'Long Term Debt',
       'Total Noncurrent Liabilities',
       'Share Capital & Additional Paid-In Capital', 'Treasury Stock',
       'Retained Earnings', 'Total Equity', 'Net Income/Starting Line',
       'Depreciation & Amortization_y', 'Non-Cash Items',
       'Change in Working Capital', 'Change in Accounts Receivable',
       'Change in Inventories', 'Change in Accounts Payable',
       'Change in Other', 'Net Cash from Operating Activities',
       'Change in Fixed Assets & Intangibles',
       'Net Change in Long Term Investment',
       'Net Cash from Acquisitions & Divestitures',
       'Net Cash from Investing Activities', 'Dividends Paid',
       'Cash from (Repayment of) Debt', 'Cash from (Repurchase of) Equity',
       'Net Cash from Financing Activities', 'Net Change in Cash',
    'Rating_Change']
target ='Rating'

In [27]:
# We will fill out some absurb value for NaN
df[features] = df[features].fillna(df.groupby('Ticker')[features].transform('mean'))
df['Rating_Change']=df['Rating_Change'].fillna(1000)
df['Rating'] = df['Rating'].fillna(1000)
df[features] = df[features].fillna(0)

In [28]:
df_hold = df[df['Report Date'] >= '2019-01-01']
df_train = df[df['Report Date'] < '2019-01-01']
X = df_train[features]
y = df_train[target]

### Imbalance

In [36]:
oversample = SMOTE()
X, y = oversample.fit_resample(X, y)

### CV Classificaiton Models
Let's see which model performs the best.

Since financial data is time series dependent, we will split the dataset using `TimeSeriesSplit`. `TimeSeriesSplit` allows us to split the data but based on a sequential order as to kfolds isnt.

In [30]:
tscv = TimeSeriesSplit(n_splits=10)

#### Each model is placed into a dictionary

In [31]:
def cross_validation_folds(X, y, test_size=0.25, random_state=71, n_splits=5):
    cv = TimeSeriesSplit(n_splits=5).split(y)
    tscv = TimeSeriesSplit(max_train_size=None, n_splits=n_splits)
    result = []
    models = { "logistic": LogisticRegression(),"xgb":xgb.XGBClassifier(n_estimators=550, seed=0),  "random_forest":RandomForestClassifier(n_estimators=25), "Decision_Tree":DecisionTreeClassifier() }
    iteration = 1
    
    for train_index, test_index in tscv.split(X):
        print("TRAIN: ", train_index, "TEST:", test_index)
        X_train, X_test = X.iloc[train_index], X.iloc[test_index]
        y_train, y_test = y.iloc[train_index], y.iloc[test_index]
        
        for name, model in models.items():
            model.fit(X_train, y_train)
            y_predict=model.predict(X_test)
            acc_score = accuracy_score(y_test, y_predict)
            precision, recall, f1, _ = precision_recall_fscore_support(y_test, y_predict, average='macro')
            result.append({'iter':iteration, 'model': name, 'acc_score': acc_score, 'precision': precision, 'recall': recall, 'f1': f1 })
        iteration += 1
    return pd.DataFrame(result)    

In [37]:
model_result = cross_validation_folds(X,y)
model_result.sort_values(by="f1", ascending=False)

Unnamed: 0,iter,model,acc_score,precision,recall,f1
13,4,xgb,0.647302,0.358622,0.315292,0.322838
14,4,random_forest,0.653757,0.31832,0.286089,0.28986
1,1,xgb,0.444186,0.38349,0.255222,0.277648
15,4,Decision_Tree,0.554953,0.314082,0.249641,0.26403
5,2,xgb,0.557277,0.321157,0.257272,0.262235
18,5,random_forest,0.747741,0.269439,0.211577,0.232296
17,5,xgb,0.699458,0.262098,0.200322,0.216634
6,2,random_forest,0.498064,0.26874,0.214856,0.206076
3,1,Decision_Tree,0.365952,0.225248,0.231745,0.205383
19,5,Decision_Tree,0.62458,0.254402,0.180367,0.202985


### Taking a closer look at xgboost
For my own interest

In [None]:
#running xgboost
xg_result = []
xg_coef = []

for train_index, test_index in tscv.split(X):
    print("TRAIN: ", train_index, "TEST:", test_index)
    X_train, X_test = X.iloc[train_index], X.iloc[test_index]
    y_train, y_test = y.iloc[train_index], y.iloc[test_index]

    model = xgb.XGBClassifier(objective='binary:logistic', booster='gbtree', max_leaves=7)
    eval_set = [(X_test, y_test)]
    model.fit(X_train, y_train, early_stopping_rounds=10,eval_set=eval_set)
    y_predict=model.predict(X_test)
    acc_score = accuracy_score(y_test, y_predict)
    precision, recall, f1, _ = precision_recall_fscore_support(y_test, y_predict, average='macro')
    print(confusion_matrix(y_test, y_predict))
    xg_result.append({'acc_score': acc_score, 'precision': precision, 'recall': recall, 'f1': f1 })

In [34]:
pd.DataFrame(xg_result)

Unnamed: 0,acc_score,precision,recall,f1
0,0.463553,0.352324,0.234325,0.253923
1,0.440202,0.444172,0.251383,0.284631
2,0.905806,0.284938,0.192766,0.219275
3,0.924582,0.285531,0.221041,0.245191
4,0.637267,0.221219,0.102148,0.124369
5,0.667245,0.176283,0.097921,0.115883
6,0.799148,0.199159,0.131506,0.147538
7,0.924897,0.198461,0.180049,0.188568
8,0.734301,0.220215,0.169371,0.161892
9,0.60366,0.175805,0.120954,0.137753


### Overall Score

F1 shows a range of [0.13, 0.29]. Not bad for not reading a financial statement. 
I believe model can be improved. And part of the futurework is to improve the ratings by integrating SEC footnotes and other technical indicators, such as ATR, volatility factors.

Other considerations involve integrating NLP of 3 statements to examine relevancy (That's another project 😊)