# Earnings Call Analysis; Running all models

This notebook orchestrates the complete workflow for analyzing earnings call transcripts and predicting stock returns. It integrates multiple components of the project:

1. **Data Preparation**: Loads and prepares earnings call transcript data along with financial metrics
2. **Baseline Models**: Runs traditional machine learning approaches to establish performance benchmarks:
   - Random baseline model
   - Finance-only logistic regression
   - TF-IDF with logistic regression (transcript-only)
   - TF-IDF + finance features with logistic regression
3. **BERT-based Models**: Implements and evaluates advanced deep learning architectures:
    - **AttnMLPPoolClassifier**: Uses attention-based pooling on BERT embeddings for transcript-only analysis
    - **AttnPoolTwoTower**: Combines transcript embeddings with financial features in a two-tower architecture
    - **MeanPoolClassifier**: Uses simple mean pooling of BERT embeddings as an alternative approach

The notebook compares model performance across different architectures and input configurations (transcript-only vs. transcript + financial data) using metrics such as AUC, confidence intervals, and standard errors for 1-day and 5-day return predictions.

In [None]:
# Necessary Imports
import pandas as pd
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from config import * 
from data_cleaning_util import prepare_earnings_data
from baseline_models import call_baseline_model
from finbert_models_utils import call_model, call_model_fin

In [2]:
# Baseline Models
baselinemodels = ["random", "finance_only", "tfidf", "finance_tfidf"]
returndays = [1,5]
raw_data = prepare_earnings_data()
baseline_results = {}
for model in baselinemodels:
    for days in returndays:
        result_df = call_baseline_model(raw_data, model, days)
        baseline_results[(model, days)] = result_df

# Add model and return period information to baseline results
baseline_results_list = []
for (model_name, return_period), result_df in baseline_results.items():
    result_df = result_df.copy()
    result_df['Model'] = model_name
    result_df['Return Period'] = return_period
    baseline_results_list.append(result_df)

baseline_results_df = pd.concat(baseline_results_list, ignore_index=True)

# Reorder columns to put Model and Return Period first
cols = ['Model', 'Return Period' , 'accuracy', 'auc', 'ci', 'se']
baseline_results_df = baseline_results_df[cols]

print("Baseline Model Results - 1-Day and 5-Day Returns")
print("=" * 50)
baseline_results_df


Random Model 1-day returns- Accuracy: 0.4647, AUC: 0.4751 ± 0.0328, CI: (np.float64(0.410089224433768), np.float64(0.5372236738418149))
Random Model 5-day returns- Accuracy: 0.4615, AUC: 0.4470 ± 0.0328, CI: (np.float64(0.3843331271902701), np.float64(0.5134099616858239))
Finance Only Model 1-day returns- Accuracy: 0.4904, AUC: 0.4782 ± 0.0330, CI: (np.float64(0.41318267419962335), np.float64(0.5436946668361079))
Finance Only Model 5-day returns- Accuracy: 0.4936, AUC: 0.4614 ± 0.0334, CI: (np.float64(0.3907394380767162), np.float64(0.525623040105628))
Best C for TF-IDF model: 100
TF-IDF Model 1-day returns- Accuracy: 0.4647, AUC: 0.4686 ± 0.0339, CI: (np.float64(0.4046624331550802), np.float64(0.5373219373219373))
Best C for TF-IDF model: 10
TF-IDF Model 5-day returns- Accuracy: 0.5994, AUC: 0.6171 ± 0.0306, CI: (np.float64(0.5585793929382614), np.float64(0.6756006628003313))
Best C for Finance + TF-IDF model: 0.01
Finance + TF-IDF Model 1-day returns- Accuracy: 0.5128, AUC: 0.4801 ± 

Unnamed: 0,Model,Return Period,accuracy,auc,ci,se
0,random,1,0.464744,0.475084,"(0.410089224433768, 0.5372236738418149)",0.032757
1,random,5,0.461538,0.446955,"(0.3843331271902701, 0.5134099616858239)",0.032828
2,finance_only,1,0.490385,0.478241,"(0.41318267419962335, 0.5436946668361079)",0.03305
3,finance_only,5,0.49359,0.461358,"(0.3907394380767162, 0.525623040105628)",0.033397
4,tfidf,1,0.464744,0.468603,"(0.4046624331550802, 0.5373219373219373)",0.033864
5,tfidf,5,0.599359,0.617119,"(0.5585793929382614, 0.6756006628003313)",0.030639
6,finance_tfidf,1,0.512821,0.480051,"(0.4153768293197251, 0.5448135705694496)",0.033128
7,finance_tfidf,5,0.567308,0.590741,"(0.5324487433862434, 0.6521614069690993)",0.031482


### BERT-based Models Evaluation



#### 1-Day Return Prediction Results:

In [None]:
# Create dataframe with the test_auc, test_auc_ci, test_se, and test_loss for both models under 1-day retirn with the AttenPoolTwoTower and the AttnMLPPoolClassifier architectures

model_bert, test_loss_bert, test_auc_bert, test_auc_ci_bert, test_se_bert = call_model(
    Model="AttnMLPPoolClassifier",
    dim=768,
    attn_hidden=256,
    hidden=256,
    dropout=0.2,
    return_period=1
)

model_fin, test_loss_fin, test_auc_fin, test_auc_ci_fin, test_se_fin = call_model_fin(
    Model="AttnPoolTwoTower",
    dim=768,
    fin_dim=4,
    hidden=256,
    dropout=0.2,
    return_period=1
)

epoch 01 | train_loss=0.7204 | val_loss=0.6936 | val_auc=0.442
epoch 02 | train_loss=0.7162 | val_loss=0.7271 | val_auc=0.479
epoch 03 | train_loss=0.6945 | val_loss=0.7292 | val_auc=0.483
epoch 04 | train_loss=0.7255 | val_loss=0.7512 | val_auc=0.483
epoch 05 | train_loss=0.8794 | val_loss=0.7574 | val_auc=0.478
epoch 06 | train_loss=0.6780 | val_loss=0.7210 | val_auc=0.481
epoch 07 | train_loss=0.7207 | val_loss=0.7013 | val_auc=0.476
epoch 08 | train_loss=0.7151 | val_loss=0.7188 | val_auc=0.480
epoch 09 | train_loss=0.7323 | val_loss=0.7073 | val_auc=0.472
epoch 10 | train_loss=0.6117 | val_loss=0.7397 | val_auc=0.480
Early stopping on AUC.
epoch 01 | train_loss=0.7370 | val_loss=0.7154 | val_auc=0.521
epoch 02 | train_loss=0.6804 | val_loss=0.7051 | val_auc=0.493
epoch 03 | train_loss=0.6376 | val_loss=0.6936 | val_auc=0.495
epoch 04 | train_loss=0.6757 | val_loss=0.7000 | val_auc=0.480
epoch 05 | train_loss=0.6906 | val_loss=0.7100 | val_auc=0.509
epoch 06 | train_loss=0.7004 | v

In [16]:
# Using the MeanPoolClassifier for 1- day return 

model_bert_mean, test_loss_bert_mean, test_auc_bert_mean, test_auc_ci_bert_mean, test_se_bert_mean = call_model(
    Model="MeanPoolClassifier",
    dim=768,
    attn_hidden=256,
    hidden=256,
    dropout=0.2,
    return_period=1
)

epoch 01 | train_loss=0.7333 | val_loss=0.7146 | val_auc=0.444
epoch 02 | train_loss=0.6893 | val_loss=0.7122 | val_auc=0.450
epoch 03 | train_loss=0.7016 | val_loss=0.7135 | val_auc=0.468
epoch 04 | train_loss=0.7627 | val_loss=0.7200 | val_auc=0.452
epoch 05 | train_loss=0.6906 | val_loss=0.6985 | val_auc=0.441
epoch 06 | train_loss=0.6748 | val_loss=0.7110 | val_auc=0.442
epoch 07 | train_loss=0.6728 | val_loss=0.7009 | val_auc=0.441
epoch 08 | train_loss=0.7135 | val_loss=0.6988 | val_auc=0.444
epoch 09 | train_loss=0.7114 | val_loss=0.7128 | val_auc=0.439
epoch 10 | train_loss=0.7000 | val_loss=0.7065 | val_auc=0.442
Early stopping on AUC.


In [None]:
# Combine all results into a single dataframe for 1-day returns
combined_results = pd.DataFrame({
    'Model': ['AttnMLPPoolClassifier (Transcript Only)', 
              'AttnPoolTwoTower (Transcript + Finance)', 
              'MeanPoolClassifier (Transcript Only)', 
    ],
    'Test AUC': [
        test_auc_bert,
        test_auc_fin,
        test_auc_bert_mean,
    ],
    'Test AUC CI': [
        test_auc_ci_bert,
        test_auc_ci_fin,
        test_auc_ci_bert_mean,
    ],
    'Test SE': [
        test_se_bert,
        test_se_fin,
        test_se_bert_mean,
    ]
})

print("Results for 1-Day Return Prediction")
print("=" * 50)
combined_results

Results for 1-Day Return Prediction


Unnamed: 0,Model,Test AUC,Test AUC CI,Test SE
0,AttnMLPPoolClassifier (Transcript Only),0.509768,"(0.44134351743047395, 0.5808038615056159)",0.0357
1,AttnPoolTwoTower (Transcript + Finance),0.426317,"(0.36121229102456337, 0.4928456363119493)",0.033727
2,MeanPoolClassifier (Transcript Only),0.480464,"(0.4144984830376023, 0.5507735996866432)",0.035041


#### 5-Day Return Prediction Results:

In [None]:
# Create dataframe with the test_auc, test_auc_ci, test_se, and test_loss for both models under 5-day retirn with the AttenPoolTwoTower and the AttnMLPPoolClassifier architectures

model_bert_5d, test_loss_bert_5d, test_auc_bert_5d, test_auc_ci_bert_5d, test_se_bert_5d = call_model(
    Model="AttnMLPPoolClassifier",
    dim=768,
    attn_hidden=256,
    hidden=256,
    dropout=0.2,
    return_period=5
)

model_fin_5d, test_loss_fin_5d, test_auc_fin_5d, test_auc_ci_fin_5d, test_se_fin_5d = call_model_fin(
    Model="AttnPoolTwoTower",
    dim=768,
    fin_dim=4,
    hidden=256,
    dropout=0.2,
    return_period=5
)

epoch 01 | train_loss=0.6968 | val_loss=0.7143 | val_auc=0.425
epoch 02 | train_loss=0.7192 | val_loss=0.7106 | val_auc=0.476
epoch 03 | train_loss=0.7115 | val_loss=0.6948 | val_auc=0.424
epoch 04 | train_loss=0.6787 | val_loss=0.7115 | val_auc=0.442
epoch 05 | train_loss=0.7169 | val_loss=0.7044 | val_auc=0.442
epoch 06 | train_loss=0.7259 | val_loss=0.7177 | val_auc=0.442
epoch 07 | train_loss=0.6287 | val_loss=0.7183 | val_auc=0.447
epoch 08 | train_loss=0.6446 | val_loss=0.7031 | val_auc=0.464
epoch 09 | train_loss=0.7059 | val_loss=0.7179 | val_auc=0.471
Early stopping on AUC.
epoch 01 | train_loss=0.6176 | val_loss=0.7134 | val_auc=0.434
epoch 02 | train_loss=0.6898 | val_loss=0.7064 | val_auc=0.474
epoch 03 | train_loss=0.7496 | val_loss=0.7091 | val_auc=0.456
epoch 04 | train_loss=0.6756 | val_loss=0.7049 | val_auc=0.479
epoch 05 | train_loss=0.7085 | val_loss=0.6976 | val_auc=0.496
epoch 06 | train_loss=0.7346 | val_loss=0.7046 | val_auc=0.495
epoch 07 | train_loss=0.7533 | v

In [8]:
# Using the MeanPoolClassifier for 5- day return 

model_bert_mean_5d, test_loss_bert_mean_5d, test_auc_bert_mean_5d, test_auc_ci_bert_mean_5d, test_se_bert_mean_5d = call_model(
    Model="MeanPoolClassifier",
    dim=768,
    attn_hidden=256,
    hidden=256,
    dropout=0.2,
    return_period=5
)

epoch 01 | train_loss=0.6943 | val_loss=0.6940 | val_auc=0.447
epoch 02 | train_loss=0.6920 | val_loss=0.7051 | val_auc=0.448
epoch 03 | train_loss=0.7404 | val_loss=0.7061 | val_auc=0.444
epoch 04 | train_loss=0.7071 | val_loss=0.6932 | val_auc=0.451
epoch 05 | train_loss=0.6986 | val_loss=0.7036 | val_auc=0.445
epoch 06 | train_loss=0.6561 | val_loss=0.7046 | val_auc=0.446
epoch 07 | train_loss=0.6932 | val_loss=0.6934 | val_auc=0.447
epoch 08 | train_loss=0.6661 | val_loss=0.7088 | val_auc=0.444
epoch 09 | train_loss=0.6619 | val_loss=0.7025 | val_auc=0.443
epoch 10 | train_loss=0.7434 | val_loss=0.7122 | val_auc=0.450
epoch 11 | train_loss=0.6678 | val_loss=0.7038 | val_auc=0.446
Early stopping on AUC.


In [None]:
# Combine all results into a single dataframe for 5-day returns
combined_results_5d = pd.DataFrame({
    'Model': ['AttnMLPPoolClassifier (Transcript Only)', 
              'AttnPoolTwoTower (Transcript + Finance)', 
              'MeanPoolClassifier (Transcript Only)', 
    ],
    'Test AUC': [
        test_auc_bert_5d,
        test_auc_fin_5d,
        test_auc_bert_mean_5d,
    ],
    'Test AUC CI': [
        test_auc_ci_bert_5d,
        test_auc_ci_fin_5d,
        test_auc_ci_bert_mean_5d,
    ],
    'Test SE': [
        test_se_bert_5d,
        test_se_fin_5d,
        test_se_bert_mean_5d,
    ]
})

print("Results for 5-Day Return Prediction")
print("=" * 50)
combined_results_5d

Results for 5-Day Return Prediction


Unnamed: 0,Model,Test AUC,Test AUC CI,Test SE
0,AttnMLPPoolClassifier (Transcript Only),0.47558,"(0.4066281884192888, 0.5424307634730539)",0.035368
1,AttnPoolTwoTower (Transcript + Finance),0.482139,"(0.42587895233600753, 0.5418581359726641)",0.029751
2,MeanPoolClassifier (Transcript Only),0.524749,"(0.4599539037741285, 0.5944713593616415)",0.034054


### Combine all results

In [33]:
# Combine baseline and BERT-based model results for both 1-day and 5-day returns
# First, prepare baseline results for 1-day returns
baseline_1d = baseline_results_df[baseline_results_df['Return Period'] == 1].copy()
baseline_1d['Model'] = baseline_1d['Model'].apply(lambda x: f"{x} (Baseline)")
baseline_1d = baseline_1d.rename(columns={'auc': 'Test AUC', 'ci': 'Test AUC CI', 'se': 'Test SE'})
baseline_1d = baseline_1d[['Model', 'Test AUC', 'Test AUC CI', 'Test SE']]

# Combine with BERT-based results for 1-day returns
all_results_1d = pd.concat([baseline_1d, combined_results], ignore_index=True)
all_results_1d['Return Period'] = '1-Day'

# Prepare baseline results for 5-day returns
baseline_5d = baseline_results_df[baseline_results_df['Return Period'] == 5].copy()
baseline_5d['Model'] = baseline_5d['Model'].apply(lambda x: f"{x} (Baseline)")
baseline_5d = baseline_5d.rename(columns={'auc': 'Test AUC', 'ci': 'Test AUC CI', 'se': 'Test SE'})
baseline_5d = baseline_5d[['Model', 'Test AUC', 'Test AUC CI', 'Test SE']]

# Combine with BERT-based results for 5-day returns
all_results_5d = pd.concat([baseline_5d, combined_results_5d], ignore_index=True)
all_results_5d['Return Period'] = '5-Day'

# Combine all results into one comprehensive dataframe
comprehensive_results = pd.concat([all_results_1d, all_results_5d], ignore_index=True)

print("Comprehensive Model Results - All Models for 1-Day and 5-Day Returns")
print("=" * 80)
# Set number of decimals to 6 precision points in each float value (including the ones in the tuples in the Test AUC CI column)
pd.options.display.float_format = '{:.6f}'.format

# Format the Test AUC CI column to display tuples with 6 decimal places
comprehensive_results['Test AUC CI'] = comprehensive_results['Test AUC CI'].apply(
    lambda x: f"({x[0]:.6f}, {x[1]:.6f})"
)

comprehensive_results

Comprehensive Model Results - All Models for 1-Day and 5-Day Returns


Unnamed: 0,Model,Test AUC,Test AUC CI,Test SE,Return Period
0,random (Baseline),0.475084,"(0.410089, 0.537224)",0.032757,1-Day
1,finance_only (Baseline),0.478241,"(0.413183, 0.543695)",0.03305,1-Day
2,tfidf (Baseline),0.523948,"(0.460347, 0.590715)",0.033366,1-Day
3,finance_tfidf (Baseline),0.463047,"(0.402240, 0.530189)",0.033209,1-Day
4,AttnMLPPoolClassifier (Transcript Only),0.509768,"(0.441344, 0.580804)",0.0357,1-Day
5,AttnPoolTwoTower (Transcript + Finance),0.426317,"(0.361212, 0.492846)",0.033727,1-Day
6,MeanPoolClassifier (Transcript Only),0.480464,"(0.414498, 0.550774)",0.035041,1-Day
7,random (Baseline),0.446955,"(0.384333, 0.513410)",0.032828,5-Day
8,finance_only (Baseline),0.461358,"(0.390739, 0.525623)",0.033397,5-Day
9,tfidf (Baseline),0.6207,"(0.559945, 0.679597)",0.030598,5-Day
