In [None]:
import streamlit as st
st.markdown("""
<div style="background-color: #000020; color: white; text-align: center; padding: 20px">
  <h1 style="margin: 0; color: white"><b>Questioning the Answers: LLMs enter the Boardroom</b></h1>
  <h2 style="margin: 0; color: white"><b></b>Using Gen AI Tools to Harness Alpha from Earnings Calls</h2>
  <h3 style="margin: 0; color: white"><b>by S&P Global Market Intelligence's Quantitative Research & Solutions (QRS) Group</b></h3>
</div>
""", unsafe_allow_html=True)

Read this QuickStart to setup an environment to run this notebook:

# 1. Overview

Earnings calls play a pivotal role in shaping investor perceptions. The quality of communication between executives and analysts can significantly influence company performance. On-topic and proactive exeutives, who deliver proactive presentations, anticipate market queries, and provide clear, on-topic answers to analysts’ questions—consistently outperform their peers. Conversely, off-topic and reactive executives, who fail to address analysts’ key inquiries during presentations, and provide off-topic responses—significantly underperform.

Executives' ability to anticipate investor concerns and maintain a focused dialogue fosters confidence and strategic communication. In contrast, failing to provide clarity when analysts seek additional information can lead to misalignment and breakdowns in transparency. A long (short) portfolio of on-topic and proactive (off-topic and reactive) generates +515bps of annualized alpha.

This notebook serves as the blueprint for the research detailed in Quantitative Research & Solutions’ recent publication, ["Questioninig the Answers: LLM's enter the Boardroom."](https://www.spglobal.com/market-intelligence/en/news-insights/research/questioning-the-answers-llms-enter-the-boardroom) It analyse executive on-topicness and proactiveness using the analysts questions, executives answers and LLM answers. This research harness alpha using LLM tools, including vector embeddings, vector cosine similarity, and the LLM quesiton answering.

# 2. Libraries & User Inputs
Import libraries required for the workflow

## 2.1 Libraries

In [None]:
from snowflake.snowpark.context import get_active_session
session = get_active_session()

## 2.2 User Inputs

This research invloves the usage of an embedding model and a completion model, the default models were set to "snowflake-arctic-embed-m" for embedding and "llama3.1-8b" for completion. This user input section gives you the flexibility to chose your own model for the task.

In [None]:
# Which of the two embedding functions we want to use
snf_embed_text_func = "SNOWFLAKE.CORTEX.EMBED_TEXT_768" 

# Which embedding model we want to use, see https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions#availability 
# for avalible embedding models in Snowflake
embedding_model = "snowflake-arctic-embed-m" 

# Which LLM we want to use, see https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions#availability 
# for avalible LLMs in Snowflake
completion_model = "llama3.1-8b" 

# Name of the databse created in the Setup Snowflake step
sp_llm_qs_location = "QRSLLM_POC_DB.QTA_SCHEMA"

# Name of the shared database created at the "Request the S&P Global Market Intelligence QuickStart dataset" step
sp_qs_share_location = "SPGLOBALXPRESSCLOUD_SPGMIQRS.XPRESSFEED"

# 3. Datasets

This notebook is using the Q&A pairing table from the S&P Global Q4 2024 Earnings Call on Feb 11, the data can be found in the Github directory.

In [None]:
session.sql(f'''SELECT * FROM {sp_llm_qs_location}.SAMPLE_QA_PAIR ORDER BY ANSWERORDER ASC''')

# 4. Working with the Data: LLM Ready Data

Using a LLM to answer analysts questions based only on the prepared remarks and the previous questions and answers will give an indication if executives are proactive.

## 4.1 Using Snowflake Cortex AI
When calling the Snowflake Cortex COMPLETE function, messages are organized into distinct roles—system, user, and assistant—to structure and guide interactions. Each role serves a specific purpose:

System: Provides instructions that define the context or behavior of the model. It's like setting the rules or tone for the conversation. Example: "You are a helpful assistant that answers questions about technology in a concise manner."

User: Represents the input or queries made by the person interacting with the model. These are the prompts or requests that the model responds to. Example: "What is the purpose of the OpenAI API?"

Assistant: Reflects the model's response to the user's query, shaped by the system's instructions and the user's input. Example: "The OpenAI API is designed to enable developers to integrate language models into their applications for tasks like answering questions, generating content, and more."

In our research, executive prepared remarks are labelled as assistant messages, analyst's questions as User messages and executive answers as Assistant messages

### 4.1.1 Apply LLM Completion
We apply the SNOWFLAKE.CORTEX.COMPLETE function on the prompt column using the model defined by `completion_model`and collect the LLM response.

In [None]:
df = session.sql(f"""

    select *,  SNOWFLAKE.CORTEX.COMPLETE('{completion_model}', prompt, {{'temperature': 0}}) as LLMAnswer 
    from {sp_llm_qs_location}.SAMPLE_QA_PAIR
          
""")

df.write.save_as_table(f"{sp_llm_qs_location}.TranscriptComponents_targetLLMAnswer", mode="overwrite", table_type='transient')
df.limit(5)

### 4.1.2 Clean Up LLM Response

Extract only the actual message from the LLM from the responses

In [None]:
df = session.sql(f"""

    select 
        callDate, tradingItemId, transcriptId, headline, 
        questionTranscriptPersonName, questionTranscriptPersonId, questionProId, answerTranscriptPersonName, answerTranscriptPersonId, answerProId,
        questionTranscriptComponentId, answerTranscriptComponentId, question, answer,
        REPLACE(LLMANSWER: "choices"[0]: "messages", '"', '') as cleanLLMAnswer 
    from {sp_llm_qs_location}.TranscriptComponents_targetLLMAnswer
          
""")

df.write.save_as_table(f"{sp_llm_qs_location}.TranscriptComponents_targetLLMAnswer_Clean", mode="overwrite", table_type='transient')
df.limit(5)

## 4.2 Summarize Text

Use the Snowflake Cortex Summarize function to summarize the question, answer and LLM answer

In [None]:
df = session.sql(f"""

    select 
        *, 
        SNOWFLAKE.CORTEX.SUMMARIZE(question) as summarizeQuestion,
        SNOWFLAKE.CORTEX.SUMMARIZE(answer) as summarizeAnswer,
        SNOWFLAKE.CORTEX.SUMMARIZE(cleanLLMAnswer) as summarizeCleanLLMAnswer,
    from {sp_llm_qs_location}.TranscriptComponents_targetLLMAnswer_Clean
          
""")

df.write.save_as_table(f"{sp_llm_qs_location}.TranscriptComponents_targetLLMAnswer_Summarize", mode="overwrite", table_type='transient')
df.limit(5)

# 5. Working with the Data: Factor Construction
## 5.1 Executive On/Off Topic Factor
When an executive answer is semantically similar (dissimilar) to the analyst’s question, it suggests that the answer uses language and concepts similar to (different from) the analyst question, indicating it is on-topic (off-topic). To determine semantic closeness, we vector-embed the question-and-answer texts and calculate a cosine similarity score between the two vectors.

### 5.1.1 Question vs Executive Answer Cosine Similarity

In [None]:
df = session.sql(f'''

    with vec as(
    select *
        , {snf_embed_text_func}('{embedding_model}', summarizeQuestion) as questionVec
        , {snf_embed_text_func}('{embedding_model}', summarizeAnswer) as answerVec 
    from {sp_llm_qs_location}.TranscriptComponents_targetLLMAnswer_Summarize
    )
    select *, VECTOR_COSINE_SIMILARITY(questionVec, answerVec) as execOnOffTopicFactor from vec 
               
''')

df.write.save_as_table(f"{sp_llm_qs_location}.TranscriptComponents_execOnOffTopicFactor", mode="overwrite", table_type='transient')
df.limit(5)

### 5.1.2 Transcript Mean Executive On/Off Topic Factor
Cosine similarity scores are averaged at the transcript level. A high (low) Cosine Similarity Score indicates an On (Off) Topic Executive.



In [None]:
select avg(execOnOffTopicFactor) as transcriptLevelExecOnOffTopicFactor 
from {{sp_llm_qs_location}}.TranscriptComponents_execOnOffTopicFactor

## 8.2 Executive Proactive/Reactive Factor
### 8.2.1 Question vs LLM Answer Cosine Similarity
Since the LLM answers only within the context of information provided in the prepared remarks, a high (low) cosine similarity score indicates that the LLM answers are semantically similar (dissimilar) the questions, reflecting the executives are proactive (reactive).

In [None]:
df = session.sql(f'''

    with vec as(
    select 
        *, 
        {snf_embed_text_func}('{embedding_model}', summarizeQuestion) as questionVec, 
        {snf_embed_text_func}('{embedding_model}', summarizeCleanLLMAnswer) as LLMAnswerVec 
    from {sp_llm_qs_location}.TranscriptComponents_targetLLMAnswer_Summarize
    )
    select *, VECTOR_COSINE_SIMILARITY(questionVec, LLMAnswerVec) as execProactiveReactiveFactor from vec 
               
''')

df.write.save_as_table(f"{sp_llm_qs_location}.TranscriptComponents_execProactiveReactiveFactor", mode="overwrite", table_type='transient')
df.limit(5)


### 8.2.2 Transcript Mean Executive Proactive/Reactive Factor
Similar to the construction of the Executive On/Off Topic factor, both the LLM answers and questions are summarized, vector-embedded and cosine similarity scores are averaged at the transcript level.

In [None]:
select avg(execProactiveReactiveFactor) as transcriptLevelexecProactiveReactiveFactor 
from {{sp_llm_qs_location}}.TranscriptComponents_execProactiveReactiveFactor

# 9. Results & Summary
This research underscores the significant impact of executive communication styles during earnings calls on firm performance. Proactive executives who anticipate market concerns and provide concise, on-topic responses foster transparency, aligning with investor expectations and driving superior returns. The findings demonstrate that firms with Efficient Communicators achieve statistically significant outperformance, while Total Redirectors suffer from diminished confidence and underperformance. These insights validate the critical role of strategic communication in shaping investor perceptions and influencing market outcomes.

Advanced analytical tools, such as vector embeddings and cosine similarity metrics, enable nuanced evaluations of executive-analyst interactions, revealing measurable performance effects across different communication styles. While large language models (LLMs) enhance feature extraction, challenges like forward-looking bias and inconsistency highlight the need for caution in time-sensitive tasks. Overall, the integration of proactive, clear, and relevant communication strategies remains paramount in fostering investor trust and maximizing financial success in a competitive marketplace.