<a href="https://colab.research.google.com/github/sheldonkemper/bank_of_england/blob/main/notebooks/modelling/am_boe_topic_modelling_v2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

############################################
# Topic Modeling Using GPT-4-Turbo on Financial Text
############################################
#
# This script processes financial text data from CSV files using GPT-4-Turbo to identify topics discussed in the text.
# It follows these steps:
#
# Step 1: Load the input CSV file into a pandas DataFrame.
# Step 2: Define a function to interact with OpenAI GPT-4-Turbo to extract topics and their descriptions.
# Step 3: Apply this function to the 'Question_cleand' column to create topic-related columns.
# Step 4: Apply the same function to the 'Response_cleand' column.
# Step 5: Save the final DataFrame to a CSV file.
#
# Required Libraries:
# - openai: For interacting with GPT-4-Turbo.
# - pandas: For handling data operations.
# - os: For managing environment variables.
############################################

In [None]:
# Install required libraries (if not already installed)
!pip install -q openai==0.28
!pip install -q pandas

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/76.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
# Import necessary libraries
import openai
import pandas as pd
import os
import re

# Ensure OpenAI API key is set
openai_api_key = os.getenv("Openai_key")


In [None]:
############################################
# Step 1: Load CSV Files into DataFrame
############################################

# Load the CSV file (assuming the user uploads it)
input_df = pd.read_csv("/content/ubs_qa_df_preprocessed_ver2.csv")
print("CSV file loaded successfully.")

CSV file loaded successfully.


In [None]:
input_df

Unnamed: 0,filename,Quarter,Question,Question_cleaned,Analyst_Bank,Response,Response_cleaned,Executive
0,1q23-earnings-call-remarks.pdf,1Q23,"Chis Hallam, Goldman Sachs Yes. Good morning, ...",['okay thank you capital requirements know sit...,Goldman,"Okay. Thank you. On capital requirements, you ...",['chis hallam goldman sachs yes good morning e...,['Sergio P. Ermotti']
1,1q23-earnings-call-remarks.pdf,1Q23,Yeah. Thanks. Just two questions. The first on...,['so sarah take first question take secondso g...,JPMorgan,"So, Sarah, take the first question. I'll take ...",['yeah thanks two questions first one related ...,"['Sergio P. Ermotti', 'Sarah Youngwood']"
2,1q23-earnings-call-remarks.pdf,1Q23,"Yeah. Thank you. Good morning. Welcome back, S...",['thank you ryan good back interact well look ...,Bank of America,"Thank you, Ryan. It is good to be back to inte...",['yeah thank you good morning welcome back ser...,['Sergio P. Ermotti']
3,1q23-earnings-call-remarks.pdf,1Q23,Yes. Good morning. The first question I wanted...,['so first question terms trends april really ...,Jefferies,"So, on the first question in terms of the tren...",['yes good morning first question wanted ask r...,"['Sergio P. Ermotti', 'Sarah Youngwood']"
4,1q23-earnings-call-remarks.pdf,1Q23,"Good morning. Two questions. Firstly, just on ...",['so first quarter first question seen signifi...,Citi,"So, on the first quarter or the first question...",['good morning two questions firstly slide 10 ...,['Sarah Youngwood']
...,...,...,...,...,...,...,...,...
91,4q24-earnings-call-remarks.pdf,4Q24,Good morning. One follow-up on GWM Americas an...,['first first question respect fa comp changes...,Citi,"So first, on your first question in respect to...",['good morning one follow-up gwm americas perh...,['Todd Tuckner']
92,4q24-earnings-call-remarks.pdf,4Q24,Very helpful. Thank you very much.,[''],Autonomous Research,,['helpful thank much'],[]
93,4q24-earnings-call-remarks.pdf,4Q24,Hi. Thank you and thank you for the clarificat...,['yeah terms expectations really articulated s...,Mediobanca,"Yeah. So in terms of the expectations, we have...",['hi thank thank clarifications capital got co...,['Todd Tuckner']
94,4q24-earnings-call-remarks.pdf,4Q24,Good morning. It's Antonio from Bank of Americ...,['terms mitigants bit headwinds know definitel...,Bank of America,So in terms of the mitigants a bit to the head...,['good morning antonio bank america two questi...,['Todd Tuckner']


In [None]:
############################################
# Step 2: Define Function to Extract Topics using GPT-4-Turbo
############################################
print("Defining GPT-4-Turbo function for topic extraction...")

def extract_topics(text):
    """
    This function takes a financial text input and analyzes the topics discussed in the text.
    It returns a list of topics and corresponding sentences that contributed to each topic.
    """
    if pd.isna(text) or text.strip() == "":
        return "No Topic", "No Relevant Sentences"

    prompt = f"""
    Analyze the following financial text and extract distinct topics discussed.
    Provide:
    1. The topic names.
    2. Example sentences that helped you identify the topic.

    Text:
    {text}
    """

    try:
        response = openai.ChatCompletion.create(
            model="gpt-4-turbo",
            messages=[{"role": "user", "content": prompt}]
        )

        result = response["choices"][0]["message"]["content"]

        # Splitting response into topic and description
        if "\n" in result:
            parts = result.split("\n", 1)
            topic = parts[0].strip()
            topic_description = parts[1].strip()
        else:
            topic = result.strip()
            topic_description = "Description not available"

        return topic, topic_description
    except Exception as e:
        return "Error", str(e)

Defining GPT-4-Turbo function for topic extraction...


In [None]:
############################################
# Step 3: Apply Function to 'Question_cleand' Column
############################################

input_df[['Question_topic', 'Question_topic_description']] = input_df['Question_cleaned'].apply(
    lambda x: pd.Series(extract_topics(x))
)

question_df = input_df.copy()
print("Topic extraction for questions completed.")

Topic extraction for questions completed.


In [None]:
question_df

Unnamed: 0,filename,Quarter,Question,Question_cleaned,Analyst_Bank,Response,Response_cleaned,Executive,Question_topic,Question_topic_description
0,1q23-earnings-call-remarks.pdf,1Q23,"Chis Hallam, Goldman Sachs Yes. Good morning, ...",['okay thank you capital requirements know sit...,Goldman,"Okay. Thank you. On capital requirements, you ...",['chis hallam goldman sachs yes good morning e...,['Sergio P. Ermotti'],### Topics Analysis from Financial Text:,1. **Capital Requirements and Scope of Perimet...
1,1q23-earnings-call-remarks.pdf,1Q23,Yeah. Thanks. Just two questions. The first on...,['so sarah take first question take secondso g...,JPMorgan,"So, Sarah, take the first question. I'll take ...",['yeah thanks two questions first one related ...,"['Sergio P. Ermotti', 'Sarah Youngwood']",The provided financial text discusses multiple...,1. **Economic Impact Analysis**\n - Example ...
2,1q23-earnings-call-remarks.pdf,1Q23,"Yeah. Thank you. Good morning. Welcome back, S...",['thank you ryan good back interact well look ...,Bank of America,"Thank you, Ryan. It is good to be back to inte...",['yeah thank you good morning welcome back ser...,['Sergio P. Ermotti'],"Based on the text provided, several distinct t...",1. **Topic Name: Asset Management and Restruct...
3,1q23-earnings-call-remarks.pdf,1Q23,Yes. Good morning. The first question I wanted...,['so first question terms trends april really ...,Jefferies,"So, on the first question in terms of the tren...",['yes good morning first question wanted ask r...,"['Sergio P. Ermotti', 'Sarah Youngwood']",The financial text provided discusses several ...,1. **Regional Strengths and Complementarity in...
4,1q23-earnings-call-remarks.pdf,1Q23,"Good morning. Two questions. Firstly, just on ...",['so first quarter first question seen signifi...,Citi,"So, on the first quarter or the first question...",['good morning two questions firstly slide 10 ...,['Sarah Youngwood'],"From the provided financial text, several dist...",1. **Mix Shifts in Financial Systems**\n - *...
...,...,...,...,...,...,...,...,...,...,...
91,4q24-earnings-call-remarks.pdf,4Q24,Good morning. One follow-up on GWM Americas an...,['first first question respect fa comp changes...,Citi,"So first, on your first question in respect to...",['good morning one follow-up gwm americas perh...,['Todd Tuckner'],"From the provided financial text, the followin...",1. **Compensation and Strategy Alignment**\n ...
92,4q24-earnings-call-remarks.pdf,4Q24,Very helpful. Thank you very much.,[''],Autonomous Research,,['helpful thank much'],[],"It appears that the provided text is empty, as...",Description not available
93,4q24-earnings-call-remarks.pdf,4Q24,Hi. Thank you and thank you for the clarificat...,['yeah terms expectations really articulated s...,Mediobanca,"Yeah. So in terms of the expectations, we have...",['hi thank thank clarifications capital got co...,['Todd Tuckner'],"Based on the provided financial text, I have i...",1. **Financial Expectations and Goals**:\n -...
94,4q24-earnings-call-remarks.pdf,4Q24,Good morning. It's Antonio from Bank of Americ...,['terms mitigants bit headwinds know definitel...,Bank of America,So in terms of the mitigants a bit to the head...,['good morning antonio bank america two questi...,['Todd Tuckner'],From the provided text which seems to be part ...,1. **Revenue Growth and Hedging Strategies**\n...


In [None]:
############################################
# Step 4: Apply Function to 'Response_cleand' Column
############################################
print("Applying GPT-4-Turbo function on 'Response_cleand' column...")

question_df[['Answer_topic', 'Answer_topic_description']] = question_df['Response_cleaned'].apply(
    lambda x: pd.Series(extract_topics(x))
)

final_df = question_df.copy()
print("Topic extraction for responses completed.")

Applying GPT-4-Turbo function on 'Response_cleand' column...
Topic extraction for responses completed.


In [None]:
final_df

Unnamed: 0,filename,Quarter,Question,Question_cleaned,Analyst_Bank,Response,Response_cleaned,Executive,Question_topic,Question_topic_description,Answer_topic,Answer_topic_description
0,1q23-earnings-call-remarks.pdf,1Q23,"Chis Hallam, Goldman Sachs Yes. Good morning, ...",['okay thank you capital requirements know sit...,Goldman,"Okay. Thank you. On capital requirements, you ...",['chis hallam goldman sachs yes good morning e...,['Sergio P. Ermotti'],### Topics Analysis from Financial Text:,1. **Capital Requirements and Scope of Perimet...,### 1. Topics discussed in the text:,**Topic 1: Capital Requirements and Regulatory...
1,1q23-earnings-call-remarks.pdf,1Q23,Yeah. Thanks. Just two questions. The first on...,['so sarah take first question take secondso g...,JPMorgan,"So, Sarah, take the first question. I'll take ...",['yeah thanks two questions first one related ...,"['Sergio P. Ermotti', 'Sarah Youngwood']",The provided financial text discusses multiple...,1. **Economic Impact Analysis**\n - Example ...,Based on the analysis of the provided financia...,1. **Book Value Per Share Guidance:**\n - Ex...
2,1q23-earnings-call-remarks.pdf,1Q23,"Yeah. Thank you. Good morning. Welcome back, S...",['thank you ryan good back interact well look ...,Bank of America,"Thank you, Ryan. It is good to be back to inte...",['yeah thank you good morning welcome back ser...,['Sergio P. Ermotti'],"Based on the text provided, several distinct t...",1. **Topic Name: Asset Management and Restruct...,"Based on the text provided, here are the topic...",1. **Merger and Acquisition Strategy**\n - *...
3,1q23-earnings-call-remarks.pdf,1Q23,Yes. Good morning. The first question I wanted...,['so first question terms trends april really ...,Jefferies,"So, on the first question in terms of the tren...",['yes good morning first question wanted ask r...,"['Sergio P. Ermotti', 'Sarah Youngwood']",The financial text provided discusses several ...,1. **Regional Strengths and Complementarity in...,"Based on the given financial text, several dis...","1. **Net New Money Trends**\n - _""first ques..."
4,1q23-earnings-call-remarks.pdf,1Q23,"Good morning. Two questions. Firstly, just on ...",['so first quarter first question seen signifi...,Citi,"So, on the first quarter or the first question...",['good morning two questions firstly slide 10 ...,['Sarah Youngwood'],"From the provided financial text, several dist...",1. **Mix Shifts in Financial Systems**\n - *...,The text involves a financial discussion focus...,1. **Net Interest Income (NII) Forecast and Ra...
...,...,...,...,...,...,...,...,...,...,...,...,...
91,4q24-earnings-call-remarks.pdf,4Q24,Good morning. One follow-up on GWM Americas an...,['first first question respect fa comp changes...,Citi,"So first, on your first question in respect to...",['good morning one follow-up gwm americas perh...,['Todd Tuckner'],"From the provided financial text, the followin...",1. **Compensation and Strategy Alignment**\n ...,"From the provided text, the following distinct...",1. **Strategic Changes and Incentives in GWM A...
92,4q24-earnings-call-remarks.pdf,4Q24,Very helpful. Thank you very much.,[''],Autonomous Research,,['helpful thank much'],[],"It appears that the provided text is empty, as...",Description not available,"Based on the provided text ""helpful thank much...",1. **Customer Service or Support**\n - Examp...
93,4q24-earnings-call-remarks.pdf,4Q24,Hi. Thank you and thank you for the clarificat...,['yeah terms expectations really articulated s...,Mediobanca,"Yeah. So in terms of the expectations, we have...",['hi thank thank clarifications capital got co...,['Todd Tuckner'],"Based on the provided financial text, I have i...",1. **Financial Expectations and Goals**:\n -...,"Upon analyzing the provided text, distinct top...",1. **Future Profitability Goals:**\n - **Top...
94,4q24-earnings-call-remarks.pdf,4Q24,Good morning. It's Antonio from Bank of Americ...,['terms mitigants bit headwinds know definitel...,Bank of America,So in terms of the mitigants a bit to the head...,['good morning antonio bank america two questi...,['Todd Tuckner'],From the provided text which seems to be part ...,1. **Revenue Growth and Hedging Strategies**\n...,The analyzed financial text discusses two main...,1. **Capital Management in Banking**\n - Exa...


In [None]:
############################################
# Step 5: Save Final DataFrame to CSV
############################################
print("Saving final DataFrame to CSV...")

final_df.to_csv("tqc_topic_model_jpmorgan.csv", index=False)
print("CSV file saved successfully.")

Saving final DataFrame to CSV...
CSV file saved successfully.


In [None]:
# Function to split topics and expand rows
def split_topics(row):
    topics = re.split(r'\n?\d+\.\s\*\*(.*?)\*\*:', row['Answer_topic_description'])
    topics = [t.strip() for t in topics if t.strip()]  # Remove empty entries
    return pd.DataFrame([{**row, 'Answer_topic_description': t} for t in topics])

# Apply the function to each row and reset the index
expanded_df = pd.concat([split_topics(row) for _, row in final_df.iterrows()], ignore_index=True)

# Display the processed DataFrame
expanded_df

Unnamed: 0,Quarter,Question,Question_cleaned,Analyst,Analyst Role,Response,Response_cleaned,Executive,Executive Role Type,Question_topic,Question_topic_description,Answer_topic,Answer_topic_description
0,4Q24,"Hi. Good morning. Jeremy, I wanted to ask abou...",['hi good morning jeremy wanted ask capital kn...,John McDonald,"Analyst, Truist Securities, Inc.","Yeah. Good question, John, and welcome back, b...",['yeah good question john welcome back way so ...,Jeremy Barnum,CFO,"Based on the provided text, here are the disti...",1. **Capital Management:**\n - Example sente...,"Based on the provided financial text, the foll...",Strategic Financial Decisions & Capital Alloca...
1,4Q24,"Hi. Good morning. Jeremy, I wanted to ask abou...",['hi good morning jeremy wanted ask capital kn...,John McDonald,"Analyst, Truist Securities, Inc.","Yeah. Good question, John, and welcome back, b...",['yeah good question john welcome back way so ...,Jeremy Barnum,CFO,"Based on the provided text, here are the disti...",1. **Capital Management:**\n - Example sente...,"Based on the provided financial text, the foll...",- This topic is identified through the discuss...
2,4Q24,"Hi. Good morning. Jeremy, I wanted to ask abou...",['hi good morning jeremy wanted ask capital kn...,John McDonald,"Analyst, Truist Securities, Inc.","Yeah. Good question, John, and welcome back, b...",['yeah good question john welcome back way so ...,Jeremy Barnum,CFO,"Based on the provided text, here are the disti...",1. **Capital Management:**\n - Example sente...,"Based on the provided financial text, the foll...",Investment Strategy and Focus Areas
3,4Q24,"Hi. Good morning. Jeremy, I wanted to ask abou...",['hi good morning jeremy wanted ask capital kn...,John McDonald,"Analyst, Truist Securities, Inc.","Yeah. Good question, John, and welcome back, b...",['yeah good question john welcome back way so ...,Jeremy Barnum,CFO,"Based on the provided text, here are the disti...",1. **Capital Management:**\n - Example sente...,"Based on the provided financial text, the foll...",- The focus is on high-certainty investment ch...
4,4Q24,"Hi. Good morning. Jeremy, I wanted to ask abou...",['hi good morning jeremy wanted ask capital kn...,John McDonald,"Analyst, Truist Securities, Inc.","Yeah. Good question, John, and welcome back, b...",['yeah good question john welcome back way so ...,Jeremy Barnum,CFO,"Based on the provided text, here are the disti...",1. **Capital Management:**\n - Example sente...,"Based on the provided financial text, the foll...",Operating Efficiency and Cost Management
...,...,...,...,...,...,...,...,...,...,...,...,...,...
206,1Q23,Thank you. So you talked about in your letter ...,['thank you talked letter regulators avoiding ...,Glenn Schorr,"Analyst, Evercore ISI",,['nan'],Jeremy Barnum,CFO & Co.,"Based on the analysis of the provided text, th...",1. **Customer and Corporate Response to Financ...,Based on the provided text which contains only...,Unavailable or Missing Data
207,1Q23,Thank you. So you talked about in your letter ...,['thank you talked letter regulators avoiding ...,Glenn Schorr,"Analyst, Evercore ISI",,['nan'],Jeremy Barnum,CFO & Co.,"Based on the analysis of the provided text, th...",1. **Customer and Corporate Response to Financ...,Based on the provided text which contains only...,Example Sentence
208,1Q23,Thank you. So you talked about in your letter ...,['thank you talked letter regulators avoiding ...,Glenn Schorr,"Analyst, Evercore ISI",,['nan'],Jeremy Barnum,CFO & Co.,"Based on the analysis of the provided text, th...",1. **Customer and Corporate Response to Financ...,Based on the provided text which contains only...,"In the entire input text, the only element ava..."
209,1Q23,Good morning. You guys talked about one of the...,['good morning guys talked one drivers higher ...,Matt O'Connor,"Analyst, Deutsche Bank Securities, Inc.",,['nan'],Jeremy Barnum,CFO & Co.,"Based on the provided financial text, there ar...",1. **Net Interest Income and Credit Card Balan...,The provided text consists solely of the term ...,Description not available


In [None]:
# Function to extract topics and reformat the row
def extract_topics(row):
    # Splitting based on numbered pattern (1, 2, 3, etc.) while keeping the content
    sections = re.split(r'\n?\d+\.\s', row['Answer_topic_description'])
    sections = [s.strip() for s in sections if s.strip()]  # Remove empty entries

    # Create a new dictionary for the reformatted row
    new_row = row.to_dict()

    # Assign each section to a new numbered column
    for i, section in enumerate(sections):
        new_row[f"Topic_{i+1}"] = section  # Naming columns as Topic_1, Topic_2, etc.

    return new_row

# Apply the function and expand the DataFrame
expanded_df = pd.DataFrame([extract_topics(row) for _, row in final_df.iterrows()])

# Display the processed DataFrame
expanded_df

Unnamed: 0,Quarter,Question,Question_cleaned,Analyst,Analyst Role,Response,Response_cleaned,Executive,Executive Role Type,Question_topic,...,Answer_topic_description,Topic_1,Topic_2,Topic_3,Topic_4,Topic_5,Topic_6,Topic_7,Topic_8,Topic_9
0,4Q24,"Hi. Good morning. Jeremy, I wanted to ask abou...",['hi good morning jeremy wanted ask capital kn...,John McDonald,"Analyst, Truist Securities, Inc.","Yeah. Good question, John, and welcome back, b...",['yeah good question john welcome back way so ...,Jeremy Barnum,CFO,"Based on the provided text, here are the disti...",...,1. **Strategic Financial Decisions & Capital A...,**Strategic Financial Decisions & Capital Allo...,**Investment Strategy and Focus Areas**:\n -...,**Operating Efficiency and Cost Management**:\...,**Technological Modernization and Efficiency**...,**Headcount Management and Organizational Effi...,**Critical Risk Management and Safety**:\n -...,,,
1,4Q24,"Hi. Simple and then more difficult, I guess. J...",['hi simple difficult guess jamie whos success...,Mike Mayo,"Analyst, Wells Fargo Securities LLC",I do love what I do. And answering the second ...,['love do answering second question first look...,Jamie Dimon,CEO,"Based on the provided text, three distinct top...",...,1. **Succession Planning**: \n - Example Sen...,**Succession Planning**: \n - Example Senten...,**Health Concerns and Rational Decisions**:\n ...,**Public Perception and Media Interaction**:\n...,**Talent Review and Internal Decision Making**...,**Uncertainty and Flexibility in Plans**:\n ...,**Leadership Continuity and Company Stability*...,,,
2,4Q24,"Hey. Good morning. Maybe just on regulation, w...",['hey good morning maybe regulation new admini...,Jim Mitchell,"Analyst, Seaport Global Securities LLC","Hey, Jim. I mean, it's obviously something we'...",['hey jim mean obviously something thinking lo...,Jeremy Barnum,CFO,"From the provided text, it seems to discuss to...",...,"- Example Sentence: ""we want a coherent, ratio...","- Example Sentence: ""we want a coherent, ratio...",Topic Name: **Bank's Role and Operations**\n ...,Topic Name: **Challenges to Bank Efficiency**\...,Topic Name: **Capital Management and Basel III...,Topic Name: **Business Sentiment and Loan Grow...,Topic Name: **Market Conditions and Business B...,Topic Name: **Economic Optimism and its Realiz...,,
3,4Q24,"Yes. Hi, good morning. Wanted to follow up on ...",['yes hi good morning wanted follow questions ...,Erika Najarian,"Analyst, UBS Securities LLC","Right, Erika. Okay. You are tempting me with m...",['right erika okay tempting many rabbit holes ...,Jeremy Barnum,CFO,1. **Topic Name: Capital Requirements and Regu...,...,1. **Seasonality and G-SIB Metrics**\n - **E...,**Seasonality and G-SIB Metrics**\n - **Exam...,**Regulatory Reviews and Amendments**\n - **...,**Legal Actions and Industry Response**\n - ...,**Capital Requirements and Economic Growth**\n...,**Scenario Analysis and Capital Projections**\...,,,,
4,4Q24,"Does that conclude your question, Erika?",['conclude question erika'],Erika,Unknown,Very good. We can go to the next question. Tha...,['good go next question thanks yeah'],Jeremy Barnum,CFO,"Based on the given text: ""conclude question er...",...,1. **General Conversation Closure or Transitio...,**General Conversation Closure or Transition**...,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
86,1Q23,"Hey Jeremy, you mentioned a degree of reinterm...",['hey jeremy mentioned degree reintermediation...,Mike Mayo,"Analyst, Wells Fargo Securities LLC",,['nan'],Jamie Dimon,CEO & Co.,1. **Reintermediation in Lending Markets**,...,Description not available,Description not available,,,,,,,,
87,1Q23,Hi. Good morning.\nI do want to unpack the que...,['hi good morning want unpack question possibi...,Betsy L. Graseck,"Analyst, Morgan Stanley & Co. LLC",,['nan'],Jeremy Barnum,CFO & Co.,"Analyzing the provided financial text, the fol...",...,Description not available,Description not available,,,,,,,,
88,1Q23,Thank you. So you talked about in your letter ...,['thank you talked letter regulators avoiding ...,Glenn Schorr,"Analyst, Evercore ISI",,['nan'],Jeremy Barnum,CFO & Co.,"Based on the analysis of the provided text, th...",...,Since the input does not contain any substanti...,Since the input does not contain any substanti...,**Topic Name**: Unavailable or Missing Data,**Example Sentence**: In the entire input text...,,,,,,
89,1Q23,Good morning. You guys talked about one of the...,['good morning guys talked one drivers higher ...,Matt O'Connor,"Analyst, Deutsche Bank Securities, Inc.",,['nan'],Jeremy Barnum,CFO & Co.,"Based on the provided financial text, there ar...",...,Description not available,Description not available,,,,,,,,


In [None]:
############################################
# Step 6: Save Final DataFrame to CSV
############################################
print("Saving final DataFrame to CSV...")

final_df.to_csv("tqc_topic_model_jpmorgan_breakdown.csv", index=False)
print("CSV file saved successfully.")

Saving final DataFrame to CSV...
CSV file saved successfully.


In [None]:
# Creating DataFrame
df = pd.DataFrame(final_df)

# Function to explode Topic column
def explode_topics(df):
    df = df.copy()
    df["Answer_topic_description"] = df["Answer_topic_description"].str.split(r'\s\d+\. ')
    df["Answer_topic_description"] = df["Answer_topic_description"].apply(lambda x: [i for i in x if i])
    return df.explode("Answer_topic_description", ignore_index=True)

# Applying function
df_exploded = explode_topics(df)

# Displaying the result
df_exploded

Unnamed: 0,filename,Quarter,Question,Question_cleaned,Analyst_Bank,Response,Response_cleaned,Executive,Question_topic,Question_topic_description,Answer_topic,Answer_topic_description
0,1q23-earnings-call-remarks.pdf,1Q23,"Chis Hallam, Goldman Sachs Yes. Good morning, ...",['okay thank you capital requirements know sit...,Goldman,"Okay. Thank you. On capital requirements, you ...",['chis hallam goldman sachs yes good morning e...,['Sergio P. Ermotti'],### Topics Analysis from Financial Text:,1. **Capital Requirements and Scope of Perimet...,### 1. Topics discussed in the text:,**Topic 1: Capital Requirements and Regulatory...
1,1q23-earnings-call-remarks.pdf,1Q23,"Chis Hallam, Goldman Sachs Yes. Good morning, ...",['okay thank you capital requirements know sit...,Goldman,"Okay. Thank you. On capital requirements, you ...",['chis hallam goldman sachs yes good morning e...,['Sergio P. Ermotti'],### Topics Analysis from Financial Text:,1. **Capital Requirements and Scope of Perimet...,### 1. Topics discussed in the text:,Additional Details from the Text:\n\nThe text ...
2,1q23-earnings-call-remarks.pdf,1Q23,Yeah. Thanks. Just two questions. The first on...,['so sarah take first question take secondso g...,JPMorgan,"So, Sarah, take the first question. I'll take ...",['yeah thanks two questions first one related ...,"['Sergio P. Ermotti', 'Sarah Youngwood']",The provided financial text discusses multiple...,1. **Economic Impact Analysis**\n - Example ...,Based on the analysis of the provided financia...,1. **Book Value Per Share Guidance:**\n - Ex...
3,1q23-earnings-call-remarks.pdf,1Q23,Yeah. Thanks. Just two questions. The first on...,['so sarah take first question take secondso g...,JPMorgan,"So, Sarah, take the first question. I'll take ...",['yeah thanks two questions first one related ...,"['Sergio P. Ermotti', 'Sarah Youngwood']",The provided financial text discusses multiple...,1. **Economic Impact Analysis**\n - Example ...,Based on the analysis of the provided financia...,**Client Retention and Impact on Wealth Manage...
4,1q23-earnings-call-remarks.pdf,1Q23,"Yeah. Thank you. Good morning. Welcome back, S...",['thank you ryan good back interact well look ...,Bank of America,"Thank you, Ryan. It is good to be back to inte...",['yeah thank you good morning welcome back ser...,['Sergio P. Ermotti'],"Based on the text provided, several distinct t...",1. **Topic Name: Asset Management and Restruct...,"Based on the text provided, here are the topic...",1. **Merger and Acquisition Strategy**\n - *...
...,...,...,...,...,...,...,...,...,...,...,...,...
276,4q24-earnings-call-remarks.pdf,4Q24,Good morning. It's Antonio from Bank of Americ...,['terms mitigants bit headwinds know definitel...,Bank of America,So in terms of the mitigants a bit to the head...,['good morning antonio bank america two questi...,['Todd Tuckner'],From the provided text which seems to be part ...,1. **Revenue Growth and Hedging Strategies**\n...,The analyzed financial text discusses two main...,**NII Sensitivity and Hedging in Banking**\n ...
277,4q24-earnings-call-remarks.pdf,4Q24,"Yes, good morning, everybody. I just got two f...",['hi piers – take second question first terms ...,HSBC,"Hi Piers, so on – I'll just take your second q...",['yes good morning everybody got two follow-up...,"['Todd Tuckner', 'Sergio P. Ermotti']","Based on the provided text, the conversation a...",1. **Basel III Impact and Internal Management*...,"Based on the provided financial text, here are...",1. **Client Risk Appetite and Banking Activity...
278,4q24-earnings-call-remarks.pdf,4Q24,"Yes, good morning, everybody. I just got two f...",['hi piers – take second question first terms ...,HSBC,"Hi Piers, so on – I'll just take your second q...",['yes good morning everybody got two follow-up...,"['Todd Tuckner', 'Sergio P. Ermotti']","Based on the provided text, the conversation a...",1. **Basel III Impact and Internal Management*...,"Based on the provided financial text, here are...",**Loan Portfolio and Credit Activity**:\n - ...
279,4q24-earnings-call-remarks.pdf,4Q24,"Yes, good morning, everybody. I just got two f...",['hi piers – take second question first terms ...,HSBC,"Hi Piers, so on – I'll just take your second q...",['yes good morning everybody got two follow-up...,"['Todd Tuckner', 'Sergio P. Ermotti']","Based on the provided text, the conversation a...",1. **Basel III Impact and Internal Management*...,"Based on the provided financial text, here are...",**Regulatory Compliance - Basel III Impact**:\...


In [None]:
df_exploded.to_csv("tqc_topic_model_ubs_breakdown1.csv", index=False)
print("CSV file saved successfully.")

CSV file saved successfully.
