# Global Impact of Brexit Uncertainty

In this notebook we will explore the global impact of Brexit uncertainty using a text-based method for measuring the cross-border propagation of large shocks at the firm level, as described by Hassan et al. ([2024](https://doi.org/10.1111/jofi.13293))


```math
BrexitExposure_{i,t} = \frac{1}{B_{i,t}}\sum_{b=1}^{B_{i,t}} 1\left[b=Brexit\right]
```

(Hassan et al., 2024, p. 419)


```math
BrexitRisk_{i,t} = \frac{1}{B_{i,t}}\sum_{b=1}^{B_{i,t}} \left\{1\left[b=Brexit\right] \times 1\left[|b-r| \lt 10\right]\right\}
```

(Hassan et al., 2024, p. 420)


```math
BrexitSentiment_{i,t} = \frac{1}{B_{i,t}}\sum_{b=1}^{B_{i,t}} \left\{1\left[b=Brexit\right] \times \left(\sum_{c=b-10}^{b+10}{S(c)}\right)\right\}
```

(Hassan et al., 2024, p. 420)


```math
NonBrexitRisk_{i,t} = \frac{1}{B_{i,t}}\sum_{b}^{B_{i,t}} \left\{1\left[b\in\R\right]\right\} - BrexitRisk_{i,t}
```

(Hassan et al., 2024, p. 421)


```math
NonBrexitSentiment_{i,t} = \frac{1}{B_{i,t}}\sum_{b}^{B_{i,t}} S(b) - BrexitSentiment_{i,t}
```

(Hassan et al., 2024, p. 421)


In [20]:
%load_ext autoreload
%autoreload 2
import pandas as pd
from process_transcripts import process_transcripts
from master_dictionary import load_masterdictionary
from pathlib import Path

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


Process all the transcripts if we haven't already.


In [21]:
process_transcripts()

Processing transcripts/2020/2020-Apr-29-AZN.L-138376894593-Transcript.txt...
  Skipping 2020-Apr-29-AZN.L-138376894593-Transcript.txt, already processed.
Processing transcripts/2020/2020-Feb-14-AZN.L-139870628566-Transcript.txt...
  Skipping 2020-Feb-14-AZN.L-139870628566-Transcript.txt, already processed.
Processing transcripts/2020/2020-Nov-05-AZN.L-137119949493-Transcript.txt...
  Skipping 2020-Nov-05-AZN.L-137119949493-Transcript.txt, already processed.
Processing transcripts/2020/2020-Jul-30-AZN.L-139316097786-Transcript.txt...
  Skipping 2020-Jul-30-AZN.L-139316097786-Transcript.txt, already processed.
Processing transcripts/2016/2016-Nov-10-AZN.L-140407419257-Transcript.txt...
  Skipping 2016-Nov-10-AZN.L-140407419257-Transcript.txt, already processed.
Processing transcripts/2016/2016-Feb-04-AZN.L-137149101673-Transcript.txt...
  Skipping 2016-Feb-04-AZN.L-137149101673-Transcript.txt, already processed.
Processing transcripts/2016/2016-Jul-29-BARC.L-138523614655-Transcript.txt..

In [22]:
file_name = "Loughran-McDonald_MasterDictionary_1993-2024.csv"
dir_name = "loughran_mcdonald_dictionary"
path = Path("") / dir_name / file_name
md = load_masterdictionary(path)

In [23]:
for word in ["BAD", "GOOD", "UNCERTAIN"]:
    word = md[word]
    print(f"{word.word}")
    print(f"  Positive: {word.positive}")
    print(f"  Negative: {word.negative}")
    print(f"  Uncertain: {word.uncertainty}")

BAD
  Positive: 0
  Negative: 2009
  Uncertain: 0
GOOD
  Positive: 2009
  Negative: 0
  Uncertain: 0
UNCERTAIN
  Positive: 0
  Negative: 0
  Uncertain: 2009


Example of how to read in a specific processed transcript


In [24]:
azn = Path("") / "processed" / "2016" / "AZN.L-2016-04-29-138460076277.txt"
with open(azn, "r") as f:
    print(f.read())

Hello, everyone, I'm Pascal Soriot, CEO of AstraZeneca.
Welcome to the Q1 2016 results conference call for investors and analysts.
The presentation is posted online for you to download and there is also an audio player.
I'm joined today by Luke Miels, Executive Vice President for Global Production for Product and Portfolio Strategy, Global Medical Affairs and Corporate Affairs; Marc Dunoyer, our CFO; and Sean Bohen, our CMO.
We plan to spend 20 to 25 minutes on the presentation and then leave ample time for Q&A.
In total we have about one hour together.
So please turn to Slide 2, where you see our forward-looking statements.
Moving to Slide 3, you will see there the agenda.
The plan for today for me is to provide a short overview, and then I'll hand over to Luke for an update on our growth platforms and the ongoing launches of new medicines.
As usual, Marc cover the financials and our guidance, and Sean will provide a pipeline and useful update, and I will end up with concluding remark

Create a dictionary of all the file names of the processed transcripts, organized by ticker and year, so we can do statistics on them later.


In [None]:
from collections import defaultdict


def load_transcript_names() -> defaultdict[str, defaultdict[str, list]]:
    """
    Load transcript filenames into a nested dictionary structure.

    Returns:
        defaultdict: A dictionary mapping ticker to year to list of transcript info.
    """
    # Dictionary to map ticker to year to list of transcripts
    transcripts = defaultdict(lambda: defaultdict(list))

    processed_dir = Path("") / "processed"

    # Iterate through all transcript files
    for transcript_file in processed_dir.rglob("*.txt"):
        # Extract ticker from filename
        ticker = transcript_file.stem.split("-")[0]
        # Extract year from filename
        year = transcript_file.stem.split("-")[1]

        # Store in dictionary
        transcripts[ticker][year].append({"filename": transcript_file.name})
    return transcripts

Actually load them in


In [28]:
transcripts = load_transcript_names()

In [29]:
# Print summary for verification
for ticker, years in transcripts.items():
    print(f"Ticker: {ticker}")
    for year, files in years.items():
        print(f"  Year: {year}, Number of transcripts: {len(files)}")

Ticker: AZN.L
  Year: 2020, Number of transcripts: 4
  Year: 2016, Number of transcripts: 4
Ticker: BARC.L
  Year: 2016, Number of transcripts: 4


List the names of the processed transcripts for some tickers


In [31]:
tickers = ["AZN.L"]  # or tickers = transcripts.keys() for all of them
# Print out the names of the transcripts for these tickers per year
for ticker in tickers:
    if ticker in transcripts:
        print(f"Transcripts for {ticker}:")
        for year, files in transcripts[ticker].items():
            print(f"  Year: {year}")
            for file_info in files:
                print(f"    {file_info['filename']}")
    else:
        print(f"No transcripts found for {ticker}.")

Transcripts for AZN.L:
  Year: 2020
    AZN.L-2020-02-14-139870628566.txt
    AZN.L-2020-07-30-139316097786.txt
    AZN.L-2020-04-29-138376894593.txt
    AZN.L-2020-11-05-137119949493.txt
  Year: 2016
    AZN.L-2016-11-10-140407419257.txt
    AZN.L-2016-07-28-140055454105.txt
    AZN.L-2016-02-04-137149101673.txt
    AZN.L-2016-04-29-138460076277.txt


Count occurences of the word "Brexit"


In [None]:
def count_brexit_occurrences(transcripts: defaultdict) -> None:
    """
    Count the occurrences of the word "Brexit" in each processed transcript.

    Args:
        transcripts (defaultdict): Dictionary mapping ticker to year to list of transcript file info.

    Side Effects:
        Updates each file info dictionary with a new key "brexit_count" indicating the number of
        occurrences of the word "Brexit" in the corresponding transcript.
    """
    processed_dir = Path("") / "processed"

    for ticker, years in transcripts.items():
        for year, files in years.items():
            for file_info in files:
                # Construct the full path to the transcript file
                transcript_path = processed_dir / year / file_info["filename"]

                # Read the transcript and count "brexit" occurrences (case insensitive)
                with open(transcript_path, "r") as f:
                    content = f.read().lower()
                    brexit_count = content.count("brexit")

                # Add brexit_count to the file_info dictionary
                file_info["brexit_count"] = brexit_count

In [33]:
count_brexit_occurrences(transcripts)

In [35]:
for ticker, years in transcripts.items():
    print(f"Ticker: {ticker}")
    for year, files in years.items():
        print(f"  Year: {year}")
        for file_info in files:
            print(
                f"    {file_info['filename']}: {file_info['brexit_count']} occurrences"
            )

Ticker: AZN.L
  Year: 2020
    AZN.L-2020-02-14-139870628566.txt: 0 occurrences
    AZN.L-2020-07-30-139316097786.txt: 0 occurrences
    AZN.L-2020-04-29-138376894593.txt: 0 occurrences
    AZN.L-2020-11-05-137119949493.txt: 0 occurrences
  Year: 2016
    AZN.L-2016-11-10-140407419257.txt: 0 occurrences
    AZN.L-2016-07-28-140055454105.txt: 3 occurrences
    AZN.L-2016-02-04-137149101673.txt: 0 occurrences
    AZN.L-2016-04-29-138460076277.txt: 0 occurrences
Ticker: BARC.L
  Year: 2016
    BARC.L-2016-07-29-138523614655.txt: 7 occurrences
    BARC.L-2016-03-01-137548274670.txt: 0 occurrences
    BARC.L-2016-03-01-139824873187.txt: 3 occurrences
    BARC.L-2016-07-29-139004361518.txt: 27 occurrences
