# Impact of central bank interest rate decisions on inflation rates and markets
Project by: *Timo Gumpp*
Course: *Machine Learning for Portfolio Trading*
University: *ENSAE Paris*

## Objective of this project

1. Observe how accompanying press statements regarding decisions of interest rate changes differ (length, sentiment, sophistication, etc.)
2. Observe correlation of different press releases characteristics on the result of the interest change hike
3. Observe impact of characteristics of interest rate announcements (length, sentiment, sophistication, etc.) on inflation development
4. (Optionally, out of interest): Observe potential impact of characteristics of interest rate announcements (length, sentiment, sophistication, etc.) on capital markets (e.g., using the example of S&P 500 index)

### Loading Packages

In [44]:
import pandas as pd
import matplotlib
import nltk
#import spacy
from IPython.display import IFrame
import os
from bs4 import BeautifulSoup
import requests
from datetime import datetime

### Load in text

#### What to look at

I limit the project on the ECB and the FED, thus looking into inflation in the Eurozone and the US respectively.

***
- The ECB's council takes interest rate decisions x times a year and each time releases press statements.
- The Federal Open Market Commitee (FOMC – the FED's council on moneytary policy decisions) holds regularly scheduled meetings 8 times a year and other meetings as needed. Each time a press statement is released. Every second meeting, this seems to be accompanied by a summary of economic projections.

The released statements include xx

An **example for a press statement** from both the ECB and the FED can be viewed here and accessed under the these links: __[ECB_Dec23](https://www.ecb.europa.eu/press/pressconf/shared/pdf/ecb.ds231214~cbcff0882a.en.pdf)__ or __[FOMC_Dec23](https://www.federalreserve.gov/monetarypolicy/files/monetary20231213a1.pdf)__

In [8]:
os.getcwd()

'/Users/tgumpp/DataspellProjects/ICBIRDIRM'

In [12]:
filepath_exp_ecb = "./Documents/ECB/ecb231214.pdf"

In [13]:
IFrame(filepath_exp_ecb, width=600, height=300)

We primarily focus on the developments from 2022 to 2023, and will potentally look into developments in 2020 and 2021. The year 2024 is left out of our analysis.
The also release minutes, which we're not using in our analysis

The FOMC's press releases compose themselves of a press statement and an implementation note, highlighting which monetary measures are being taken.

In [3]:
text_test = "Recent indicators suggest that growth of economic activity has slowed from its strong pace in the third quarter. Job gains have moderated since earlier in the year but remain strong, and the unemployment rate has remained low. Inflation has eased over the past year but remains elevated. The U.S. banking system is sound and resilient. Tighter financial and credit conditions for households and businesses are likely to weigh on economic activity, hiring, and inflation. The extent of these effects remains uncertain. The Committee remains highly attentive to inflation risks. The Committee seeks to achieve maximum employment and inflation at the rate of 2 percent over the longer run. In support of these goals, the Committee decided to maintain the target range for the federal funds rate at 5-1/4 to 5-1/2 percent. The Committee will continue to assess additional information and its implications for monetary policy. In determining the extent of any additional policy firming that may be appropriate to return inflation to 2 percent over time, the Committee will take into account the cumulative tightening of monetary policy, the lags with which monetary policy affects economic activity and inflation, and economic and financial developments. In addition, the Committee will continue reducing its holdings of Treasury securities and agency debt and agency mortgage-backed securities, as described in its previously announced plans. The Committee is strongly committed to returning inflation to its 2 percent objective. In assessing the appropriate stance of monetary policy, the Committee will continue to monitor the implications of incoming information for the economic outlook. The Committee would be prepared to adjust the stance of monetary policy as appropriate if risks emerge that could impede the attainment of the Committee's goals. The Committee's assessments will take into account a wide range of information, including readings on labor market conditions, inflation pressures and inflation expectations, and financial and international developments."

In [4]:
word_count = len(text_test.split())
print(f"Word Count: {word_count}")

Word Count: 310


In [7]:
from nltk import FreqDist
from nltk.tokenize import word_tokenize

# Download NLTK resources (you need to do this once)
nltk.download('punkt')

# Tokenize the text
tokens = word_tokenize(text_test)

# Calculate word frequencies
word_freq = FreqDist(tokens)

# Display the most common words
print(word_freq.most_common(10))

[nltk_data] Downloading package punkt to /Users/tgumpp/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


[('the', 21), ('.', 16), ('and', 15), (',', 13), ('of', 12), ('to', 12), ('Committee', 11), ('The', 8), ('inflation', 8), ('policy', 6)]


#### Web Mining FOMC

Extraction of text is automated, the supply of the corresponding URLs not for now

In [57]:
fomc_statements_url_list = ['https://www.federalreserve.gov/newsevents/pressreleases/monetary20230201a.htm',
                            'https://www.federalreserve.gov/newsevents/pressreleases/monetary20230322a.htm',
                            'https://www.federalreserve.gov/newsevents/pressreleases/monetary20230503a.htm',
                            'https://www.federalreserve.gov/newsevents/pressreleases/monetary20230614a.htm',
                            'https://www.federalreserve.gov/newsevents/pressreleases/monetary20230726a.htm',
                            'https://www.federalreserve.gov/newsevents/pressreleases/monetary20230920a.htm',
                            'https://www.federalreserve.gov/newsevents/pressreleases/monetary20231101a.htm',
                            'https://www.federalreserve.gov/newsevents/pressreleases/monetary20231213a.htm']

In [9]:
# To automate: Cycle through URLs here
url = "https://www.federalreserve.gov/newsevents/pressreleases/monetary20231101a.htm"
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html')

In [None]:
<p class="article__time">December 13, 2023</p>
div_list = {'article__time', 'col-xs-12 col-sm-8 col-md-8'}
<meta property="og:url" content="https://www.federalreserve.gov/newsevents/pressreleases/monetary20231101a.htm">
meta_tag = soup.find('meta', property='og:url').get('content')

In [55]:
target_text = soup.find('div', class_='col-xs-12 col-sm-8 col-md-8')
target_date = soup.find('p', class_='article__time')
target_url = soup.find('meta', property='og:url').get('content')

In [80]:
# TODO: add if / else if mining fails
def process_text(text):
    list = [title.text.strip() for title in text]
    text_final = ' '.join(map(str, list))
    return text_final

def process_date(date):
    raw_date = date.get_text(strip=True)
    #print("Raw Date:", raw_date)

    # Transform the date format
    datetime_object = datetime.strptime(raw_date, '%B %d, %Y')
    formatted_date = datetime_object.strftime('%d.%m.%Y')
    return formatted_date

def process_url(url):
    return url.split('/')[-1].split('.')[0]

In [41]:
target_website

'https://www.federalreserve.gov/newsevents/pressreleases/monetary20231101a.htm'

In [82]:
for x in fomc_statements_url_list:
    page = requests.get(x)
    soup = BeautifulSoup(page.text, 'html')

    target_text = soup.find('div', class_='col-xs-12 col-sm-8 col-md-8')
    target_date = soup.find('p', class_='article__time')
    target_url = soup.find('meta', property='og:url').get('content')

    # run processing and add processed to df
    df_fomc_press_statements = df_fomc_press_statements.append({'Filename': process_url(target_url), 'Date': process_date(target_date), 'Text': process_text(target_text)}, ignore_index=True)

  df_fomc_press_statements = df_fomc_press_statements.append({'Filename': process_url(target_url), 'Date': process_date(target_date), 'Text': process_text(target_text)}, ignore_index=True)
  df_fomc_press_statements = df_fomc_press_statements.append({'Filename': process_url(target_url), 'Date': process_date(target_date), 'Text': process_text(target_text)}, ignore_index=True)
  df_fomc_press_statements = df_fomc_press_statements.append({'Filename': process_url(target_url), 'Date': process_date(target_date), 'Text': process_text(target_text)}, ignore_index=True)
  df_fomc_press_statements = df_fomc_press_statements.append({'Filename': process_url(target_url), 'Date': process_date(target_date), 'Text': process_text(target_text)}, ignore_index=True)
  df_fomc_press_statements = df_fomc_press_statements.append({'Filename': process_url(target_url), 'Date': process_date(target_date), 'Text': process_text(target_text)}, ignore_index=True)
  df_fomc_press_statements = df_fomc_press_statements.a

In [83]:
df_fomc_press_statements

Unnamed: 0,Date,Text,Filename
0,01.02.2023,Recent indicators point to modest growth in s...,monetary20230201a
1,22.03.2023,Recent indicators point to modest growth in s...,monetary20230322a
2,03.05.2023,Economic activity expanded at a modest pace i...,monetary20230503a
3,14.06.2023,Recent indicators suggest that economic activ...,monetary20230614a
4,26.07.2023,Recent indicators suggest that economic activ...,monetary20230726a
5,20.09.2023,Recent indicators suggest that economic activ...,monetary20230920a
6,01.11.2023,Recent indicators suggest that economic activ...,monetary20231101a
7,13.12.2023,Recent indicators suggest that growth of econ...,monetary20231213a


In [61]:
df_fomc_press_statements

Unnamed: 0,Date,Text,Filename


In [14]:


# Print the content of the found div (if any)
if target_div:
    print(target_div.prettify())
else:
    print("Div not found.")

<div class="col-xs-12 col-sm-8 col-md-8">
 <p>
  Recent indicators suggest that economic activity expanded at a strong pace in the third quarter. Job gains have moderated since earlier in the year but remain strong, and the unemployment rate has remained low. Inflation remains elevated.
 </p>
 <p>
  The U.S. banking system is sound and resilient. Tighter financial and credit conditions for households and businesses are likely to weigh on economic activity, hiring, and inflation. The extent of these effects remains uncertain. The Committee remains highly attentive to inflation risks.
 </p>
 <p>
  The Committee seeks to achieve maximum employment and inflation at the rate of 2 percent over the longer run. In support of these goals, the Committee decided to maintain the target range for the federal funds rate at 5-1/4 to 5-1/2 percent. The Committee will continue to assess additional information and its implications for monetary policy. In determining the extent of additional policy firmi

In [29]:
list = [title.text.strip() for title in target_div]
len(list)

15

In [30]:
text = ' '.join(map(str, list))
text

" Recent indicators suggest that economic activity expanded at a strong pace in the third quarter. Job gains have moderated since earlier in the year but remain strong, and the unemployment rate has remained low. Inflation remains elevated.  The U.S. banking system is sound and resilient. Tighter financial and credit conditions for households and businesses are likely to weigh on economic activity, hiring, and inflation. The extent of these effects remains uncertain. The Committee remains highly attentive to inflation risks.  The Committee seeks to achieve maximum employment and inflation at the rate of 2 percent over the longer run. In support of these goals, the Committee decided to maintain the target range for the federal funds rate at 5-1/4 to 5-1/2 percent. The Committee will continue to assess additional information and its implications for monetary policy. In determining the extent of additional policy firming that may be appropriate to return inflation to 2 percent over time, 

In [81]:
df_fomc_press_statements = pd.DataFrame(columns = {"Filename", "Date", "Text"})
df_fomc_press_statements.info()

<class 'pandas.core.frame.DataFrame'>
Index: 0 entries
Data columns (total 3 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   Date      0 non-null      object
 1   Text      0 non-null      object
 2   Filename  0 non-null      object
dtypes: object(3)
memory usage: 0.0+ bytes


In [None]:
def fill_df():

    df_fomc_press_statements['Filename'] = filename
    df_fomc_press_statements['Date'] = date
    df_fomc_press_statements['Text'] = text

### Load in inflation data

### Processing text

Preprocessing text, these are typical steps to pursue to prepare a meaningful dataset for text.

- Lemmanization
- Stemming
- Removal of Stopwords
-

### First text analysis

In [71]:
def get_word_count(text):
    word_count = len(text.split())
    #print(f"Word Count: {word_count}")
    return word_count

def get_char_count(text):
    char_count = len(text)
    # print(f"Character Count: {char_count}")
    return char_count

In [69]:
df_fomc_press_statements = df_fomc_press_statements.assign(WordCount='', CharCount='')

In [75]:
df_fomc_press_statements.head()

Unnamed: 0,Date,Text,Filename,WordCount,CharCount
0,01.02.2023,Recent indicators point to modest growth in s...,monetary20230201a,,
1,22.03.2023,Recent indicators point to modest growth in s...,monetary20230322a,,
2,03.05.2023,Economic activity expanded at a modest pace i...,monetary20230503a,,
3,14.06.2023,Recent indicators suggest that economic activ...,monetary20230614a,,
4,26.07.2023,Recent indicators suggest that economic activ...,monetary20230726a,,


In [74]:
for x in df_fomc_press_statements['Text']:
    df_fomc_press_statements = df_fomc_press_statements.append({'WordCount': get_word_count(x), 'CharCount': get_char_count(x)}, ignore_index=True)

  df_fomc_press_statements = df_fomc_press_statements.append({'WordCount': get_word_count(x), 'CharCount': get_char_count(x)})


TypeError: Can only append a dict if ignore_index=True

In [16]:
#Tokenizing
tokens = nltk.word_tokenize(text_test)

['Recent',
 'indicators',
 'suggest',
 'that',
 'growth',
 'of',
 'economic',
 'activity',
 'has',
 'slowed',
 'from',
 'its',
 'strong',
 'pace',
 'in',
 'the',
 'third',
 'quarter',
 '.',
 'Job',
 'gains',
 'have',
 'moderated',
 'since',
 'earlier',
 'in',
 'the',
 'year',
 'but',
 'remain',
 'strong',
 ',',
 'and',
 'the',
 'unemployment',
 'rate',
 'has',
 'remained',
 'low',
 '.',
 'Inflation',
 'has',
 'eased',
 'over',
 'the',
 'past',
 'year',
 'but',
 'remains',
 'elevated',
 '.',
 'The',
 'U.S.',
 'banking',
 'system',
 'is',
 'sound',
 'and',
 'resilient',
 '.',
 'Tighter',
 'financial',
 'and',
 'credit',
 'conditions',
 'for',
 'households',
 'and',
 'businesses',
 'are',
 'likely',
 'to',
 'weigh',
 'on',
 'economic',
 'activity',
 ',',
 'hiring',
 ',',
 'and',
 'inflation',
 '.',
 'The',
 'extent',
 'of',
 'these',
 'effects',
 'remains',
 'uncertain',
 '.',
 'The',
 'Committee',
 'remains',
 'highly',
 'attentive',
 'to',
 'inflation',
 'risks',
 '.',
 'The',
 'Committ

In [24]:
nltk.pos_tag(tokens)

[('Recent', 'JJ'),
 ('indicators', 'NNS'),
 ('suggest', 'VBP'),
 ('that', 'IN'),
 ('growth', 'NN'),
 ('of', 'IN'),
 ('economic', 'JJ'),
 ('activity', 'NN'),
 ('has', 'VBZ'),
 ('slowed', 'VBN'),
 ('from', 'IN'),
 ('its', 'PRP$'),
 ('strong', 'JJ'),
 ('pace', 'NN'),
 ('in', 'IN'),
 ('the', 'DT'),
 ('third', 'JJ'),
 ('quarter', 'NN'),
 ('.', '.'),
 ('Job', 'NNP'),
 ('gains', 'NNS'),
 ('have', 'VBP'),
 ('moderated', 'VBN'),
 ('since', 'IN'),
 ('earlier', 'RBR'),
 ('in', 'IN'),
 ('the', 'DT'),
 ('year', 'NN'),
 ('but', 'CC'),
 ('remain', 'VBP'),
 ('strong', 'JJ'),
 (',', ','),
 ('and', 'CC'),
 ('the', 'DT'),
 ('unemployment', 'NN'),
 ('rate', 'NN'),
 ('has', 'VBZ'),
 ('remained', 'VBN'),
 ('low', 'JJ'),
 ('.', '.'),
 ('Inflation', 'NN'),
 ('has', 'VBZ'),
 ('eased', 'VBN'),
 ('over', 'IN'),
 ('the', 'DT'),
 ('past', 'JJ'),
 ('year', 'NN'),
 ('but', 'CC'),
 ('remains', 'VBZ'),
 ('elevated', 'JJ'),
 ('.', '.'),
 ('The', 'DT'),
 ('U.S.', 'NNP'),
 ('banking', 'NN'),
 ('system', 'NN'),
 ('is', 'V

In [23]:
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/tgumpp/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.


True

### Sentiment Analysis

In [20]:
nltk.download('vader_lexicon')

[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /Users/tgumpp/nltk_data...


True

In [68]:
from tqdm.notebook import tqdm
from nltk.sentiment import SentimentIntensityAnalyzer

sia = SentimentIntensityAnalyzer()

In [86]:
for x in df_fomc_press_statements['Text']:
    #print(type(x))
    print(sia.polarity_scores(x))

{'neg': 0.078, 'neu': 0.846, 'pos': 0.076, 'compound': -0.5994}
{'neg': 0.047, 'neu': 0.885, 'pos': 0.068, 'compound': 0.765}
{'neg': 0.051, 'neu': 0.881, 'pos': 0.068, 'compound': 0.6124}
{'neg': 0.05, 'neu': 0.879, 'pos': 0.072, 'compound': 0.743}
{'neg': 0.051, 'neu': 0.882, 'pos': 0.067, 'compound': 0.6124}
{'neg': 0.062, 'neu': 0.853, 'pos': 0.085, 'compound': 0.775}
{'neg': 0.061, 'neu': 0.853, 'pos': 0.086, 'compound': 0.8225}
{'neg': 0.059, 'neu': 0.847, 'pos': 0.094, 'compound': 0.9042}


In [22]:
sia.polarity_scores(text_test)

{'neg': 0.07, 'neu': 0.828, 'pos': 0.103, 'compound': 0.85}

#### Roberta

In [25]:
from transformers import AutoTokenizer
from transformers import AutoModelForSequenceClassification
from scipy.special import softmax

ModuleNotFoundError: No module named 'transformers'

### Comparing sentiment analysis with inflation rate development