# **FED Textual Analysis Software (Beta version)**

We acknowledge the support of Elia Landini, Jessie Cameron & Lina Avril (Pantheon-Sorbonne University) in the development of this project.

### **INTRODUCTION**

The project aims to conduct textual analysis of the Federal Reserve's (FED) Monetary Policy Reports through the deployment of Python-based software. This report is written semi-annually and provided to Congress containing discussions on the "conduct of monetary policy and economic developments and prospects for the future." 

First, we develop a web scraping script to extract textual data from the FED's website. Subsequently, we use the Natural Language Toolkit (NLTK) package to preprocess the text, including tokenization, stemming, and converting words to lowercase. Next, the Loughran McDonald Sentiment Dictionary is employed to transform the cleaned qualitative text data into a quantitative measure of the FED's communication tone. This communication measure is then be regressed against the output gap and inflation gap, obtained via API, to assess the sensitivity of the FED's communication to these macroeconomic variables. Throughout the project, we employ various visualisation and analysis packages to explore the data and conduct preliminary analysis. 

Finally, we plan to develop a user-friendly interface for easy access and interpretation of our findings. **IF WE GET TIME**

### **INSTALL PACKAGES**

In [226]:
!pip install pandas
!pip install matplotlib
!pip install requests-html
!pip install seaborn
!pip install numpy
!pip install schedule
!pip install statsmodels
!pip install reportlab
!pip install scipy
!pip install linearmodels
!pip install openai
!pip install fredapi



### **1. IMPORT TEXTUAL DATA: FED WEB SCRAPPING**

With the following function, we aim to retrieve and filter text-based sources concerning monetary policy decisions undertaken by the FED itself and released to the public on semiannual press conferences.  
Customize the function to scrape articles from the ECB/Eurostystem website within the folder named "Monetary Policy Report"
The function is also designed to include filtering options to select specific text-based sources according to topic and typology of the publication. However in our specific case we will be interesented only in semiannual reports concerning monetary policy.
Base URL-FED: https://www.federalreserve.gov/monetarypolicy/publications/mpr_default.htm

 #### 1.1 FED press conferences URL retrieval 

In [227]:
import requests
from bs4 import BeautifulSoup

In [228]:
# FED scraping function
# With the following function, we aim to retrieve and filter text-based sources concerning monetary policy decisions undertaken by the FED itself and released to the public on semiannual press conferences.  
# Customize the function to scrape articles from the ECB/Eurostystem website within the folder named "Monetary Policy Report"
# The function is also designed to include filtering options to select specific text-based sources according to topic and typology of the publication. However in our specific case we will be interesented only in semiannual reports concerning monetary policy.
# Base URL-FED: https://www.federalreserve.gov/monetarypolicy/publications/mpr_default.htm    

def fed_get_articles(topic, publication_type, sub_class):
    
    # Base URL settings 
    base_url = f"https://www.federalreserve.gov/{topic}/{publication_type}/mpr_default.htm"
    base_domain = f"https://www.federalreserve.gov"
    
    article_urls = []

    # From the base URL, we now extract all the available URLs on the page by deploying Request and BeatifulSoup packages 
    response = requests.get(base_url)
    soup = BeautifulSoup(response.content, "html.parser")

    # Find and filter article URLs, ruling out other structural URL non-inherent to the analysis
    # We are also interested only in full report publications and not summaries. In our case this difference is highlighted by the subclass "testimony" in which full report publications are stored
    # It is also worth noticing that the publications' URL has slightly changed throughout time (from 2016 onwards)
    for link in soup.find_all("a", href=True):
        article_url = link["href"]

        # To limit the research to the first 5 results, we may want to activate this loop
        # if len(article_urls) >= 5:  
            # break
        if not article_url.startswith("http"):
            article_url = base_domain + article_url
        
        # Source specific customizations
        # Necessary specifications to filter URLs and include even the 2000's press confereneces, which follow a slightly different structure:
        if f"{sub_class}" in article_url or "/hh/2000/February/Testimony.htm" in article_url or "/hh/2000/July/Testimony.htm" in article_url:
            article_urls.append(article_url)
        


    # Delete the first element of the article_urls list representing the general folder    
    for url in article_urls:     
        if url == f"https://www.federalreserve.gov/newsevents/{sub_class}.htm":
            article_urls.remove(url)
            break

    return article_urls

In [229]:
# Retrieving FED semiannual reports' URLs through the previous function by specifying the value of each parameter to fit our research scope
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"

print(fed_get_articles(topic, publication_type, sub_class))

['https://www.federalreserve.gov/newsevents/testimony/powell20240306a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20230307a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20230621a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20220302a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20220622a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20210223a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20210714a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20200211a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20200616a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20190226a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20190710a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20180226a.htm', 'https://www.federalreserve.gov/newsevents/testimony/powell20180717a.htm', 'https://www.federalrese

#### 1.2 Text-based sources retrieving

In [230]:
import requests
from bs4 import BeautifulSoup
import re

In [231]:
# Text-based sources retrieving for the target article (num = ID/Index article)
def fed_article_txt(article_urls, num):
    articles_text = []

    for article_url in article_urls:

        # Fetch article content (the "response" function takes the HTML text from the URL)
        response = requests.get(article_url)
        soup = BeautifulSoup(response.content, "html.parser")
        article_text = soup.get_text()
        articles_text.append(article_text)
    
    return articles_text

In [232]:
# Article 1 example
article_urls = fed_get_articles(topic, publication_type, sub_class)
num = 0
print(fed_article_txt(article_urls, num))



#### 1.3 Text Cleaning

In [233]:
import requests
from bs4 import BeautifulSoup
import re

In [234]:
# Text-based soruces retrieving for the target article (num = ID/Index article)
def fed_article_cleaning(pattern, topic, publication_type, sub_class, num):

    # Combine previous functions to directly pass to input to the new function in order to eventually clean the text from unhelpful and irrelevant texts
    cleaned_text = []
    input_text = fed_article_txt(fed_get_articles(topic, publication_type, sub_class), num)
    extracted_text = re.search(pattern, input_text[num], re.DOTALL)

    if extracted_text:
        cleaned_text.append(extracted_text.group(1).strip()) 
    else:
        cleaned_text.append("Text not found")
        
    return cleaned_text

In [235]:
# Article 0 [Powell, Mar 2024]
# the pattern settings may also be included into the text cleaning fuction, but they result to be quite specific of each publication serie and not broadly applicable from an automated perspective

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 0
fed_CESO0 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 0
fed_MP0 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article0_text = fed_CESO0 + fed_MP0
print(article0_text)

["Economic activity expanded at a strong pace over the past year. For 2023 as a whole, gross domestic product increased 3.1 percent, bolstered by solid consumer demand and improving supply conditions. Activity in the housing sector was subdued over the past year, largely reflecting high mortgage rates. High interest rates also appear to have been weighing on business fixed investment.\nThe labor market remains relatively tight, but supply and demand conditions have continued to come into better balance. Since the middle of last year, payroll job gains have averaged 239,000 jobs per month, and the unemployment rate has remained near historical lows, at 3.7 percent. Strong job creation has been accompanied by an increase in the supply of workers, particularly among individuals aged 25 to 54, and a continued strong pace of immigration. Job vacancies have declined, and nominal wage growth has been easing. Although the jobs-to-workers gap has narrowed, labor demand still exceeds the supply 

In [236]:
# Article 1 [Powell, Mar 2023]

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 1
fed_CESO1 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 1
fed_MP1 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article1_text = fed_CESO1 + fed_MP1
print(article1_text)

["The data from January on employment, consumer spending, manufacturing production, and inflation have partly reversed the softening trends that we had seen in the data just a month ago. Some of this reversal likely reflects the unseasonably warm weather in January in much of the country. Still, the breadth of the reversal along with revisions to the previous quarter suggests that inflationary pressures are running higher than expected at the time of our previous Federal Open Market Committee (FOMC) meeting.\nFrom a broader perspective, inflation has moderated somewhat since the middle of last year but remains well above the FOMC's longer-run objective of 2 percent. The 12-month change in total personal consumption expenditures (PCE) prices has slowed from its peak of 7 percent in June to 5.4 percent in January as energy prices have declined and supply chain bottlenecks have eased.\nOver the past 12 months, core PCE inflation, which excludes the volatile food and energy prices, was 4.7

In [237]:
# Article 2 [Powell, Jun 2023]

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 2
fed_CESO2 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 2
fed_MP2 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article2_text = fed_CESO2 + fed_MP2
print(article2_text)

['The U.S. economy slowed significantly last year, and recent indicators suggest that economic activity has continued to expand at a modest pace. Although growth in consumer spending has picked up this year, activity in the housing sector remains weak, largely reflecting higher mortgage rates. Higher interest rates and slower output growth also appear to be weighing on business fixed investment.\nThe labor market remains very tight. Over the first five months of the year, job gains averaged a robust 314,000 jobs per month. The unemployment rate moved up but remained low in May, at 3.7 percent. There are some signs that supply and demand in the labor market are coming into better balance. The labor force participation rate has moved up in recent months, particularly for individuals aged 25 to 54. Nominal wage growth has shown some signs of easing, and job vacancies have declined so far this year. While the jobs-to-workers gap has narrowed, labor demand still substantially exceeds the su

In [238]:
# Article 3 [Powell, Feb 2022]

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 3
fed_CESO3 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 3
fed_MP3 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article3_text = fed_CESO3 + fed_MP3
print(article3_text)

["Economic activity expanded at a robust 5 1/2 percent pace last year, reflecting progress on vaccinations and the reopening of the economy, fiscal and monetary policy support, and the healthy financial positions of households and businesses. The rapid spread of the Omicron variant led to some slowing in economic activity early this year, but with cases having declined sharply since mid-January, the slowdown seems to have been brief.\nThe labor market is extremely tight. Payroll employment rose by 6.7 million in 2021, and job gains were robust in January. The unemployment rate declined substantially over the past year and stood at 4.0 percent in January, reaching the median of Federal Open Market Committee (FOMC) participants' estimates of its longer-run normal level. The improvements in labor market conditions have been widespread, including for workers at the lower end of the wage distribution as well as for African Americans and Hispanics. Labor demand is very strong, and while labo

In [239]:
# Article 4 [Powell, Jun 2022]

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 4
fed_CESO4 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 4
fed_MP4 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article4_text = fed_CESO4 + fed_MP4
print(article4_text)

["Inflation remains well above our longer-run goal of 2 percent. Over the 12 months ending in April, total PCE (personal consumption expenditures) prices rose 6.3 percent; excluding the volatile food and energy categories, core PCE prices rose 4.9 percent. The available data for May suggest the core measure likely held at that pace or eased slightly last month. Aggregate demand is strong, supply constraints have been larger and longer lasting than anticipated, and price pressures have spread to a broad range of goods and services. The surge in prices of crude oil and other commodities that resulted from Russia's invasion of Ukraine is boosting prices for gasoline and fuel and is creating additional upward pressure on inflation. And COVID-19-related lockdowns in China are likely to exacerbate ongoing supply chain disruptions. Over the past year, inflation also increased rapidly in many foreign economies, as discussed in a box in the June Monetary Policy Report.\nOverall economic activit

In [240]:
# Article 5 [Powell, Feb 2021]

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 5
fed_CESO5 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 5
fed_MP5 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article5_text = fed_CESO5 + fed_MP5
print(article5_text)

['The path of the economy continues to depend significantly on the course of the virus and the measures undertaken to control its spread. The resurgence in COVID-19 cases, hospitalizations, and deaths in recent months is causing great hardship for millions of Americans and is weighing on economic activity and job creation. Following a sharp rebound in economic activity last summer, momentum slowed substantially, with the weakness concentrated in the sectors most adversely affected by the resurgence of the virus. In recent weeks, the number of new cases and hospitalizations has been falling, and ongoing vaccinations offer hope for a return to more normal conditions later this year. However, the economic recovery remains uneven and far from complete, and the path ahead is highly uncertain.\nHousehold spending on services remains low, especially in sectors that typically require people to gather closely, including leisure and hospitality. In contrast, household spending on goods picked up

In [241]:
# Article 6 [Powell, Jul 2021] 

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 6
fed_CESO6 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 6
fed_MP6 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article6_text = fed_CESO6 + fed_MP6
print(article6_text)

["Over the first half of 2021, ongoing vaccinations have led to a reopening of the economy and strong economic growth, supported by accommodative monetary and fiscal policy. Real gross domestic product this year appears to be on track to post its fastest rate of increase in decades. Household spending is rising at an especially rapid pace, boosted by strong fiscal support, accommodative financial conditions, and the reopening of the economy. Housing demand remains very strong, and overall business investment is increasing at a solid pace. As described in the Monetary Policy Report, supply constraints have been restraining activity in some industries, most notably in the motor vehicle industry, where the worldwide shortage of semiconductors has sharply curtailed production so far this year.\nConditions in the labor market have continued to improve, but there is still a long way to go. Labor demand appears to be very strong; job openings are at a record high, hiring is robust, and many w

In [242]:
# Article 7 [Powell, Feb 2020]
# here for example we have a peculiar change in the press conference report structure

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 7
fed_CESO7 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 7
fed_MP7 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article7_text = fed_CESO7 + fed_MP7
print(article7_text)

["The economic expansion is well into its 11th year, and it is the longest on record. Over the second half of last year, economic activity increased at a moderate pace and the labor market strengthened further, as the economy appeared resilient to the global headwinds that had intensified last summer. Inflation has been low and stable but has continued to run below the Federal Open Market Committee's (FOMC) symmetric 2 percent objective.\nJob gains averaged 200,000 per month in the second half of last year, and an additional 225,000 jobs were added in January. The pace of job gains has remained above what is needed to provide jobs for new workers entering the labor force, allowing the unemployment rate to move down further over the course of last year. The unemployment rate was 3.6 percent last month and has been near half-century lows for more than a year. Job openings remain plentiful. Employers are increasingly willing to hire workers with fewer skills and train them. As a result, t

In [243]:
# Article 8 [Powell, Jun 2020]

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 8
fed_CESO8 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy and Federal Reserve Actions to Support the Flow of Credit\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 8
fed_MP8 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article8_text = fed_CESO8 + fed_MP8
print(article8_text)

["Beginning in mid-March, economic activity fell at an unprecedented speed in response to the outbreak of the virus and the measures taken to control its spread. Even after the unexpectedly positive May employment report, nearly 20 million jobs have been lost on net since February, and the reported unemployment rate has risen about 10 percentage points, to 13.3 percent. The decline in real gross domestic product (GDP) this quarter is likely to be the most severe on record. The burden of the downturn has not fallen equally on all Americans. Instead, those least able to withstand the downturn have been affected most. As discussed in the June Monetary Policy Report, low-income households have experienced, by far, the sharpest drop in employment, while job losses of African Americans, Hispanics, and women have been greater than that of other groups. If not contained and reversed, the downturn could further widen gaps in economic well-being that the long expansion had made some progress in 

In [244]:
# Article 9 [Powell, Feb 2019]

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 9
fed_CESO9 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 9
fed_MP9 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article9_text = fed_CESO9 + fed_MP9
print(article9_text)

["The economy grew at a strong pace, on balance, last year, and employment and inflation remain close to the Federal Reserve's statutory goals of maximum employment and stable prices‑‑our dual mandate.\nBased on the available data, we estimate that gross domestic product (GDP) rose a little less than 3 percent last year following a 2.5 percent increase in 2017. Last year's growth was led by strong gains in consumer spending and increases in business investment. Growth was supported by increases in employment and wages, optimism among households and businesses, and fiscal policy actions. In the last couple of months, some data have softened but still point to spending gains this quarter. While the partial government shutdown created significant hardship for government workers and many others, the negative effects on the economy are expected to be fairly modest and to largely unwind over the next several months.\nThe job market remains strong. Monthly job gains averaged 223,000 in 2018, 

In [245]:
# Article 10 [Powell, Jul 2019]
# Here we cope with issues caused by italic styled words 

# Current Economic Situation and Outlook text extraction 1
pattern = "Current Economic Situation and Outlook" + r"(.*?)" + "A box in the July"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 10
fed_CESO101 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Current Economic Situation and Outlook text extraction 2
pattern = "different levels of education." + r"(.*?)" + "address these issues."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 10
fed_CESO102 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction 1
pattern = "Against this backdrop," + r"(.*?)" + "The July"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 10
fed_MP101 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction 2 
pattern = "The FOMC routinely looks" + r"(.*?)" + "The review has started"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 10
fed_MP102 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article10_text = fed_CESO101 + fed_CESO102 + fed_MP101 + fed_MP102
print(article10_text)

["The economy performed reasonably well over the first half of 2019, and the current expansion is now in its 11th year. However, inflation has been running below the Federal Open Market Committee's (FOMC) symmetric 2 percent objective, and crosscurrents, such as trade tensions and concerns about global growth, have been weighing on economic activity and the outlook.\nThe labor market remains healthy. Job gains averaged 172,000 per month from January through June. This number is lower than the average of 223,000 a month last year but above the pace needed to provide jobs for new workers entering the labor force. Consequently, the unemployment rate moved down from 3.9 percent in December to 3.7 percent in June, close to its lowest level in 50 years. Job openings remain plentiful, and employers are increasingly willing to hire workers with fewer skills and train them. As a result, the benefits of a strong job market have been more widely shared in recent years. Indeed, wage gains have bee

In [246]:
# Article 11 [Powell, Feb 2018] 

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 11
fed_CESO11 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 11
fed_MP11 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article11_text = fed_CESO11 + fed_MP11
print(article11_text)

["The U.S. economy grew at a solid pace over the second half of 2017 and into this year. Monthly job gains averaged 179,000 from July through December, and payrolls rose an additional 200,000 in January. This pace of job growth was sufficient to push the unemployment rate down to 4.1 percent, about 3/4 percentage point lower than a year earlier and the lowest level since December 2000. In addition, the labor force participation rate remained roughly unchanged, on net, as it has for the past several years--that is a sign of job market strength, given that retiring baby boomers are putting downward pressure on the participation rate. Strong job gains in recent years have led to widespread reductions in unemployment across the income spectrum and for all major demographic groups. For example, the unemployment rate for adults without a high school education has fallen from about 15 percent in 2009 to 5-1/2 percent in January of this year, while the jobless rate for those with a college deg

In [247]:
# Article 12 [Powell, Jul 2018] 

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 12
fed_CESO12 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 12
fed_MP12 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article12_text = fed_CESO12 + fed_MP12
print(article10_text)

["The economy performed reasonably well over the first half of 2019, and the current expansion is now in its 11th year. However, inflation has been running below the Federal Open Market Committee's (FOMC) symmetric 2 percent objective, and crosscurrents, such as trade tensions and concerns about global growth, have been weighing on economic activity and the outlook.\nThe labor market remains healthy. Job gains averaged 172,000 per month from January through June. This number is lower than the average of 223,000 a month last year but above the pace needed to provide jobs for new workers entering the labor force. Consequently, the unemployment rate moved down from 3.9 percent in December to 3.7 percent in June, close to its lowest level in 50 years. Job openings remain plentiful, and employers are increasingly willing to hire workers with fewer skills and train them. As a result, the benefits of a strong job market have been more widely shared in recent years. Indeed, wage gains have bee

In [248]:
# Article 13 [Yellen, Feb 2017] 

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 13
fed_CESO13 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 13
fed_MP13 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article13_text = fed_CESO13 + fed_MP13
print(article13_text)

["Since my appearance before this Committee last June, the economy has continued to make progress toward our dual-mandate objectives of maximum employment and price stability. In the labor market, job gains averaged 190,000 per month over the second half of 2016, and the number of jobs rose an additional 227,000 in January. Those gains bring the total increase in employment since its trough in early 2010 to nearly 16 million. In addition, the unemployment rate, which stood at 4.8 percent in January, is more than 5 percentage points lower than where it stood at its peak in 2010 and is now in line with the median of the Federal Open Market Committee (FOMC) participants' estimates of its longer-run normal level. A broader measure of labor underutilization, which includes those marginally attached to the labor force and people who are working part time but would like a full-time job, has also continued to improve over the past year. In addition, the pace of wage growth has picked up relati

In [249]:
# Article 14 [Yellen, Jul 2017] 

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 14
fed_CESO14 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = r"Monetary Policy\r?\n(.*?)\r?\nThank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 14
fed_MP14 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article14_text = fed_CESO14 + fed_MP14
print(article14_text)

["Since my appearance before this committee in February, the labor market has continued to strengthen. Job gains have averaged 180,000 per month so far this year, down only slightly from the average in 2016 and still well above the pace we estimate would be sufficient, on average, to provide jobs for new entrants to the labor force. Indeed, the unemployment rate has fallen about 1/4 percentage point since the start of the year, and, at 4.4 percent in June, is 5‑1/2 percentage points below its peak in 2010 and modestly below the median of Federal Open Market Committee (FOMC) participants' assessments of its longer-run normal level. The labor force participation rate has changed little, on net, this year--another indication of improving conditions in the jobs market, given the demographically driven downward trend in this series. A broader measure of labor market slack that includes workers marginally attached to the labor force and those working part time who would prefer full-time work

In [250]:
# Article 15 [Yellen, Feb 2016]

# Current Economic Situation and Outlook text extraction
pattern = r"Current Economic Situation and Outlook\r?\n(.*?)\r?\nMonetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 15
fed_CESO15 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "Turning to monetary policy" + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 15
fed_MP15 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article15_text = fed_CESO15 + fed_MP15
print(article15_text)

["Since my appearance before this Committee last July, the economy has made further progress toward the Federal Reserve's objective of maximum employment. And while inflation is expected to remain low in the near term, in part because of the further declines in energy prices, the Federal Open Market Committee (FOMC) expects that inflation will rise to its 2 percent objective over the medium term.\nIn the labor market, the number of nonfarm payroll jobs rose 2.7 million in 2015, and posted a further gain of 150,000 in January of this year. The cumulative increase in employment since its trough in early 2010, is now more than 13 million jobs. Meanwhile, the unemployment rate fell to 4.9 percent in January, 0.8 percentage point below its level a year ago and in line with the median of FOMC participants' most recent estimates of its longer-run normal level. Other measures of labor market conditions have also shown solid improvement, with noticeable declines over the past year in the number

In [251]:
# Article 16 [Yellen, Jun 2016] 

# Current Economic Situation and Outlook text extraction 1
pattern = "Since my last appearance" + r"(.*?)" + "In addition, as"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 16
fed_CESO161 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Current Economic Situation and Outlook text extraction 2
pattern = "including for African Americans and Hispanics." + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 16
fed_CESO162 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "I will turn next to monetary policy." + r"(.*?)" + "Thank you."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 16
fed_MP16 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article16_text = fed_CESO161 + fed_CESO162 + fed_MP16
print(article16_text)

["before this Committee in February, the economy has made further progress toward the Federal Reserve's objective of maximum employment. And while inflation has continued to run below our 2 percent objective, the Federal Open Market Committee (FOMC) expects inflation to rise to that level over the medium term. However, the pace of improvement in the labor market appears to have slowed more recently, suggesting that our cautious approach to adjusting monetary policy remains appropriate.\r\n    \n\r\n      In the labor market, the cumulative increase in jobs since its trough in early 2010 has now topped 14 million, while the unemployment rate has fallen more than 5 percentage points from its peak.", "Despite these declines, however, it is troubling that unemployment rates for these minority groups remain higher than for the nation overall, and that the annual income of the median African American household is still well below the median income of other U.S. households.\r\n    \n\r\n     

In [252]:
# Article 17 [Yellen, Feb 2015] 

# Current Economic Situation and Outlook text extraction
pattern = "Since my appearance" + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 17
fed_CESO17 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "I will now turn to monetary policy." + r"(.*?)" + "Policy Normalization"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 17
fed_MP17 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Policy normalization text extraction
pattern = "Let me now" + r"(.*?)" + "Summary"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 17
fed_PN17 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Summary text extraction
pattern = ", there has been important" + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 17
fed_S17 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article17_text = fed_CESO17 + fed_MP17 + fed_PN17 + fed_S17
print(article17_text)

["before this Committee last July, the employment situation in the United States has been improving along many dimensions. The unemployment rate now stands at 5.7 percent, down from just over 6 percent last summer and from 10 percent at its peak in late 2009. The average pace of monthly job gains picked up from about 240,000 per month during the first half of last year to 280,000 per month during the second half, and employment rose 260,000 in January. In addition, long-term unemployment has declined substantially, fewer workers are reporting that they can find only part-time work when they would prefer full-time employment, and the pace of quits--often regarded as a barometer of worker confidence in labor market opportunities--has recovered nearly to its pre-recession level. However, the labor force participation rate is lower than most estimates of its trend, and wage growth remains sluggish, suggesting that some cyclical weakness persists. In short, considerable progress has been ac

In [253]:
# Article 18 [Yellen, Jul 2015] 

# Current Economic Situation and Outlook text extraction
pattern = "Since my appearance" + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 18
fed_CESO18 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "Regarding monetary policy," + r"(.*?)" + "Federal Reserve Transparency and Accountability"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 18
fed_MP18 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Federal Reserve Transparency and Accountability text extraction
pattern = "These statements pertaining to policy normalization" + r"(.*?)" + "Summary"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 18
fed_FRTA18 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Summary text extraction
pattern = ", we have seen," + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 18
fed_S18 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article18_text = fed_CESO18 + fed_MP18 + fed_FRTA18 + fed_S18
print(article18_text)

["before this Committee in February, the economy has made further progress toward the Federal Reserve's objective of maximum employment, while inflation has continued to run below the level that the Federal Open Market Committee (FOMC) judges to be most consistent over the longer run with the Federal Reserve's statutory mandate to promote maximum employment and price stability.\r\n    \n\r\n      In the labor market, the unemployment rate now stands at 5.3 percent, slightly below its level at the end of last year and down more than 4-1/2 percentage points from its 10 percent peak in late 2009. Meanwhile, monthly gains in nonfarm payroll employment averaged about 210,000 over the first half of this year, somewhat less than the robust 260,000 average seen in 2014 but still sufficient to bring the total increase in employment since its trough to more than 12 million jobs. Other measures of job market health are also trending in the right direction, with noticeable declines over the past y

In [254]:
# Article 19 [Yellen, Feb 2014] 

# Current Economic Situation and Outlook text extraction 1
pattern = "The economic" + r"(.*?)" + "since the previous"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 19
fed_CESO191 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Current Economic Situation and Outlook text extraction 2 
pattern = "last July," + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 19
fed_CESO192 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "Turning to monetary policy," + r"(.*?)" + "Strengthening the Financial System"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 19
fed_MP19 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Strengthening the Financial System text extraction
pattern = "I will finish with an update" + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 19
fed_SFS19 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article19_text = fed_CESO191 + fed_CESO192 + fed_MP19 + fed_SFS19
print(article19_text)

['recovery gained greater traction in the second half of last year. Real gross domestic product (GDP) is currently estimated to have risen at an average annual rate of more than 3-1/2 percent in the third and fourth quarters, up from a 1-3/4 percent pace in the first half. The pickup in economic activity has fueled further progress in the labor market. About 1-1/4 million jobs have been added to payrolls', "and 3-1/4 million have been added since August 2012, the month before the Federal Reserve began a new round of asset purchases to add momentum to the recovery. The unemployment rate has fallen nearly a percentage point since the middle of last year and 1-1/2 percentage points since the beginning of the current asset purchase program. Nevertheless, the recovery in the labor market is far from complete. The unemployment rate is still well above levels that Federal Open Market Committee (FOMC) participants estimate is consistent with maximum sustainable employment. Those out of a job f

In [255]:
# Article 20 [Yellen, Jul 2014] 

# Current Economic Situation and Outlook text extraction
pattern = "The economy is continuing" + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 20
fed_CESO20 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "I will now turn to monetary policy." + r"(.*?)" + "Financial Stability"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 20
fed_MP20 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Financial Stability text extraction
pattern = "The Committee recognizes" + r"(.*?)" + "Summary"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 20
fed_FS20 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Summary text extraction
pattern = "strengthening the financial system." + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 20
fed_S20 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article20_text = fed_CESO20 + fed_MP20 + fed_FS20 + fed_S20
print(article20_text)

["to make progress toward the Federal Reserve's objectives of maximum employment and price stability.\r\n    \n\r\n       In the labor market, gains in total nonfarm payroll employment averaged about 230,000 per month over the first half of this year, a somewhat stronger pace than in 2013 and enough to bring the total increase in jobs during the economic recovery thus far to more than 9 million. The unemployment rate has fallen nearly 1-1/2 percentage points over the past year and stood at 6.1 percent in June, down about 4 percentage points from its peak. Broader measures of labor utilization have also registered notable improvements over the past year.\r\n    \n\r\n       Real gross domestic product (GDP) is estimated to have declined sharply in the first quarter. The decline appears to have resulted mostly from transitory factors, and a number of recent indicators of production and spending suggest that growth rebounded in the second quarter, but this bears close watching. The housin

In [256]:
# Article 21 [Bernanke, Feb 2013] 

# Current Economic Situation and Outlook text extraction
pattern = "Since I last reported" + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 21
fed_CESO21 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "With unemployment well" + r"(.*?)" + "Thoughts on Fiscal Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 21
fed_MP21 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Thoughts on Fiscal Policy text extraction
pattern = "The Committee recognizes" + r"(.*?)" + "choices we face."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 21
fed_TFP21 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article21_text = fed_CESO21 + fed_MP21 + fed_TFP21 
print(article21_text)

["to this Committee in mid-2012, economic activity in the United States has continued to expand at a moderate if somewhat uneven pace. In particular, real gross domestic product (GDP) is estimated to have risen at an annual rate of about 3 percent in the third quarter but to have been essentially flat in the fourth quarter.1 The pause in real GDP growth last quarter does not appear to reflect a stalling-out of the recovery. Rather, economic activity was temporarily restrained by weather-related disruptions and by transitory declines in a few volatile categories of spending, even as demand by U.S. households and businesses continued to expand. Available information suggests that economic growth has picked up again this year.\r\n    \n\r\n       Consistent with the moderate pace of economic growth, conditions in the labor market have been improving gradually. Since July, nonfarm payroll employment has increased by 175,000 jobs per month on average, and the unemployment rate declined 0.3 

In [257]:
# Article 22 [Bernanke, Jul 2013] 

# Current Economic Situation and Outlook text extraction
pattern = "The economic recovery" + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 22
fed_CESO22 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "With unemployment still" + r"(.*?)" + "Regulatory Reform"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 22
fed_MP22 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Regulatory Reform text extraction
pattern = "I will finish by" + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 22
fed_RF22 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article22_text = fed_CESO22 + fed_MP22 + fed_RF22
print(article22_text)

["has continued at a moderate pace in recent quarters despite the strong headwinds created by federal fiscal policy.\r\n    \n\r\n       Housing has contributed significantly to recent gains in economic activity. Home sales, house prices, and residential construction have moved up over the past year, supported by low mortgage rates and improved confidence in both the housing market and the economy. Rising housing construction and home sales are adding to job growth, and substantial increases in home prices are bolstering household finances and consumer spending while reducing the number of homeowners with underwater mortgages. Housing activity and prices seem likely to continue to recover, notwithstanding the recent increases in mortgage rates, but it will be important to monitor developments in this sector carefully.\r\n    \n\r\n       Conditions in the labor market are improving gradually. The unemployment rate stood at 7.6 percent in June, about a half percentage point lower than i

In [258]:
# Article 23 [Bernanke, Feb 2012] 

# Current Economic Situation and Outlook text extraction
pattern = "The recovery of the U.S." + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 23
fed_CESO23 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "Against this" + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 23
fed_MP23 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article23_text = fed_CESO23 + fed_MP23
print(article23_text)

["economy continues, but the pace of expansion has been uneven and modest by historical standards. After minimal gains in the first half of last year, real gross domestic product (GDP) increased at a 2-1/4 percent annual rate in the second half.1 The limited information available for 2012 is consistent with growth proceeding, in coming quarters, at a pace close to or somewhat above the pace that was registered during the second half of last year.\r\n    \n\r\n       We have seen some positive developments in the labor market. Private payroll employment has increased by 165,000 jobs per month on average since the middle of last year, and nearly 260,000 new private-sector jobs were added in January. The job gains in recent months have been relatively widespread across industries. In the public sector, by contrast, layoffs by state and local governments have continued. The unemployment rate hovered around 9 percent for much of last year but has moved down appreciably since September, reac

In [259]:
# Article 24 [Bernanke, Jul 2012] 

# Current Economic Situation and Outlook text extraction
pattern = "The U.S. economy" + r"(.*?)" + "Risks to the Outlook"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 24
fed_CESO24 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Risks to the Outlook text extraction
pattern = "Participants at the" + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 24
fed_RO24 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "In view of the" + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 24
fed_MP24 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article24_text = fed_CESO24 + fed_RO24 + fed_MP24
print(article24_text)

["has continued to recover, but economic activity appears to have decelerated somewhat during the first half of this year. After rising at an annual rate of 2-1/2 percent in the second half of 2011, real gross domestic product (GDP) increased at a 2 percent pace in the first quarter of 2012, and available indicators point to a still-smaller gain in the second quarter.\r\n    \n\r\n       Conditions in the labor market improved during the latter part of 2011 and early this year, with the unemployment rate falling about a percentage point over that period. However, after running at nearly 200,000 per month during the fourth and first quarters, the average increase in payroll employment shrank to 75,000 per month during the second quarter. Issues related to seasonal adjustment and the unusually warm weather this past winter can account for a part, but only a part, of this loss of momentum in job creation. At the same time, the jobless rate has recently leveled out at just over 8 percent.\

In [260]:
# Article 25 [Bernanke, Mar 2011] 

# Current Economic Situation and Outlook text extraction
pattern = "Following the" + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 25
fed_CESO25 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "As I noted earlier," + r"(.*?)" + "Federal Reserve Transparency"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 25
fed_MP25 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Federal Reserve Transparency text extraction
pattern = "The Congress established" + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 25
fed_FEDT25 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article25_text = fed_CESO25 + fed_MP25 + fed_FEDT25
print(article25_text)

["stabilization of economic activity in mid-2009, the U.S. economy is now in its seventh quarter of growth; last quarter, for the first time in this expansion, our nation's real gross domestic product (GDP) matched its pre-crisis peak. Nevertheless, job growth remains relatively weak and the unemployment rate is still high.\nIn its early stages, the economic recovery was largely attributable to the stabilization of the financial system, the effects of expansionary monetary and fiscal policies, and a strong boost to production from businesses rebuilding their depleted inventories. Economic growth slowed significantly in the spring and early summer of 2010, as the impetus from inventory building and fiscal stimulus diminished and as Europe's debt problems roiled global financial markets. More recently, however, we have seen increased evidence that a self-sustaining recovery in consumer and business spending may be taking hold. Notably, real consumer spending has grown at a solid pace sin

In [261]:
# Article 26 [Bernanke, Jul 2011] 

# Current Economic Situation and Outlook text extraction
pattern = "The U.S. economy" + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 26
fed_CESO26 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "FOMC members' judgments" + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 26
fed_MP26 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article26_text = fed_CESO26 + fed_MP26
print(article26_text)

["has continued to recover, but the pace of the expansion so far this year has been modest. After increasing at an annual rate of 2-3/4 percent in the second half of 2010, real gross domestic product (GDP) rose at about a 2 percent rate in the first quarter of this year, and incoming data suggest that the pace of recovery remained soft in the spring. At the same time, the unemployment rate, which had appeared to be on a downward trajectory at the turn of the year, has moved back above 9 percent.\r\n    \n\r\n       In part, the recent weaker-than-expected economic performance appears to have been the result of several factors that are likely to be temporary. Notably, the run-up in prices of energy, especially gasoline, and food has reduced consumer purchasing power. In addition, the supply chain disruptions that occurred following the earthquake in Japan caused U.S. motor vehicle producers to sharply curtail assemblies and limited the availability of some models. Looking forward, howev

In [262]:
# Article 27 [Bernanke, Feb 2010] 

# Current Economic Situation and Outlook text extraction
pattern = "Although the" + r"(.*?)" + "Monetary Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 27
fed_CESO27 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "Over the past year," + r"(.*?)" + "Federal Reserve Transparency"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 27
fed_MP27 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Federal Reserve Transparency text extraction
pattern = "The Federal Reserve is committed" + r"(.*?)" + "Regulatory Reform"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 27
fed_FEDT27 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Regulatory Reform text extraction
pattern = "Strengthening our financial" + r"(.*?)" + "financial regulatory framework."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 27
fed_RF27 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article27_text = fed_CESO27 + fed_MP27 + fed_FEDT27 + fed_RF27
print(article27_text)

["recession officially began more than two years ago, U.S. economic activity contracted particularly sharply following the intensification of the global financial crisis in the fall of 2008. Concerted efforts by the Federal Reserve, the Treasury Department, and other U.S. authorities to stabilize the financial system, together with highly stimulative monetary and fiscal policies, helped arrest the decline and are supporting a nascent economic recovery. Indeed, the U.S. economy expanded at about a 4 percent annual rate during the second half of last year. A significant portion of that growth, however, can be attributed to the progress firms made in working down unwanted inventories of unsold goods, which left them more willing to increase production. As the impetus provided by the inventory cycle is temporary, and as the fiscal support for economic growth likely will diminish later this year, a sustained recovery will depend on continued growth in private-sector final demand for goods a

In [263]:
# Article 28 [Bernanke, Jul 2010] 

# Current Economic Situation and Outlook text extraction
pattern = "The economic" + r"(.*?)" + "Federal Reserve Policy"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 28
fed_CESO28 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Federal Reserve Policy text extraction
pattern = "The Federal Reserve's" + r"(.*?)" + "Financial Reform Legislation"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 28
fed_FEDP28 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Financial Reform Legislation text extraction
pattern = "Last week," + r"(.*?)" + "Thank you"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 28
fed_FRL28 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article28_text = fed_CESO28 + fed_FEDP28 + fed_FRL28
print(article28_text)

["expansion that began in the middle of last year is proceeding at a moderate pace, supported by stimulative monetary and fiscal policies. Although fiscal policy and inventory restocking will likely be providing less impetus to the recovery than they have in recent quarters, rising demand from households and businesses should help sustain growth. In particular, real consumer spending appears to have expanded at about a 2-1/2 percent annual rate in the first half of this year, with purchases of durable goods increasing especially rapidly. However, the housing market remains weak, with the overhang of vacant or foreclosed houses weighing on home prices and construction.\nAn important drag on household spending is the slow recovery in the labor market and the attendant uncertainty about job prospects. After two years of job losses, private payrolls expanded at an average of about 100,000 per month during the first half of this year, a pace insufficient to reduce the unemployment rate mate

In [264]:
# Article 29 [Bernanke, Feb 2009] 

# Recent Economic and Financial Developments and the Policy Responses text extraction
pattern = "As you are aware," + r"(.*?)" + "Federal Reserve Transparency"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 29
fed_REFDPR29 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Federal Reserve Transparency text extraction
pattern = "The Federal Reserve is committed" + r"(.*?)" + "The Economic Outlook and the FOMC's Quarterly Projections"
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 29
fed_FEDT29 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# The Economic Outlook and the FOMC's Quarterly Projections text extraction
pattern = "In their economic" + r"(.*?)" + "and price stability."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 29
fed_ECOQP29 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article29_text = fed_REFDPR29 + fed_FEDT29 + fed_ECOQP29
print(article29_text)

["the U.S. economy is undergoing a severe contraction.\xa0 Employment has fallen steeply since last autumn, and the unemployment rate has moved up to 7.6 percent.\xa0 The deteriorating job market, considerable losses of equity and housing wealth, and tight lending conditions have weighed down consumer sentiment and spending.\xa0 In addition, businesses have cut back capital outlays in response to the softening outlook for sales as well as the difficulty of obtaining credit.\xa0 In contrast to the first half of last year, when robust foreign demand for U.S. goods and services provided some offset to weakness in domestic spending, exports slumped in the second half as our major trading partners fell into recession and some measures of global growth turned negative for the first time in more than 25 years.\xa0 In all, U.S. real gross domestic product (GDP) declined slightly in the third quarter of 2008, and that decline steepened considerably in the fourth quarter.\xa0 The sharp contracti

In [265]:
# Article 30 [Bernanke, Jul 2009] 

# Economic and Financial Developments in the First Half of 2009 text extraction
pattern = "Aggressive policy actions" + r"(.*?)" + "the next two years."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 30
fed_EFDFH30 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Monetary Policy text extraction
pattern = "In light of the" + r"(.*?)" + "and price stability."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 30
fed_MP30 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Fiscal Policy text extraction
pattern = "Our economy and" + r"(.*?)" + "nor durable economic growth."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 30
fed_FP30 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Trasparency and Accountability text extraction
pattern = "The Congress and the American" + r"(.*?)" + "monetary policy independence."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 30
fed_TA30 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article30_text = fed_EFDFH30 + fed_MP30 + fed_FP30 + fed_TA30
print(article30_text)

["taken around the world last fall may well have averted the collapse of the global financial system, an event that would have had extremely adverse and protracted consequences for the world economy. Even so, the financial shocks that hit the global economy in September and October were the worst since the 1930s, and they helped push the global economy into the deepest recession since World War II. The U.S. economy contracted sharply in the fourth quarter of last year and the first quarter of this year. More recently, the pace of decline appears to have slowed significantly, and final demand and production have shown tentative signs of stabilization. The labor market, however, has continued to weaken. Consumer price inflation, which fell to low levels late last year, remained subdued in the first six months of 2009.\nTo promote economic recovery and foster price stability, the Federal Open Market Committee (FOMC) last year brought its target for the federal funds rate to a historically

In [266]:
# Article 31 [Bernanke, Feb 2008] 

# Press Conference text extraction 
pattern = "in their financial dealings." + r"(.*?)" + "Thank you."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 31
fed_G31 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article31_text = fed_G31
print(article31_text)

["The economic situation has become distinctly less favorable since the time of our July report.\xa0 Strains in financial markets, which first became evident late last summer, have persisted; and pressures on bank capital and the continued poor functioning of markets for securitized credit have led to tighter credit conditions for many households and businesses.\xa0 The growth of real gross domestic product (GDP) held up well through the third quarter despite the financial turmoil, but it has since slowed sharply.\xa0 Labor market conditions have similarly softened, as job creation has slowed and the unemployment rate--at 4.9 percent in January--has moved up somewhat.\r\n    \n\r\n       Many of the challenges now facing our economy stem from the continuing contraction of the U.S. housing market.\xa0 In 2006, after a multiyear boom in residential construction and house prices, the housing market reversed course.\xa0 Housing starts and sales of new homes are now less than half of their 

In [267]:
# Article 32 [Bernanke, Jul 2008] 

# Press Conference text extraction 
pattern = "The U.S. economy" + r"(.*?)" + "Thank you."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 32
fed_G32 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article32_text = fed_G32
print(article32_text)

["and financial system have confronted some significant challenges thus far in 2008.\xa0 The contraction in housing activity that began in 2006 and the associated deterioration in mortgage markets that became evident last year have led to sizable losses at financial institutions and a sharp tightening in overall credit conditions.\xa0 The effects of the housing contraction and of the financial headwinds on spending and economic activity have been compounded by rapid increases in the prices of energy and other commodities, which have sapped household purchasing power even as they have boosted inflation.\xa0 Against this backdrop, economic activity has advanced at a sluggish pace during the first half of this year, while inflation has remained elevated.\nFollowing a significant reduction in its policy rate over the second half of 2007, the Federal Open Market Committee (FOMC) eased policy considerably further through the spring to counter actual and expected weakness in economic growth a

In [268]:
# Article 33 [Bernanke, Feb 2007] 

# Press Conference text extraction 
pattern = "Real activity" + r"(.*?)" + "Thank you."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 33
fed_G33 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article33_text = fed_G33
print(article33_text)

["in the United States expanded at a solid pace in 2006, although the pattern of growth was uneven. After a first-quarter rebound from weakness associated with the effects of the hurricanes that ravaged the Gulf Coast the previous summer, output growth moderated somewhat on average over the remainder of 2006. Real gross domestic product (GDP) is currently estimated to have increased at an annual rate of about 2-3/4 percent in the second half of the year.\r\n    \n\r\n       As we anticipated in our July report, the U.S. economy appears to be making a transition from the rapid rate of expansion experienced over the preceding several years to a more sustainable average pace of growth. The principal source of the ongoing moderation has been a substantial cooling in the housing market, which has led to a marked slowdown in the pace of residential construction. However, the weakness in housing market activity and the slower appreciation of house prices do not seem to have spilled over to an

In [269]:
# Article 34 [Bernanke, Jul 2007] 

# Press Conference text extraction 
pattern = " As you know," + r"(.*?)" + "on these important issues."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 34
fed_G34 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article34_text = fed_G34
print(article34_text)

["this occasion marks the thirtieth year of semiannual testimony on the economy and monetary policy by the Federal Reserve.\xa0 In establishing these hearings, the Congress proved prescient in anticipating the worldwide trend toward greater transparency and accountability of central banks in the making of monetary policy.\xa0 Over the years, these testimonies and the associated reports have proved an invaluable vehicle for the Federal Reserve's communication with the public about monetary policy, even as they have served to enhance the Federal Reserve's accountability for achieving the dual objectives of maximum employment and price stability set for it by the Congress.\xa0 I take this opportunity to reiterate the Federal Reserve's strong support of the dual mandate; in pursuing maximum employment and price stability, monetary policy makes its greatest possible contribution to the general economic welfare.\xa0\r\n    \n\r\n       Let me now review the current economic situation and the

In [270]:
# Article 35 [Bernanke, Feb 2006] 

# Press Conference text extraction 
pattern = "the Congress has charged the Federal Reserve System." + r"(.*?)" + "in the years to come."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 35
fed_G35 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article35_text = fed_G35
print(article35_text)

["The U.S. economy performed impressively in 2005. Real gross domestic product (GDP) increased a bit more than 3 percent, building on the sustained expansion that gained traction in the middle of 2003. Payroll employment rose 2 million in 2005, and the unemployment rate fell below 5 percent. Productivity continued to advance briskly. \nThe economy achieved these gains despite some significant obstacles. Energy prices rose substantially yet again, in response to increasing global demand, hurricane-related disruptions to production, and concerns about the adequacy and reliability of supply. The Gulf Coast region suffered through severe hurricanes that inflicted a terrible loss of life; destroyed homes, personal property, businesses, and infrastructure on a massive scale; and displaced more than a million people. The storms also damaged facilities and disrupted production in many industries, with substantial effects on the energy and petrochemical sectors and on the region's ports. Full r

In [271]:
# Article 36 [Bernanke, Jul 2006] 

# Press Conference text extraction 
pattern = "Over the period" + r"(.*?)" + "Thank you."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 36
fed_G36 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article36_text = fed_G36
print(article36_text)

['since our February report, the U.S. economy has continued to expand. Real gross domestic product (GDP) is estimated to have risen at an annual rate of 5.6 percent in the first quarter of 2006. The available indicators suggest that economic growth has more recently moderated from that quite strong pace, reflecting a gradual cooling of the housing market and other factors that I will discuss. With respect to the labor market, more than 850,000 jobs were added, on net, to nonfarm payrolls over the first six months of the year, though these gains came at a slower pace in the second quarter than in the first. Last month the unemployment rate stood at 4.6 percent.\nInflation has been higher than we had anticipated in February, partly as a result of further sharp increases in the prices of energy and other commodities. During the first five months of the year, overall inflation as measured by the price index for personal consumption expenditures averaged 4.3 percent at an annual rate. Over 

In [272]:
# Article 37 [Greenspan, Feb 2005] 

# Press Conference text extraction 
pattern = "Over the first half of 2004," + r"(.*?)" + "of our nation and its people."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 37
fed_G37 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article37_text = fed_G36
print(article37_text)

['since our February report, the U.S. economy has continued to expand. Real gross domestic product (GDP) is estimated to have risen at an annual rate of 5.6 percent in the first quarter of 2006. The available indicators suggest that economic growth has more recently moderated from that quite strong pace, reflecting a gradual cooling of the housing market and other factors that I will discuss. With respect to the labor market, more than 850,000 jobs were added, on net, to nonfarm payrolls over the first six months of the year, though these gains came at a slower pace in the second quarter than in the first. Last month the unemployment rate stood at 4.6 percent.\nInflation has been higher than we had anticipated in February, partly as a result of further sharp increases in the prices of energy and other commodities. During the first five months of the year, overall inflation as measured by the price index for personal consumption expenditures averaged 4.3 percent at an annual rate. Over 

In [273]:
# Article 38 [Greenspan, Jul 2005] 

# Press Conference text extraction 
pattern = "In mid-February," + r"(.*?)" + "to maintain price stability."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 38
fed_G38 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article38_text = fed_G38
print(article38_text)

['when I presented our last report to the Congress, the economy, supported by\r\nstrong underlying fundamentals, appeared to be on a solid growth path, and\r\nthose circumstances prevailed through March. Accordingly, the Federal Open\r\nMarket Committee (FOMC) continued the process of a measured removal of monetary\r\naccommodation, which it had begun in June 2004, by raising the federal funds\r\nrate 1/4 percentage point at both the February and the March meetings.\r\n\nThe\r\nupbeat picture became cloudier this spring, when data on economic activity\r\nproved to be weaker than most market participants had anticipated and inflation\r\nmoved up in response to the jump in world oil prices. By the time of the May\r\nFOMC meeting, some evidence suggested that the economy might have been entering\r\na soft patch reminiscent of the middle of last year, perhaps as a result of\r\nhigher energy costs worldwide. In particular, employment gains had slowed from\r\nthe strong pace of the end of 20

In [274]:
# Article 39 [Greenspan, Feb 2004] 

# Press Conference text extraction 1
pattern = "When I testified before this committee in July," + r"(.*?)" + "export market more receptive."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 39
fed_G391 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2
pattern = "Although the prospects" + r"(.*?)" + "be thwarted and reversed."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 39
fed_G392 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 3
pattern = "In summary," + r"(.*?)" + "of effective price stability."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 39
fed_G393 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article39_text = fed_G391 + fed_G392 + fed_G393
print(article39_text)

["I reported that conditions had become a good deal more supportive of economic expansion over the previous few months.  A notable reduction in geopolitical concerns, strengthening confidence in economic prospects, and an improvement in financial conditions boded well for spending and production over the second half of the year.  Still, convincing signs of a sustained acceleration in activity were not yet in evidence.  Since then, the picture has brightened.  The gross domestic product expanded vigorously over the second half of 2003 while productivity surged, prices remained stable, and financial conditions improved further.  Overall, the economy has made impressive gains in output and real incomes; however, progress in creating jobs has been limited.  \r\n\r\nLooking forward, the prospects are good for sustained expansion of the U.S. economy.  The household sector's financial condition is stronger, and the business sector has made substantial strides in bolstering balance sheets.  Na

In [275]:
# Article 40 [Greenspan, Jul 2004] 

# Press Conference text extraction 
pattern = "Economic developments" + r"(.*?)" + "nation's continuing prosperity."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 40
fed_G40 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article40_text = fed_G40
print(article40_text)

["in the United States have generally been quite favorable in 2004, lending increasing support to the view that the expansion is self-sustaining.  Not only has economic activity quickened, but the expansion has become more broad-based and has produced notable gains in employment.  The evident strengthening in demand that underlies this improved performance doubtless has been a factor contributing to the rise in inflation this year.  But inflation also seems to have been boosted by transitory factors such as the surge in energy prices.  Those higher prices, by eroding households' disposable income, have accounted for at least some of the observed softness in consumer spending of late, a softness which should prove short-lived.\r\n\n\r\n\tWhen I testified before this Committee in February, many of the signs of the step-up in economic activity were already evident.  Capital spending had increased markedly in the second half of last year, no doubt spurred by significantly improving profits

In [276]:
# Article 41 [Greenspan, Feb 2003] 

# Press Conference text extraction 1
pattern = "When I testified" + r"(.*?)" + "my colleagues at the Federal Reserve."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 41
fed_G411 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2
pattern = "One notable feature" + r"(.*?)" + "the consumption of workers."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 41
fed_G412 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 3
pattern = "These are challenging" + r"(.*?)" + "of all our citizens."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 41
fed_G413 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article41_text = fed_G411 + fed_G412 + fed_G413
print(article41_text)

["before this committee last July, I noted that, while the growth of economic activity over the first half of the year had been spurred importantly by a swing from rapid inventory drawdown to modest inventory accumulation, that source of impetus would surely wind down in subsequent quarters, as it did.  We at the Federal Reserve recognized that a strengthening of final sales was an essential element of putting the expansion on a firm and sustainable track.  To support such a strengthening, monetary policy was set to continue its accommodative stance.\r\n\r\nIn the event, final sales continued to grow only modestly, and business outlays remained soft.  Concerns about corporate governance, which intensified for a time, were compounded over the late summer and into the fall by growing geopolitical tensions.  In particular, worries about the situation in Iraq contributed to an appreciable increase in oil prices.  These uncertainties, coupled with ongoing concerns surrounding macroeconomic 

In [277]:
# Article 42 [Greenspan, Apr 2003] 

# Press Conference text extraction 
pattern = "At that time," + r"(.*?)" + "it meets in six days."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 42
fed_G42 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article42_text = fed_G42
print(article42_text)

["I noted that the economic expansion over the preceding \r\n        year had been modest. Spending by households had contributed importantly \r\n        to the gains in economic activity. The nation's strong underlying productivity \r\n        performance was providing ongoing support for household income. That rise \r\n        in income, combined with low interest rates, reduced taxes, and the availability \r\n        of substantial home equity, had spurred solid gains in consumer spending \r\n        and a robust advance in residential construction. \nIn contrast, although the contraction in capital spending appeared to \r\n        have slowed, we had yet to see any convincing signs that a sustained pickup \r\n        in business spending was emerging. Moreover, heightened geopolitical tensions \r\n        were adding to the already considerable uncertainties that had clouded \r\n        the business outlook over the preceding three years. The general climate \r\n        of caution 

In [278]:
# Article 43 [Greenspan, Jul 2003] 

# Press Conference text extraction 1
pattern = "When in late April" + r"(.*?)" + "stoking inflationary pressures."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 43
fed_G431 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2
pattern = "The prospects for a" + r"(.*?)" + "new motor vehicles."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 43
fed_G432 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 3
pattern = "In addition to balance" + r"(.*?)" + "in the current environment."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 43
fed_G433 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 4
pattern = "Much like households," + r"(.*?)" + "the civilian labor force."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 43
fed_G434 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 5
pattern = "Although forward-looking" + r"(.*?)" + "remains a concern."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 43
fed_G435 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 6
pattern = "Inflation developments have" + r"(.*?)" + "satisfactory economic performance."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 43
fed_G436 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article43_text = fed_G431 + fed_G432 + fed_G433 + fed_G434 + fed_G435 + fed_G436
print(article43_text)

["I last reviewed the economic outlook before this Committee, full-scale military operations in Iraq had concluded, and there were signs that some of the impediments to brisker growth in economic activity in the months leading up to the conflict were beginning to lift.  Many, though by no means all, of the economic uncertainties stemming from the situation in Iraq had been resolved, and that reduction in uncertainty had left an imprint on a broad range of indicators.\nStock prices had risen, risk spreads on corporate bonds had narrowed, oil prices had dropped sharply, and measures of consumer sentiment appeared to be on the mend.  But, as I noted in April, hard data indicating that these favorable developments were quickening the pace of spending and production were not yet in evidence, and it was likely that the extent of the underlying vigor of the economy would become apparent only gradually. \nIn the months since, some of the residual war-related uncertainties have abated further a

In [279]:
# Article 44 [Greenspan, Feb 2002]

# Press Conference text extraction 1
pattern = "Since July, when I" + r"(.*?)" + "goods across our nation."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 44
fed_G441 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2
pattern = "Both deregulation and" + r"(.*?)" + "reason for encouragement."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 44
fed_G442 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article44_text = fed_G441 + fed_G442
print(article44_text)

["last reported to you on the\r\nconduct of monetary policy, the U.S. economy has\r\ngone through a period of considerable strain, with\r\noutput contracting for a time and unemployment rising. \r\nWe in the Federal Reserve System acted vigorously to\r\nadjust monetary policy in an endeavor both to limit the\r\nextent of the downturn and to hasten its completion. \r\nDespite the disruptions engendered by the terrorist\r\nattacks of September 11, the typical dynamics of the\r\nbusiness cycle have re-emerged and are prompting a\r\nfirming in economic activity. An array of influences\r\nunique to this business cycle, however, seems likely to\r\nmoderate the speed of the anticipated recovery.\nAt the time of our last report, the economy was\r\nweakening. Many firms were responding to the\r\nrealization that significant overcapacity had developed. \r\nThe demand for capital goods had dropped sharply,\r\nand inventories were uncomfortably high in many\r\nindustries. In response, businesses s

In [280]:
# Article 45 [Greenspan, Feb 2002] 

# Press Conference text extraction 1
pattern = "Over the four and" + r"(.*?)" + "achieve its full potential."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 45
fed_G451 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2
pattern = "A considerable volume" + r"(.*?)" + ", and trade policies." 
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 45
fed_G452 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article45_text = fed_G451 + fed_G452
print(article45_text)

['one-half months since I last testified before this Committee on monetary policy, the economy has continued to expand, largely along the broad contours we had anticipated at that time.  Although the uncertainties of earlier this year are as yet not fully resolved, the U.S. economy appears to have withstood a set of blows--major declines in equity markets, a sharp retrenchment in investment spending, and the tragic terrorist attacks of last September--that in previous business cycles almost surely would have induced a severe contraction.  The mildness and brevity of the downturn, as I indicated earlier this year, are a testament to the notable improvement in the resilience and flexibility of the U.S. economy.\r\n\r\nBut while the economy has held up remarkably well, not surprisingly the depressing effects of recent events linger.  Spending will continue to adjust for some time to the declines that have occurred in equity prices.  In recent weeks, those prices have fallen further on net

In [281]:
# Article 46 [Greenspan, Feb 2001]

# Press Conference text extraction 1
pattern = "The past decade" + r"(.*?)" + "adjustment of its policy."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 46
fed_G46 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article46_text = fed_G46
print(article46_text)



In [282]:
# Article 47 [Greenspan, Jul 2001]

# Press Conference text extraction
pattern = "Monetary policy this" + r"(.*?)" + "that benefits us all."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 47
fed_G47 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article47_text = fed_G47
print(article47_text)

["year has confronted an economy that slowed sharply late last year and has remained weak this year, following an extraordinary period of buoyant expansion.\r\n\r\n\tBy aggressively easing the stance of monetary policy, the Federal Reserve has moved to support demand and, we trust, help lay the groundwork for the economy to achieve maximum sustainable growth.  Our accelerated action reflected the pronounced downshift in economic activity, which was accentuated by the especially prompt and synchronous adjustment of production by businesses utilizing the faster flow of information coming from the adoption of new technologies.  A rapid and sizable easing was made possible by reasonably well-anchored inflation expectations, which helped to keep underlying inflation at a modest rate, and by the prospect that inflation would remain contained as resource utilization eased and energy prices backed down.  \r\n\r\n\tIn addition to the more accommodative stance of monetary policy, demand should b

In [283]:
# Article 48 [Greenspan, Feb 2000]

# Press Conference text extraction 1
pattern = "There is little evidence" + r"(.*?)" + "economic weakness."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 48
fed_G481 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2
pattern = "Underlying this performance," + r"(.*?)" + "economy in the process."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 48
fed_G482 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 3
pattern = "On a broader front," + r"(.*?)" + "later in this testimony."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 48
fed_G483 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 4
pattern = "On a broader front," + r"(.*?)" + "later in this testimony."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 48
fed_G484 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 5
pattern = "Although the outlook" + r"(.*?)" + "the evolving technologies."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 48
fed_G485 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 6
pattern = "Before closing, I" + r"(.*?)" + "for economic performance."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 48
fed_G486 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 7
pattern = "As the U.S. economy enters" + r"(.*?)" + "into the new millennium."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 48
fed_G487 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article48_text = fed_G481 + fed_G482 + fed_G483 + fed_G484 + fed_G485 + fed_G486 + fed_G487
print(article48_text)

['that the American economy, which grew more than 4 percent in 1999 and surged forward at an even faster pace in the second half of the year, is slowing appreciably.  At the same time, inflation has remained largely contained.  An increase in the overall rate of inflation in 1999 was mainly a result of higher energy prices.  Importantly, unit labor costs actually declined in the second half of the year.  Indeed, still-preliminary data indicate that total unit cost increases last year remained extraordinarily low, even as the business expansion approached a record nine years.  Domestic operating profit margins, after sagging for eighteen months, apparently turned up again in the fourth quarter, and profit expectations for major corporations for the first quarter have been undergoing upward revisions since the beginning of the year--scarcely an indication of imminent', "unprecedented in my half-century of observing the American economy, is a continuing acceleration in productivity.  Nonf

In [284]:
# Article 49 [Greenspan, Jul 2000]

# Press Conference text extraction 1 
pattern = "The Federal Reserve has been" + r"(.*?)" + "the average of recent years."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 49
fed_G491 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2 
pattern = "The last decade has been" + r"(.*?)" + "in pursuit of that goal."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 49
fed_G492 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article49_text = fed_G491 + fed_G492
print(article49_text)

["confronting a complex set of challenges in judging the stance of policy that will best contribute to sustaining the strong and long-running expansion of our economy.  The challenges will be no less in coming months as we judge whether ongoing adjustments in supply and demand will be sufficient to prevent distortions that would undermine the economy's extraordinary performance. \r\n\r\n\tFor some time now, the growth of aggregate demand has exceeded the expansion of production potential.  Technological innovations have boosted the growth rate of potential, but as I noted in my testimony last February, the effects of this process also have spurred aggregate demand.  It has been clear to us that, with labor markets already quite tight, a continuing disparity between the growth of demand and potential supply would produce disruptive imbalances.\r\n\r\n\tA key element in this disparity has been the very rapid growth of consumption resulting from the effects on spending of the remarkable r

In [285]:
# Article 50 [Greenspan, Feb 1999]

# Press Conference text extraction 1 
pattern = "The U.S. economy" + r"(.*?)" + "the economic expansion."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 50
fed_G501 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2 
pattern = "A hallmark of our" + r"(.*?)" + "been significantly reduced."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 50
fed_G502 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 3
pattern = "These recent domestic" + r"(.*?)" + "will begin to accelerate."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 50
fed_G503 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 4
pattern = "At its February meeting," + r"(.*?)" + "nominal GDP growth."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 50
fed_G504 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 5
pattern = "The FOMC at recent" + r"(.*?)" + "for monetary policy."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 50
fed_G505 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 6
pattern = "Before closing," + r"(.*?)" + "planning efforts."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 50
fed_G506 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 7
pattern = "Americans can justifiably" + r"(.*?)" + "progress over time."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 50
fed_G507 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article50_text = fed_G501 + fed_G502 + fed_G503 + fed_G504 + fed_G505 + fed_G506 + fed_G507
print(article50_text)

["over the past year again performed admirably.  Despite the challenges presented by severe economic downturns in a number of foreign countries and episodic financial turmoil abroad and at home, our real GDP grew about 4 percent for a third straight year.  In 1998, 2-3/4 million jobs were created on net, bringing the total increase in payrolls to more than 18 million during the current economic expansion, which late last year became the longest in U.S. peacetime history.  Unemployment edged down further to a 4-1/4 percent rate, the lowest since 1970.\r\n\r\n\tAnd despite taut labor markets, inflation also fell to its lowest rate in many decades by some broad measures, although a portion of this decline owed to decreases in oil, commodity, and other import prices that are unlikely to be repeated.  Hourly labor compensation adjusted for inflation posted further impressive gains.  Real compensation gains have been supported by robust advances in labor productivity, which in turn have part

In [286]:
# Article 51 [Greenspan, Jul 1999]

# Press Conference text extraction 1 
pattern = "To date," + r"(.*?)" + "made can be sustained."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 51
fed_G511 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2 
pattern = "A number of important" + r"(.*?)" + "enhanced incentives to spend."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 51
fed_G512 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 3
pattern = "Even if labor" + r"(.*?)" + "domestic saving rebounds."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 51
fed_G513 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 4
pattern = "Going forward," + r"(.*?)" + "is 2 to 2-1/2 percent."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 51
fed_G514 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 5
pattern = "In its deliberations this year" + r"(.*?)" + "transition to the next expansion."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 51
fed_G515 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 6
pattern = "I would be remiss" + r"(.*?)" + "formal enforcement actions."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 51
fed_G516 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 7
pattern = "As a result of our" + r"(.*?)" + "continuing its remarkable progress."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 51
fed_G517 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article51_text = fed_G511 + fed_G512 + fed_G513 + fed_G514 + fed_G515 + fed_G516 + fed_G517
print(article51_text)

["1999 has been an exceptional year for the American economy, but a challenging one for American monetary policy.  Through the first six months of this year, the U.S. economy has further extended its remarkable performance:  Almost 1-1/4 million jobs were added to payrolls on net, and gross domestic product apparently expanded at a brisk pace, perhaps near that of the prior three years.  \r\n\r\nAt the root of this impressive expansion of economic activity has been a marked acceleration in the productivity of our nation's workforce.  This productivity growth has allowed further healthy advances in real wages and has permitted activity to expand at a robust clip while helping to foster price stability.\r\n\r\nLast fall, the Federal Open Market Committee (FOMC) eased monetary policy to counter a seizing-up of financial markets that threatened to disrupt economic activity significantly.  As those markets recovered, the FOMC had to assess whether that policy stance remained appropriate.  B

In [287]:
# Article 52 [Greenspan, Feb 1998]

# Press Conference text extraction 1 
pattern = "The U.S. economy delivered" + r"(.*?)" + "outlook for 1998 shortly."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 52
fed_G521 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 2 
pattern = "History teaches us that" + r"(.*?)" + "impetus to domestic spending."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 52
fed_G522 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 3
pattern = "There can be no doubt" + r"(.*?)" + "the line on inflation."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 52
fed_G523 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 4
pattern = "The FOMC affirmed the" + r"(.*?)" + "determining its policy stance."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 52
fed_G524 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Press Conference text extraction 5
pattern = "With the current situation" + r"(.*?)" + "to those that occur."
topic = "monetarypolicy"
publication_type = "publications"
sub_class = "testimony"
num = 52
fed_G525 = fed_article_cleaning(pattern, topic, publication_type, sub_class, num)

# Agglomeration 
article52_text = fed_G521 + fed_G522 + fed_G523 + fed_G524 + fed_G525
print(article52_text)

['another exemplary\r\n          performance in 1997.  Over the four quarters of last year,\r\n          real GDP expanded close to 4 percent, its fastest annual\r\n          increase in ten years.  To produce that higher output,\r\n          about 3 million Americans joined the nation\'s payrolls, in\r\n          the process contributing to a reduction in the\r\n          unemployment rate to 4-3/4 percent, its lowest sustained\r\n          level since the late 1960s.  And our factories were working\r\n          more intensively too:  Industrial production increased 5-3/4 percent\r\n\t\t  last year, exceeding robust additions to\r\n          capacity.\r\n\r\n               Those gains were shared widely.  The hourly wage\r\n          and salary structure rose about 4 percent, fueling\r\n          impressive increases in personal incomes.  Unlike some\r\n          prior episodes when faster wage rate increases mainly\r\n          reflected attempts to make up for more rapidly rising\r\

#### 1.4 Data merging

In [288]:
import pandas as pd

In [289]:
# Text-based sources dataframe

FED_text_df = [
    article0_text, article1_text, article2_text, article3_text, article4_text, 
    article5_text, article6_text, article7_text, article8_text, article9_text, 
    article10_text, article11_text, article12_text, article13_text, article14_text, 
    article15_text, article16_text, article17_text, article18_text, article19_text, 
    article20_text, article21_text, article22_text, article23_text, article24_text, 
    article25_text, article26_text, article27_text, article28_text, article29_text, 
    article30_text, article31_text, article32_text, article33_text, article34_text, 
    article35_text, article36_text, article37_text, article38_text, article39_text, 
    article40_text, article41_text, article42_text, article43_text, article44_text, 
    article45_text, article46_text, article47_text, article48_text, article49_text, 
    article50_text, article51_text, article52_text
]

In [290]:
# Date-based dataframe
# We could not avoid to imports dates manually, since publication months were seldom differing from the average and hence, the function [f"{str(month).zfill(2)}/2001" for month in range(1, 54)] would have been imprecise

FED_date_df = [
    "03/2024", "03/2023", "06/2023", "02/2022", "06/2022", "02/2021", "07/2021", "02/2020", 
    "06/2020", "02/2019", "07/2019", "02/2018", "07/2018", "02/2017", "07/2017", "02/2016", 
    "06/2016", "02/2015", "07/2015", "02/2014", "07/2014", "02/2013", "07/2013", "02/2012", 
    "07/2012", "03/2011", "07/2011", "02/2010", "07/2010", "02/2009", "07/2009", "02/2008", 
    "07/2008", "02/2007", "07/2007", "02/2006", "07/2006", "02/2005", "07/2005", "02/2004", 
    "07/2004", "02/2003", "04/2003", "07/2003", "02/2002", "07/2002", "02/2001", "07/2001", 
    "02/2000", "07/2000", "02/1999", "07/1999", "07/1998"
]

In [291]:
# Chairman-based dataframe

FED_chairman_df = [
    "Powell", "Powell", "Powell", "Powell", "Powell", 
    "Powell", "Powell", "Powell", "Powell", "Powell", 
    "Powell", "Powell", "Powell", "Yellen", "Yellen", 
    "Yellen", "Yellen", "Yellen", "Yellen", "Yellen", 
    "Yellen", "Bernanke", "Bernanke", "Bernanke", "Bernanke", 
    "Bernanke", "Bernanke", "Bernanke", "Bernanke", "Bernanke", 
    "Bernanke", "Bernanke", "Bernanke", "Bernanke", "Bernanke", 
    "Bernanke", "Bernanke", "Greenspan", "Greenspan", "Greenspan", 
    "Greenspan", "Greenspan", "Greenspan", "Greenspan", "Greenspan", 
    "Greenspan", "Greenspan", "Greenspan", "Greenspan", "Greenspan", 
    "Greenspan", "Greenspan", "Greenspan"
]

In [292]:
# Final conglomerate data merging 
FED_data_df = {"Article Text": FED_text_df, 
        "Publication Date": FED_date_df, 
        "Chairman": FED_chairman_df}

# Convert to pandas dataframe
FED_df = pd.DataFrame(FED_data_df)

# Sort the DataFrame by publication Date
FED_df["Publication Date"] = pd.to_datetime(FED_df["Publication Date"], format="%m/%Y")
FED_df = FED_df.sort_values(by="Publication Date")

# Reset the index scale
FED_df = FED_df.reset_index(inplace=False, drop=True)

print(FED_df)

                                         Article Text Publication Date  \
0   [another exemplary\r\n          performance in...       1998-07-01   
1   [over the past year again performed admirably....       1999-02-01   
2   [1999 has been an exceptional year for the Ame...       1999-07-01   
3   [that the American economy, which grew more th...       2000-02-01   
4   [confronting a complex set of challenges in ju...       2000-07-01   
5   [has been extraordinary for the American econo...       2001-02-01   
6   [year has confronted an economy that slowed sh...       2001-07-01   
7   [last reported to you on the\r\nconduct of mon...       2002-02-01   
8   [one-half months since I last testified before...       2002-07-01   
9   [before this committee last July, I noted that...       2003-02-01   
10  [I noted that the economic expansion over the ...       2003-04-01   
11  [I last reviewed the economic outlook before t...       2003-07-01   
12  [I reported that conditions had be

### **2. PREPARE AND CLEAN TEXTUAL DATA**

This section cleans the data so that is ready for analysis. It involves doing X, Y, Z.

#### 2.1 Remove Paragraph Headings

#### 2.2 Textual Adjustments

#### 2.3 Convert Text to Lowercase

#### 2.4 Remove Punctuation

#### 2.5 Remove Stopwords

#### 2.6 Lemmatisation

### **3. PRELIMINARY ANALYSIS TEXTUAL DATA**

In [293]:
# Jessie - I have added this to help with structure. 
# These analysis need to be done BEFORE Elia's cleaning just using the raw textual data before any cleaning
# Update the markdown code below once you add all the textual analysis variables (the ones in the thesis file and any other you think of!)
# Describe the dictionary used for stopwords etc and each line of code - the more comments the better 


This section focuses on creating new variables to analyse the textual data from the Federal Reserve (FED). Specifically, it calculates:

**1. Word Count**: Total number of words per statement.

**2. Sentence Count**: Total number of sentences per statement.

**3. Average Words per Sentence**: Average number of words per sentence.

**X. Ratio of Complex Words**: Share of complex words (words with three or more syllables) to the total word count.

**X. Ratio of Stop Words**: Proportion of stop words (common words like "and", "the", "is", etc.) to the total word count.

---

After computing these variables, the data are summarised using descriptive statistics tables and visually. This exploratory analysis provides insights into the textual characteristics of the FED data before conducting regression analysis.


#### 3.1 Create Textual Variables 

##### 3.1.1 Number of Meeting Minutes

##### 3.1.2 Word Count for Each Statement

##### 3.1.3 Sentence Count for Each Statement

##### 3.1.X Stop Words in Each Statement

##### 3.1.X Other Variables....

#### 3.2 Descriptive Statstics 

In [294]:
# Summary table of data
df.describe().round(2)

NameError: name 'df' is not defined

#### 3.3 Visualisation of Textual Variables

In [None]:
## Create charts - maybe instead of single charts you could find a package which lets you view charts side-by-side? Could make the code look cleaner? 
# e.g. word count over time, stop word ratio over time etc.

#### 3.4 Word Frequency

In [None]:
# table with top 20 words, word cloud map

### **4. CREATE COMMUNICATION VARIABLES**

This section converts the qualtative text data to quantiative measure of readability for analysis. 

#### 4.1 Readability Measure

#### 4.2 Sentiment Measure

### **5. IMPORT MACROECONOMIC DATA**

This section imports the macroeconomic variables using an API .... 

In [None]:
# Install Required Packages
!pip install pandas_datareader
import pandas as pd
from pandas_datareader import fred



In [None]:
# Grab data using FredReader
data = fred.FredReader(symbols=['GDPC1', 
                                'FPCPITOTLZGUSA'], 
                       start='1900-01-01', 
                       end=None).read()

# Save the data to a CSV file
data.to_csv('macro_vars.csv')

# Read the data back from the CSV file to a DataFrame
df_macro_vars = pd.read_csv('macro_vars.csv')

# Check data has imported correctly
print(df_macro_vars.tail())

           DATE      GDPC1  FPCPITOTLZGUSA
304  2023-01-01  22112.329             NaN
305  2023-04-01  22225.350             NaN
306  2023-07-01  22490.692             NaN
307  2023-10-01  22679.255             NaN
308  2024-01-01  22768.866             NaN


In [None]:
# Assuming df_vars is your DataFrame
new_column_names = ['date', 'GDP', 'CPI']

# Rename the columns
df_macro_vars = df_macro_vars.rename(columns=dict(zip(df_macro_vars.columns, new_column_names)))
df_macro_vars.head(5)

Unnamed: 0,date,GDP,CPI
0,1947-01-01,2182.681,
1,1947-04-01,2176.892,
2,1947-07-01,2172.432,
3,1947-10-01,2206.452,
4,1948-01-01,2239.682,


### **6. DESCRIPTIVE STATSTICS**

In [None]:
This section does... 

#### 6.1 Summary Table

#### 6.2 Density charts 

#### 6.3 Correlation Analysis

#### 6.4 Visualisation of Variables

### **7. REGRESSION ANALYSIS**

Description of model equations... 

#### 7.1 Model Specification

#### 7.2 Robustness Tests