<a href="https://colab.research.google.com/github/quiet-econ-lab/Quantifying_Beige_Book/blob/main/Quantifying_Beige_Book.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

```Name
Ryotaro Tsuchiya  (UNI: rt302)
MPA Candidate '27 | International Finance and Economic Policy
Columbia University – School of International and Public Affairs (SIPA)
```

---
# Quantifying the Federal Reserve’s Beige Book Using Text Analysis
---



## 1. Introduction

The Federal Reserve’s Beige Book is a qualitative summary of business conditions based on reports from firms, industry contacts, and organizations across the twelve Federal Reserve districts. Released roughly two weeks before each FOMC meeting, it provides timely descriptions of hiring, prices, wages, and demand that often detect turning points earlier than standard macroeconomic indicators.

Despite its importance, the Beige Book is entirely narrative. Policymakers must rely on subjective judgment to determine whether conditions are improving or weakening, which limits its usefulness for empirical analysis. This project examines whether the Beige Book’s tone can be quantified using sentence-level sentiment analysis, and whether such a measure tracks real economic activity.

To investigate this, I scraped the full Beige Book archive, applied sentiment analysis to each sentence, constructed a monthly sentiment index, and compared it with the Philadelphia Fed’s Coincident Economic Activity Index. I also used TF-IDF to identify themes that drive positive and negative sentiment over time.


## 2. Data Collection

### 2.1 Scraping the Beige Book
[The Beige Book](https://www.minneapolisfed.org/region-and-community/regional-economic-indicators/beige-book-archive) is not published as a structured dataset, so I built a scraper that checks each month from 1970 onward and downloads all available releases. When a release is detected, the scraper collects the publication date, the regional section name, and the full narrative text. The resulting dataset contains several thousand district-level narratives describing business conditions across more than five decades.

In [None]:
import requests
import pandas as pd
from datetime import datetime
import time

In [None]:
# Base URL for the Minneapolis Fed Beige Book archive
BASE = "https://www.minneapolisfed.org/beige-book-reports/"

# List of all 12 Federal Reserve districts (plus national summary "su")
regions = ["su", "at", "bo", "ch", "cl", "da", "kc", "mi", "ny", "ph", "ri", "sf", "sl"]

headers = {"User-Agent": "Mozilla/5.0"}

start_year = 1970
current_year = datetime.now().year
url_records = []

# The Beige Book is released about eight times per year,
# but the release months are not fixed and differ across years.
# Because the archive has no index of all release dates,
# we check all 12 months and detect which ones actually contain a release.

for year in range(start_year, current_year + 1):
    for month in range(1, 13):

        # First test if the release exists by checking the national summary page (su)
        test_url = f"{BASE}{year}/{year}-{month:02d}-su"
        try:
            r = requests.get(test_url, headers=headers, timeout=10)
        except requests.RequestException:
            print(f"{year}-{month:02d}: ERROR")
            continue

        if r.status_code == 200:
            print(f"{year}-{month:02d}: FOUND")

            # If the national summary exists, add URLs for all districts
            for region in regions:
                url = f"{BASE}{year}/{year}-{month:02d}-{region}"
                url_records.append({
                    "year": year,
                    "month": month,
                    "region": region,
                    "url": url,
                })

        elif r.status_code == 404:
            print(f"{year}-{month:02d}: NOT FOUND")
        else:
            print(f"{year}-{month:02d}: STATUS {r.status_code}")

        # Sleep briefly to avoid overwhelming the server
        time.sleep(0.1)

In [None]:
# Convert list to DataFrame
df_urls = pd.DataFrame(url_records, columns=["year", "month", "region", "url"])
print("Total URLs collected:", len(df_urls))
print(df_urls.sample(5).to_string())

Total URLs collected: 6227
      year  month region                                                                url
931   1976      5     ny  https://www.minneapolisfed.org/beige-book-reports/1976/1976-05-ny
5184  2015      9     ri  https://www.minneapolisfed.org/beige-book-reports/2015/2015-09-ri
5016  2014      1     sf  https://www.minneapolisfed.org/beige-book-reports/2014/2014-01-sf
3273  1997      3     ri  https://www.minneapolisfed.org/beige-book-reports/1997/1997-03-ri
2746  1992      3     ch  https://www.minneapolisfed.org/beige-book-reports/1992/1992-03-ch


In [None]:
from bs4 import BeautifulSoup

In [None]:
### Scraping all Beige Book pages using the URL list created above ###

def scrape_beige_page(url):
    """
    Scrapes one Beige Book page and extracts:
        - year_month : e.g., "August 1970"
        - full_date  : e.g., "August 12, 1970" (exact publication date)
        - region_name: e.g., "National Summary", "Atlanta"
        - url        : source URL
        - content    : main economic narrative (merged paragraphs)

    Returns a dictionary if successful.
    Returns None if the page is missing or has unexpected structure.
    """

    # Attempt request
    try:
        r = requests.get(url, headers=headers, timeout=10)
    except requests.RequestException:
        return None

    # Skip pages that do not exist (404) or return unexpected status
    if r.status_code != 200:
        return None

    soup = BeautifulSoup(r.text, "html.parser")

    # ---------------------------
    # Extract title block <h1>
    # ---------------------------
    h1 = soup.find("h1", class_="i9-c-title-banner__title--title")
    if not h1:
        return None

    title_text = h1.get_text(strip=True)

    # Most pages follow the format:
    #   "<Region>: <Month Year>"
    # Example:
    #   "Boston: August 1970"
    if ":" in title_text:
        region_name, year_month = [x.strip() for x in title_text.split(":", 1)]
    else:
        region_name = title_text
        year_month = ""

    # ---------------------------
    # Extract main text block
    # ---------------------------
    div = soup.find("div", class_="i9-c-rich-text-area")
    if not div:
        return None

    # Extract full_date from <strong>
    strong_tag = div.find("strong")
    full_date = strong_tag.get_text(strip=True) if strong_tag else ""

    # Extract ALL paragraphs (including date paragraph)
    p_tags = div.find_all("p")
    paragraphs = [p.get_text(" ", strip=True) for p in p_tags]

    # Merge into content
    content = "\n\n".join(paragraphs)
    content = content.replace(full_date, "")

    return {
        "year_month": year_month,
        "full_date": full_date,
        "region_name": region_name,
        "url": url,
        "content": content,
    }

In [None]:
### -------------------------------------------------------------
### Loop over all URLs and scrape pages (with progress display)
### -------------------------------------------------------------

beige_records = []   # Will contain all text + metadata

for _, row in df_urls.iterrows():
    url    = row["url"]
    year   = row["year"]
    month  = row["month"]
    region = row["region"]

    # Progress display (important for debugging and long scraping runs)
    print(f"Scraping: {year}-{month:02d}  region={region}  URL={url}")

    page_info = scrape_beige_page(url)

    # Skip pages with missing or malformed structure
    if page_info is None:
        print(f"  → skipped (no usable content): {year}-{month:02d}  {region}")
        continue

    beige_records.append(page_info)

In [None]:
df_beige = pd.DataFrame(beige_records)

pd.set_option('display.max_colwidth', None)
pd.set_option('display.width', 0)

df_beige.sample(1)


Unnamed: 0,year_month,full_date,region_name,url,content
6009,November 2023,"November 29, 2023",Kansas City,https://www.minneapolisfed.org/beige-book-reports/2023/2023-11-kc,"\n\nSummary of Economic Activity Economic activity in the Tenth District declined slightly in recent weeks. Consumers were increasingly likely to ""share a roof and share meals"" to manage household budget challenges. Demand for rental housing reportedly shifted away from single-bedroom units toward multi-bedroom housing where rent expenses could be shared with a roommate. Similarly, restaurateurs noted that revenues fell as more customers split dishes and eschewed expensive items. Manufacturing businesses reported little change in activity, though some contacts noted a decline in their expectations of demand over the medium term. Reports of planned capital expenditures were mixed depending on how directly businesses were supported by fiscal spending and municipal projects. Renewable energy activity in the Tenth District continued to expand at a moderate pace, driven by modest growth in wind generation and robust growth in solar installations. The outlook for renewable energy remained positive, but contacts noted skilled labor shortages and limitations on interregional electricity transmission as challenges. The agricultural economy and farm credit conditions in the District softened moderately.\n\nLabor Markets Labor conditions in the Tenth District remained mostly unchanged over the past month. Hiring activity in the service sector was mixed across segments. Transportation contacts reported robust employment growth while most hotel contacts reported contractions in employment. Most contacts expected to increase hiring or maintain the size their workforce over the next year, citing expected sales growth, overworked staff, and an ongoing need for workers with specific skills. Few businesses laid off workers, but many contacts reported reducing their workforce through natural attrition.\n\nTo build a skilled workforce, contacts noted raising wages for new hires, upskilling less-qualified workers, and making increased efforts to retain existing employees. Wages continued to grow at a moderate pace. Contacts highlighted raising wages as central to their retention of existing employees and attracting new hires over the past few years. However, some contacts noted an increased number of potential hires have refused the compensation packages offered, indicative of ongoing tightness in the labor market.\n\nPrices Prices grew at a moderate pace. While manufacturing contacts witnessed a moderation in price pressures, service firms are still facing higher prices due to tight labor market conditions. Most firms reported plans to raise prices in coming months. Contacts reported concerns about risks of higher commodity and energy prices. While higher interest rates are raising financing costs for some companies, most District firms reported a majority of their funding coming from cash financing, insulating many District firms from the higher rate environment.\n\nConsumer Spending Consumer spending declined slightly in recent weeks. Contacts suggested consumers were increasingly likely to ""share a roof and share meals"" to manage household budget challenges. Specifically, contacts in multifamily housing reported demand for single-bedroom units softened, shifting toward demand for multiple bedrooms as more renters sought to share rent expenses with roommates. Restaurant owners similarly reported that, while patronage was steady, revenues fell as more customers shared plates and avoided higher cost items. Leisure travelers accounted for a smaller share of hotel stays.\n\nCommunity Conditions Organizations serving low- and moderate-income (LMI) populations reported LMI households have largely spent down any savings and are increasingly turning toward credit cards to make ends meet. More households were skipping car payments, rationing medication, and moving in with other families to cut back on expenses. Organizations noted that while most industries have increased wages recently, the growth in earnings at LMI households was insufficient to offset recent and ongoing inflation. As a result, non-profits were experiencing substantially higher demand for assistance. They reported struggling to meet that demand due to decreasing donations.\n\nManufacturing and Other Business Activity Overall business activity declined slightly last month. Contacts in retail and tourism reported moderate declines in sales and revenues. Hoteliers reported occupancy levels remained steady but noted an increase in stays related to business travel. This shift in traveler type raised some concerns regarding future demand, as business travelers are reportedly more sensitive to price and business cycle fluctuations. Contacts in healthcare reported a somewhat lower outlook for use of services through the end of year. With greater enrollment in high-deductible health insurance plans in 2023, more households have yet to meet their deductible despite being late in the year and may forgo care requiring out-of-pocket payment. Manufacturing businesses reported little change in activity, though some contacts noted a decline in their expectations of demand over the medium term. Planned capital spending was mixed across segments with manufacturers reporting softening investment activity. Contacts noted the emergence of a firm-specific dichotomy whereby businesses that obtained government or defense contracts are fueling the majority of capital expenditure activity.\n\nReal Estate and Construction Several developers and construction managers reported raw materials costs stabilized recently. They also noted greater ability to push against escalating costs from subcontractors. Public sector funding for municipal projects sustained demand for building materials, somewhat supporting materials prices. Contacts indicated that subcontractors were becoming more available for work, with holes in their backlog schedules for the first time in several years. Though construction labor was somewhat more available, growth in labor costs remain elevated.\n\nCommunity and Regional Banking Loan demand remained tepid at banks across the District as lenders continued to focus on maintaining sound credit quality, while higher rates exerted pressure on customer demand for credit. Though standards across loan types remained unchanged, several contacts expected further deterioration in credit quality over the next six months, particularly in the consumer and commercial real estate segments of their portfolios. Bankers cited higher debt service costs and declining borrower cash flow as key risks facing their CRE books, particularly for loans maturing in the near term. Rising funding costs persisted as deposit balances continued to shift to higher-yielding accounts, with contacts reporting strength in time deposit products.\n\nEnergy Renewable energy activity in the Tenth District continued to grow at a moderate pace, driven by modest growth in wind generation and robust growth in solar installations. Expectations were for a continued moderate pace of growth going into next year, driven mostly by wind generation. While growth in renewable energy in the District is expected to be slightly behind the U.S., Kansas and New Mexico are slated to outpace the U.S. average in coming months. Contacts in the renewable energy sector highlighted acute skilled labor shortages and limitations on interregional electricity transmission as key challenges. While higher interest rates are adding to the renewable development costs, most of those higher costs are being passed onto consumers in the form of higher electricity rates. Contacts highlighted the significant boost to renewable development activity expected in the coming years from fiscal stimulus spending, equating that spending to ""throwing gasoline on an already raging fire.""\n\nAgriculture The agricultural economy and farm credit conditions in the District softened last month alongside a moderate decrease in agricultural commodity prices. Agricultural bankers reported borrower liquidity deteriorated slightly from strong levels, and loan repayment rates were slightly lower than a year ago. Farm income declined faster in areas with more intense drought and more corn and wheat production. Agricultural real estate values remained firm. Cattle prices remained strong, supporting credit conditions in other portions of the District. Contacts cited elevated production expenses and high financing costs as ongoing concerns.\n\nFor more information about District economic conditions visit: http://www.KansasCityFed.org/research/regional-research"


### 2.2 Economic Indicators from FRED

To compare Beige Book sentiment with real-time economic activity, I retrieved the [Philadelphia Fed’s Coincident Economic Activity Index](https://fred.stlouisfed.org/series/USPHCI#) from the FRED API. I converted it into year-over-year growth rates so it could be aligned with the monthly Beige Book releases. This indicator summarizes the national business cycle using employment, income, manufacturing, and unemployment data, making it a useful reference point for evaluating whether qualitative sentiment contains timely economic information.

In [None]:
import os
from getpass import getpass

# Enter your FRED API key.
# If you do not have one, you can request it at:
# https://fred.stlouisfed.org/docs/api/api_key.html
os.environ["FRED_API_KEY"] = getpass("Paste your FRED API key: ")

# Simple check to ensure the key was set correctly
assert os.environ.get("FRED_API_KEY"), "No API key detected. Please run this cell again and enter your key."

Paste your FRED API key: ··········


In [None]:
# ------------------------------------------------------------
# Retrieve USPHCI (Philadelphia Fed's Coincident Economic Activity Index)
# from the FRED API and compute year-over-year percentage change.
# ------------------------------------------------------------

# Load API key stored in environment variable
fred_api_key = os.environ.get("FRED_API_KEY")

# FRED series ID for the national coincident index
series_id = "USPHCI"

# Construct API URL
url = (
    "https://api.stlouisfed.org/fred/series/observations"
    f"?series_id={series_id}&api_key={fred_api_key}&file_type=json"
)

# Request data from FRED
r = requests.get(url)

# Extract the list of observations (each contains a date and value)
obs = r.json()["observations"]

# Convert to DataFrame
df_fred = pd.DataFrame(obs)

# Convert date to datetime and value to numeric
df_fred["date"] = pd.to_datetime(df_fred["date"])
df_fred["value"] = pd.to_numeric(df_fred["value"], errors="coerce")

# Keep only valid rows and sort chronologically
df_fred = df_fred[["date", "value"]].dropna()
df_fred = df_fred.sort_values("date").reset_index(drop=True)

# Compute year-over-year percentage change (12-month difference)
df_fred["pct_change"] = df_fred["value"].pct_change(12) * 100

# Display first two years of data
df_fred.head(24)

Unnamed: 0,date,value,pct_change
0,1979-01-01,44.91,
1,1979-02-01,45.05,
2,1979-03-01,45.3,
3,1979-04-01,45.36,
4,1979-05-01,45.59,
5,1979-06-01,45.72,
6,1979-07-01,45.84,
7,1979-08-01,45.89,
8,1979-09-01,45.98,
9,1979-10-01,46.07,


## 3. Data Cleaning and Preparation
The Beige Book narratives were cleaned by standardizing whitespace and removing formatting inconsistencies. Each cleaned document was then split into individual sentences. This step was essential because many paragraphs include both positive and negative assessments, and sentence-level granularity ensures that sentiment is measured precisely rather than averaged across mixed passages. These sentences form the core dataset for the sentiment analysis.

To prepare text for identifying economic themes, a second cleaned version of each sentence was created specifically for TF-IDF. Based on the cleaning script in [Krisel (2023)](https://github.com/rskrisel/tfidf_topic_modeling/blob/main/Intro_Text_Analysis_TFIDF_LDA_Inaugurals.ipynb), this process included lowercasing, tokenization, and removal of English stopwords, years, month names, and common Beige Book terms such as “contacts” or “reported.” The purpose was to retain economically meaningful vocabulary while filtering out boilerplate phrasing that appears in nearly every report. The resulting tokenized sentences were then ready for TF-IDF modeling.

In [None]:
df_beige.sample(1)

Unnamed: 0,year_month,full_date,region_name,url,content
5943,April 2023,"April 19, 2023",Dallas,https://www.minneapolisfed.org/beige-book-reports/2023/2023-04-da,"\n\nSummary of Economic Activity The Eleventh District economy continued to expand modestly. Manufacturing output rose slightly following a mild contraction in the previous period. Growth in the service sector continued at a modest pace, and retail sales and energy activity were flat. Loan demand weakened further, loan volumes fell, and credit conditions tightened. Agricultural conditions remained strained by drought in some areas. Home sales rose. Local nonprofits cited higher demand for assistance. Overall payrolls rose modestly, though hiring slowed sharply in the service sector. Wage growth remained elevated, while price pressures eased notably. Outlooks worsened, and uncertainty surged, partly due to heightened apprehension about the recent banking sector issues and high interest rates, and their spillover effects on the broader economy.\n\nLabor Markets Employment increased modestly during the reporting period. The pace of hiring picked up in manufacturing but slowed in energy and nearly stalled out in services. Difficulty hiring workers remained a top concern for many firms, though a few reported some improvement. Airlines cited capacity constraints due to pilot shortages, and a workforce development contact said some employers were taking a closer look at non-traditional talent pipelines to fill positions. In contrast, staffing firms noted clients were taking longer to make hiring decisions in part due to the increased economic uncertainty, and there were scattered reports of layoffs in construction-related manufacturing and upstream energy.\n\nWage pressures remained elevated, though they have stabilized or moderated in some industries. A food manufacturer noted having issues finding workers despite offering a starting salary that was more than twice the minimum wage, while construction contacts noted some easing in pricing for certain trades.\n\nPrices While input costs continued to rise, the pace of increases moderated in energy, construction, and manufacturing. Freight costs dipped. Some manufacturers noted continued price pressures from supply chain constraints, and a few firms said higher borrowing costs were slowing down expansion plans. Selling price pressures decelerated broadly, bringing price growth close to or below its historical average in manufacturing and services. Homebuilders continued to use incentives and discounts to close sales. Airlines said ticket prices remained elevated, while energy firms reported declining rental rates for drilling rigs and said they expect cost inflation to continue slowing. More than a third of firms responding to a March Dallas Fed survey of nearly 400 Texas business executives cited inflation as a primary outlook concern over the next six months.\n\nManufacturing Texas factory output expanded slightly in March after declining in February. New orders for manufactured goods continued to contract, however. Weakness in demand was most pronounced in primary metals and plastics, though construction-related and computer manufacturers cited declines in new orders as well. In contrast, demand for fabricated metals and machinery rose, and chemical and refinery utilization rates increased. Overall, outlooks weakened, with just under two-thirds of contacts noting waning demand and/or recession as a key concern. Other headwinds cited were elevated input costs, labor shortages, and higher labor costs.\n\nRetail Sales Overall retail sales held steady in March. Auto sales rose strongly, though one contact noted a pullback in demand due to high interest rates. Clothing and health and personal care retailers cited higher sales. In contrast, electronics and appliance store sales dipped, which some contacts attributed to slow activity in the housing market. Nonstore retailers reported sluggish activity in part due to more people traveling this spring break.\n\nNonfinancial Services Modest expansion continued in the service sector. Revenue growth was the strongest in leisure and hospitality, and activity in professional and business services, education, and transportation services rose as well. Small parcel and air cargo shipments were flat to down, while sea cargo volumes remained robust and were up notably compared with year-ago levels. One contact noted that the recent train derailments had increased supply chain delays. Airlines saw continuing solid demand for leisure travel and some contacts expect business travel revenues to reach pre-pandemic levels this spring. Demand for staffing services was mixed, with firms making white-collar placements seeing continued strong activity while those filling blue-collar positions citing weakness. Health care and real estate rental and leasing firms noted declining revenues on net.\n\nConstruction and Real Estate Single-family housing demand improved further during the reporting period partly due to lower mortgage rates. However, the level of activity remained well below year ago levels. Most contacts reported a solid spring market, with sales, particularly in popular locations at or above plan. Buyer traffic held up, and contract cancellations dipped. Housing starts remained subdued. Outlooks improved but uncertainty remained elevated particularly considering the recent banking challenges. Apartment leasing picked up slightly. Rents were flat and occupancy continued to dip as supply outpaced demand.\n\nDemand for office space was lackluster, and heightened levels of sublease space remained an impediment to market recovery. Activity in the industrial market stayed solid, but vacancy edged up due to the arrival of new properties. The higher cost of capital, tighter lending standards, and financial uncertainty has made it challenging to price deals, diminishing investment sales activity. Some contacts voiced concern regarding the renewal of commercial real estate loans, particularly those secured by office properties.\n\nFinancial Services Loan demand continued to decline in March as bankers reported worsening business activity. Loan volumes fell, driven largely by a sharp contraction in consumer loans. Loan performance worsened slightly overall. Credit standards and terms tightened sharply, and marked increases in loan pricing were noted. Banking outlooks continued to deteriorate, with contacts expecting a contraction in loan demand and business activity and an increase in nonperforming loans over the next six months. Increased uncertainty and lack of confidence resulting from the recent banking issues were cited as concerns.\n\nEnergy Energy activity was essentially flat over the past six weeks. The rig count was unchanged as activity shifted between and within basins in part due to lower natural gas prices. Oil and natural gas production increased in the first quarter, and expectations are for drilling and completion activity to rise moderately through the year. Outlooks worsened, however, partly due to uncertainty about the economy.\n\nAgriculture Drought conditions persisted in the western part of the district while soil conditions were quite favorable elsewhere. The La Niña weather pattern has ended, and rainfall is expected to increase moving into summer and fall. Cotton acres are expected to be down significantly this year, with farmers favoring crops with a relatively higher price and drought tolerance. On the livestock side, cattle prices increased dramatically over the past six weeks and were up from this time last year, and demand was solid.\n\nCommunity Perspectives Nonprofits saw increased demand for their services, with one contact citing higher activity compared with pre-pandemic levels. Utilization of housing assistance or temporary shelters increased notably, and some nonprofits said that housing assistance was the fastest growing need among their clients. Contacts cited growing financial difficulties for low- to moderate-income families in part due to the recent reduction in SNAP benefits. One nonprofit noted that more middle-class families were seeking financial help as their wages had not kept pace with rising living costs. High or rising operating costs remained a challenge for many nonprofits, and some were concerned that with many companies downsizing, they would not meet their fundraising goals.\n\nFor more information about District economic conditions visit: https://www.dallasfed.org/research/texas"


In [None]:
# -------------------------------------------------------------
# Clean Beige Book text (normalize whitespace)
# -------------------------------------------------------------
import re

In [None]:
# Clean whitespace in content (remove newlines/tabs/double spaces)
def normalize_spaces(text):
    return re.sub(r"\s+", " ", text).strip() if isinstance(text, str) else ""

df_beige["content_clean"] = df_beige["content"].apply(normalize_spaces)

# Clean year_month_
df_beige["year_month"] = df_beige["year_month"].str.extract(r"([A-Za-z]+ \d{4})")

df_beige[["year_month", "full_date", "region_name", "content_clean"]].sample(1)

Unnamed: 0,year_month,full_date,region_name,content_clean
1255,June 1978,"June 14, 1978",Richmond,"Fifth District manufacturing activity continued to expand in May as respondents to our monthly survey report further increases in shipments, new orders, and backlogs. Survey responses also included favorable reports on inventories, where accumulation slowed, employment, and weekly hours worked. Despite recent gains in activity, there is no indication of imminent bottlenecks. Comments by the Richmond Directors, however, were less encouraging than our survey results. The Directors view consumer attitudes as essentially neutral and a majority cited inflation as a negative or potentially negative factor. Bank credit activity in the District continues strong, with lending concentrated in the business and consumer sectors. Most banks and S & Ls in the District are offering the new six-month savings certificates and most are paying the legal maximum rate. Customer response, generally, has been less than expected. The gains in manufacturing activity over the past several months have served to bring inventories and plant and equipment capacity nearly into line with desired levels. There remains a number of individual respondents who view current stocks as excessive, but nearly two-thirds consider current levels about right or too low. Nearly all manufacturing respondents view current expansion plans as about right. A majority of our Directors feel that business investment opportunities have changed little over the past six to twelve months. Despite substantial recent improvements in orders and shipments, manufacturing respondents express considerably less optimism about the near-term future than has been the case in other recent surveys. The consensus now appears to be that business activity nationally and locally will be little changed over the next six months. Their expectations, as well as those of our Directors, appear to have been significantly affected by recent price developments. Our survey indicates widespread increases in both prices paid and prices received. Directors are almost unanimous in citing inflation as a negative factor in the outlook and note, in particular, its adverse impact on consumer attitudes. An overwhelming majority feel that recent mortgage rate developments have dimmed the outlook for residential construction in the District. The few retailers reporting in our latest survey reflect a somewhat more positive view of near-term prospects than those expressed by manufacturers and Directors. The outlook for automobiles appears especially to have firmed. Reports from banks and thrift institutions suggest that the recent sharp increases in mortgage rates are beginning to have a depressing effect on loan demand, although this effect is still fairly minor. The typical rate for an 80 percent, 30-year mortgage loan in the District is around 9 3/4 - 10 percent. Business lending has been very strong in recent weeks, with loans to durable goods manufacturers, public utilities, and retail trade accounting for the largest portion of new volume. Two-thirds of the respondents to our May survey of bank lending practices report moderately stronger commercial and industrial loan demand over the past quarter, and half expect demand to strengthen further. Rates charged to customers have risen, and there is some evidence of firming in non-price terms of lending, too. Most banks and S&Ls in the District are offering the new six- month savings certificates, and most are paying the legal maximum rate. An exception seems to be South Carolina, where the major banks have not entered the market for these instruments. The thrifts have been more aggressive than the banks in promoting the new certificates, with bank advertising generally taking the form of announcement or notification advertising. In general customer response to the certificates has been less than expected at both banks and S&Ls. The expectation at most financial institutions is that the six-month certificates will help prevent the loss of existing deposits as market interest rates rise, but that no significant increase in new money will result. Initial experiences confirm this expectation, as most institutions report that the largest proportion, 75 to 90 percent, of funds being placed in six-month certificates is being transferred from existing accounts. Again, however, South Carolina is an exception. The thrifts in that state have had a good public response to the new certificates and a substantial amount of new money is being generated. There is a feeling at some institutions that funds currently held in long-term deposits will be transferred to the six-month certificates as outstanding savings certificates mature. Most field crops have gotten off to a late start because of the cool, wet spring. Delays in planting have ranged from one to three weeks. Much replanting has been necessary. Some cotton has been plowed up because of poor stands. And shortages of flue-cured tobacco plants have developed in some areas. Peaches are said to be in good condition and sizing well. Fifth District manufacturing activity continued to expand in May as respondents to our monthly survey report further increases in shipments, new orders, and backlogs. Survey responses also included favorable reports on inventories, where accumulation slowed, employment, and weekly hours worked. Despite recent gains in activity, there is no indication of imminent bottlenecks. Comments by the Richmond Directors, however, were less encouraging than our survey results. The Directors view consumer attitudes as essentially neutral and a majority cited inflation as a negative or potentially negative factor. Bank credit activity in the District continues strong, with lending concentrated in the business and consumer sectors. Most banks and S & Ls in the District are offering the new six-month savings certificates and most are paying the legal maximum rate. Customer response, generally, has been less than expected. The gains in manufacturing activity over the past several months have served to bring inventories and plant and equipment capacity nearly into line with desired levels. There remains a number of individual respondents who view current stocks as excessive, but nearly two-thirds consider current levels about right or too low. Nearly all manufacturing respondents view current expansion plans as about right. A majority of our Directors feel that business investment opportunities have changed little over the past six to twelve months. Despite substantial recent improvements in orders and shipments, manufacturing respondents express considerably less optimism about the near-term future than has been the case in other recent surveys. The consensus now appears to be that business activity nationally and locally will be little changed over the next six months. Their expectations, as well as those of our Directors, appear to have been significantly affected by recent price developments. Our survey indicates widespread increases in both prices paid and prices received. Directors are almost unanimous in citing inflation as a negative factor in the outlook and note, in particular, its adverse impact on consumer attitudes. An overwhelming majority feel that recent mortgage rate developments have dimmed the outlook for residential construction in the District. The few retailers reporting in our latest survey reflect a somewhat more positive view of near-term prospects than those expressed by manufacturers and Directors. The outlook for automobiles appears especially to have firmed. Reports from banks and thrift institutions suggest that the recent sharp increases in mortgage rates are beginning to have a depressing effect on loan demand, although this effect is still fairly minor. The typical rate for an 80 percent, 30-year mortgage loan in the District is around 9 3/4 - 10 percent. Business lending has been very strong in recent weeks, with loans to durable goods manufacturers, public utilities, and retail trade accounting for the largest portion of new volume. Two-thirds of the respondents to our May survey of bank lending practices report moderately stronger commercial and industrial loan demand over the past quarter, and half expect demand to strengthen further. Rates charged to customers have risen, and there is some evidence of firming in non-price terms of lending, too. Most banks and S&Ls in the District are offering the new six- month savings certificates, and most are paying the legal maximum rate. An exception seems to be South Carolina, where the major banks have not entered the market for these instruments. The thrifts have been more aggressive than the banks in promoting the new certificates, with bank advertising generally taking the form of announcement or notification advertising. In general customer response to the certificates has been less than expected at both banks and S&Ls. The expectation at most financial institutions is that the six-month certificates will help prevent the loss of existing deposits as market interest rates rise, but that no significant increase in new money will result. Initial experiences confirm this expectation, as most institutions report that the largest proportion, 75 to 90 percent, of funds being placed in six-month certificates is being transferred from existing accounts. Again, however, South Carolina is an exception. The thrifts in that state have had a good public response to the new certificates and a substantial amount of new money is being generated. There is a feeling at some institutions that funds currently held in long-term deposits will be transferred to the six-month certificates as outstanding savings certificates mature. Most field crops have gotten off to a late start because of the cool, wet spring. Delays in planting have ranged from one to three weeks. Much replanting has been necessary. Some cotton has been plowed up because of poor stands. And shortages of flue-cured tobacco plants have developed in some areas. Peaches are said to be in good condition and sizing well. Fifth District manufacturing activity continued to expand in May as respondents to our monthly survey report further increases in shipments, new orders, and backlogs. Survey responses also included favorable reports on inventories, where accumulation slowed, employment, and weekly hours worked. Despite recent gains in activity, there is no indication of imminent bottlenecks. Comments by the Richmond Directors, however, were less encouraging than our survey results. The Directors view consumer attitudes as essentially neutral and a majority cited inflation as a negative or potentially negative factor. Bank credit activity in the District continues strong, with lending concentrated in the business and consumer sectors. Most banks and S & Ls in the District are offering the new six-month savings certificates and most are paying the legal maximum rate. Customer response, generally, has been less than expected. The gains in manufacturing activity over the past several months have served to bring inventories and plant and equipment capacity nearly into line with desired levels. There remains a number of individual respondents who view current stocks as excessive, but nearly two-thirds consider current levels about right or too low. Nearly all manufacturing respondents view current expansion plans as about right. A majority of our Directors feel that business investment opportunities have changed little over the past six to twelve months. Despite substantial recent improvements in orders and shipments, manufacturing respondents express considerably less optimism about the near-term future than has been the case in other recent surveys. The consensus now appears to be that business activity nationally and locally will be little changed over the next six months. Their expectations, as well as those of our Directors, appear to have been significantly affected by recent price developments. Our survey indicates widespread increases in both prices paid and prices received. Directors are almost unanimous in citing inflation as a negative factor in the outlook and note, in particular, its adverse impact on consumer attitudes. An overwhelming majority feel that recent mortgage rate developments have dimmed the outlook for residential construction in the District. The few retailers reporting in our latest survey reflect a somewhat more positive view of near-term prospects than those expressed by manufacturers and Directors. The outlook for automobiles appears especially to have firmed. Reports from banks and thrift institutions suggest that the recent sharp increases in mortgage rates are beginning to have a depressing effect on loan demand, although this effect is still fairly minor. The typical rate for an 80 percent, 30-year mortgage loan in the District is around 9 3/4 - 10 percent. Business lending has been very strong in recent weeks, with loans to durable goods manufacturers, public utilities, and retail trade accounting for the largest portion of new volume. Two-thirds of the respondents to our May survey of bank lending practices report moderately stronger commercial and industrial loan demand over the past quarter, and half expect demand to strengthen further. Rates charged to customers have risen, and there is some evidence of firming in non-price terms of lending, too. Most banks and S&Ls in the District are offering the new six- month savings certificates, and most are paying the legal maximum rate. An exception seems to be South Carolina, where the major banks have not entered the market for these instruments. The thrifts have been more aggressive than the banks in promoting the new certificates, with bank advertising generally taking the form of announcement or notification advertising. In general customer response to the certificates has been less than expected at both banks and S&Ls. The expectation at most financial institutions is that the six-month certificates will help prevent the loss of existing deposits as market interest rates rise, but that no significant increase in new money will result. Initial experiences confirm this expectation, as most institutions report that the largest proportion, 75 to 90 percent, of funds being placed in six-month certificates is being transferred from existing accounts. Again, however, South Carolina is an exception. The thrifts in that state have had a good public response to the new certificates and a substantial amount of new money is being generated. There is a feeling at some institutions that funds currently held in long-term deposits will be transferred to the six-month certificates as outstanding savings certificates mature. Most field crops have gotten off to a late start because of the cool, wet spring. Delays in planting have ranged from one to three weeks. Much replanting has been necessary. Some cotton has been plowed up because of poor stands. And shortages of flue-cured tobacco plants have developed in some areas. Peaches are said to be in good condition and sizing well."


In [None]:
# -------------------------------------------------------------
# Split cleaned text into individual sentences
# -------------------------------------------------------------
import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import stopwords

nltk.download("punkt")
nltk.download('punkt_tab')
nltk.download("stopwords")

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


True

In [None]:
# Split each cleaned Beige Book into sentences and store metadata + clean sentence
sent_rows = []

for row in df_beige.itertuples():
    for sent in nltk.sent_tokenize(row.content_clean):
        sent_rows.append({
            "year_month": row.year_month,
            "full_date": row.full_date,
            "region_name": row.region_name,
            "sentence_clean_sent": normalize_spaces(sent),
        })

df_sent = pd.DataFrame(sent_rows)

In [None]:
# -------------------------------------------------------------
# Create TF-IDF clean sentences + token lists
# -------------------------------------------------------------

# Prepare stopwords for TF-IDF
EN_STOP = set(stopwords.words("english"))

# Custom stopwords specific to Beige Book text
CUSTOM_STOP = {
    "contacts", "reported", "noted", "district", "year", "percent"
}

# Add years and month names to custom stopwords
years = {str(y) for y in range(start_year, current_year + 1)}
months = {
    "january","february","march","april","may","june",
    "july","august","september","october","november","december",
    "jan","feb","mar","apr","jun","jul","aug","sep","sept","oct","nov","dec"
}

CUSTOM_STOP = CUSTOM_STOP.union(years).union(months)
STOPWORDS = EN_STOP.union(CUSTOM_STOP)

# Clean tokens for TF-IDF (lowercase, alphabetic only, ≥3 letters)
def simple_clean_tokens(text):
    text = text.lower()
    tokens = word_tokenize(text)
    return [t for t in tokens if t.isalpha() and len(t) >= 3 and t not in STOPWORDS]

df_sent["tokens_clean_tfid"] = df_sent["sentence_clean_sent"].apply(simple_clean_tokens)
df_sent["sentence_clean_tfidf"] = df_sent["tokens_clean_tfid"].apply(lambda t: " ".join(t))

df_sent[
    ["year_month", "region_name", "sentence_clean_sent", "sentence_clean_tfidf", "tokens_clean_tfid"]
].head()

Unnamed: 0,year_month,region_name,sentence_clean_sent,sentence_clean_tfidf,tokens_clean_tfid
0,May 1970,National Summary,"This initial report of economic conditions in the 12 Federal Reserve Districts is based on information gathered from directors of the Reserve Banks, conversations with local bankers, businessmen and economists, regular monthly surveys of manufacturing and trade industries conducted by some of the Reserve Banks, and selected statistical measures of regional economic activity.",initial report economic conditions federal reserve districts based information gathered directors reserve banks conversations local bankers businessmen economists regular monthly surveys manufacturing trade industries conducted reserve banks selected statistical measures regional economic activity,"[initial, report, economic, conditions, federal, reserve, districts, based, information, gathered, directors, reserve, banks, conversations, local, bankers, businessmen, economists, regular, monthly, surveys, manufacturing, trade, industries, conducted, reserve, banks, selected, statistical, measures, regional, economic, activity]"
1,May 1970,National Summary,Reports from the Reserve Banks clearly indicate that the current overriding domestic concern is inflation.,reports reserve banks clearly indicate current overriding domestic concern inflation,"[reports, reserve, banks, clearly, indicate, current, overriding, domestic, concern, inflation]"
2,May 1970,National Summary,Businessmen contacted generally expect that prices will continue to increase at a rapid rate during the remainder of the year.,businessmen contacted generally expect prices continue increase rapid rate remainder,"[businessmen, contacted, generally, expect, prices, continue, increase, rapid, rate, remainder]"
3,May 1970,National Summary,There appears to be considerable skepticism regarding the ability of economic stabilization policies to achieve a significant reduction in the rate of inflation without generating an intolerable level of unemployment or a full-scale recession.,appears considerable skepticism regarding ability economic stabilization policies achieve significant reduction rate inflation without generating intolerable level unemployment recession,"[appears, considerable, skepticism, regarding, ability, economic, stabilization, policies, achieve, significant, reduction, rate, inflation, without, generating, intolerable, level, unemployment, recession]"
4,May 1970,National Summary,"Similarly, there is evidence of extensive concern about the persistence of strong upward wage pressures, despite some easing in labor markets.",similarly evidence extensive concern persistence strong upward wage pressures despite easing labor markets,"[similarly, evidence, extensive, concern, persistence, strong, upward, wage, pressures, despite, easing, labor, markets]"


## 4. Methods

### 4.1 Sentiment Classification
The sentiment analysis was conducted using VADER, a rule-based model designed for short sentences. I computed the VADER compound score for every sentence. The compound score ranges from −1 (most negative) to +1 (most positive). Following standard practice, I classified sentences as positive when the compound score was 0.05 or higher, negative when it was −0.05 or lower, and neutral otherwise.

After labeling all sentences, I aggregated those belonging to the same monthly Beige Book release to construct a national sentiment measure. The Beige Book Sentiment Index for month *t* was defined as:

$$
\text{Index}_t = \frac{\text{# positive}_t - \text{# negative}_t}{\text{# positive_t} + \text{# negative}_t}
$$

This formulation captures the balance of optimistic and pessimistic assessments each month, producing an index that increases when positive evaluations become more frequent and decreases when negative assessments dominate.

In [None]:
# -------------------------------------------------------------
# Compute Beige Book Sentiment Index using VADER
# -------------------------------------------------------------
from nltk.sentiment import SentimentIntensityAnalyzer
nltk.download('vader_lexicon')

[nltk_data] Downloading package vader_lexicon to /root/nltk_data...


True

In [None]:
sia = SentimentIntensityAnalyzer()

df_sent["vader_compound"] = df_sent["sentence_clean_sent"].apply(
    lambda x: sia.polarity_scores(x)["compound"]
)

def vader_label(score):
    if score >= 0.05:
        return "pos"
    elif score <= -0.05:
        return "neg"
    else:
        return "neu"

df_sent["vader_label"] = df_sent["vader_compound"].apply(vader_label)

In [None]:
# Top 3 positive sentences
df_sent.sort_values("vader_compound").tail(3)[["vader_compound", "sentence_clean_sent"]]

Unnamed: 0,vader_compound,sentence_clean_sent
4333,0.9712,"Surveys of businessmen and bankers in the Fifth District indicate general agreement on the following points: (1) some improvement in manufacturers' shipments, volume of new orders, and backlogs of orders; (2) significant further improvement in retail sales, including automobiles; (3) stability in the employment situation, but no clear evidence of improvement; (4) further reductions of prices in manufacturing, but not in retail goods and services; (5) sharp improvement in residential construction, and some increase in nonresidential construction; (6) substantial increases in mortgage loan demand, and slight increases in consumer loan demand, but no significant improvement in business loan demand; and (7) a generally more optimistic outlook regarding future business conditions."
553225,0.9719,"Contacts noted that workers are on track to receive bonuses this year, but bonuses are not expected to be overly generous given the softer labor market, though Wall Street bonuses are expected to be strong."
10661,0.9772,"However, a special survey of a cross-section of prominent businessmen located in the Atlanta, Nashville, and New Orleans areas yielded the following conclusions: support for the NEP remains strong, but has diminished somewhat over the past year; the NEP has been effective in checking at least some price increases; wage inflation has been effectively checked by the NEP, inequities have not been so great as to jeopardize the program; some form of controls should be continued beyond April 1973; inflationary expectations have diminished slightly; and the only economic resource that is in short supply is competent labor."


In [None]:
# Top 5 negative sentences
df_sent.sort_values("vader_compound").head(3)[["vader_compound", "sentence_clean_sent"]]

Unnamed: 0,vader_compound,sentence_clean_sent
189644,-0.9716,But adverse weather delayed or damaged crops in other districts and caused heavy livestock death losses and flood losses in the Minneapolis district.
455224,-0.9571,"In late August and early September, Hurricane Irene and Tropical Storm Lee left dozens of people injured or dead, damaged or destroyed thousands of homes, and cost hundreds of millions of dollars in disruption and damage throughout much of the Third District."
337377,-0.9524,Contacts also mentioned other industry changes resulting from the terrorist attacks such as separate terrorism and war clauses for policies (at additional cost) and closer scrutiny of the solvency of re-insurance providers.


In [None]:
# -------------------------------------------------------
# Aggregate sentiment by Beige Book release (year_month)
# -------------------------------------------------------
df_index = (
    df_sent
    .groupby("year_month", as_index=False)
    .agg(
        n_pos=("vader_label", lambda x: (x == "pos").sum()),
        n_neg=("vader_label", lambda x: (x == "neg").sum()),
        n_total=("vader_label", lambda x: ((x == "pos") | (x == "neg")).sum()),
        full_date=("full_date", "first")  # keep one date for reference
    )
)

# Compute Beige Book Sentiment Index: ( #positive − #negative ) / ( #positive + #negative )
df_index["vader_national"] = (
    (df_index["n_pos"] - df_index["n_neg"]) /
    (df_index["n_pos"] + df_index["n_neg"])
)

### 4.2 TF-IDF Theme Identification
To understand what drives movements in sentiment, I applied TF-IDF separately to positive and negative sentences for each monthly release. TF-IDF ranks words by how distinctive they are within that month relative to the entire corpus. This approach highlights the economic concepts that businesses emphasized most strongly in positive or negative reports. By examining these terms over time, it is possible to identify shifts in the themes that shape business sentiment, such as supply-chain issues, labor concerns, or demand conditions.

In [None]:
# -------------------------------------------------------
# Compute TF–IDF separately for positive and negative sentences
# -------------------------------------------------------
from sklearn.feature_extraction.text import TfidfVectorizer

In [None]:

def top_tfidf_words(group, n=10):
    """
    Compute TF–IDF for a group of sentences (one month)
    and return the top-n weighted terms.
    """
    # Take all TF-IDF-cleaned sentences for this month/label
    texts = group["sentence_clean_tfidf"].tolist()

    # If empty, return blank
    if len(texts) == 0:
        return ""

    # Vectorize using TF-IDF
    tfidf = TfidfVectorizer(max_features=2000)
    X_tfidf = tfidf.fit_transform(texts)

    # Compute average TF-IDF score across all sentences in this group
    scores = X_tfidf.mean(axis=0).A1
    terms = tfidf.get_feature_names_out()

    # Top-n term indices (descending score)
    top_idx = scores.argsort()[::-1][:n]

    # Return selected words as comma-separated string
    return ", ".join([terms[i] for i in top_idx])

In [None]:
# Compute TF–IDF themes for positive sentences
df_pos = df_sent[df_sent["vader_label"] == "pos"]

tfidf_pos = (
    df_pos
    .groupby("year_month")     # compute monthly themes
    .apply(top_tfidf_words, n=10)
    .reset_index(name="top_pos_terms")
)

In [None]:
# Compute TF–IDF themes for negative sentences
df_neg = df_sent[df_sent["vader_label"] == "neg"]

tfidf_neg = (
    df_neg
    .groupby("year_month")
    .apply(top_tfidf_words, n=10)
    .reset_index(name="top_neg_terms")
)

In [None]:
# Merge TF–IDF results into df_index
df_index = df_index.merge(tfidf_pos, on="year_month", how="left")
df_index = df_index.merge(tfidf_neg, on="year_month", how="left")

# Preview the result
df_index.head()

Unnamed: 0,year_month,n_pos,n_neg,n_total,full_date,vader_national,top_pos_terms,top_neg_terms
0,April 1971,154,120,274,"April 6, 1971",0.124088,"construction, demand, rates, consumer, improvement, loans, directors, loan, residential, increased","demand, rates, loan, banks, unemployment, weak, levels, activity, consumer, cut"
1,April 1972,220,70,290,"April 12, 1972",0.517241,"sales, demand, business, strong, construction, expected, strength, loans, loan, gains","demand, business, unemployment, cent, per, directors, new, phase, investment, respondents"
2,April 1973,169,114,283,"April 11, 1973",0.194346,"strong, directors, increase, construction, loan, rates, business, interest, employment, increased","shortages, demand, prices, labor, increases, price, new, unemployment, inflationary, phase"
3,April 1974,192,143,335,"April 10, 1974",0.146269,"business, sales, strong, prices, increase, increased, month, optimistic, rates, demand","shortages, demand, prices, steel, continue, loan, recent, business, however, also"
4,April 1975,202,187,389,"April 9, 1975",0.03856,"sales, increase, however, construction, economy, months, business, new, consumer, one","sales, prices, demand, weak, loan, unemployment, recovery, still, capital, one"


## 5. Results

In [None]:
# ---------------------------------------------
# Build a dynamic Plotly graph
# ---------------------------------------------
import numpy as np
import plotly.graph_objs as go

In [None]:
df_index["date"] = pd.to_datetime(df_index["year_month"], format="%B %Y")
df_index = df_index.sort_values("date").reset_index(drop=True)
df_fred  = df_fred.sort_values("date").reset_index(drop=True)

# ---------------------------------------------
# Merge Beige Book index with FRED growth data
# (pct_change is used directly, no renaming)
# ---------------------------------------------
df_plot = pd.merge_asof(
    df_index.sort_values("date"),
    df_fred[["date", "pct_change"]].sort_values("date"),
    on="date",
    direction="backward"
).reset_index(drop=True)

# ---------------------------------------------
# Prepare custom hover data
# customdata columns:
#   0 = Beige Book Index
#   1 = USPHCI YoY (%)
#   2 = positive keywords
#   3 = negative keywords
# ---------------------------------------------
customdata = np.stack([
    df_plot["vader_national"].values,
    df_plot["pct_change"].values,
    df_plot["top_pos_terms"].fillna("").values,
    df_plot["top_neg_terms"].fillna("").values,
], axis=-1)

# ---------------------------------------------
# Build Plotly figure with dual y-axes
# ---------------------------------------------
fig = go.Figure()

# ----- Beige Book Sentiment Index (left axis) -----
fig.add_trace(
    go.Scatter(
        x=df_plot["date"],
        y=df_plot["vader_national"],
        mode="lines+markers",
        name="Beige Book Sentiment Index",
        yaxis="y1",
        customdata=customdata,
        hovertemplate=(
            "<b>%{x|%Y-%m}</b><br>"
            "Beige Book Index: %{customdata[0]:.3f}<br>"
            "USPHCI YoY: %{customdata[1]:.2f}%<br>"
            "<br><b>Positive keywords</b>: %{customdata[2]}<br>"
            "<b>Negative keywords</b>: %{customdata[3]}<br>"
            "<extra></extra>"
        ),
    )
)

# ----- USPHCI YoY (%) (right axis) -----
fig.add_trace(
    go.Scatter(
        x=df_plot["date"],
        y=df_plot["pct_change"],
        mode="lines+markers",
        name="Philly Fed Economic Activity Index YoY (%)",
        yaxis="y2",
        hovertemplate=(
            "<b>%{x|%Y-%m}</b><br>"
            "USPHCI YoY: %{y:.2f}%<br>"
            "<extra></extra>"
        ),
    )
)

# ---------------------------------------------
# Layout settings
# ---------------------------------------------
fig.update_layout(
    title="Beige Book Sentiment Index vs Economic Activity Index",
    xaxis=dict(title="Date"),
    yaxis=dict(title="Beige Book Sentiment Index"),
    yaxis2=dict(
        title="Philly Fed Economic Activity Index YoY (%)",
        overlaying="y",
        side="right",
        showgrid=False,
    ),
    template="plotly_white",
    hovermode="x unified",
)

fig.show()

<blockquote>

<h3>⚠️ Interactive Plot of the Beige Book Index</h3>

GitHub cannot render interactive Plotly graphs.<br>
Please click
<a href="https://quiet-econ-lab.github.io/Quantifying_Beige_Book/" target="_blank">
<b>here</b>
</a>
to view the fully interactive graph.

</blockquote>

When the Beige Book Sentiment Index is compared with the growth rate of the Philadelphia Fed’s Coincident Economic Activity Index, the two measures tend to move in a similar direction. Periods of stronger sentiment are generally associated with stronger underlying economic conditions, whereas weaker sentiment coincides with slower growth. This alignment indicates that qualitative business narratives contain timely information about economic momentum—an insight that is particularly valuable during periods when official statistics are delayed or unavailable, as in the current government shutdown.

As the graph above shows, the sentiment index has softened. The background for this decline becomes clearer when viewed alongside the TF-IDF results, which are accessible through the hover labels in the plot. In particular, the decline in May 2025 stands out. The most prominent negative terms for that month— “demand,” “uncertain,” and “tariff”—indicate increasing concerns about slowing demand and uncertainty related to trade policy.

A similar pattern is visible in November 2025, where negative sentences again emphasize “demand” and “uncertain.” This persistence indicates that caution among businesses has not eased and that uncertainty remains a major factor shaping economic expectations. In addition, “labor” appears as an important negative term. Given the Federal Reserve’s dual mandate of maximum employment and stable prices, the emergence of “labor” as a source of concern carries meaningful implications for monetary policy, as discussed in the next section.

## 6. Policy Implications and Conclusion
For monetary policymakers, a decline in Beige Book sentiment driven by weakening demand and heightened uncertainty provides an early signal of economic slowing. Such developments strengthen the case for policy easing. As noted above, concerns about the labor market are particularly important. When negative terms related to “labor” appear alongside broader signs of weakening demand, they suggest that the Federal Reserve’s goal of maximum employment may be at risk, creating a strong signal in favor of lowering interest rates. Indeed, the Federal Reserve has recently decided to cut rates, consistent with the deterioration in sentiment documented in the Beige Book.

For fiscal policymakers, the economic themes revealed through the TF-IDF analysis are especially valuable because fiscal policy, unlike monetary policy, can be targeted toward specific sectors or groups. For example, if “tariff” appears as a prominent negative term, this indicates that importers facing higher input costs and consumers affected by price pass-through may require targeted relief. In addition, the TF-IDF results for November 2025 show manufacturing among the positive keywords, suggesting that this sector remains relatively resilient. In such circumstances, the analysis also helps clarify how to finance targeted support. If manufacturing is performing comparatively well while importers and consumers are under strain, redistributing income from the stronger sector to those more adversely affected may represent an appropriate fiscal strategy.

In conclusion, this project demonstrates that the Federal Reserve’s qualitative Beige Book can be transformed into a useful numerical indicator through sentence-level sentiment analysis. The resulting index aligns with real economic activity, and the TF-IDF results help reveal the economic themes that drive month-to-month shifts in sentiment. Together, these methods show that narrative economic information—often viewed as anecdotal or subjective—can be converted into data that meaningfully informs monetary and fiscal decision-making.

Several extensions could further enhance the value of this approach. Applying the same methodology at the district level could highlight regional strengths and challenges, enabling not only the Federal Reserve and federal policymakers but also state and local governments to incorporate these insights into their policy processes. Although the smaller volume of text in district-level reports may increase the volatility of sentiment indices, this limitation could be mitigated by using FinBERT, a language model trained specifically on financial and economic text. While VADER was used in this project for its speed and low computational cost, FinBERT may offer more accurate and less volatile sentiment classification, especially when GPU-based parallel processing is available. With such refinements, the Beige Book could become an even more powerful quantitative resource for real-time economic monitoring and policy design.