# Fed Press Conference Audio → Sentiment → Intraday SP500 Alignment 

**Notebook purpose:** End‑to‑end pipeline to (1) download a Fed press-conference video, (2) segment audio into fixed windows, (3) transcribe with Whisper, (4) score sentiment with FinBERT, (5) align with SP500 intraday prices with a 15s embargo, and (6) perform a simple Pearson correlation analysis.

> **Inspiration:** This notebook is inspired by Chapter 9 of Generative AI for Trading and Asset Management by Hamlet Medina and Ernest P. Chan. It reproduces the high-level system described there for educational purposes, with additional pragmatic glue code and commentary to make it runnable in practice.

> **Disclaimer:** This notebook is provided solely for educational purposes and does not constitute financial advice.

## 0) Environment, Dependencies and Project Configuration
- Python 3.9+ is recommended.
- ffmpeg must be installed and available on your PATH (for audio I/O).
- A GPU is helpful for Whisper but not required.

In [None]:
import pandas as pd
import datetime as dt
import numpy as np

import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

In [None]:
from pathlib import Path
YOUTUBE_URL = "https://www.youtube.com/watch?v=u0V3gnOjOi0"
AUDIO_FILE = "audio.mp4"        
AUDIO_FILE_CLIP = "clip.mp4"        
OUT_DIR, CHUNK_DIR = Path("out"), Path("out/chunks")
AUDIO_FILE_CONVERTED = OUT_DIR/"audio.wav"
CHUNK, EMBARGO = 60, 15  


## 1) Download press‑conference audio from YouTube

We grab the **best available audio-only** stream with `pytubefix`.

In [None]:
from pytubefix import YouTube

yt = YouTube(YOUTUBE_URL)

audio_stream = yt.streams.filter(only_audio=True).order_by("abr").desc().first()
audio_stream.download(filename=AUDIO_FILE)

## 2) Segment audio into fixed windows (“audio bars”)

(1) Clip the audio to include only the speech segments, (2) Convert the audio to mono at 16 kHz for ASR, (3) Split the audio into 1-minute chunks with a 15-second embargo

In [None]:
import subprocess
import ffmpeg

start = "00:56:44"
end = "01:02:23"
fmt = "%H:%M:%S"
start_dt = dt.datetime.strptime(start, fmt)
new_start_dt = max(start_dt - dt.timedelta(seconds=EMBARGO), dt.datetime.strptime("00:00:00", fmt))
new_start = new_start_dt.strftime(fmt)

subprocess.run([
    "C:/ffmpeg/bin/ffmpeg.exe",
    "-ss", new_start,
    "-to", end,
    "-i", AUDIO_FILE,
    "-c", "copy",
    AUDIO_FILE_CLIP
])

In [None]:
ffmpeg.input(AUDIO_FILE_CLIP).output(str(AUDIO_FILE_CONVERTED), ac=1, ar=16000).overwrite_output().run()

In [None]:
dur = float(subprocess.check_output(
    ["ffprobe","-v","error","-show_entries","format=duration",
     "-of","default=noprint_wrappers=1:nokey=1","out/audio.wav"]
).decode().strip())

In [None]:
chunks, start = [], 0
while start + CHUNK <= dur+1e-6:
    out_wav = CHUNK_DIR/f"chunk_{int(start):04d}.wav"
    (ffmpeg.input(str(AUDIO_FILE_CLIP), ss=start, t=CHUNK)
     .output(str(out_wav)).overwrite_output().run())
    chunks.append((out_wav, start))
    start += CHUNK

## 3) Transcribe segments with Whisper


In [None]:
import whisper

whisper_model = whisper.load_model("base")
rows = []
for wav,s0 in chunks:
    txt = whisper_model.transcribe(str(wav), fp16=False, verbose=False).get("text","").strip()
    rows.append({"chunk_start_s":s0,"transcript":txt})
df = pd.DataFrame(rows)

In [None]:
df

In [None]:
df['transcript'].iloc[0]

## 4) Sentiment analysis with FinBERT (prosusai/finbert)

In [None]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tok = AutoTokenizer.from_pretrained("ProsusAI/finbert")
clf = AutoModelForSequenceClassification.from_pretrained("ProsusAI/finbert").to("cpu").eval()

labels = clf.config.id2label  # e.g., {0: 'negative', 1: 'neutral', 2: 'positive'}
print(labels)
pos_idx = next(i for i, v in labels.items() if v.lower().startswith("pos"))
neg_idx = next(i for i, v in labels.items() if v.lower().startswith("neg"))

In [None]:
def sentiment(text):
    enc = tok(text, return_tensors="pt", truncation=True, padding=True)
    with torch.no_grad():
        p = torch.softmax(clf(**enc).logits, dim=-1).numpy()[0]
    return p[pos_idx] - p[neg_idx]
df["sentiment"] = df["transcript"].apply(sentiment)

In [None]:
df

## 5) Align with SP500 intraday prices


In [None]:
sp_data = {
    "time": [
        "2025-07-30 15:31:00",
        "2025-07-30 15:32:00",
        "2025-07-30 15:33:00",
        "2025-07-30 15:34:00",
        "2025-07-30 15:35:00",
        "2025-07-30 15:36:00"
    ],
    "close": [
        6392.41,
        6392.11,
        6391.11,
        6393.62,
        6396.32,
        6389.22
    ]
}
sp = pd.DataFrame(sp_data)
sp["time"] = pd.to_datetime(sp["time"])
sp["log_return"] = np.log(sp["close"] / sp["close"].shift(1))

In [None]:
temp = pd.concat([sp.dropna().reset_index(drop=True), df], axis=1)

In [None]:
temp

## 6) Pearson correlation analysis between SP500 prices and sentiment scores

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import pearsonr

In [None]:
plt.figure(figsize=(8, 5))
sns.regplot(x="close", y="sentiment", data=temp, marker="o", line_kws={"color": "red"})

In [None]:
pearsonr(temp["log_return"], temp["sentiment"])