### Yahoo Finance Analysis LLM Summarizer

##### This analysis scrapes the Yahoo Finance website for multiple tickers. Specifically, it scrapes the Yahoo Finance Analysis page and use Selenium to extract the title and text. Afterwards, the script uses Open AI's gpt-4o-mini to summarize the findings of the analysis using a User and System Prompt.

In [1]:
import numpy as np
import pandas as pd
import requests
from bs4 import BeautifulSoup
import pickle
import unicodedata
from dotenv import load_dotenv
from IPython.display import Markdown, display
from openai import OpenAI
import os
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import re

In [2]:
# Load environment variables in a file called .env

load_dotenv(override=True) # looks at .env file and loads in secrets
api_key = os.getenv('OPENAI_API_KEY')

# Check the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


In [3]:
openai = OpenAI()

In [57]:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import time

# List of tickers
tickers = ['AAPL', 'MSFT', 'SMCI']

# Setup headless Chrome
options = Options()
options.add_argument('--headless=new')
options.add_argument('--disable-gpu')
options.add_argument('--no-sandbox')
driver = webdriver.Chrome(options=options)

analysis_links = {}

for ticker in tickers:
    base_url = f'https://finance.yahoo.com/quote/{ticker}'
    driver.get(base_url)
    time.sleep(3)  # wait for page to load

    try:
        # Find "Analysis" tab link in navigation menu
        nav_links = driver.find_elements(By.CSS_SELECTOR, 'a[href*="/quote/"][href*="/analysis"]')
        for link in nav_links:
            href = link.get_attribute('href')
            if '/analysis' in href:
                analysis_links[ticker] = href
                break
        else:
            analysis_links[ticker] = "Analysis link not found"
    except Exception as e:
        analysis_links[ticker] = f"Error: {e}"

driver.quit() 

AAPL: https://finance.yahoo.com/quote/AAPL/analysis/
MSFT: https://finance.yahoo.com/quote/MSFT/analysis/
SMCI: https://finance.yahoo.com/quote/SMCI/analysis/


In [63]:
yahoo_links  = list(analysis_links.values())

In [68]:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import time

class Website:
    def __init__(self, url):
        self.url = url

        options = Options()
        options.add_argument("--headless=new")
        options.add_argument("--disable-gpu")
        options.add_argument("--no-sandbox")
        self.driver = webdriver.Chrome(options=options)

        self.driver.get(url)
        time.sleep(5)  # let JS render

        self.title = self.driver.title
        body_elem = self.driver.find_element(By.TAG_NAME, "body")
        self.text = body_elem.text

        self.driver.quit()

In [69]:
site = Website("https://finance.yahoo.com/quote/SMCI/analysis")
print(site.title)
print(site.text[:1000])  # Preview

Super Micro Computer, Inc. (SMCI) Analyst Ratings, Estimates & Forecasts - Yahoo Finance
Yahoo Finance
Mail
Sign in
Summary
News
Research
Chart
Community
Statistics
Historical Data
Profile
Financials
Analysis
Options
Holders
Sustainability
Unlock stock picks and a broker-level newsfeed that powers Wall Street.
Upgrade Now
NasdaqGS - Nasdaq Real Time Price
•
USD
Super Micro Computer, Inc. (SMCI)
Follow
Compare
32.32
-2.77
(-7.89%)
At close: 4:00:00 PM EDT
32.33
+0.01
(+0.03%)
After hours: 7:59:33 PM EDT
Time to buy SMCI?
Estimate Trends
Fair Value
Research Analysis
Earnings Per Share
+0.53 Estimate
  Revenue vs. Earnings
Revenue
5.68B
Earnings
320.6M
Q1'24
Q2'24
Q3'24
Q4'24
0
2B
4B
  Analyst Recommendations
Strong Buy
Buy
Hold
Underperform
Sell
  Analyst Price Targets
15.00
Low
52.48
Average
32.32
Current
93.00
High
Earnings Estimate
Currency in USD Current Qtr. (Mar 2025) Next Qtr. (Jun 2025) Current Year (2025) Next Year (2026)
No. of Analysts 11 11 12 13
Avg. Estimate 0.53 0.73 2.59 

In [70]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = "You are a fundamental analyst that analyzes the contents of a yahoo finance analysis page \
and provides a short summary of the analysis, ignoring text that might be navigation related. \
Respond in markdown."

In [71]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [72]:
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [73]:
# And now: call the OpenAI API. You will get very familiar with this!

def summarize(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model = "gpt-4o-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [74]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [75]:
display_summary(yahoo_links[0])

# Apple Inc. (AAPL) Analyst Ratings, Estimates & Forecasts Summary

## Current Stock Price
- **Closing Price:** $203.19 (-9.25%)
- **After Hours Price:** $203.25 (+0.03%)

## Analyst Recommendations
- **Overall Sentiment:** Mixed
- **Ratings:**
  - Strong Buy
  - Buy
  - Hold
  - Underperform
  - Sell

## Price Targets
- **Low Target:** $175.00
- **Average Target:** $250.40
- **Current Price:** $203.19
- **High Target:** $325.00

## Earnings Estimates
- **Upcoming EPS Estimates:**
  - Current Qtr (Mar 2025): $1.61
  - Next Qtr (Jun 2025): $1.49
  - Current Year (2025): $7.30
  - Next Year (2026): $8.17

## Revenue Estimates
- **Revenue Projections:**
  - Current Qtr (Mar 2025): $94.04B
  - Next Qtr (Jun 2025): $89.41B
  - Current Year (2025): $409.21B
  - Next Year (2026): $441.99B

## Analyst Ratings and Upgrades
- Notable Upgrades:
  - **B of A Securities:** Maintained "Buy" on 4/3/2025
  - **Tigress Financial:** Maintained "Strong Buy" on 4/3/2025
  - **Morgan Stanley:** Maintained "Overweight" on 3/12/2025

## Growth Estimates
- **Sales Growth:**
  - Current Year: 8.32%
  - Next Year: 12.04%

## Overall Analyst Sentiment
- Analysts are generally optimistic with several maintaining or upgrading their ratings, reflecting confidence in Apple's future growth potential driven by revenue and earnings estimates. 

## Conclusion
Investors should consider current stock trends, analyst recommendations, and estimated earnings growth when evaluating AAPL's investment potential.