<a href="https://colab.research.google.com/github/fdac25/trading/blob/main/src/forbesScraper.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Article scraper - scrapes forbes articles for article title, author, publication date, and content

# Imports
import requests
from bs4 import BeautifulSoup
import pandas as pd
import re
import time
import random

In [2]:
# Get critical elements from article webpage
def scrape(url, wait_times=[1,5,30], printErrors = True):
  # Run until scrape is successful
  completed = False
  errors = 0
  while(not completed):
    # Get html and scrape elements
    try:
      response = requests.get(url)
      soup = BeautifulSoup(response.content, 'html.parser')
      title = soup.find_all('h1')[0].text.strip()
      date = soup.find_all('time')[0].text.strip()
      p = [elem.text.strip() for elem in soup.find_all('p')]
      author = p[0][2:-1]
      body = "\n".join(p[2:])
      completed = True

    # Print html and wait a bit if there's an error
    except:
      errors += 1
      print(f"({errors}) Scraping error")
      if(printErrors): print(soup.prettify())
      completed = False
      if(len(wait_times) > 1): time.sleep(wait_times[errors])
      else: time.sleep(wait_times[0])

      # Skip article if the article cannot be scraped
      if errors >= len(wait_times)-1:
        print("(!) Skipping article...")
        return "0", "0", "0", "0"

  time.sleep(wait_times[0])
  return title, date, author, body

# Test - print elements
title, author, date, body = scrape("https://www.forbes.com/sites/jonmarkman/2025/11/04/you-missed-nvidia-because-you-think-in-straight-lines/")
print(title)
print("")
print(date)
print("")
print(author)
print("")
print(body)

You Missed Nvidia Because You Think In Straight Lines

Jon Markman

Nov 04, 2025, 10:57am EST

Your greatest risk in this market isn’t what you own. It's the mental model you use to value it.
Investors often struggle with an ingrained inability to move beyond linear thinking. They dismiss transformative technologies as bubbles because their intuition can’t grasp the fundamentals of exponential math. Exponential growth fundamentally differs from linear progress. Small, iterative doublings lead to massive scale quickly, unlike steady, incremental increases.
Some of you may be aware of the A4 paper folding problem. A standard A4 is 1 mm thick. The first few folds are trivial, adding a few millimeters. This is the deceptive early slowness where linear forecasts tell investors to sell. But keep going: Folding the same sheet 50 times leads to thickness that exceeds the distance to the Moon. Exponential processes defy intuition through sudden, vast scale changes.
Our cognitive bias, an expone

In [3]:
# Open CSV containing links
df = pd.DataFrame(pd.read_csv("forbes_search.csv"))
links = df['Link'].tolist()
numLinks = len(links)

In [4]:
data = [["0"] for i in range(numLinks)]

# Loop through articles until all articles have been scraped
passNum = 1
while(True):
  for i in range(numLinks):
    if data[i][0] == "0":
      print(f"Scraping article {i+1} of {numLinks}")
      dataAquired = False
      data[i] = scrape(links[i], [random.uniform(1,3)], False)

  # Post-pass
  passNum += 1
  aquired = 0
  for i in range(numLinks):
    if data[i][0] != "0": aquired += 1
  print(f"Scraped {aquired} out of {numLinks} articles")
  if aquired == numLinks: break
  print(f"Waiting before starting pass {passNum}...")
  time.sleep(5)
print("Scraping Complete")

Scraping article 1 of 300
Scraping article 2 of 300
Scraping article 3 of 300
Scraping article 4 of 300
Scraping article 5 of 300
Scraping article 6 of 300
Scraping article 7 of 300
Scraping article 8 of 300
Scraping article 9 of 300
Scraping article 10 of 300
Scraping article 11 of 300
Scraping article 12 of 300
Scraping article 13 of 300
Scraping article 14 of 300
Scraping article 15 of 300
Scraping article 16 of 300
Scraping article 17 of 300
Scraping article 18 of 300
(1) Scraping error
(!) Skipping article...
Scraping article 19 of 300
Scraping article 20 of 300
Scraping article 21 of 300
Scraping article 22 of 300
Scraping article 23 of 300
Scraping article 24 of 300
Scraping article 25 of 300
Scraping article 26 of 300
Scraping article 27 of 300
Scraping article 28 of 300
Scraping article 29 of 300
Scraping article 30 of 300
Scraping article 31 of 300
Scraping article 32 of 300
Scraping article 33 of 300
Scraping article 34 of 300
Scraping article 35 of 300
Scraping article 36 o

In [11]:
# Export data as CSV
df_final = pd.DataFrame(data, columns=['Title', 'Time', 'Author', 'Body'])
df_final.insert(0, 'Link', df['Link'])
df_final.to_csv('forbes_articles.csv', index=True)
df_final.head(10)

Unnamed: 0,Link,Title,Time,Author,Body
0,https://www.forbes.com/sites/solrashidi/2025/1...,OpenAI’s Strategic Shift: What The AMD And Nvi...,"Oct 09, 2025, 09:00am EDT",Sol Rashidi,"AMD, Advanced Micro Devices saw a 23.71% surge..."
1,https://www.forbes.com/sites/greatspeculations...,Nvidia Stock 2x To $350?,"Oct 07, 2025, 05:30am EDT",Trefis Team,Will Nvidia stock (NASDAQ: NVDA) reach $350 in...
2,https://www.forbes.com/sites/devpatnaik/2025/1...,The Lesson from Intel And Nvidia: Choose Actio...,"Oct 06, 2025, 04:42pm EDT",Dev Patnaik,If I told you back in 2010 that a scrappy make...
3,https://www.forbes.com/sites/petercohan/2025/0...,"Nvidia Stock Up 1,124%. Other Winners And Whet...","Sep 24, 2025, 01:31pm EDT",Peter Cohan,"Nvidia's $100B investment in OpenAI, linked to..."
4,https://www.forbes.com/sites/saibala/2025/09/2...,Nvidia-Powered Diligent Robotics Is Optimizing...,"Sep 23, 2025, 08:30am EDT","Dr. Sai Balasubramanian, M.D., J.D.",With rapid advancements in artificial intellig...
5,https://www.forbes.com/sites/greatspeculations...,MRVL Stock vs. NVIDIA,"Sep 23, 2025, 10:15am EDT",Trefis Team,Marvell Technology stock recently jumped 12% i...
6,https://www.forbes.com/sites/iainmartin/2025/0...,Nvidia Keeps Minting New Coreweave-Style AI Da...,"Sep 22, 2025, 01:41pm EDT",Iain Martin,“I’ve never seen a startup take off like that ...
7,https://www.forbes.com/sites/antoniopequenoiv/...,Nvidia/OpenAI Deal: Chipmaker Investing 100 Bi...,"Sep 22, 2025, 02:31pm EDT",Antonio Pequeño IV,Chip designer Nvidia will invest up to $100 bi...
8,https://www.forbes.com/sites/paulocarvao/2025/...,Nvidia/OpenAI $100 Billion Deal Fuels AI As UN...,"Sep 22, 2025, 09:23pm EDT",Paulo Carvão,Nvidia’s $100 billion investment in OpenAI mad...
9,https://www.forbes.com/sites/daniellechemtob/2...,Forbes Daily: Intel Shares Spike After Deal Wi...,"Sep 19, 2025, 07:55am EDT",Danielle Chemtob,"andForbes Daily,\nForbes Staff.\nFans have lon..."
