# Tesla & GameStop — Stock vs Revenue Dashboard (Final Project)

*Generated on: 2025-08-24 10:05:11*

This notebook answers Questions 1–7:

1. Extract Tesla stock data using **yfinance**
2. Extract Tesla revenue data using **web scraping**
3. Extract GameStop stock data using **yfinance**
4. Extract GameStop revenue data using **web scraping**
5. Build a dashboard for **Tesla** (Price vs Revenue)
6. Build a dashboard for **GameStop** (Price vs Revenue)
7. Share the assignment notebook

**Note:** Please run this notebook with an active internet connection so data fetching and scraping can succeed.


## Setup — Imports & Helpers

In [3]:
import sys, warnings, re, io
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Display options
pd.set_option('display.max_rows', 10)
pd.set_option('display.max_columns', 20)
pd.set_option('display.width', 120)

# Helper: safe import of yfinance
def safe_import_yfinance():
    try:
        import yfinance as yf
        return yf
    except Exception as e:
        print("yfinance is not available. Please install it with: pip install yfinance")
        raise

# Helper: check internet connectivity (simple DNS test)
def has_internet():
    try:
        import socket
        socket.gethostbyname("www.google.com")
        return True
    except Exception:
        return False

warnings.filterwarnings('ignore')
print("Environment ready.")


ModuleNotFoundError: No module named 'pandas'

## Question 1 — Extracting Tesla Stock Data Using yfinance (2 pts)

In [4]:
yf = safe_import_yfinance()
if not has_internet():
    raise RuntimeError("No internet connection detected. Please connect to the internet and re-run this cell.")

tsla = yf.Ticker("TSLA")
tsla_hist = tsla.history(period="max")
tsla_hist.reset_index(inplace=True)
# Keep only relevant columns
tsla_hist = tsla_hist[["Date", "Open", "High", "Low", "Close", "Volume", "Dividends", "Stock Splits"]]
tsla_hist.rename(columns={"Close":"Adj Close"}, inplace=True)  # In many courses 'Close' is used as adjusted close; adjust if needed

# Save and preview
tsla_hist.to_csv("TSLA_stock.csv", index=False)
print("Saved TSLA_stock.csv")
tsla_hist.head(10)


NameError: name 'safe_import_yfinance' is not defined

## Question 2 — Extracting Tesla Revenue Data Using Web Scraping (1 pt)

In [5]:
import requests
from bs4 import BeautifulSoup

if not has_internet():
    raise RuntimeError("No internet connection detected. Please connect to the internet and re-run this cell.")

tesla_url = "https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue"
headers = {"User-Agent": "Mozilla/5.0"}
resp = requests.get(tesla_url, headers=headers, timeout=30)
resp.raise_for_status()
soup = BeautifulSoup(resp.text, 'html.parser')

# Find the 'Tesla Quarterly Revenue' table by heading text proximity
tables = soup.find_all("table")
target = None
for t in tables:
    if t.find("th") and "Quarterly Revenue" in t.get_text():
        target = t
        break
if target is None:
    # fallback: look for series in script tag (Macrotrends often embeds in JS); simple fallback omitted for brevity
    raise ValueError("Could not find Tesla Quarterly Revenue table. The page structure may have changed.")

rows = []
for tr in target.find_all("tr"):
    cols = [c.get_text(strip=True) for c in tr.find_all(["td","th"])]
    if len(cols) >= 2 and re.match(r"\d{4}-\d{2}", cols[0]):
        date = cols[0]
        revenue = cols[1].replace("$","").replace(",","")
        try:
            revenue = float(revenue)  # in millions USD typically
        except:
            continue
        rows.append((date, revenue))

tsla_revenue = pd.DataFrame(rows, columns=["Date","Revenue (Millions USD)"])
tsla_revenue["Date"] = pd.to_datetime(tsla_revenue["Date"])
tsla_revenue = tsla_revenue.sort_values("Date").reset_index(drop=True)
tsla_revenue.to_csv("TSLA_revenue.csv", index=False)
print("Extracted rows:", len(tsla_revenue))
tsla_revenue.tail(10)


ModuleNotFoundError: No module named 'requests'

## Question 3 — Extracting GameStop Stock Data Using yfinance (2 pts)

In [6]:
yf = safe_import_yfinance()
if not has_internet():
    raise RuntimeError("No internet connection detected. Please connect to the internet and re-run this cell.")

gme = yf.Ticker("GME")
gme_hist = gme.history(period="max")
gme_hist.reset_index(inplace=True)
gme_hist = gme_hist[["Date", "Open", "High", "Low", "Close", "Volume", "Dividends", "Stock Splits"]]
gme_hist.rename(columns={"Close":"Adj Close"}, inplace=True)

gme_hist.to_csv("GME_stock.csv", index=False)
print("Saved GME_stock.csv")
gme_hist.head(10)


NameError: name 'safe_import_yfinance' is not defined

## Question 4 — Extracting GameStop Revenue Data Using Web Scraping (1 pt)

In [7]:
import requests
from bs4 import BeautifulSoup

if not has_internet():
    raise RuntimeError("No internet connection detected. Please connect to the internet and re-run this cell.")

gme_url = "https://www.macrotrends.net/stocks/charts/GME/gamestop/revenue"
headers = {"User-Agent": "Mozilla/5.0"}
resp = requests.get(gme_url, headers=headers, timeout=30)
resp.raise_for_status()
soup = BeautifulSoup(resp.text, 'html.parser')

# Find the 'GameStop Quarterly Revenue' table
tables = soup.find_all("table")
target = None
for t in tables:
    if t.find("th") and "Quarterly Revenue" in t.get_text():
        target = t
        break
if target is None:
    raise ValueError("Could not find GameStop Quarterly Revenue table. The page structure may have changed.")

rows = []
for tr in target.find_all("tr"):
    cols = [c.get_text(strip=True) for c in tr.find_all(["td","th"])]
    if len(cols) >= 2 and re.match(r"\d{4}-\d{2}", cols[0]):
        date = cols[0]
        revenue = cols[1].replace("$","").replace(",","")
        try:
            revenue = float(revenue)
        except:
            continue
        rows.append((date, revenue))

gme_revenue = pd.DataFrame(rows, columns=["Date","Revenue (Millions USD)"])
gme_revenue["Date"] = pd.to_datetime(gme_revenue["Date"])
gme_revenue = gme_revenue.sort_values("Date").reset_index(drop=True)
gme_revenue.to_csv("GME_revenue.csv", index=False)
print("Extracted rows:", len(gme_revenue))
gme_revenue.tail(10)


ModuleNotFoundError: No module named 'requests'

## Questions 5 & 6 — Dashboards (Price vs Revenue) (2 + 2 pts)

In [8]:
def make_price_revenue_dashboard(stock_csv, revenue_csv, title="Dashboard"):
    stock = pd.read_csv(stock_csv, parse_dates=["Date"])
    rev = pd.read_csv(revenue_csv, parse_dates=["Date"])
    # Use 'Adj Close' if present, else 'Close'
    price_col = "Adj Close" if "Adj Close" in stock.columns else "Close"

    # Resample stock prices to quarterly end to match revenue cadence
    stock_q = stock.set_index("Date").resample("Q")[price_col].last().rename("Price").to_frame().reset_index()
    rev_q = rev.rename(columns={rev.columns[1]: "Revenue"}).copy()
    # Some revenue series are already quarterly by date; ensure quarter end alignment
    rev_q["Date"] = rev_q["Date"] + pd.offsets.QuarterEnd(0)

    df = pd.merge_asof(stock_q.sort_values("Date"), rev_q.sort_values("Date"), on="Date", direction="nearest", tolerance=pd.Timedelta("45D"))
    df.dropna(inplace=True)

    fig, ax1 = plt.subplots(figsize=(10,5))
    ax1.plot(df["Date"], df["Price"], linewidth=2)
    ax1.set_xlabel("Date")
    ax1.set_ylabel("Price (USD)")
    ax1.set_title(title)

    ax2 = ax1.twinx()
    ax2.plot(df["Date"], df["Revenue"], linewidth=2, linestyle="--")
    ax2.set_ylabel("Revenue (Millions USD)")

    plt.tight_layout()
    plt.show()
    return df

print("Functions loaded. Next cells will build the dashboards.")


Functions loaded. Next cells will build the dashboards.


### Tesla — Stock vs Revenue Dashboard

In [9]:
_ = make_price_revenue_dashboard("TSLA_stock.csv", "TSLA_revenue.csv", title="Tesla: Price vs Revenue")


NameError: name 'pd' is not defined

### GameStop — Stock vs Revenue Dashboard

In [10]:
_ = make_price_revenue_dashboard("GME_stock.csv", "GME_revenue.csv", title="GameStop: Price vs Revenue")


NameError: name 'pd' is not defined

## Question 7 — Sharing Your Assignment Notebook (2 pts)

- Save this notebook after execution: **File → Save As…** and export as needed (HTML or PDF) for submission.
- Include your screenshots for each question (head/tail previews and rendered dashboards).
- Ensure your environment has internet access when running Questions 1–4.
