# Extracting and Visualizing Stock Data

In [1]:
from plotly.subplots import make_subplots
from bs4 import BeautifulSoup

import yfinance as yf
import pandas as pd
import plotly.graph_objects as go
import plotly.io as pio
import requests
import re

In [2]:
pio.renderers.default = "iframe"
yf.set_config("http://127.0.0.1:10808") # if you're located in mainland China, please set a proxy, or it'll raise `YFRateLimitError: Too Many Requests`. Replace here to your proxy.

## Define Graphing Function

<p>In this section, we define the function <code>make_graph</code>. <b>You don't have to know how the function works, you should only care about the inputs. It takes a dataframe with stock data (dataframe must contain Date and Close columns), a dataframe with revenue data (dataframe must contain Date and Revenue columns), and the name of the stock.</b></p>

In [3]:
def make_graph(stock_data: pd.DataFrame, revenue_data: pd.DataFrame, stock: str) -> None:
    fig = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=("Historical Share Price", "Historical Revenue"), vertical_spacing=0.3)

    stock_data_specific = stock_data[stock_data.Date <= "2021-06-14"]
    revenue_data_specific = revenue_data[revenue_data.Date <= "2021-04-30"]

    fig.add_trace(go.Scatter(x=pd.to_datetime(stock_data_specific.Date, format="%Y-%m-%d"), y=stock_data_specific.Close.astype(float), name="Stock Price"), row=1, col=1)
    fig.add_trace(go.Scatter(x=pd.to_datetime(revenue_data_specific.Date, format="%Y-%m-%d"), y=revenue_data_specific.Revenue.astype(float), name="Revenue"), row=2, col=1)
    fig.update_xaxes(title_text="Date", row=1, col=1, showticklabels=True)
    fig.update_xaxes(title_text="Date", row=2, col=1, showticklabels=True)
    fig.update_yaxes(title_text="Price ($US)", row=1, col=1)
    fig.update_yaxes(title_text="Revenue ($US Millions)", row=2, col=1)
    fig.update_layout(showlegend=False, height=900, title=stock, xaxis_rangeslider_visible=True)
    fig.show()

<p>Use the make_graph function that we’ve already defined. You’ll need to invoke it in questions 5 and 6 to display the graphs and create the dashboard.</p>

> <p><b>Note: You don’t need to redefine the function for plotting graphs anywhere else in this notebook; just use the existing function.</b></p>

## Question 1: Use yfinance to Extract Stock Data

<p>Using the <code>Ticker</code> function enter the ticker symbol of the stock we want to extract data on to create a ticker object. The stock is Tesla and its ticker symbol is <code>TSLA</code>.</p>

In [4]:
tesla = yf.Ticker("TSLA")

<p>Using the ticker object and the function <code>history</code> extract stock information and save it in a dataframe named <code>tesla_data</code>. Set the <code>period</code> parameter to <code>"max"</code> so we get information for the maximum amount of time.</p>

In [5]:
tesla_data = tesla.history(period="max")
tesla_data.tail()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Dividends,Stock Splits
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2025-07-28 00:00:00-04:00,318.450012,330.48999,315.690002,325.589996,112673800,0.0,0.0
2025-07-29 00:00:00-04:00,325.549988,326.25,318.25,321.200012,87358900,0.0,0.0
2025-07-30 00:00:00-04:00,322.179993,324.450012,311.619995,319.040009,83931900,0.0,0.0
2025-07-31 00:00:00-04:00,319.609985,321.369995,306.100006,308.269989,85270900,0.0,0.0
2025-08-01 00:00:00-04:00,306.209991,309.309998,297.820007,302.630005,88838600,0.0,0.0


<p><b>Reset the index</b> using the <code>reset_index(inplace=True)</code> function on the <code>tesla_data</code> DataFrame and display the first five rows of the <code>tesla_data</code> dataframe using the <code>head()</code> function. Take a screenshot of the results and code from the beginning of Question 1 to the results below.</p>

In [6]:
tesla_data.reset_index(inplace=True)
tesla_data.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2010-06-29 00:00:00-04:00,1.266667,1.666667,1.169333,1.592667,281494500,0.0,0.0
1,2010-06-30 00:00:00-04:00,1.719333,2.028,1.553333,1.588667,257806500,0.0,0.0
2,2010-07-01 00:00:00-04:00,1.666667,1.728,1.351333,1.464,123282000,0.0,0.0
3,2010-07-02 00:00:00-04:00,1.533333,1.54,1.247333,1.28,77097000,0.0,0.0
4,2010-07-06 00:00:00-04:00,1.333333,1.333333,1.055333,1.074,103003500,0.0,0.0


## Question 2: Use Web Scraping to Extract Tesla Revenue Data

<p>Use the <code>requests</code> library to download the <a href="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/revenue.htm">webpage</a>. Save the text of the response as a variable named <code>html_data</code>.</p>

In [7]:
html_data = requests.get(url="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/revenue.htm").text

<p>Parse the html data using <code>BeautifulSoup</code> using parser such as <code>html5lib</code> or <code>html.parser</code>.</p>

In [8]:
soup = BeautifulSoup(html_data, "html.parser")

<p>Using <code>BeautifulSoup</code> or the <code>read_html()</code> function extract the table with <code>Tesla Revenue</code> and store it into a dataframe named <code>tesla_revenue</code>. The dataframe should have columns <code>Date</code> and <code>Revenue</code>.</p>

In [9]:
tesla_revenue = pd.DataFrame(columns=["Date", "Revenue"])

tesla_table = soup.find_all("table")
tesla_revenue_table_index = next((index for index, row in enumerate(tesla_table) if "Tesla Quarterly Revenue" in str(row)), None)
if not tesla_revenue_table_index:
    raise ValueError("No revenue table found.")

for row in tesla_table[tesla_revenue_table_index].tbody.find_all("tr"):
    col = row.find_all("td")
    if not col:
        continue

    date = col[0].text
    revenue = col[1].text

    new_row = pd.DataFrame([{"Date": date, "Revenue": revenue}])

    tesla_revenue = pd.concat([tesla_revenue, new_row], ignore_index=True)

<p>Execute the following line to remove the comma and dollar sign from the <code>Revenue</code> column.</p>

In [10]:
tesla_revenue["Revenue"] = tesla_revenue["Revenue"].apply(lambda x: re.sub(r",|\$", "", x))

<p>Execute the following lines to remove an null or empty strings in the Revenue column.</p>

In [11]:
tesla_revenue.dropna(inplace=True)
tesla_revenue = tesla_revenue[tesla_revenue["Revenue"] != ""]

<p>Display the last 5 row of the <code>tesla_revenue</code> dataframe using the <code>tail</code> function. Take a screenshot of the results.</p>

In [12]:
tesla_revenue.tail()

Unnamed: 0,Date,Revenue
48,2010-09-30,31
49,2010-06-30,28
50,2010-03-31,21
52,2009-09-30,46
53,2009-06-30,27


## Question 3: Use yfinance to Extract Stock Data

<p>Using the <code>Ticker</code> function enter the ticker symbol of the stock we want to extract data on to create a ticker object. The stock is GameStop and its ticker symbol is <code>GME</code>.</p>

In [13]:
gme = yf.Ticker("GME")

<p>Using the ticker object and the function <code>history</code> extract stock information and save it in a dataframe named <code>gme_data</code>. Set the <code>period</code> parameter to <code>"max"</code> so we get information for the maximum amount of time.</p>

In [14]:
gme_data = gme.history(period="max")

<p><b>Reset the index</b> using the <code>reset_index(inplace=True)</code> function on the <code>gme_data</code> dataframe using the <code>head()</code> function. Take a screenshot of the results and code from the beginning of Question 3 to the results below.</p>

In [15]:
gme_data.reset_index(inplace=True)
gme_data.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2002-02-13 00:00:00-05:00,1.620129,1.69335,1.603296,1.691667,76216000,0.0,0.0
1,2002-02-14 00:00:00-05:00,1.712707,1.716074,1.670626,1.683251,11021600,0.0,0.0
2,2002-02-15 00:00:00-05:00,1.68325,1.687458,1.658001,1.674834,8389600,0.0,0.0
3,2002-02-19 00:00:00-05:00,1.666418,1.666418,1.578047,1.607504,7410400,0.0,0.0
4,2002-02-20 00:00:00-05:00,1.61592,1.66221,1.603296,1.66221,6892800,0.0,0.0


## Question 4: Use Webscraping to Extract GME Revenue Data

<p>Use the <code>requests</code> library to download the <a href="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/stock.html">webpage</a>. Save the text of the response as a variable named <code>html_data_2</code>.</p>

In [16]:
html_data_2 = requests.get(url="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/stock.html").text

<p>Parse the html data using <code>BeautifulSoup</code> using parser such as <code>html5lib</code> or <code>html.parser</code>.</p>

In [17]:
soup = BeautifulSoup(html_data_2, "html.parser")

<p>Using <code>BeautifulSoup</code> or the <code>read_html()</code> function extract the table with <code>GameStop Revenue</code> and store it into a dataframe named <code>gme_revenue</code>. The dataframe should have columns <code>Date</code> and <code>Revenue</code>. Make sure the comma and dollar sign is removed from the <code>Revenue</code> column.</p>

In [18]:
gme_revenue = pd.DataFrame(columns=["Date", "Revenue"])

gme_table = soup.find_all("table")
gme_revenue_table_index = next((index for index, row in enumerate(gme_table) if "GameStop Quarterly Revenue" in str(row)), None)
if not gme_revenue_table_index:
    raise ValueError("No revenue table found.")

for row in gme_table[gme_revenue_table_index].tbody.find_all("tr"):
    col = row.find_all("td")
    if not col:
        continue

    date = col[0].text
    revenue = re.sub(r",|\$", "", col[1].text)
    new_row = pd.DataFrame([{"Date": date, "Revenue": revenue}])

    gme_revenue = pd.concat([gme_revenue, new_row], ignore_index=True)

gme_revenue.dropna(inplace=True)
gme_revenue = gme_revenue[gme_revenue["Revenue"] != ""]

<p>Display the last five rows of the <code>gme_revenue</code> dataframe using the <code>tail</code> function. Take a screenshot of the results.</p>

In [19]:
gme_revenue.tail()

Unnamed: 0,Date,Revenue
57,2006-01-31,1667
58,2005-10-31,534
59,2005-07-31,416
60,2005-04-30,475
61,2005-01-31,709


## Question 5: Plot Tesla Stock Graph

<p>Use the <code>make_graph()</code> function to graph the Tesla Stock Data, also provide a title for the graph. Note the graph will only show data upto June 2021.</p>

In [20]:
make_graph(tesla_data, tesla_revenue, "Tesla")

## Question 6: Plot GameStop Stock Graph

<p>Use the <code>make_graph()</code> function to graph the GameStop Stock Data, also provide a title for the graph. The structure to call the <code>make_graph()</code> function is <code>make_graph(gme_data, gme_revenue, "GameStop")</code>. Note the graph will only show data upto June 2021.</p>

In [21]:
make_graph(gme_data, gme_revenue, "GameStop")

****
This is the end of the file.
****