<h1>Extracting and Visualizing Stock Data</h1>
<h2>Aims</h2>


Extracting essential data from a dataset and displaying it is a necessary part of data science before individuals can make correct decisions based on the data. This project focuses on extracting stock data and displaying it in a graph.


<h2>Table of Contents</h2>
<div class="alert alert-block alert-info" style="margin-top: 20px">
    <ul>
        <li>Define a Function that Makes a Graph</li>
        <li> Use yfinance to Extract Stock Data</li>
        <li> Use Webscraping to Extract Tesla Revenue Data</li>
        <li> Use yfinance to Extract Stock Data</li>
        <li> Use Webscraping to Extract GME Revenue Data</li>
        <li> Plot Tesla Stock Graph</li>
        <li> Plot GameStop Stock Graph</li>
    </ul>
</div>

<hr>


Import libraries

In [2]:
import yfinance as yf
import pandas as pd
import requests
from bs4 import BeautifulSoup
import plotly.graph_objects as go
from plotly.subplots import make_subplots

## Define Graphing Function


In this section, we define the function `make_graph`. It takes a dataframe with stock data (dataframe must contain Date and Close columns), a dataframe with revenue data (dataframe must contain Date and Revenue columns), and the name of the stock.


In [22]:
def make_graph(stock_data, revenue_data, stock):
    fig = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=("Historical Share Price", "Historical Revenue"), vertical_spacing = .3)
    stock_data_specific = stock_data[stock_data.Date <= '2021-06-14']
    revenue_data_specific = revenue_data[revenue_data.Date <= '2021-06-14']
    fig.add_trace(go.Scatter(x=pd.to_datetime(stock_data_specific.Date, infer_datetime_format=True), y=stock_data_specific.Close.astype("float"), name="Share Price"), row=1, col=1)
    fig.add_trace(go.Scatter(x=pd.to_datetime(revenue_data_specific.Date, infer_datetime_format=True), y=revenue_data_specific.Revenue.astype("float"), name="Revenue"), row=2, col=1)
    fig.update_xaxes(title_text="Date", row=1, col=1)
    fig.update_xaxes(title_text="Date", row=2, col=1)
    fig.update_yaxes(title_text="Price ($US)", row=1, col=1)
    fig.update_yaxes(title_text="Revenue ($US Millions)", row=2, col=1)
    fig.update_layout(showlegend=False,
    height=900,
    title="{} Share Price & Revenue".format(stock),
    xaxis_rangeslider_visible=True)
    fig.show()

## Use yfinance to Extract Stock Data


Using the `Ticker` function we can extract data on most publicly traded stock from Yahoo Finance and create a ticker object. The first stock we will be using is Tesla and its ticker symbol is `TSLA`.

In [4]:
tesla = yf.Ticker("TSLA")

Using the ticker object and the function `history` we extract stock information and save it in a dataframe named `tesla_data`. The `period` parameter is set to `max` so we get information for the maximum amount of time.


In [5]:
tesla_data = tesla.history(period = "max")

Reset the index and display the first five rows of the `tesla_data` dataframe.


In [6]:
tesla_data.reset_index(inplace=True)
tesla_data.head(5)

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2010-06-29,3.8,5.0,3.508,4.778,93831500,0,0.0
1,2010-06-30,5.158,6.084,4.66,4.766,85935500,0,0.0
2,2010-07-01,5.0,5.184,4.054,4.392,41094000,0,0.0
3,2010-07-02,4.6,4.62,3.742,3.84,25699000,0,0.0
4,2010-07-06,4.0,4.0,3.166,3.222,34334500,0,0.0


## Use Webscraping to Extract Tesla Revenue Data


Now let's get Tesla's revenue data using webscrapping from the webpage https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue.


In [7]:
url = "https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue"
html_data = requests.get(url).text

Parse the html data using `beautiful_soup`.


In [8]:
soup = BeautifulSoup(html_data, "html5lib")

Using beautiful soup, we extract the table with `Tesla Quarterly Revenue` and store it into a dataframe named `tesla_revenue`.

In [9]:
tables = soup.find_all('table')

for index, table in enumerate(tables):
    if "Tesla Quarterly Revenue" in str(table):
        table_index = index
        
tesla_revenue = pd.read_html(str(tables[table_index]), flavor='bs4')[0]
tesla_revenue.columns = ["Date", "Revenue"]
tesla_revenue.tail(5)


Unnamed: 0,Date,Revenue
45,2010-03-31,$21
46,2009-12-31,
47,2009-09-30,$46
48,2009-06-30,$27
49,2008-12-31,


Remove the comma and dollar sign from the `Revenue` column.


In [10]:
tesla_revenue['Revenue'] = tesla_revenue['Revenue'].str.replace('$',"").str.replace(',',"")

  tesla_revenue['Revenue'] = tesla_revenue['Revenue'].str.replace('$',"").str.replace(',',"")


Remove null or empty strings in the Revenue column.


In [11]:
tesla_revenue.dropna(inplace=True)

tesla_revenue = tesla_revenue[tesla_revenue['Revenue'] != ""]

Check the results of cleaning


In [12]:
tesla_revenue.tail(5)

Unnamed: 0,Date,Revenue
43,2010-09-30,31
44,2010-06-30,28
45,2010-03-31,21
47,2009-09-30,46
48,2009-06-30,27


## Use yfinance to Extract Stock Data


Using the Ticker function we also create a ticker object for the second stock we will be using: GameStop, and its ticker symbol is `GME`.


In [13]:
gamestop = yf.Ticker('GME')

Extract stock information and save it in a dataframe named `gme_data`.


In [14]:
gme_data = gamestop.history(period = "max")

Reset the index and display the first five rows of the `gme_data` dataframe.


In [15]:
gme_data.reset_index(inplace=True)
gme_data.head(5)

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2002-02-13,6.480512,6.773398,6.413182,6.766665,19054000,0.0,0.0
1,2002-02-14,6.850829,6.864295,6.682504,6.733001,2755400,0.0,0.0
2,2002-02-15,6.733,6.749832,6.632005,6.699335,2097400,0.0,0.0
3,2002-02-19,6.665674,6.665674,6.312191,6.430019,1852600,0.0,0.0
4,2002-02-20,6.463682,6.648839,6.413184,6.648839,1723200,0.0,0.0


## Use Webscraping to Extract GME Revenue Data


Let's retrieve GME's revenue data using webscrapping from the webpage  https://www.macrotrends.net/stocks/charts/GME/gamestop/revenue.


In [16]:
url = "https://www.macrotrends.net/stocks/charts/GME/gamestop/revenue"
html_data = requests.get(url).text

Parse the html data using `beautiful_soup`.


In [17]:
soup = BeautifulSoup(html_data, "html5lib")

Using beautiful soup we extract the table with `GameStop Quarterly Revenue` and store it into a dataframe named `gme_revenue`. Also remove the comma and dollar sign from `Revenue` column using a method similar to what we did for Tesla.


In [18]:
tables = soup.find_all("table")

for index, table in enumerate(tables):
    if "GameStop Quarterly Revenue" in str(table):
        table_index = index
        
gme_revenue = pd.read_html(str(tables[table_index]), flavor="bs4")[0]
gme_revenue.columns = ["Date", "Revenue"]

gme_revenue['Revenue'] = gme_revenue['Revenue'].str.replace('$', "").str.replace(",", "")

  gme_revenue['Revenue'] = gme_revenue['Revenue'].str.replace('$', "").str.replace(",", "")


Display the last five rows of the `gme_revenue` dataframe.


In [19]:
gme_revenue.tail(5)

Unnamed: 0,Date,Revenue
62,2006-01-31,1667
63,2005-10-31,534
64,2005-07-31,416
65,2005-04-30,475
66,2005-01-31,709


## Plot Tesla Stock Graph


We use the `make_graph` function to graph the Tesla Stock Data. Note the graph will only show data upto June 2021 to match available revenue data.

In [23]:
make_graph(tesla_data, tesla_revenue, 'Tesla')

## Plot GameStop Stock Graph


We also make a graph of GameStop Stock Data. Note the graph will also only show data upto June 2021 for the above mentioned reason.

In [24]:
make_graph(gme_data, gme_revenue, 'GameStop')