# Getting Starting: Extracting and Visualizing Stock Data (Web Scraping)

This are my notes extracting information from [Macrotrends](https://www.macrotrends.net/) using the API `yfinance` and webscraping, then I plot it using `ploty`


In [None]:
#Libraries that we'll need
#!pip install yfinance
#!pip install pandas
#!pip install requests
#!pip install bs4
#!pip install plotly

In [None]:
import yfinance as yf
import pandas as pd
import requests
from bs4 import BeautifulSoup
import plotly.graph_objects as go
from plotly.subplots import make_subplots

## Define Graphing Function


In [None]:
def make_graph(stock_data, revenue_data, stock):
    fig = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=("Historical Share Price", "Historical Revenue"), vertical_spacing = .3)
    fig.add_trace(go.Scatter(x=pd.to_datetime(stock_data.Date, infer_datetime_format=True), y=stock_data.Close.astype("float"), name="Share Price"), row=1, col=1)
    fig.add_trace(go.Scatter(x=pd.to_datetime(revenue_data.Date, infer_datetime_format=True), y=revenue_data.Revenue.astype("float"), name="Revenue"), row=2, col=1)
    fig.update_xaxes(title_text="Date", row=1, col=1)
    fig.update_xaxes(title_text="Date", row=2, col=1)
    fig.update_yaxes(title_text="Price ($US)", row=1, col=1)
    fig.update_yaxes(title_text="Revenue ($US Millions)", row=2, col=1)
    fig.update_layout(showlegend=False,
    height=900,
    title=stock,
    xaxis_rangeslider_visible=True)
    fig.show()

## Use yfinance to Extract Tesla Stock Data


Using the `Ticker` function enter the ticker symbol of the stock we want to extract data on to create a ticker object. The stock is Tesla and its ticker symbol is `TSLA`.


In [None]:
tesla = yf.Ticker("TSLA")

Using the ticker object and the function `history` extract stock information and save it in a dataframe named `tesla_data`. Set the `period` parameter to `max` so we get information for the maximum amount of time.


In [None]:
tesla_data = tesla.history(period="max")

In [None]:
tesla_data.reset_index(inplace=True)

In [None]:
tesla_data.head()

## Webscraping to Extract Tesla Revenue Data


Use the `requests` library to download the webpage [https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue](https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue). Save the text of the response as a variable named `html_data`.


In [None]:
URL = "https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue"

In [None]:
html_data  = requests.get(URL).text

In [None]:
soup = BeautifulSoup(html_data, "html.parser")

In [None]:
tables = soup.find_all('table') 

This is just to check if I have access to the table that I like

In [None]:
for index,table in enumerate(tables):
    if ("Tesla Quarterly Revenue" in str(table)):
        table_index = index
print(table_index)

Using beautiful soup extract the table with `Tesla Quarterly Revenue` and store it into a dataframe named `tesla_revenue`. The dataframe should have columns `Date` and `Revenue`. Make sure the comma and dollar sign is removed from the `Revenue` column. 


In [None]:
tesla_revenue = pd.DataFrame(columns=["Date", "Revenue"])

for row in tables[table_index].tbody.find_all("tr"):
    col = row.find_all("td")
    date = col[0].text.strip()
    revenue = col[1].text.replace("$", "").replace(",", "")
    
    tesla_revenue = tesla_revenue.append({"Date":date, "Revenue":revenue}, ignore_index=True)

In [None]:
tesla_revenue.head()

Remove the rows in the dataframe that are empty strings or are NaN in the Revenue column. Print the entire `tesla_revenue` DataFrame to see if you have any.


### Is here any missing value or blank entry?

In [None]:
len(tesla_revenue)

In [None]:
tesla_revenue.isnull().values.any()

In [None]:
tesla_revenue.isna().values.any()

In [None]:
tesla_revenue[tesla_revenue['Revenue'] == ""].index

Let's get rid of this blank entries for now

In [None]:
tesla_revenue = tesla_revenue[tesla_revenue['Revenue'] != ""]

In [None]:
tesla_revenue.tail()

## yfinance to Extract GameStop Stock Data


Using the `Ticker` function enter the ticker symbol of the stock we want to extract data on to create a ticker object. The stock is GameStop and its ticker symbol is `GME`.


In [None]:
gamestop = yf.Ticker("GME")

Using the ticker object and the function `history` extract stock information and save it in a dataframe named `gme_data`. Set the `period` parameter to `max` so we get information for the maximum amount of time.


In [None]:
gme_data = gamestop.history(period='max')

**Reset the index** using the `reset_index(inplace=True)` function on the gme_data DataFrame and display the first five rows of the `gme_data` dataframe using the `head` function. Take a screenshot of the results and code from the beginning of Question 3 to the results below.


In [None]:
gme_data.reset_index(inplace=True)

In [None]:
gme_data.head()

## Webscraping to Extract GME Revenue Data


Use the `requests` library to download the [webpage](https://www.macrotrends.net/stocks/charts/GME/gamestop/revenue) for GameStop revenue.


In [None]:
url = "https://www.macrotrends.net/stocks/charts/GME/gamestop/revenue"

In [None]:
html_data_gme = requests.get(url).text

Parse the html data using `beautiful_soup`.


In [None]:
soup_gme = BeautifulSoup(html_data_gme, 'html.parser')

In [None]:
tables_gme = soup_gme.find_all('table')

In [None]:
len(tables_gme)

In [None]:
for index,table in enumerate(tables_gme):
    if ("GameStop Quarterly Revenue" in str(table)):
        table_index_gme = index
print(table_index_gme)

In [None]:
gme_revenue = pd.DataFrame(columns=["Date", "Revenue"])

for row in tables_gme[table_index_gme].tbody.find_all("tr"):
    col = row.find_all("td")
    date = col[0].text.strip()
    revenue = col[1].text.replace("$", "").replace(",", "")
    
    gme_revenue = gme_revenue.append({"Date":date, "Revenue":revenue}, ignore_index=True)

In [None]:
gme_revenue.describe()

## Plot Tesla Stock Graph


In [None]:
make_graph(tesla_data, tesla_revenue, 'Tesla Stock')

## Question 6: Plot GameStop Stock Graph


In [None]:
make_graph(gme_data, gme_revenue, 'GameStop')

<h2>About the Me:</h2> 

<a href="https://www.linkedin.com/in/bysabrina/">Sabrina Labrador</a> is a Data Analyst and System Engineer focus on Business Research and Costumer Behavior Research.