# Extracting and Visualizing Stock Data

## Description

#### Extracting essential data from a dataset and displaying it is a necessary part of data science; therefore individuals can make correct decisions based on the data. In this assignment, you will extract some stock data, you will then display this data in a graph.

##### Table of Contents
Define a Function that Makes a Graph

Question 1: Use yfinance to Extract Stock Data

Question 2: Use Webscraping to Extract Tesla Revenue Data

Question 3: Use yfinance to Extract Stock Data

Question 4: Use Webscraping to Extract GME Revenue Data

Question 5: Plot Tesla Stock Graph

Question 6: Plot GameStop Stock Graph

Estimated Time Needed: 30 min

######  Installing yfinance

In [188]:
!pip install yfinance 



## Define Graphing Function 

###### In this section, we define the function make_graph. You don't have to know how the function works, you should only care about the inputs. It takes a dataframe with stock data (dataframe must contain Date and Close columns), a dataframe with revenue data (dataframe must contain Date and Revenue columns), and the name of the stock.

In [189]:
def make_graph(stock_data, revenue_data, stock):
    fig = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=("Historical Share Price", "Historical Revenue"), vertical_spacing = .3)
    fig.add_trace(go.Scatter(x=pd.to_datetime(stock_data.Date, infer_datetime_format=True), y=stock_data.Close.astype("float"), name="Share Price"), row=1, col=1)
    fig.add_trace(go.Scatter(x=pd.to_datetime(revenue_data.Date, infer_datetime_format=True), y=revenue_data.Revenue.astype("float"), name="Revenue"), row=2, col=1)
    fig.update_xaxes(title_text="Date", row=1, col=1)
    fig.update_xaxes(title_text="Date", row=2, col=1)
    fig.update_yaxes(title_text="Price ($US)", row=1, col=1)
    fig.update_yaxes(title_text="Revenue ($US Millions)", row=2, col=1)
    fig.update_layout(showlegend=False,
    height=900,
    title=stock,
    xaxis_rangeslider_visible=True)
    fig.show()

In [190]:
import yfinance as yf

# Question 1: Use the yfinance to Extract Stock Data

###### Using the Ticker function enter the ticker symbol of the stock we want to extract data on to create a ticker object. The stock is TESLA and its ticker symbol is TSLA

In [191]:
tesla_data = yf.Ticker("TSLA")

###### Using the ticker object and function history extract stock information and save it in a dataframe named tesla_data. Setting the period parameter to max so we get information for the maximum amount of time

In [192]:
tesla_data = tesla_data.history(period="max")

#### Reset the index. Using the reset_index(inplace=True) function on the tesla_data DataFrame and display the first five rows of the tesla_data dataframe using the head function. 

In [193]:
tesla_data.reset_index(inplace=True)

In [194]:
tesla_data.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2010-06-29,3.8,5.0,3.508,4.778,93831500,0,0.0
1,2010-06-30,5.158,6.084,4.66,4.766,85935500,0,0.0
2,2010-07-01,5.0,5.184,4.054,4.392,41094000,0,0.0
3,2010-07-02,4.6,4.62,3.742,3.84,25699000,0,0.0
4,2010-07-06,4.0,4.0,3.166,3.222,34334500,0,0.0


# Question 2: Use Webscraping to Extract Tesla Revenue Data

###### Importing libraries we will use

In [195]:
from bs4 import BeautifulSoup
import requests
import pandas as pd

###### Using the requests library to download the webpage https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue. Saving the text of the response as a variable named html_data.

In [196]:
url = "https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue"
html_data = requests.get(url).text


###### Parsing the html data using beautiful_soup.

In [197]:
soup = BeautifulSoup(html_data,"html.parser")
soup.find_all("title")

[<title>Tesla Revenue 2009-2021 | TSLA | MacroTrends</title>]

###### Using beautiful soup extract the table with Tesla Quarterly Revenue and store it into a dataframe named tesla_revenue. The dataframe should have columns Date and Revenue. Also, I will remove comma and dollar sign from the Revenue column.

In [198]:
tesla_revenue = pd.DataFrame(columns=["Date","Revenue"])
for row in soup.find_all("tbody")[1].find_all("tr"):
    col = row.find_all("td")
    date = col[0].text
    revenue = col[1].text.replace("$","").replace(",","")
    
    tesla_revenue = tesla_revenue.append({"Date":date,"Revenue":revenue},
                                        ignore_index = True)
    

###### Dropping columns whose Revenue's Nan or " "

In [199]:
tesla_revenue.dropna(inplace=True)

tesla_revenue = tesla_revenue[tesla_revenue["Revenue"] != ""]



###### After dropping rows whose Revenue is " " or Nan, index numbers break (for example (46,48,49). I will solve this problem by using reset_index method.

In [200]:
tesla_revenue.reset_index(drop=True)

Unnamed: 0,Date,Revenue
0,2021-09-30,13757
1,2021-06-30,11958
2,2021-03-31,10389
3,2020-12-31,10744
4,2020-09-30,8771
5,2020-06-30,6036
6,2020-03-31,5985
7,2019-12-31,7384
8,2019-09-30,6303
9,2019-06-30,6350


###### Displaying the last 5 row of the tesla_revenue dataframe using the tail function. 

In [201]:
tesla_revenue.tail()

Unnamed: 0,Date,Revenue
44,2010-09-30,31
45,2010-06-30,28
46,2010-03-31,21
48,2009-09-30,46
49,2009-06-30,27


# Question 3: Use yfinance to Extract Stock Data

###### Using the Ticker function enter the ticker symbol of the stock we want to extract data on to create a ticker object. The stock is GameStop and its ticker symbol is GME

In [202]:
GameStop = yf.Ticker("GME")

###### Using the ticker object and the function history extract stock information and save it in a dataframe named gme_data. Setting the period parameter to max so we get information for the maximum amount of time

In [203]:
gme_data = GameStop.history(period = "max")

###### Reset the index using the reset_index(inplace=True) function on the gme_data DataFrame and display the first five rows of the gme_Data dataframe using the head function. 

In [204]:
gme_data.reset_index(inplace=True)
gme_data.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2002-02-13,6.480514,6.7734,6.413183,6.766666,19054000,0.0,0.0
1,2002-02-14,6.850828,6.864294,6.682503,6.733001,2755400,0.0,0.0
2,2002-02-15,6.733,6.749832,6.632005,6.699335,2097400,0.0,0.0
3,2002-02-19,6.66567,6.66567,6.312188,6.430016,1852600,0.0,0.0
4,2002-02-20,6.463683,6.64884,6.413185,6.64884,1723200,0.0,0.0


# Question 4: Use Webscraping to Extract GME Revenue Data



##### Use the requests library to download the webpage https://www.macrotrends.net/stocks/charts/GME/gamestop/revenue. Save the text of the response as a variable named html_data.

In [205]:
url = "https://www.macrotrends.net/stocks/charts/GME/gamestop/revenue"
html_data = requests.get(url).text

###### Parsing the html data using beautiful_soup

In [206]:
soup = BeautifulSoup(html_data,"html.parser")
soup.find_all("title")

[<title>GameStop Revenue 2006-2021 | GME | MacroTrends</title>]

###### Using beautiful soup extract the table with GameStop Quarterly Revenue and store it into a dataframe named gme_revenue. The dataframe should have columns Date and Revenue. Also, I will remove comma and dollar sign from the Revenue column.¶

In [207]:
gme_revenue = pd.DataFrame(columns = ["Date","Revenue"])

for row in soup.find_all("tbody")[1].find_all("tr"):
    col = row.find_all("td")
    date = col[0].text
    revenue = col[1].text.replace("$","").replace(",","")
    
    gme_revenue = gme_revenue.append({"Date":date,"Revenue":revenue},
                                    ignore_index = True)

###### Deleting rows whose Revenue value is " " or Nan

In [208]:
gme_revenue.dropna(inplace=True)
gme_revenue = gme_revenue[gme_revenue["Revenue"]!=""]

###### After dropping rows whose Revenue is " " or Nan, index numbers break (for example (46,48,49). I will solve this problem by using reset_index method.

In [209]:
gme_revenue.reset_index(drop = True)

Unnamed: 0,Date,Revenue
0,2021-07-31,1183
1,2021-04-30,1277
2,2021-01-31,2122
3,2020-10-31,1005
4,2020-07-31,942
...,...,...
62,2006-01-31,1667
63,2005-10-31,534
64,2005-07-31,416
65,2005-04-30,475


###### Displaying the last five rows of the gme_revenue dataframe using the tail function.

In [210]:
gme_revenue.tail()

Unnamed: 0,Date,Revenue
62,2006-01-31,1667
63,2005-10-31,534
64,2005-07-31,416
65,2005-04-30,475
66,2005-01-31,709


# Question 5: Plot Tesla Stock Graph


###### Using the make_graph function to graph the Tesla Stock Data, also providing a title for the graph.

In [211]:
make_graph(gme_data, gme_revenue, 'TESLA')


# Question 6: Plot GameStop Stock Graph

###### Using the make_graph function to graph the GameStop Stock Data, also providing a title for the graph.

In [212]:
make_graph(gme_data, gme_revenue, 'GameStop')
