# **US Stock Lab**


In this lab, I am going to extract the profit data for Tesla and GameStop and compare the price of the stock vs the profit for the hedge fund. Thus, I will use Pandas and yfinance (yahoo finance) libraries. yfinance offers a threaded and Pythonic way to download market data from Yahoo. Let's use it.

**Step 1: Installing dependencies**

In [1]:
!pip install -q -U watermark

[?25l[K     |▏                               | 10 kB 19.5 MB/s eta 0:00:01[K     |▍                               | 20 kB 4.9 MB/s eta 0:00:01[K     |▋                               | 30 kB 6.9 MB/s eta 0:00:01[K     |▉                               | 40 kB 3.7 MB/s eta 0:00:01[K     |█                               | 51 kB 4.2 MB/s eta 0:00:01[K     |█▎                              | 61 kB 5.0 MB/s eta 0:00:01[K     |█▌                              | 71 kB 4.9 MB/s eta 0:00:01[K     |█▊                              | 81 kB 5.1 MB/s eta 0:00:01[K     |█▉                              | 92 kB 5.6 MB/s eta 0:00:01[K     |██                              | 102 kB 4.8 MB/s eta 0:00:01[K     |██▎                             | 112 kB 4.8 MB/s eta 0:00:01[K     |██▌                             | 122 kB 4.8 MB/s eta 0:00:01[K     |██▊                             | 133 kB 4.8 MB/s eta 0:00:01[K     |███                             | 143 kB 4.8 MB/s eta 0:00:01[K    

In [2]:
%reload_ext watermark
%watermark -v -p pandas

Python implementation: CPython
Python version       : 3.7.13
IPython version      : 7.9.0

pandas: 1.3.5



In [3]:
!pip install yfinance==0.1.67
!mamba install bs4==4.10.0 -y

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting yfinance==0.1.67
  Downloading yfinance-0.1.67-py2.py3-none-any.whl (25 kB)
Installing collected packages: yfinance
Successfully installed yfinance-0.1.67
/bin/bash: mamba: command not found


**Step 2: Importing libs**

In [4]:
import yfinance as yf
import pandas as pd
import requests
from bs4 import BeautifulSoup
import plotly.graph_objects as go
from plotly.subplots import make_subplots

**Step 3: Define Graphing Function**


In this section, we define the function make_graph. It takes a dataframe with stock data (dataframe must contain Date and Close columns), a dataframe with revenue data (dataframe must contain Date and Revenue columns), and the name of the stock.


In [5]:
def make_graph(stock_data, revenue_data, stock):
    fig = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=("Historical Share Price", "Historical Revenue"), vertical_spacing = .3)
    stock_data_specific = stock_data[stock_data.Date <= '2021--06-14']
    revenue_data_specific = revenue_data[revenue_data.Date <= '2021-04-30']
    fig.add_trace(go.Scatter(x=pd.to_datetime(stock_data_specific.Date, infer_datetime_format=True), y=stock_data_specific.Close.astype("float"), name="Share Price"), row=1, col=1)
    fig.add_trace(go.Scatter(x=pd.to_datetime(revenue_data_specific.Date, infer_datetime_format=True), y=revenue_data_specific.Revenue.astype("float"), name="Revenue"), row=2, col=1)
    fig.update_xaxes(title_text="Date", row=1, col=1)
    fig.update_xaxes(title_text="Date", row=2, col=1)
    fig.update_yaxes(title_text="Price ($US)", row=1, col=1)
    fig.update_yaxes(title_text="Revenue ($US Millions)", row=2, col=1)
    fig.update_layout(showlegend=False,
    height=900,
    title=stock,
    xaxis_rangeslider_visible=True)
    fig.show()

**Step 4: Using the yfinance Library to Extract Tesla Stock**


Using the Ticker function enter the ticker symbol of the stock we want to extract data on to create a ticker object. The stock is Tesla and its ticker symbol is TSLA.

In [6]:
Tesla = yf.Ticker('TSLA')

**Step 5:Using the ticker object and the function history**

Using the ticker object and the function history extract stock information and save it in a dataframe named tesla_data. Set the period parameter to max so we get information for the maximum amount of time.

In [7]:
tesla_data = Tesla.history(period = "max")

**Step 6: Reseting Index** 

In this step, I will reset the index using the reset_index(inplace=True) function on the tesla_data DataFrame and display the first five rows of the tesla_data dataframe using the head function.
The format that the data is returned in is a Pandas DataFrame. With the Date as the index the share Open, High, Low, Close, Volume, and Stock Splits are given for each day.

In [8]:
tesla_data.reset_index(inplace = True)
tesla_data.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2010-06-29,1.266667,1.666667,1.169333,1.592667,281494500,0,0.0
1,2010-06-30,1.719333,2.028,1.553333,1.588667,257806500,0,0.0
2,2010-07-01,1.666667,1.728,1.351333,1.464,123282000,0,0.0
3,2010-07-02,1.533333,1.54,1.247333,1.28,77097000,0,0.0
4,2010-07-06,1.333333,1.333333,1.055333,1.074,103003500,0,0.0


In [9]:
tesla_data.describe()

Unnamed: 0,Open,High,Low,Close,Volume,Dividends,Stock Splits
count,3068.0,3068.0,3068.0,3068.0,3068.0,3068.0,3068.0
mean,54.718161,55.943503,53.379899,54.693276,93569130.0,0.0,0.002608
std,93.087553,95.233898,90.668179,92.97795,82526060.0,0.0,0.105257
min,1.076,1.108667,0.998667,1.053333,1777500.0,0.0,0.0
25%,8.081666,8.279167,7.933333,8.0875,41241000.0,0.0,0.0
50%,15.925,16.210334,15.664,15.933,75582000.0,0.0,0.0
75%,23.5285,23.833167,23.135,23.478834,117628500.0,0.0,0.0
max,411.470001,414.496674,405.666656,409.970001,914082000.0,0.0,5.0


**Step 7: Use Webscraping to Extract Tesla Revenue Data**

Web scraping is the process of collecting structured web data in an automated fashion.web data extraction is used by those who want to make use of the vast amount of publicly available web data to make smarter decisions.

I will apply the requests library to download the specific webpage. Then the text of the response will be saved as a variable named html_data.

In [10]:
url = "https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue"
html_data = requests.get(url).text

**Step 8: Parsing the html data using beautiful_soup**

In [11]:
soup = BeautifulSoup(html_data, "html.parser")
soup.find_all('title')

[<title>Tesla Revenue 2010-2022 | TSLA | MacroTrends</title>]

We use BeautifulSoup or the read_html function to extract the table with Tesla Quarterly Revenue and store it into a dataframe named tesla_revenue. The dataframe should have columns Date and Revenue.

In [12]:
tesla_revenue = pd.DataFrame(columns = ['Date', 'Revenue'])

for row in soup.find_all("tbody")[1].find_all("tr"):
    col = row.find_all("td")
    date = col[0].text
    revenue = col[1].text.replace("$", "").replace(",", "")
    
    tesla_revenue = tesla_revenue.append({"Date": date, "Revenue": revenue}, ignore_index = True)

We can execute the following line to remove the **comma** and **dollar** sign from the **Revenue column**.

In [13]:
tesla_revenue.dropna(inplace=True)

We can execute the following lines to remove an **null** or **empty strings** in the **Revenue column**.

In [14]:
tesla_revenue = tesla_revenue[tesla_revenue['Revenue'] != ""]

we can display the last 10 row of the tesla_revenue dataframe using the tail function. 

In [15]:
tesla_revenue.tail(10)

Unnamed: 0,Date,Revenue
42,2011-12-31,39
43,2011-09-30,58
44,2011-06-30,58
45,2011-03-31,49
46,2010-12-31,36
47,2010-09-30,31
48,2010-06-30,28
49,2010-03-31,21
51,2009-09-30,46
52,2009-06-30,27


**Step 8: Using the yfinance Library to Extract GameStop Stock**


I will use the Ticker function enter the ticker symbol of the stock we want to extract data on to create a ticker object. The stock is GameStop and its ticker symbol is GME.

In [16]:
GameStop = yf.Ticker("GME")

**Step 9:Using the ticker object and the function history**

I am going to use the ticker object and the function history extract stock information and save it in a dataframe named gme_data. Set the period parameter to max so we get information for the maximum amount of time.

In [17]:
gme_data = GameStop.history(period = 'max')

**Step 10: Reseting Index**


I am going to use the reset_index(inplace=True) function on the gme_data DataFrame and display the first five rows of the gme_data dataframe using the head function.

In [18]:
gme_data.reset_index(inplace = True)
gme_data.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2002-02-13,1.620128,1.69335,1.603296,1.691666,76216000,0.0,0.0
1,2002-02-14,1.712707,1.716074,1.670626,1.68325,11021600,0.0,0.0
2,2002-02-15,1.68325,1.687458,1.658002,1.674834,8389600,0.0,0.0
3,2002-02-19,1.666418,1.666418,1.578047,1.607504,7410400,0.0,0.0
4,2002-02-20,1.61592,1.66221,1.603296,1.66221,6892800,0.0,0.0


In [19]:
gme_data.describe()

Unnamed: 0,Open,High,Low,Close,Volume,Dividends,Stock Splits
count,5176.0,5176.0,5176.0,5176.0,5176.0,5176.0,5176.0
mean,6.958454,7.208987,6.709848,6.940891,14858040.0,0.000462,0.001159
std,10.321294,10.986968,9.675121,10.23014,29988380.0,0.006271,0.062156
min,0.643843,0.672459,0.631219,0.638794,260000.0,0.0,0.0
25%,2.745818,2.783703,2.702865,2.739888,6228700.0,0.0,0.0
50%,4.019602,4.076833,3.945118,4.017389,10071400.0,0.0,0.0
75%,6.654809,6.802219,6.51539,6.652946,15546100.0,0.0,0.0
max,94.927498,120.75,72.877502,86.877502,788631600.0,0.095,4.0


**Step 11: Useing Webscraping to Extract GME Revenue Data**

We can use the requests library to download the webpage https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/stock.html. Then we will save the text of the response as a variable named html_data.

In [20]:
url = "https://www.macrotrends.net/stocks/charts/GME/gamestop/revenue"
html_data = requests.get(url).text

**Step 12: Parsing the html data using beautiful_soup**

In [21]:
soup = BeautifulSoup(html_data, "html.parser")
soup.find_all('title')

[<title>GameStop Revenue 2010-2022 | GME | MacroTrends</title>]

Using BeautifulSoup or the read_html function extract the table with GameStop Quarterly Revenue and store it into a dataframe named gme_revenue. The dataframe should have columns Date and Revenue. Make sure the comma and dollar sign is removed from the Revenue column using a method similar to what you did in Question 2.

In [22]:
gme_revenue = pd.DataFrame(columns = ['Date', 'Revenue'])

for row in soup.find_all("tbody")[1].find_all("tr"):
    col = row.find_all("td")
    date = col[0].text
    revenue = col[1].text.replace("$", "").replace(",", "")
    
    gme_revenue = gme_revenue.append({"Date": date, "Revenue": revenue}, ignore_index = True)

We can execute the following line to remove the **comma** and **dollar** sign from the **Revenue column**.

In [23]:
gme_revenue.dropna(inplace=True)

We can execute the following lines to remove an **null** or **empty** strings in the **Revenue column**.

In [24]:
gme_revenue = gme_revenue[gme_revenue['Revenue'] != ""]

We can display the last ten rows of the gme_revenue dataframe using the tail function. 

In [25]:
gme_revenue.dropna(inplace=True)
gme_revenue = gme_revenue[gme_revenue['Revenue'] != ""]
gme_revenue.tail(10)

Unnamed: 0,Date,Revenue
44,2011-04-30,2281
45,2011-01-31,3693
46,2010-10-31,1899
47,2010-07-31,1799
48,2010-04-30,2083
49,2010-01-31,3524
50,2009-10-31,1835
51,2009-07-31,1739
52,2009-04-30,1981
53,2009-01-31,3492


**Step 13: Ploting Tesla Stock Graph**



We can use the make_graph function to graph the Tesla Stock Data, also provide a title for the graph. The structure to call the make_graph function is make_graph(tesla_data, tesla_revenue, 'Tesla').

We use Plotly because it provides a high-level, declarative charting library running in browser using a javascript lib. Rangeslider is used in this example.

In [26]:
!pip install plotly

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [27]:
!pip install -U kaleido

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting kaleido
  Downloading kaleido-0.2.1-py2.py3-none-manylinux1_x86_64.whl (79.9 MB)
[K     |████████████████████████████████| 79.9 MB 103 kB/s 
[?25hInstalling collected packages: kaleido
Successfully installed kaleido-0.2.1


In [28]:
import plotly.graph_objects as go

In [29]:
make_graph(tesla_data, tesla_revenue, 'Tesla')

**Step 14: Ploting GameStop Stock Graph**


As I maintained before, we can use the make_graph function to graph the GameStop Stock Data, and also provide a title for the graph. 

In [30]:
make_graph(gme_data, gme_revenue, 'GameStop')

**Step 15: More advanced visualization with interactive selection** 

***Candle Stick Chart for TESLA Stock***

In [31]:
# load default data
df = Tesla.history(period= 'max')
df.reset_index(inplace=True)
# Create figure
fig3 = go.Figure(data=[go.Candlestick(x=df['Date'],
                open=df['Open'],
                high=df['High'],
                low=df['Low'],
                close=df['Close'])])

# Set title
fig3.update_layout(title_text="Candle Stick chart with range slider and selectors")

# Add range slider
fig3.update_layout(
    xaxis=dict(
        rangeselector=dict(
            buttons=list([
                dict(count=1,
                     label="1m",
                     step="month",
                     stepmode="backward"),
                dict(count=6,
                     label="6m",
                     step="month",
                     stepmode="backward"),
                dict(count=1,
                     label="YTD",
                     step="year",
                     stepmode="todate"),
                dict(count=1,
                     label="1y",
                     step="year",
                     stepmode="backward"),
                dict(step="all")
            ])
        ),
        rangeslider=dict(
            visible=True
        ),
        type="date"
    )
)

fig3.show()


***Candle Stick Chart for GameStop Stock***

In [32]:


# load default data
df = GameStop.history(period= 'max')
df.reset_index(inplace=True)
# Create figure
fig4 = go.Figure(data=[go.Candlestick(x=df['Date'],
                open=df['Open'],
                high=df['High'],
                low=df['Low'],
                close=df['Close'])])

# Set title
fig4.update_layout(title_text="Candle Stick chart with range slider and selectors")

# Add range slider
fig4.update_layout(
    xaxis=dict(
        rangeselector=dict(
            buttons=list([
                dict(count=1,
                     label="1m",
                     step="month",
                     stepmode="backward"),
                dict(count=6,
                     label="6m",
                     step="month",
                     stepmode="backward"),
                dict(count=1,
                     label="YTD",
                     step="year",
                     stepmode="todate"),
                dict(count=1,
                     label="1y",
                     step="year",
                     stepmode="backward"),
                dict(step="all")
            ])
        ),
        rangeslider=dict(
            visible=True
        ),
        type="date"
    )
)

fig4.show()
