<h2>Analyzing Stock/Revenue Data</h2>
<div class="alert alert-block alert-info" style="margin-top: 20px">
    <ul>
        <li>Define a Function that Makes a Graph</li>
        <li>Question 1: Use yfinance to Extract Stock Data</li>
        <li>Question 2: Use Webscraping to Extract Tesla Revenue Data</li>
        <li>Question 3: Use yfinance to Extract Stock Data</li>
        <li>Question 4: Use Webscraping to Extract GME Revenue Data</li>
        <li>Question 5: Plot Tesla Stock Graph</li>
        <li>Question 6: Plot GameStop Stock Graph</li>
    </ul>
</div>

<hr>

In [1]:
import yfinance as yf
import pandas as pd
import requests
from bs4 import BeautifulSoup
import plotly.graph_objects as go
from plotly.subplots import make_subplots

In [17]:
def make_graph(stock_data, revenue_data, stock):
    fig = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=("Historical Share Price", "Historical Revenue"), vertical_spacing = .3)
    stock_data_specific = stock_data[stock_data.Date <= '2021--06-14']
    revenue_data_specific = revenue_data[revenue_data.Date <= '2021-04-30']
    fig.add_trace(go.Scatter(x=pd.to_datetime(stock_data_specific.Date, infer_datetime_format=True), y=stock_data_specific.Close.astype("float"), name="Share Price"), row=1, col=1)
    fig.add_trace(go.Scatter(x=pd.to_datetime(revenue_data_specific.Date, infer_datetime_format=True), y=revenue_data_specific.Revenue.astype("float"), name="Revenue"), row=2, col=1)
    fig.update_xaxes(title_text="Date", row=1, col=1)
    fig.update_xaxes(title_text="Date", row=2, col=1)
    fig.update_yaxes(title_text="Price ($US)", row=1, col=1)
    fig.update_yaxes(title_text="Revenue ($US Millions)", row=2, col=1)
    fig.update_layout(showlegend=False,
    height=900,
    title=stock,
    xaxis_rangeslider_visible=True)
    fig.show()

### **QUESTION 1**

- Using the Ticker function enter the ticker symbol of the stock we want to extract data on to create a ticker object. The stock is Tesla and its ticker symbol is TSLA.

- Using the ticker object and the function history extract stock information and save it in a dataframe named tesla_data. Set the period parameter to current month, if you try by your own, set your current month.

In [64]:
tesla = yf.Ticker("TSLA")
tesla_data = tesla.history(start="2023-04-01", end="2023-04-07")
tesla_data.reset_index(inplace=True)
#reset_index method works to transform the column 'Date' part of the dataframe
tesla_data

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2023-04-03 00:00:00-04:00,199.910004,202.690002,192.199997,194.770004,169545900,0.0,0.0
1,2023-04-04 00:00:00-04:00,197.320007,198.740005,190.320007,192.580002,126463800,0.0,0.0
2,2023-04-05 00:00:00-04:00,190.520004,190.679993,183.759995,185.520004,133882500,0.0,0.0
3,2023-04-06 00:00:00-04:00,183.080002,186.389999,179.740005,185.059998,123583100,0.0,0.0


### **QUESTION 2**

- Use the requests library to download the webpage https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue. Save the text of the response as a variable named html_data.

- Parse the html data using beautiful_soup.

- Using beautiful soup extract the table with Tesla Quarterly Revenue and store it into a dataframe named tesla_revenue. The dataframe should have columns Date and Revenue. Make sure the comma and dollar sign is removed from the Revenue column.

In [79]:
url= "https://www.macrotrends.net/stocks/charts/TSLA/tesla/revenue"
html_data=requests.get(url).text
soup = BeautifulSoup(html_data,"html5lib")

In [80]:
#here we found "Tesla Quarterly Revenue" table
tesla_revenue= pd.read_html(url, match="Tesla Quarterly Revenue")[0]
tesla_revenue.head()

Unnamed: 0,Tesla Quarterly Revenue (Millions of US $),Tesla Quarterly Revenue (Millions of US $).1
0,2022-12-31,"$24,318"
1,2022-09-30,"$21,454"
2,2022-06-30,"$16,934"
3,2022-03-31,"$18,756"
4,2021-12-31,"$17,719"


In [81]:
#here we rename the columns name for date and revenue
cols = tesla_revenue.columns
tesla_revenue.rename (columns = {cols[0]:'Date',cols[1]:'Revenue'}, inplace=True)

In [83]:
##here we replace the dollar sign into a gap, and the comma sign into a gap blanket so we can tranform it later
tesla_revenue['Revenue'] = tesla_revenue['Revenue'].str.replace("$" , "").str.replace(',', '')

  tesla_revenue['Revenue'] = tesla_revenue['Revenue'].str.replace("$" , "").str.replace(',', '')


In [84]:
#here we transform the str type to a floar type numbers
tesla_revenue['Revenue'] = tesla_revenue['Revenue'].astype(float)

In [88]:
###check if everything looks good so far
tesla_revenue.head()

Unnamed: 0,Date,Revenue
0,2022-12-31,24318.0
1,2022-09-30,21454.0
2,2022-06-30,16934.0
3,2022-03-31,18756.0
4,2021-12-31,17719.0


In [86]:
##now, another step we can do, is to clean NaN/Null Values
##we run the next command to check how many null values we got
print("Numer of nulls values by column")
tesla_revenue.isna().sum()

Numer of nulls values by column


Date       0
Revenue    1
dtype: int64

In [89]:
#so far, we got only 1 null value, we are going to remove it
tesla_revenue.dropna(inplace=True)
#and we check again
print("Numer of nulls values by column")
tesla_revenue.isna().sum()

Numer of nulls values by column


Date       0
Revenue    0
dtype: int64