<h1>Extracting and Visualizing Stock Data</h1>


<h2>Description</h2>
Extracting essential data from a dataset and displaying it is a necessary part of data science; therefore individuals can make correct decisions based on the data. In this mini project, we will extract some stock data, we will then display this data in a graph and analyze it.

In [None]:
#Installing Necessary Libraries
!pip install yfinance==0.2.38
!pip install pandas==2.2.2
#Downgraded pandas so that yfinance can work fine
!pip install nbformat

Collecting pandas==2.2.2
  Downloading pandas-2.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.0/13.0 MB[0m [31m79.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pandas
  Attempting uninstall: pandas
    Found existing installation: pandas 2.0.3
    Uninstalling pandas-2.0.3:
      Successfully uninstalled pandas-2.0.3
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires pandas==2.0.3, but you have pandas 2.2.2 which is incompatible.[0m[31m
[0mSuccessfully installed pandas-2.2.2


Importing important libraries

In [None]:
import yfinance as yf
import pandas as pd
import requests
from bs4 import BeautifulSoup
import plotly.graph_objects as go
from plotly.subplots import make_subplots

In [None]:
import warnings
# Ignore all warnings
warnings.filterwarnings("ignore", category=FutureWarning)

## Defining Graphing Function


In [None]:
def make_graph(stock_data, revenue_data, stock):
    fig = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=("Historical Share Price", "Historical Revenue"), vertical_spacing = .3)
    stock_data_specific = stock_data[stock_data.Date <= '2020-04-30']
    revenue_data_specific = revenue_data[revenue_data.Date <= '2020-04-30']
    fig.add_trace(go.Scatter(x=pd.to_datetime(stock_data_specific.Date, infer_datetime_format=True), y=stock_data_specific.Close.astype("float"), name="Share Price"), row=1, col=1)
    fig.add_trace(go.Scatter(x=pd.to_datetime(revenue_data_specific.Date, infer_datetime_format=True), y=revenue_data_specific.Revenue.astype("float"), name="Revenue"), row=2, col=1)
    fig.update_xaxes(title_text="Date", row=1, col=1)
    fig.update_xaxes(title_text="Date", row=2, col=1)
    fig.update_yaxes(title_text="Price ($US)", row=1, col=1)
    fig.update_yaxes(title_text="Revenue ($US Millions)", row=2, col=1)
    fig.update_layout(showlegend=False,
    height=900,
    title=stock,
    xaxis_rangeslider_visible=True)
    fig.show()

## Part 1: Using yfinance to Extract Stock Data


In [None]:
tesla = yf.Ticker("TSLA")

In [None]:
tesla_data = tesla.history(period='max')
tesla_data.reset_index(inplace=True)
tesla_data .head(10)

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2010-06-29 00:00:00-04:00,1.266667,1.666667,1.169333,1.592667,281494500,0.0,0.0
1,2010-06-30 00:00:00-04:00,1.719333,2.028,1.553333,1.588667,257806500,0.0,0.0
2,2010-07-01 00:00:00-04:00,1.666667,1.728,1.351333,1.464,123282000,0.0,0.0
3,2010-07-02 00:00:00-04:00,1.533333,1.54,1.247333,1.28,77097000,0.0,0.0
4,2010-07-06 00:00:00-04:00,1.333333,1.333333,1.055333,1.074,103003500,0.0,0.0
5,2010-07-07 00:00:00-04:00,1.093333,1.108667,0.998667,1.053333,103825500,0.0,0.0
6,2010-07-08 00:00:00-04:00,1.076,1.168,1.038,1.164,115671000,0.0,0.0
7,2010-07-09 00:00:00-04:00,1.172,1.193333,1.103333,1.16,60759000,0.0,0.0
8,2010-07-12 00:00:00-04:00,1.196667,1.204667,1.133333,1.136667,33037500,0.0,0.0
9,2010-07-13 00:00:00-04:00,1.159333,1.242667,1.126667,1.209333,40201500,0.0,0.0


## Part 2: Use Webscraping to Extract Tesla Revenue Data


Use the `requests` library to download the webpage https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/revenue.htm Save the text of the response as a variable named `html_data`.


In [None]:
url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/revenue.htm"
data = requests.get(url).text
soup = BeautifulSoup(data, "html.parser")

In [None]:
table_html = soup.find_all('tbody')[1]

In [None]:
tesla_revenue = pd.DataFrame(columns=['Date', 'Revenue'])

In [None]:
data = []
for row in table_html.find_all("tr"):
    col = row.find_all("td")
    quarter = col[0].text
    revenue = col[1].text.replace(",", "").replace("$", "")
    data.append({'Date': quarter, 'Revenue': revenue})

In [None]:
tesla_revenue = pd.DataFrame(data)

In [None]:
tesla_revenue.head()

Unnamed: 0,Date,Revenue
0,2020-04-30,1021
1,2020-01-31,2194
2,2019-10-31,1439
3,2019-07-31,1286
4,2019-04-30,1548


To remove an null or empty strings in the Revenue column.


In [None]:
tesla_revenue.dropna(inplace=True)

tesla_revenue = tesla_revenue[tesla_revenue['Revenue'] != ""]

In [None]:
tesla_revenue.head()

Unnamed: 0,Date,Revenue
0,2020-04-30,1021
1,2020-01-31,2194
2,2019-10-31,1439
3,2019-07-31,1286
4,2019-04-30,1548


## Part 3: Use yfinance to Extract Stock Data


In [None]:
gamestop = yf.Ticker("GME")
gamestop_data = gamestop.history(period='max')

In [None]:
gamestop_data.reset_index(inplace=True)
gamestop_data.head(10)

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2002-02-13 00:00:00-05:00,1.620128,1.69335,1.603296,1.691667,76216000,0.0,0.0
1,2002-02-14 00:00:00-05:00,1.712707,1.716073,1.670626,1.68325,11021600,0.0,0.0
2,2002-02-15 00:00:00-05:00,1.68325,1.687458,1.658001,1.674834,8389600,0.0,0.0
3,2002-02-19 00:00:00-05:00,1.666418,1.666418,1.578047,1.607504,7410400,0.0,0.0
4,2002-02-20 00:00:00-05:00,1.615921,1.66221,1.603296,1.66221,6892800,0.0,0.0
5,2002-02-21 00:00:00-05:00,1.656318,1.670626,1.641169,1.658002,6976800,0.0,0.0
6,2002-02-22 00:00:00-05:00,1.670626,1.670626,1.615921,1.628545,3525600,0.0,0.0
7,2002-02-25 00:00:00-05:00,1.624336,1.653793,1.605821,1.641169,3453600,0.0,0.0
8,2002-02-26 00:00:00-05:00,1.632753,1.658002,1.606662,1.641169,2761600,0.0,0.0
9,2002-02-27 00:00:00-05:00,1.628545,1.629387,1.599088,1.611712,4091200,0.0,0.0


## Part 4: Use Webscraping to Extract GME Revenue Data


In [None]:
url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/stock.html"
data = requests.get(url).text
soup = BeautifulSoup(data, "html.parser")

In [None]:
table_html = soup.find_all('tbody')[1]

In [None]:
gme_revenue = pd.DataFrame(columns=['Date', 'Revenue'])
data = []
for row in table_html.find_all("tr"):
    col = row.find_all("td")
    quarter = col[0].text
    revenue = col[1].text.replace(",", "").replace("$", "")
    data.append({'Date': quarter, 'Revenue': revenue})

In [None]:
gme_revenue = pd.DataFrame(data)

To remove an null or empty strings in the Revenue column.

In [None]:
gme_revenue.dropna(inplace=True)

gme_revenue = tesla_revenue[tesla_revenue['Revenue'] != ""]
gme_revenue.head()

Unnamed: 0,Date,Revenue
0,2020-04-30,1021
1,2020-01-31,2194
2,2019-10-31,1439
3,2019-07-31,1286
4,2019-04-30,1548


## Part 5: Tesla Stock Graph


In [None]:
make_graph(tesla_data, tesla_revenue, 'Tesla Stock Graph')


The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.


The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.



## Part 6: GameStop Stock Graph


In [None]:
make_graph(gamestop_data, gme_revenue, 'GameStop Stock Graph')


The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.


The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.



<h2>Graph Overview<h2>
The graph is structured into two main parts:

1. **Historical Share Price:** This part of the graph shows the closing prices of stock over a specified period. The x-axis represents the date, and the y-axis represents the closing price in USD. This allows viewers to observe how stock price has changed over time, identifying trends such as upward, downward, or sideways movements.

2. **Historical Revenue:** This section plots company's quarterly revenue. Similar to the stock price graph, the x-axis represents the date (specifically the end of each quarter), and the y-axis shows the revenue in US millions. This graph helps in understanding the financial performance of company over time, correlating revenue changes with possible stock price changes.

<h2>Analysis of Trends<h2>

**Stock Price Trends:** By examining the stock price graph, one can identify periods of growth, decline, or stability. For instance, significant upward trends might indicate periods of strong market confidence or positive company developments, while downward trends could suggest market challenges or poor financial performance.

**Revenue Trends:** Changes in the revenue graph can provide insights into the company's operational performance. Sharp increases in revenue might correlate with product launches or market expansion, while declines could reflect decreased sales or adverse market conditions.

<h2>Correlation Between Stock Prices and Revenue<h2>

**Direct Correlation:** Typically, there might be a direct correlation between the revenue figures and stock prices. For example, if the revenue increases significantly due to successful product launches or market expansion, the stock price might also rise as investor confidence grows.

**Lag or Lead Times:** It's also possible to observe lag or lead times between revenue changes and stock price adjustments. Investors might react to revenue announcements with a delay, or stock prices might anticipate future revenue changes based on market rumors or forecasts.

The graph provides a visual representation that helps in analyzing the financial health and market perception of a particular company over time. By correlating the stock price trends with revenue data, investors and analysts can gauge the company's performance and make predictions about future movements based on historical data. This analysis is crucial for making informed investment decisions, understanding market sentiment, and assessing the impact of external factors on Tesla's stock performance.