# 📈 Extracting and Visualizing Stock Data  


### Install Required Libraries in Google Colab

In [1]:
!pip install yfinance
!pip install bs4
!pip install nbformat

Collecting bs4
  Downloading bs4-0.0.2-py2.py3-none-any.whl.metadata (411 bytes)
Downloading bs4-0.0.2-py2.py3-none-any.whl (1.2 kB)
Installing collected packages: bs4
Successfully installed bs4-0.0.2


## Import Necessary Modules

In [2]:
import yfinance as yf
import pandas as pd
import requests
from bs4 import BeautifulSoup
import plotly.graph_objects as go
from plotly.subplots import make_subplots

In [3]:
import warnings
# Ignore all warnings
warnings.filterwarnings("ignore", category=FutureWarning)

## 1: Extract stock data using yfinance.

In this section, we define the function make_graph, which visualizes stock and revenue data.  

1.   List item
2.   List item



### **Function Overview**  
The function takes in:  
- A dataframe with stock data (`Date` and `Close` columns).  
- A dataframe with revenue data (`Date` and `Revenue` columns).  
- The name of the stock for labeling the graph.  

### **Purpose**  
- Stock Price Visualization: Track trends in stock performance over time.  
- Revenue Analysis: Compare financial growth alongside stock behavior.  

This ensures meaningful insights into market trends and financial health of a company.  



In [4]:
def make_graph(stock_data, revenue_data, stock):
    fig = make_subplots(rows=2, cols=1, shared_xaxes=True, subplot_titles=("Historical Share Price", "Historical Revenue"), vertical_spacing = .3)
    stock_data_specific = stock_data[stock_data.Date <= '2021-06-14']
    revenue_data_specific = revenue_data[revenue_data.Date <= '2021-04-30']
    fig.add_trace(go.Scatter(x=pd.to_datetime(stock_data_specific.Date, infer_datetime_format=True), y=stock_data_specific.Close.astype("float"), name="Share Price"), row=1, col=1)
    fig.add_trace(go.Scatter(x=pd.to_datetime(revenue_data_specific.Date, infer_datetime_format=True), y=revenue_data_specific.Revenue.astype("float"), name="Revenue"), row=2, col=1)
    fig.update_xaxes(title_text="Date", row=1, col=1)
    fig.update_xaxes(title_text="Date", row=2, col=1)
    fig.update_yaxes(title_text="Price ($US)", row=1, col=1)
    fig.update_yaxes(title_text="Revenue ($US Millions)", row=2, col=1)
    fig.update_layout(showlegend=False,
    height=900,
    title=stock,
    xaxis_rangeslider_visible=True)
    fig.show()

## 2: Scrape Tesla revenue data using web scraping techniques.

To retrieve Tesla's stock data, we use the `yfinance` library.  
The Ticker function allows us to create a ticker object using Tesla

In [5]:
# Create a Ticker object for Tesla (TSLA)
Tesla = yf.Ticker("TSLA")

# Extract historical stock data
tesla_data = Tesla.history(period="max")

tesla_data.reset_index(inplace=True)
# Display the first few rows
tesla_data.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2010-06-29 00:00:00-04:00,1.266667,1.666667,1.169333,1.592667,281494500,0.0,0.0
1,2010-06-30 00:00:00-04:00,1.719333,2.028,1.553333,1.588667,257806500,0.0,0.0
2,2010-07-01 00:00:00-04:00,1.666667,1.728,1.351333,1.464,123282000,0.0,0.0
3,2010-07-02 00:00:00-04:00,1.533333,1.54,1.247333,1.28,77097000,0.0,0.0
4,2010-07-06 00:00:00-04:00,1.333333,1.333333,1.055333,1.074,103003500,0.0,0.0


## 3: Extract stock data using yfinance.

To retrieve **Tesla's revenue data**, we use **web scraping techniques** to download and process structured financial information.  

### Approach  
- Access the **webpage** containing Tesla revenue data.  
- Extract the **HTML content** for parsing and analysis.  
- Store the extracted text in a variable for further processing.  

### Key Insight  
Web scraping allows **real-time financial data extraction**, ensuring up-to-date insights into Tesla's revenue trends.  


In [6]:
url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/revenue.htm "
html_data = requests.get(url).text

In [7]:
soup = BeautifulSoup(html_data, "html5lib")


Assuming this really is an XML document, what you're doing might work, but you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the Python package 'lxml' installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor.




  soup = BeautifulSoup(html_data, "html5lib")


In [8]:
tesla_revenue_table = pd.read_html(url)
tesla_revenue = tesla_revenue_table[1]
tesla_revenue.columns = ['Data', 'Revenue']

In [9]:
tesla_revenue["Revenue"] = tesla_revenue['Revenue'].str.replace(',|\$',"",regex=True)

In [10]:
tesla_revenue.dropna(inplace=True)

tesla_revenue = tesla_revenue[tesla_revenue['Revenue'] != ""]

In [11]:
tesla_revenue.tail()

Unnamed: 0,Data,Revenue
48,2010-09-30,31
49,2010-06-30,28
50,2010-03-31,21
52,2009-09-30,46
53,2009-06-30,27


## 4: Scrape GameStop (GME) revenue data using web scraping.

To retrieve GameStop's stock data, we use Yahoo Finance (yfinance), which provides historical market information.  

### Approach  
- Utilize the Ticker function to create an object using GameStop’s symbol: GME.  
- Extract key financial metrics such as stock price, trading volume, and historical trends.  
- Process and visualize the data for further analysis.  

### Key Insight  
Accessing stock data enables investors and analysts to track performance, identify trends, and make informed financial decisions.  


In [12]:
GameStop = yf.Ticker("GME")

In [13]:
gme_data = GameStop.history(period = "max")

In [14]:
gme_data.reset_index(inplace=True)
gme_data.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2002-02-13 00:00:00-05:00,1.620129,1.693351,1.603297,1.691667,76216000,0.0,0.0
1,2002-02-14 00:00:00-05:00,1.712708,1.716074,1.670627,1.683251,11021600,0.0,0.0
2,2002-02-15 00:00:00-05:00,1.68325,1.687458,1.658002,1.674834,8389600,0.0,0.0
3,2002-02-19 00:00:00-05:00,1.666418,1.666418,1.578047,1.607504,7410400,0.0,0.0
4,2002-02-20 00:00:00-05:00,1.61592,1.66221,1.603296,1.66221,6892800,0.0,0.0


## 5: Plot Tesla stock graph to analyze historical trends.

To retrieve GameStop's revenue data, web scraping is used to access and extract financial details from an online source.  

### Approach  
- Use the requests library to **download** the webpage containing revenue data.  
- Extract and **store the HTML content** in a variable for further analysis.  
- Process the retrieved data to make it usable for visualization or insights.  

### Key Insight  
Web scraping enables real-time access to financial data, allowing for up-to-date revenue tracking and analysis.  


In [15]:
url = " https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/stock.html"
html_data_2 = requests.get(url).text

In [16]:
soup = BeautifulSoup(html_data_2, "html5lib")

In [17]:
gme_revenue_table = pd.read_html(url)
gme_revenue = gme_revenue_table[1]
gme_revenue.columns = ['Date', 'Revenue']
gme_revenue["Revenue"] = gme_revenue['Revenue'].str.replace(',|\$',"",regex=True)
gme_revenue.dropna(inplace=True)

gme_revenue = gme_revenue[gme_revenue['Revenue'] != ""]

In [18]:
gme_revenue.tail()

Unnamed: 0,Date,Revenue
57,2006-01-31,1667
58,2005-10-31,534
59,2005-07-31,416
60,2005-04-30,475
61,2005-01-31,709


## 6: Plot GameStop stock graph for financial insights.

Using the `make_graph` function, the Tesla stock data is visualized to show trends and performance over time.  

### Approach  
- Pass the Tesla stock **dataframe** (containing `Date` and `Close` columns).  
- Include the Tesla revenue **dataframe** (containing `Date` and `Revenue` columns).  
- Set the **graph title** to indicate it represents stock trends up to **June 2021**.  

### Key Insight  
Visualizing stock performance helps in **analyzing market trends, identifying patterns, and making informed financial decisions**.  


In [24]:
print(tesla_data.columns)
print(tesla_revenue.columns)


Index(['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Dividends',
       'Stock Splits'],
      dtype='object')
Index(['Data', 'Revenue'], dtype='object')


In [25]:
tesla_revenue.rename(columns={"Data": "Date"}, inplace=True)
print(tesla_revenue.columns)

Index(['Date', 'Revenue'], dtype='object')


In [26]:
make_graph(tesla_data, tesla_revenue, 'Tesla')


The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.


The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.



## Plotting GameStop Stock Graph  

The `make_graph` function is used to **visualize GameStop's stock trends**, showcasing performance up to **June 2021**.  

### Approach  
- Pass the **GameStop stock dataframe** (`gme_data`) containing `Date` and `Close` values.  
- Include the **GameStop revenue dataframe** (`gme_revenue`) containing `Date` and `Revenue` values.  
- Set the **graph title** to clearly indicate that data is displayed up to June 2021.  
- Function call structure:  


In [27]:
make_graph(gme_data, gme_revenue, 'GameStop')
make_graph


The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.


The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.

