
# **Extracting and Visualizing Stock Data using Python**

# Problem - 2: Use Webscraping to Extract Tesla Revenue Data



This Python script demonstrates how to extract Tesla's revenue data using web scraping techniques and data cleaning. By employing requests and BeautifulSoup, the HTML content of the webpage is fetched and parsed to locate the revenue table. The table is processed into a pandas dataframe, where columns are renamed and the Revenue column is cleaned by removing special characters and handling missing values. This structured dataset allows for seamless analysis of Tesla's historical revenue trends, showcasing efficient data extraction and preparation methods for data analysts.

In [1]:
!pip install bs4
!pip install nbformat

import pandas as pd
import requests
from bs4 import BeautifulSoup
import warnings
# Ignore all warnings
warnings.filterwarnings("ignore", category=FutureWarning)

tesla_url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork/labs/project/revenue.htm"

html_data = requests.get(tesla_url).text

soup = BeautifulSoup(html_data,"html.parser")
soup.title

table = soup.find("table")
tesla_revenue = pd.read_html(str(table))[0]

# Rename columns to 'Date' and 'Revenue'
tesla_revenue.columns = ['Date', 'Revenue']

tesla_revenue["Revenue"] = tesla_revenue['Revenue'].str.replace(',|\$',"")

tesla_revenue.dropna(inplace=True)

tesla_revenue = tesla_revenue[tesla_revenue['Revenue'] != ""]

tesla_revenue.dropna(inplace=True)
tesla_revenue = tesla_revenue[tesla_revenue["Revenue"] != ""]
print(tesla_revenue.tail())  # Display the last 5 rows

Collecting bs4
  Downloading bs4-0.0.2-py2.py3-none-any.whl.metadata (411 bytes)
Downloading bs4-0.0.2-py2.py3-none-any.whl (1.2 kB)
Installing collected packages: bs4
Successfully installed bs4-0.0.2
    Date Revenue
8   2013  $2,013
9   2012    $413
10  2011    $204
11  2010    $117
12  2009    $112



<div class="alert alert-block alert-info" style="margin-top: 20px">
<div class="row">
<div class="col-md-12">

<div class="col-md-6">
<p> <strong>Author:</strong>  <a href="https://github.com/luqman-cheema" target="_blank">Luqman Cheema</a> , student of AI & Data, Software Engineer by profession, I hold a Master's degree in Information Technology, besides, I work with Technology and Business leaders to resolve complex business problems. I have a strong background in designing, developing, implementing, and migrating end-to-end software solutions for sectors, including public, private, and multinational organizations.
   </p>
</div>
<div class="col-md-3">
 <img src="https://avatars.githubusercontent.com/u/14842482?v=4" height="100" width="100" /> </div>
</div>
</div>

<div class="row">
<p><a href="https://www.linkedin.com/in/luqman-cheema/" target="_blank">LinkedIn</a></p>
</div>

<hr>