### Scraping more extensive supercharger data from Supercharge.info

This code uses the Selenium package to scrape data from a webpage. Specifically, it navigates to the "https://supercharge.info/data" page and extracts information from a table. The code waits for the table to load, and then uses Selenium to locate and extract the table's headers and rows. Its important to note that it first selects 10,000 rows on the webpage to view all data. The headers are stored in a list, and the data from each row is stored in a nested list. This data is then used to create a Pandas DataFrame, which is printed to the console. 

In [141]:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import re


# Import the necessary modules
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
# Import the necessary modules
from selenium import webdriver

In [202]:
# Launch a new Chrome browser window
driver = webdriver.Chrome()

# Navigate to the URL
driver.get('https://supercharge.info/data')

# Wait for the select element to be available
select_element = WebDriverWait(driver, 60).until(
    EC.presence_of_element_located((By.NAME, 'supercharger-data-table_length'))
)

# Create a Select object from the select element
select_object = Select(select_element)

# Select the option with value "10000"
select_object.select_by_value('10000')


# wait for the table to load
table_locator = (By.ID, 'supercharger-data-table')
WebDriverWait(driver, 60).until(EC.presence_of_element_located(table_locator))

# extract the table data
table = driver.find_element(*table_locator)
rows = table.find_elements(By.TAG_NAME, 'tr')

# extract headers
header_row = rows[0]
header_cells = header_row.find_elements(By.TAG_NAME, 'th')
headers = [cell.text.strip() for cell in header_cells]

# extract data
data = []
for row in rows[1:]:
    cells = row.find_elements(By.TAG_NAME, 'td')
    row_data = [cell.text.strip() for cell in cells]
    data.append(row_data)

# close the driver
driver.quit()

# create dataframe
df = pd.DataFrame(data, columns=headers)

# print dataframe
df


Unnamed: 0,Supercharger,Street Address,City,State,Zip,Country,Stalls,kW,GPS,Elev(m),Status,Open Date,Links
0,"Newark - Old Baltimore Pike, DE",865 S Old Baltimore Pike,Newark,DE,19702,USA,8,250,"39.642686, -75.729226",19,OPEN,2023-03-17,gmap | forum | tesla
1,"Shantou - Dongying, China","Dongying Tesla Authorized Body & Spray Center,...",Shantou,Guangdong,,China,3,250,"23.406781, 116.692634",3,OPEN,2023-03-17,gmap | forum | tesla
2,"Cuencame, Mexico",Pits Cuencame Autopista Torreón - Yerbaniz Km....,Cuencame,Durango,35834,Mexico,4,250,"24.993046, -103.732996",1438,OPEN,2023-03-16,gmap | forum | tesla
3,"Örebro, Sweden",Boglundsgatan 2,Örebro,,703 74,Sweden,12,250,"59.297839, 15.206485",27,OPEN,2023-03-16,gmap | forum | tesla
4,"Portsmouth, VA",720 London St,Portsmouth,VA,23704,USA,8,250,"36.837402, -76.306478",2,OPEN,2023-03-16,gmap | forum | tesla
...,...,...,...,...,...,...,...,...,...,...,...,...,...
5528,"Alamosa, CO",720 Main Street,Alamosa,CO,81101,USA,8,250,"37.4673976, -105.8683603",2299,PERMIT,,gmap | forum | tesla
5529,"Pearland, TX",11151 Shadow Creek Parkway,Pearland,TX,77584,USA,12,250,"29.580184, -95.392324",22,PERMIT,,gmap | forum | tesla
5530,"Sutton Forest, NSW",Sallys Corner Rd,Exeter,NSW,2579,Australia,6,250,"-34.60926, 150.229746",701,CONSTRUCTION,,gmap | forum
5531,"Conshohocken, PA",400 Alan Wood Rd,Conshohocken,PA,19428,USA,12,250,"40.094384, -75.308501",292,CONSTRUCTION,,gmap | forum
