## Vehicle Registration Counts by State

**Data Source: US Department of Energy - Alternative Fuels Data Center (AFDC)**

URL: https://afdc.energy.gov/vehicle-registration?year=2023

Data Description: "This page provides approximate light-duty vehicle registration counts derived by the National Renewable Energy Laboratory with data from Experian Information Solutions. Counts are rounded to the closest 100 vehicles and reflect the total number of light-duty registered vehicles through the selected year. Fuel types are based on vehicle identification numbers (VINs), which do not reflect aftermarket conversions to use different fuels or power sources."


In [1]:
import pandas as pd
import requests
from bs4 import BeautifulSoup


**Scrape page for each year (2016-2023)**

In [2]:
afdc_url = "https://afdc.energy.gov/vehicle-registration?year={}"
years = range(2016, 2024)

compiled_data = []

for year in years:
    url = afdc_url.format(year)
    afdc_result = requests.get(url)

    if afdc_result.status_code == 200:
        print(f"Scraping data for {year}...")
                
    page = BeautifulSoup(afdc_result.text, 'html.parser')
                
    table = page.find('table')
                
    if table:
        rows = table.find_all('tr')

        print(f"Found {len(rows)} rows in the table for {year}.")
                    
        for row in rows[2:]: 
            cols = row.find_all('td')
            cols = [col.text.strip() for col in cols]
                        
            compiled_data.append([year] + cols)

    else:
        print(f"Failed to retrieve data for {year}: {afdc_result.status_code} - {afdc_result.reason}")


if table:
        header_row = table.find('tbody').find_all('tr')[0]
        headers = [td['headers'] for td in header_row.find_all('td')]
        clean_headers = []
        for header in headers:
            if header[0].isupper():
                    clean_headers.append(header[0].strip())
            else:
                    clean_headers.append(header[0].strip().capitalize())
        
        print(f"Headers found: {clean_headers}")
    
compiled_df = pd.DataFrame(compiled_data, columns=["Year"] + clean_headers)

Scraping data for 2016...
Found 54 rows in the table for 2016.
Scraping data for 2017...
Found 54 rows in the table for 2017.
Scraping data for 2018...
Found 54 rows in the table for 2018.
Scraping data for 2019...
Found 54 rows in the table for 2019.
Scraping data for 2020...
Found 54 rows in the table for 2020.
Scraping data for 2021...
Found 54 rows in the table for 2021.
Scraping data for 2022...
Found 54 rows in the table for 2022.
Scraping data for 2023...
Found 54 rows in the table for 2023.
Headers found: ['State', 'Electric', 'PHEV', 'HEV', 'Biodiesel', 'Flex', 'CNG', 'Propane', 'Hydrogen', 'Methanol', 'Gas', 'Diesel', 'Unknown']


**Convert to CSV file**

In [3]:
compiled_df.to_csv('ev_registration_data.csv', index = False)

pd.read_csv('ev_registration_data.csv')

Unnamed: 0,Year,State,Electric,PHEV,HEV,Biodiesel,Flex,CNG,Propane,Hydrogen,Methanol,Gas,Diesel,Unknown
0,2016,Alabama,500,900,29100,0,428300,20100,0,0,0,3777300,126500,53900
1,2016,Alaska,200,200,5000,0,55700,4900,0,0,0,525900,44800,19400
2,2016,Arizona,4700,4400,89600,0,427300,17500,0,0,100,4805000,179500,112800
3,2016,Arkansas,200,500,19100,0,320500,12600,0,0,0,2097800,96800,22200
4,2016,California,141500,116700,966700,0,1322600,80600,0,1300,400,27241000,710400,115500
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
411,2023,Washington,152100,41200,307200,73800,337700,100,100,0,0,5583000,274200,46700
412,2023,West Virginia,2800,1800,22400,17300,123400,100,0,0,0,1281500,46400,15200
413,2023,Wisconsin,24900,12500,123600,52900,536200,300,0,0,0,4604700,147500,26400
414,2023,Wyoming,1100,800,8400,21200,57700,0,0,0,0,489100,60900,13700
