## Establishing the Data

### Acquiring and Reading the Data

In [2]:
import pandas as pd
df = pd.read_csv("EV_Data.csv")
df.head()

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,state,year,EV Registrations,Total Vehicles,EV Share (%),Stations,Total Charging Outlets,Level 1,...,personal,reducetax,regulate,worried,price_cents_per_kwh,gasoline_price_per_gallon,Total,Trucks,Trucks_Share,Party
0,0,0,Alabama,2023,13000,4835900,0.27,424,1096,35,...,39.06,62.16,69.01,54.95,11.47,2.742,5446619.0,3397137.0,62.37,Republican
1,1,1,Alaska,2023,2700,559800,0.48,65,124,3,...,43.28,65.38,71.61,62.49,21.41,3.594,680974.0,517525.0,76.0,Republican
2,2,2,Arizona,2023,89800,6529000,1.38,1198,3506,9,...,46.92,64.73,73.19,64.32,12.19,3.278,6447062.0,3868118.0,60.0,Democratic
3,3,3,Arkansas,2023,7100,2708300,0.26,334,833,3,...,39.08,63.39,68.58,56.21,9.73,2.76,3338322.0,2291924.0,68.65,Republican
4,4,4,California,2023,1256600,36850300,3.41,16381,49433,648,...,53.19,72.08,76.3,71.24,24.87,4.731,31057329.0,16757880.0,53.96,Democratic


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 408 entries, 0 to 407
Data columns (total 36 columns):
 #   Column                                 Non-Null Count  Dtype  
---  ------                                 --------------  -----  
 0   Unnamed: 0.1                           408 non-null    int64  
 1   Unnamed: 0                             408 non-null    int64  
 2   state                                  408 non-null    object 
 3   year                                   408 non-null    int64  
 4   EV Registrations                       408 non-null    int64  
 5   Total Vehicles                         408 non-null    int64  
 6   EV Share (%)                           408 non-null    float64
 7   Stations                               408 non-null    int64  
 8   Total Charging Outlets                 408 non-null    int64  
 9   Level 1                                408 non-null    int64  
 10  Level 2                                408 non-null    int64  
 11  DC Fas

### Where to Find the Data's CSV
The dataset I used for my project is called "EV Adoption USA" on Kaggle posted by the user Suraj Shivakumar. In order to obtain the data, search with the name and user of the post on Kaggle or follow the kagglehub download steps below. 

```python
    # Install dependencies as needed:
    # pip install kagglehub[pandas-datasets]
    import kagglehub
    from kagglehub import KaggleDatasetAdapter

    # Set the path to the file you'd like to load
    file_path = ""

    # Load the latest version
    df = kagglehub.load_dataset(
        KaggleDatasetAdapter.PANDAS,
        "surajshivakumar/ev-adoption-usa",
        file_path,
        # Provide any additional arguments like 
        # sql_query or pandas_kwargs. See the 
        # documenation for more information:
        # https://github.com/Kaggle/kagglehub/blob/main/README.md#kaggledatasetadapterpandas
      ) 
    
      print("First 5 records:", df.head())
```
<br>

### Who and How the Data was Produced
  In his description of the dataset it says that it "provides a comprehensive, state-level view of the key factors influencing electric vehicle (EV) adoption across the United States." I wanted to use this dataset specifically because I wanted to explore Electric Vehicle (EV) adoption within the USA and understand the different possible factors that contributes to flucuating rates including, but not limited to location, government incentives, political association, and accessability to charging stations. 

Shivakumar compiled this data from a variety of sources. The columns that I used in my analysis were collected by the following organizations: US Census Bureau (socioeconomic indicator columns), and the National Renewable Energy Laboratory (NREL) (EV registrations), Alternative Fuels Data Center (AFDC) (charging stations and incentives), Energy Information Administration (EIA) (electricity prices). All of these sources are government run agencies, so the data is collected, refined, and distributed by a verifed source. 



The compiled data is structured into 36 columns with 408 rows. The columns are the following:
    
| Column Name                               | Description                                                                                     | Data Type | Example |
|-------------------------------------------|-------------------------------------------------------------------------------------------------|-----------|---------|
| state                                     | US state                                                                                        | object    | "California" |
| year                                      | Year of observation                                                                             | int       | 2022 |
| EV Registrations                           | Number of registered electric vehicles                                                          | int       | 152000 |
| Total Vehicles                             | Total number of all registered vehicles                                                         | int       | 3000000 |
| EV Share (%)                               | Percentage of total vehicles that are electric                                                  | float     | 5.2 |
| Stations                                   | Number of public EV charging stations                                                           | int       | 450 |
| Total Charging Outlets                     | Total number of individual charging plugs                                                       | int       | 1200 |
| Level 1                                    | Number of Level 1 charging outlets                                                              | int       | 50 |
| Level 2                                    | Number of Level 2 charging outlets                                                              | int       | 900 |
| DC Fast                                    | Number of DC Fast charging outlets                                                              | int       | 250 |
| fuel_economy                               | Average fuel economy of all vehicles (MPG)                                                      | float     | 24.5 |
| Incentives                                 | Presence/details of state-level EV incentives                                                   | float     | 1.0 |
| Number of Metro Organizing Committees      | Number of metropolitan planning organizations                                                    | float     | 12 |
| Population_20_64                           | Working-age population (ages 20–64)                                                             | float     | 4.2e6 |
| Education_Bachelor                         | Number of people with a Bachelor's degree or higher                                             | float     | 1.1e6 |
| Labour_Force_Participation_Rate            | Percentage of working-age population in the labor force                                         | float     | 63.4 |
| Unemployment_Rate                          | Percentage of the labor force that is unemployed                                                | float     | 4.8 |
| Bachelor_Attainment                        | Percentage of population with a Bachelor's degree or higher                                     | float     | 32.1 |
| Per_Cap_Income                             | Average income per person                                                                       | float     | 52000 |
| affectweather                              | Concern/belief about climate change impacts                                                     | float     | 0.62 |
| devharm                                    | Concern about harm from development                                                             | float     | 0.55 |
| discuss                                    | Frequency of discussing environmental issues                                                    | float     | 0.48 |
| exp                                        | Environmental experience or exposure                                                            | float     | 0.51 |
| localofficials                             | Trust/engagement with local environmental officials                                             | float     | 0.44 |
| personal                                   | Personal responsibility toward the environment                                                  | float     | 0.67 |
| reducetax                                  | Support for reducing taxes related to environmental policy                                      | float     | 0.40 |
| regulate                                   | Support for government environmental regulation                                                 | float     | 0.58 |
| worried                                    | Concern about environmental problems                                    a                        | float     | 0.70 |
| price_cents_per_kwh                        | Average electricity price per kWh (cents)                                                       | float     | 14.2 |
| gasoline_price_per_gallon                  | Average gasoline price per gallon                                                               | float     | 3.85 |
| Total                                      | Total registered vehicles (duplicate of Total Vehicles)                                         | int       | 3000000 |
| Trucks                                     | Number of registered trucks                                                                     | int       | 800000 |
| Trucks_Share                               | Percentage of total vehicles that are trucks                                                    | float     | 26.7 |
| Party                                      | Predominant political party affiliation                                                         | object    | "Democrat" |
| Unnamed: 0.1                               | Index column (artifact from CSV)                                                                | int       | 0 |
| Unnamed: 0                                 | Index column (artifact from CSV)                                                                | int       | 0 |
