# Map(s) of BEV concentration in Washington State

## In this notebook I walk through the process of building a few Choropleth maps that highlight the "population" of battery electric vehicles in the state by county and by city within a few select counties.

### Within Jupyter notebooks the Geopandas library is not already installed. I am installing through !pip install

In [None]:
#install geopandas 

!pip install geopandas

### Need to import os model. this provides a way to interact with the operating sytem allowing access to environment variables. In this example we are using shapefiles( .shx) to build a map. Below we set the SHAPE_RESTORE_SHX to Yes emabling the system to restore/read the .shx files.

In [None]:
## import os. SHAPE_RESTORE_SHX is a variable related to shapefilehandling. Shapefiles often come with an auxiliary file 
## with the extentsion .shx (shape file index). This file contains index data that helps in restoring or using the geometry
## information effectively (per ChatGT)

import os
os.environ["SHAPE_RESTORE_SHX"] = "YES"

### Import libraries and the files we will be working with. Create a main data frame of battery electric vehicles (BEV). Create a path to the map(shapefile) to leverage when building the map highlighting the BEV populations by county. I found map (and other maps) of Washington state broken out by counties through Chat GPT, MS Copilot, and Bing. Note - the map file comes in a zip file. Download the entire zipfile and save unzipped. Though you will directly save the .shp file to notebook, the other files are needed via GeoPandas: 
### 1) .shp (Shapefile): This file contains the geometry of the features (e.g., points, lines, polygons).
### 2) .shx (Shape Index file): This file is an index that allows software to quickly find features within the .shp file
### 3)  .dbf (Attribute Table): This file stores the attribute data for each feature in a tabular format

In [None]:
#Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import geopandas as gpd

#pull in data file and map file
df = pd.read_csv(r'C:\Users\v-joecamp\OneDrive - Microsoft\Desktop\Python Beginner Projects\EV stats\Electric_Vehicle_Population_Data_09292024.csv')
shape_path = r'C:\Users\v-joecamp\OneDrive - Microsoft\Desktop\Python Beginner Projects\EV stats\tl_2016_53_cousub.shp'

In [None]:
## let's take look at the column names the data frame and the map's shapefile

print(df.columns)
print(shape_path.columns)

### The current dataframe contains both battery electric vehicles (BEV) and plug in hybrid cars. I want to make a dataframe of just BEV. Then look at the column information on the new dataframe to understand naming convention for each column, data types and count for each column. 

In [None]:
## creating a BEV dataframe

bev_df = df[df['Electric Vehicle Type'] == 'Battery Electric Vehicle (BEV)']
bev_df.info()

### I want to build a BEV dataframe by county - use the groupby function. This dataframe will show the number of BEVs per county which will be reflected in the Choropleth map once built.  

In [None]:
#a list of the counties and COUNTYFP in Washington State and count of BEV cars in each. COUNTYFP means County Federal 
## Information Processing Standards (FIPS)code. This is code for each county in the U.S. by the Census Bureau.  I added the 
## COUNTYFP column to the original data set. COUNTYFP will be the "key" between df_bev_county and the shape file. 

df_bev_county = bev_df.groupby(['County','COUNTYFP']).size().reset_index(name='Count')

df_bev_county.head()

### Build a shape geo dataframe via geopandas and take a look at the information in the COUNTYFP fields to confirm a match with the same column in df_bev_county

In [None]:
## Setting up the shape geo dataframe  and then printing the COUNTYFP columns to confirm the numbers are the same as the 
## COUNTYFP column in df_bev_county

shape = gpd.read_file(shape_path)
print(shape['COUNTYFP'])

### Look at the data types of the shape df - specifically the data type of the COUNTYFP column. This needs to be the same data type as those in the same column from df_bev_county. As you can see the COUNTFP data type is "object" - this needs to be changed to int. 

In [None]:
## Look t the column names, count, and datatypes of each column. Notice that COUNTYFP is an object. As this is a key, this 
## data type needs to be changed to an int

shape.info()

### This block of code changes the data type in column COUNTYFP from object to int. Using the .info() function we can confirm the change was made. 

In [None]:
## changing the data type of the COUNTYFP column from an object to int. Now shape geo df and df_bev_county can "talk" to 
##each other. 

shape['COUNTYFP'] = shape['COUNTYFP'].astype(int)
shape.info()

### I want to get a final look at the column names for each data frame.  

In [None]:
## Printing the column names for shape geo df and for df_bev_county

print(shape.columns)
print(df_bev_county.columns)

### This block of code builds the data frame that will be used to build the Choropleth map. I am using the exisitng shape geo data frame and merging (adding) columns from df_bev_county. All based on the key column COUNTYFP. 

In [None]:
##merging the shape Geo df and the bev.county dataframe. Had to add the COUNTYFP code to the "master dataframe" so the merge 
##function had a "key" to work off of. The columns County and Count will be added to the shape file. 

shape= pd.merge(
    left=shape,
    right=df_bev_county,
    left_on='COUNTYFP',
    right_on='COUNTYFP',
    how='left'
)

print(shape.columns)

### Useing the head() function I want to confirm the two data frames are merged into one. 

In [None]:
## a quick look at the merged file. 

shape.head()

### Creating the initial map. I don't like the way this map is broken out by the cities within each county. We will clean the map up but I did find a better, cleaner map visual which is below.  

In [None]:
##plotting the map including a legend. The ax line of code creates a plot of the boundaries of the geometries in the 
## shape GeoDataFrame
## The second line of code plots the geometries on the exisiting axis (ax). The "Count" column is used to color 
## the geometries
## I don't like the initial look of the map as each county is broken down by cities. It is hard to distinguish counties.This
## is resolved with a different map below. 

ax=shape.boundary.plot(edgecolor='black', linewidth=0.4, figsize=(12,6))
shape.plot(ax=ax, column='Count', legend=True, cmap='RdYlBu', legend_kwds={'shrink':1.0})

plt.show()

### A "cleaned up" version where a title is given, the black outline (spine) is removed, and the x and y axis are removed. Additionally I changed the color map to see if the visual was better and easier to "read". 

In [None]:
## This is the final map removing the text for the x and y axis, removing the black line (spine) from around the map, and
## adding a title to the map. 

ax=shape.boundary.plot(edgecolor='black', linewidth=0.4, figsize=(12,6))
shape.plot(ax=ax, column='Count', legend=True, cmap='Spectral', legend_kwds={'shrink':1.0})

##this following gets rid of the text for the x and y axis
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

##getting rid of the black "spines" around the map
for edge in ['right', 'left', 'bottom', 'top']:
    ax.spines[edge].set_visible(False)

##setting a title
ax.set_title('Battery Electric Vehicles by County', size=16, weight='bold')

plt.show()

## As referenced above I don't like the busyness/messiness of the map(s) above. I found a map of Washington state that outlines the counties. Below is how I added this to the notebook. 

### Adding the "new" map to this notebook

In [None]:
shape_path2 = r'C:\Users\v-joecamp\OneDrive - Microsoft\Desktop\Python Beginner Projects\EV stats\WA_County_Boundaries.shp'

### Create a geopandas dataframe and look at the columns and data to understand how and what to use to merge the df_bev_counties data frame with this one. 

In [None]:
## Used GeoPandas to read the file and create a geo df named shape2. Looking through the column names and corresponding data
## the column JURISDIC_2 listed each county. I will use this column and the County column from df_bev_county to merge the 
## two dataframes. 

shape2 = gpd.read_file(shape_path2)

print(shape2)

### Reviewing the columns and data types of this geo df I see that the JURISDIC_2 column of county names is the same data type (object) as the County column from df_bev_county. This is how I merge the two data frames. 

In [None]:
## looking at the data types for each column of the shape2 geo dataframe. Note that the data type of JURISDIC_2 is object. 
## This is the same data type for the column County from df_bev_county

shape2.info()

### Merging the df_bev_county dataframe into the shape2 geo dataframe. Tehn look columns to confirm merge succeeded. 

In [None]:
#Merging shape2 geo data frame and df_bev_column

shape2= pd.merge(
    left=shape2,
    right=df_bev_county,
    left_on='JURISDIC_2',
    right_on='County',
    how='left'
)

print(shape.columns)

### Take a quick look at the new geo data frame

In [None]:
shape2.head()

### Build an initial map. This looks so much better and easier to read- uncluttered

In [None]:
##Leveraging the map plotting code from above - plotting the map including a legend. 

ax=shape2.boundary.plot(edgecolor='black', linewidth=0.4, figsize=(12,6))
shape2.plot(ax=ax, column='Count', legend=True, cmap='RdYlBu', legend_kwds={'shrink':1.0})

plt.show()

### Final map - removed the spine, removed the x and y axis, and added a title. 

In [None]:
## Final map

ax=shape2.boundary.plot(edgecolor='black', linewidth=0.4, figsize=(12,6))
shape2.plot(ax=ax, column='Count', legend=True, cmap='RdYlBu', legend_kwds={'shrink':1.0})

##this following gets rid of the text for the x and y axis
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

##getting rid of the black "spines" around the map
for edge in ['right', 'left', 'bottom', 'top']:
    ax.spines[edge].set_visible(False)

##setting a title
ax.set_title('Battery Electric Vehicles by County', size=16, weight='bold')

plt.show()