## Electric Vehicle Charging Station Site Selection Analysis

This notebook demonstrates a workflow for identifying potential areas for new electric vehicle (EV) charging station development using WherobotsDB and WherobotsAI raster inference functionality. The workflow is based on:

* Identifying existing EV charging station infrastructure
* Proximity to retail stores as a proxy for demand, and
* Proximity to solar farms
    

Existing charing station infrastructure and retail store point of interest data is determined using public data sources, while existing solar farm infrastructure is identified using Wherobots AI raster inference. By using a machine learning model trained on satellite imagery we can identify solar farms as an input to the analysis

In [1]:
from sedona.spark import *

config = SedonaContext.builder().config("spark.hadoop.fs.s3a.bucket.wherobots-examples.aws.credentials.provider","org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider").config("spark.driver.maxResultSize", "10g").config("sedona.join.autoBroadcastJoinThreshold", "-1").getOrCreate()
sedona = SedonaContext.create(config)

24/05/31 19:10:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/05/31 19:10:49 WARN MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
24/05/31 19:10:50 WARN S3ABlockOutputStream: Application invoked the Syncable API against stream writing to qjnq6fcbf1/spark-logs/spark-bdb3d12e4f704a19983143e0d002ef80.inprogress. This is unsupported
24/05/31 19:11:27 WARN SedonaContext: Python files are not set. Sedona will not pre-load Python UDFs.


## Identify Area Of Interest

We will use US Census Zip Code Tabulated Areas (ZCTA) to identify regions for potential EV charging station development. We will confine our analysis to the state of Arizona.

Note that we are using the `ST_Intersects` spatial predicate function to find ZCTAs that intersect with the border of Arizona rather than `ST_Contains`. This is because some ZCTAs extend beyond the border of Arizona and can lie within multiple states. This will extend our analysis slightly beyond the borders of Arizona.

In [2]:
az_zips_df = sedona.sql("""
WITH arizona AS ( 
    SELECT localityArea.geometry AS geometry
    FROM wherobots_open_data.overture_2024_02_15.admins_locality locality 
    JOIN wherobots_open_data.overture_2024_02_15.admins_localityArea localityArea 
    ON locality.id = localityArea.localityId
    WHERE locality.names.primary = "Arizona" AND locality.localityType = "state" 
)

SELECT zta5.geometry AS geometry, ZCTA5CE10 
FROM wherobots_pro_data.us_census.zipcode zta5, arizona
WHERE ST_Intersects(arizona.geometry, zta5.geometry)
""")

In [3]:
az_zips_df.createOrReplaceTempView("az_zta5")

In [4]:
SedonaKepler.create_map(az_zips_df, name="Arizona ZCTAs")

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
                                                                                

KeplerGl(data={'Arizona ZCTAs': {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 1…

Next, we will identify existing EV charging infrastructure within each ZCTA as an input to our analysis.

## Existing EV Charging Infrastructure

Using data from [Open Charge Map](https://openchargemap.org/site) we calculate the number of EV charging stations in each ZCTA to give us a sense of existing EV charging infrastructure.


In [6]:
stations_df = sedona.read.format("geoparquet").load("s3://wherobots-examples/data/examples/openchargemap/world.parquet")

                                                                                

In [9]:
stations_df.createOrReplaceTempView("stations")

In [8]:
SedonaKepler.create_map(stations_df.sample(0.01), name="EV Charging Stations")

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


                                                                                

KeplerGl(data={'EV Charging Stations': {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…

Count of existing EV charging stations per ZCTA.

In [10]:
az_stations_df = sedona.sql("""
SELECT COUNT(*) AS num, any_value(az_zta5.geometry) AS geometry, ZCTA5CE10
FROM stations JOIN az_zta5
WHERE ST_Intersects(az_zta5.geometry, stations.geometry)
GROUP BY ZCTA5CE10 
ORDER BY num DESC
""")

In [27]:
az_stations_df.createOrReplaceTempView("az_stations")

In [11]:
az_stations_df.count()

                                                                                

218

In [12]:
az_stations_df.show()



+---+--------------------+---------+
|num|            geometry|ZCTA5CE10|
+---+--------------------+---------+
| 51|POLYGON ((-111.97...|    85281|
| 36|POLYGON ((-111.95...|    85251|
| 33|POLYGON ((-111.92...|    85260|
| 28|POLYGON ((-112.07...|    85004|
| 27|POLYGON ((-112.14...|    85226|
| 26|POLYGON ((-112.07...|    86336|
| 22|POLYGON ((-111.97...|    85054|
| 22|POLYGON ((-112.05...|    85016|
| 20|POLYGON ((-111.89...|    85286|
| 19|POLYGON ((-112.06...|    85034|
| 17|POLYGON ((-111.97...|    85254|
| 17|POLYGON ((-111.69...|    85212|
| 16|POLYGON ((-111.75...|    85206|
| 15|MULTIPOLYGON (((-...|    85282|
| 15|POLYGON ((-112.28...|    85305|
| 15|POLYGON ((-111.92...|    85250|
| 15|POLYGON ((-110.96...|    85719|
| 15|POLYGON ((-113.11...|    84767|
| 15|POLYGON ((-111.95...|    86001|
| 15|POLYGON ((-111.94...|    85248|
+---+--------------------+---------+
only showing top 20 rows



                                                                                

In [13]:
SedonaKepler.create_map(az_stations_df)

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


                                                                                

KeplerGl(data={'unnamed': {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,…

## Arizona Retail Stores

Next, we'll use retail stores per ZCTA as a proxy for demand. Using the Overture Maps Foundation public point of interest data set.

In [14]:
sedona.table("wherobots_open_data.overture_2024_02_15.places_place").count()

53622897

In [15]:
az_retail_df = sedona.sql("""
SELECT COUNT(*) AS num, any_value(az_zta5.geometry) AS geometry, ZCTA5CE10
FROM wherobots_open_data.overture_2024_02_15.places_place places 
JOIN az_zta5
WHERE ST_Intersects(az_zta5.geometry, places.geometry)
AND places.categories.main = "retail"
GROUP BY ZCTA5CE10 
ORDER BY num DESC
""")

In [21]:
az_retail_df.createOrReplaceTempView("az_retail")

In [16]:
az_retail_df.cache().show(5)

                                                                                

+---+--------------------+---------+
|num|            geometry|ZCTA5CE10|
+---+--------------------+---------+
| 22|POLYGON ((-112.14...|    85226|
| 20|POLYGON ((-111.92...|    85260|
| 15|POLYGON ((-111.95...|    85251|
| 15|POLYGON ((-111.86...|    85210|
| 15|POLYGON ((-111.97...|    85284|
+---+--------------------+---------+
only showing top 5 rows



In [18]:
SedonaKepler.create_map(az_retail_df)

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


KeplerGl(data={'unnamed': {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,…

## Combining Retail Stores & Existing EV Chargers

Before we apply WherobotsAI raster inference to identify solar farms in the area, we'll use existing EV chargers and retail stores to identify ZCTA with high demand and low existing EV charging infrastructure by computing the ratio of retail stores to EV chargers in each ZCTA.


In [37]:
az_ratio = sedona.sql("""
SELECT 
    coalesce(az_stations.num, 0) / coalesce(az_retail.num, 1) AS ratio, 
    coalesce(az_stations.geometry, az_retail.geometry) AS geometry, 
    coalesce(az_stations.ZCTA5CE10, az_retail.ZCTA5CE10) AS ZCTA5CE10
FROM az_retail FULL OUTER JOIN az_stations
ON az_retail.ZCTA5CE10 = az_stations.ZCTA5CE10
ORDER BY ratio ASC
""")

In [38]:
az_ratio.show()

                                                                                

+-----+--------------------+---------+
|ratio|            geometry|ZCTA5CE10|
+-----+--------------------+---------+
|  0.0|POLYGON ((-112.24...|    85033|
|  0.0|POLYGON ((-111.84...|    85249|
|  0.0|POLYGON ((-113.33...|    85354|
|  0.0|MULTIPOLYGON (((-...|    85375|
|  0.0|POLYGON ((-114.81...|    85349|
|  0.0|POLYGON ((-112.64...|    86052|
|  0.0|POLYGON ((-111.54...|    85735|
|  0.0|POLYGON ((-113.81...|    85360|
|  0.0|POLYGON ((-113.33...|    86305|
|  0.0|POLYGON ((-110.75...|    85631|
|  0.0|POLYGON ((-113.16...|    85332|
|  0.0|POLYGON ((-109.53...|    86512|
|  0.0|POLYGON ((-111.05...|    85539|
|  0.0|POLYGON ((-111.68...|    85544|
|  0.0|POLYGON ((-113.33...|    86321|
|  0.0|POLYGON ((-112.79...|    85361|
|  0.0|POLYGON ((-112.54...|    86323|
|  0.0|POLYGON ((-112.20...|    85083|
|  0.0|POLYGON ((-109.10...|    86515|
|  0.0|POLYGON ((-110.19...|    85939|
+-----+--------------------+---------+
only showing top 20 rows



In [39]:
SedonaKepler.create_map(az_ratio)

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


                                                                                

KeplerGl(data={'unnamed': {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,…

ZCTAs with a low "ratio" are potential candidates for additional EV charging stations. The final input to our analysis is proximity to solar farms, which we will identify using WherobotsAI raster inference.

## WherobotsAI Raster Inference

TODO: identify solar farms within ZCTAs, prioritize low "ratio" ZCTAs
