### Packages

Use `findspark` module to find the PySpark Install.
- Run `pip install findspark` to install `findspark`
- Import `findspark`, then run `findspark.init()` and `findspark.find()`

In [66]:
import findspark
findspark.init()
from pyspark.sql import SparkSession

### Session settings

In [67]:
spark = SparkSession.builder.appName('Session-001').getOrCreate()

### Create dataframe from CSV

In [68]:
ev = spark.read.csv('data/ev-population.csv', header=True, inferSchema=True)

In [69]:
# Show dataframe
ev.show()

+----------+---------+-----------------+-----+-----------+----------+---------+-------+---------------------+-------------------------------------------------+--------------+---------+--------------------+--------------+--------------------+--------------------+-----------------+
|VIN (1-10)|   County|             City|State|Postal Code|Model Year|     Make|  Model|Electric Vehicle Type|Clean Alternative Fuel Vehicle (CAFV) Eligibility|Electric Range|Base MSRP|Legislative District|DOL Vehicle ID|    Vehicle Location|    Electric Utility|2020 Census Tract|
+----------+---------+-----------------+-----+-----------+----------+---------+-------+---------------------+-------------------------------------------------+--------------+---------+--------------------+--------------+--------------------+--------------------+-----------------+
|5YJXCAE26J|   Yakima|           Yakima|   WA|      98908|      2018|    TESLA|MODEL X| Battery Electric ...|                             Clean Alternative..

#### Create a subset of dataframe by selecting certain columns

In [70]:
ev_subset = ev.select(['County', 'State', 'Model Year', 'Make', 'Electric Vehicle Type'])

ev_subset.show()

+---------+-----+----------+---------+---------------------+
|   County|State|Model Year|     Make|Electric Vehicle Type|
+---------+-----+----------+---------+---------------------+
|   Yakima|   WA|      2018|    TESLA| Battery Electric ...|
|   Kitsap|   WA|      2021|    HONDA| Plug-in Hybrid El...|
|     King|   WA|      2019|    TESLA| Battery Electric ...|
|     King|   WA|      2013|   NISSAN| Battery Electric ...|
| Thurston|   WA|      2017|    TESLA| Battery Electric ...|
| Thurston|   WA|      2018|    HONDA| Plug-in Hybrid El...|
| Thurston|   WA|      2016|     FORD| Plug-in Hybrid El...|
| Thurston|   WA|      2023|     AUDI| Plug-in Hybrid El...|
| Thurston|   WA|      2014|     FORD| Plug-in Hybrid El...|
|   Kitsap|   WA|      2013|    TESLA| Battery Electric ...|
| Thurston|   WA|      2015|   NISSAN| Battery Electric ...|
|   Yakima|   WA|      2022|    TESLA| Battery Electric ...|
| Thurston|   WA|      2020|    TESLA| Battery Electric ...|
| Thurston|   WA|      2

#### Create new columns

In [71]:
# Import SQL functions
from pyspark.sql.functions import *

In [72]:
ev_subset = ev_subset.withColumns({
    'County': concat(ev_subset.County, lit(' County, '), ev_subset.State)
})

ev_subset.show()

+--------------------+-----+----------+---------+---------------------+
|              County|State|Model Year|     Make|Electric Vehicle Type|
+--------------------+-----+----------+---------+---------------------+
|   Yakima County, WA|   WA|      2018|    TESLA| Battery Electric ...|
|   Kitsap County, WA|   WA|      2021|    HONDA| Plug-in Hybrid El...|
|     King County, WA|   WA|      2019|    TESLA| Battery Electric ...|
|     King County, WA|   WA|      2013|   NISSAN| Battery Electric ...|
| Thurston County, WA|   WA|      2017|    TESLA| Battery Electric ...|
| Thurston County, WA|   WA|      2018|    HONDA| Plug-in Hybrid El...|
| Thurston County, WA|   WA|      2016|     FORD| Plug-in Hybrid El...|
| Thurston County, WA|   WA|      2023|     AUDI| Plug-in Hybrid El...|
| Thurston County, WA|   WA|      2014|     FORD| Plug-in Hybrid El...|
|   Kitsap County, WA|   WA|      2013|    TESLA| Battery Electric ...|
| Thurston County, WA|   WA|      2015|   NISSAN| Battery Electr

#### Rename columns

In [73]:
ev_subset = ev_subset.withColumnsRenamed({
    'Electric Vehicle Type': 'EV Type',
})

ev_subset.show()

+--------------------+-----+----------+---------+--------------------+
|              County|State|Model Year|     Make|             EV Type|
+--------------------+-----+----------+---------+--------------------+
|   Yakima County, WA|   WA|      2018|    TESLA|Battery Electric ...|
|   Kitsap County, WA|   WA|      2021|    HONDA|Plug-in Hybrid El...|
|     King County, WA|   WA|      2019|    TESLA|Battery Electric ...|
|     King County, WA|   WA|      2013|   NISSAN|Battery Electric ...|
| Thurston County, WA|   WA|      2017|    TESLA|Battery Electric ...|
| Thurston County, WA|   WA|      2018|    HONDA|Plug-in Hybrid El...|
| Thurston County, WA|   WA|      2016|     FORD|Plug-in Hybrid El...|
| Thurston County, WA|   WA|      2023|     AUDI|Plug-in Hybrid El...|
| Thurston County, WA|   WA|      2014|     FORD|Plug-in Hybrid El...|
|   Kitsap County, WA|   WA|      2013|    TESLA|Battery Electric ...|
| Thurston County, WA|   WA|      2015|   NISSAN|Battery Electric ...|
|   Ya

#### Delete columns

In [74]:
columns_to_drop = ['EV Type', 'State']
ev_subset = ev_subset.drop(*columns_to_drop)

ev_subset.show()

+--------------------+----------+---------+
|              County|Model Year|     Make|
+--------------------+----------+---------+
|   Yakima County, WA|      2018|    TESLA|
|   Kitsap County, WA|      2021|    HONDA|
|     King County, WA|      2019|    TESLA|
|     King County, WA|      2013|   NISSAN|
| Thurston County, WA|      2017|    TESLA|
| Thurston County, WA|      2018|    HONDA|
| Thurston County, WA|      2016|     FORD|
| Thurston County, WA|      2023|     AUDI|
| Thurston County, WA|      2014|     FORD|
|   Kitsap County, WA|      2013|    TESLA|
| Thurston County, WA|      2015|   NISSAN|
|   Yakima County, WA|      2022|    TESLA|
| Thurston County, WA|      2020|    TESLA|
| Thurston County, WA|      2020|    TESLA|
| Thurston County, WA|      2017|     FORD|
|   Kitsap County, WA|      2020|      KIA|
|   Yakima County, WA|      2015|   NISSAN|
| Thurston County, WA|      2014|    TESLA|
|Snohomish County, WA|      2015|CHEVROLET|
|   Chelan County, WA|      2017