# Capstone Project - Sustainable Transportation in Charlottesville: EV Charging Stations
### The Battle of the Neighborhood Week 1
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results](#results)
* [Discussion](#discussion)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

Transportation emissions account for 29% of the greenhouse gas (GHG) emissions in the U.S., making it the largest contributor to Earth's warming atmosphere and climate change.  Traditional gasoline powered vehicles burn fossil fuels and release carbon dioxide, a powerful greenhouse gas, into the atmosphere [EPA, 2020]  In Virginia, nearly half of all GHGs come from the transportation sector. [EIA, 2019]  Switching from a gasoline powered vehicle to an electric vehicle could reduce GHG emissions by 73%. [DOE, 2020] Incentives for people to choose sustainable transportation methods are needed to help decrease fossil fuel use in the transportation sector.

As more people switch to electric vehicles, they shop, dine, and visit areas that make it easy for them to charge their vehicle.  This project explores the various EV charging stations in Charlottesville, Virginia to determine the top 10 venues that are within a radius of 400 meters (.25 miles) of each EV charging station.  Next, it determines the ten most common venue categories near each station.  The project groups the EV charging stations into clusters using k-means clustering and visualizes EV charging station locations in Charlottesville and their clusters.  

This project is targeted to EV owner stakeholders who would like to charge their vehicle while dining, shopping, or visiting the local area near the EV charging station.  The resulting ten common venue categories near each station will help EV owners choose the best charging station to use for their needs.  This project helps an EV owner answer questions such as "Which EV charging station should I use if I want to get pizza while my car charges?" and "Which EV charging station should I use if I want to do some shopping while my car charges?".

--- [EPA. _Carbon Pollution from Transportation_. Retrieved March 6, 2020.](https://www.epa.gov/transportation-air-pollution-and-climate-change/carbon-pollution-transportation)

--- [EIA. _Energy-Related Carbon Dioxide Emissions by State, 2005-2016_. Retrieved March 6, 2020.](https://www.eia.gov/environment/emissions/state/analysis/pdf/stateanalysis.pdf)

--- [DOE. _Emissions from Hybrid and Plug-In Electric Vehicles_. Retrieved March 6, 2020.](https://afdc.energy.gov/vehicles/electric_emissions.html)


## Data <a name="data"></a>

This project uses the following two data sources:

* [Charlottesville Open Data - Green Infrastructure (Transportation) dataset](https://opendata.arcgis.com/datasets/4c9a19905e3b43bba02b9a540685b3e2_71.csv)
* [Foursquare dataset](https://developer.foursquare.com/)

Specifically the project uses the following data:
1. Locations of EV charging stations in Charlottesville, VA expressed in latitude and longitude.  This data is extracted from the CSV file on the Charlottesville Open Data portal located at https://opendata.arcgis.com/datasets/4c9a19905e3b43bba02b9a540685b3e2_71.csv.  An example of this type of data looks like the following:


| X           | Y           | OBJECTID | Webmap     | Entry                     | Type                | Description    |Address        |
|------------ |:-----------:|:--------:|:----------:|:-------------------------:|:-------------------:|:--------------:|:-------------:|
|-78.49324869 | 38.03193056 | 4        | EV Support | The Flats at West Village | EV Charging Station | Tesla Charger | 852 W Main St |


2. Types of venues near each charging stations.  This data is extracted using the Foursquare API.  The Foursquare RESTful API can be accessed at https://api.foursquare.com/v2.  An example of this type of data looks like the following:

## Methodology <a name="methodology"></a>

This project explores the Foursquare venue data around Charlottesville, Virginia EV charging stations.  The project displays the top 10 nearby venues and clusters and visualizes the 10 most common venue categories within a radius of 400 meters (.25 miles) of each EV charging station

**1. [Import and prepare the data](#prepare):**  The Charlottesville Green Infrastructure dataset contains sustainable transportation infrastructure including public transit, bike and pedestrian infrastructure, ride sharing and alternate fueling locations.  Download and convert the Charlottesville Green Infrastructure CSV file into a Pandas dataframe.  Next, slice the data frame to extract EV Charging Station entries.  

**2. [Visualize the EV Charging Stations](#visualize):**  To visualize the EV charging stations, create a map of Charlottesville and add markers to display the locations of the EV Charging Stations.

**3. [Display nearby venues](#display):**  Create Foursquare credentials and display the top 10 venues that are within a radius of 400 meters (.25 miles) of each EV Charging Station.  

**4. [Analyze the venues](#analyze):**  Create a dataframe with each venues name, latitude, longitude, and category.  Calculate the number of unique venue categories.  Calculate the mean of the frequency of occurrence of each category.

**5. [Display the most common venues](#common):**  Display each EV Charging Station along with the 10 most common venues.

**6. [Cluster and visualize EV Charging Station clusters](#cluster):**  Run k-means to cluster the neighborhood into 4 clusters.  K-means was chosen to group EV charging station areas with similar characteristics together.  K-means minimizes intra-cluster distances and maximizes inter-cluster distances.  Visualize the resulting clusters and add markers to the map.  Examine each cluster to display the 1st-10th most common venues.


## Analysis <a name="analysis"></a>

### Import and prepare the data <a name="prepare"></a>

This section prepares the Python platform and data. Necessary libraries are imported. The Charlottesville Green Infrastructure dataset is imported and converted into a Pandas dataframe.

Let's visualize the data we have so far: city center location and candidate neighborhood centers:

In [1]:
#import libraries
import numpy as np
import pandas as pd
!conda install -c conda-forge folium=0.5.0 --yes
import folium #map rendering library
!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim #convert an address into lat and long values
import json # library to handle JSON files
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
from sklearn.cluster import KMeans # import k-means from clustering stage
import matplotlib.cm as cm # Matplotlib and associated plotting modules
import matplotlib.colors as colors

Solving environment: done


  current version: 4.5.11
  latest version: 4.8.2

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    scikit-learn-0.20.1        |   py36h22eb022_0         5.7 MB
    liblapack-3.8.0            |      11_openblas          10 KB  conda-forge
    numpy-1.18.1               |   py36h95a1406_0         5.2 MB  conda-forge
    liblapacke-3.8.0           |      11_openblas          10 KB  conda-forge
    libopenblas-0.3.6          |       h5a2b251_2         7.7 MB
    scipy-1.4.1                |   py36h921218d_0        18.9 MB  conda-forge
    libcblas-3.8.0             |      11_openblas 

In [4]:
#read csv file into dataframe
url = "https://opendata.arcgis.com/datasets/4c9a19905e3b43bba02b9a540685b3e2_71.csv"
df = pd.read_csv(url)

In [5]:
#check the dataframe
df

Unnamed: 0,X,Y,OBJECTID,Webmap,Entry,Type_,Description,Address
0,-78.486762,38.032283,1,Bike Support,Carver Recreation Center,Bike Fix-it Station,A fix-it station is a temporary or permanent f...,233 4th St NW
1,-78.477374,38.029038,2,Bike Support,Downtown Transit Center,Bike Fix-it Station,A fix-it station is a temporary or permanent f...,615 Water St E
2,-78.454631,38.024054,3,Bike Support,Riverview Park,Bike Fix-it Station,A fix-it station is a temporary or permanent f...,end of Chesapeake Street
3,-78.493249,38.031931,4,EV Support,The Flats at West Village,EV Charging Station,Tesla Charger,852 W Main St
4,-78.480905,38.031099,5,EV Support,First and Market Parking Garage,EV Charging Station,"Level 2, DC Fast",104 1st St N
5,-78.481435,38.029704,6,EV Support,Water Street Parking,EV Charging Station,DC Fast,Water Street
6,-78.507632,38.031464,7,EV Support,Oakhurst Inn,EV Charging Station,Level 2,100 Oakhurst Circle
7,-78.469102,38.023136,8,EV Support,Martin Horn,EV Charging Station,Level 2,210 Carlton Rd
8,-78.488893,38.061129,9,EV Support,Homewood Suites,EV Charging Station,DC Fast,2036 India Rod
9,-78.487974,38.035969,10,EV Support,Timbercreek Market,EV Charging Station,Level 2,722 Preston Ave


In [7]:
#slice the dataframe to create a new dataframe with only entries for the EV Charging Stations
df_ev = df[df['Type_'] == 'EV Charging Station'].reset_index(drop=True)
df_ev

Unnamed: 0,X,Y,OBJECTID,Webmap,Entry,Type_,Description,Address
0,-78.493249,38.031931,4,EV Support,The Flats at West Village,EV Charging Station,Tesla Charger,852 W Main St
1,-78.480905,38.031099,5,EV Support,First and Market Parking Garage,EV Charging Station,"Level 2, DC Fast",104 1st St N
2,-78.481435,38.029704,6,EV Support,Water Street Parking,EV Charging Station,DC Fast,Water Street
3,-78.507632,38.031464,7,EV Support,Oakhurst Inn,EV Charging Station,Level 2,100 Oakhurst Circle
4,-78.469102,38.023136,8,EV Support,Martin Horn,EV Charging Station,Level 2,210 Carlton Rd
5,-78.488893,38.061129,9,EV Support,Homewood Suites,EV Charging Station,DC Fast,2036 India Rod
6,-78.487974,38.035969,10,EV Support,Timbercreek Market,EV Charging Station,Level 2,722 Preston Ave


### Visualize the EV Charging Stations<a name="visualize"></a>

In [8]:
#create a map of Charlottesville with EV Charging Stations dislayed
#get lat long for Charlottesville
address = 'Charlottesville, VA'
geolocator = Nominatim(user_agent="cv_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The coordinates of Charlottesville are {}, {}.'.format(latitude, longitude))

The coordinates of Charlottesville are 38.029306, -78.4766781.


In [9]:
#create map of Charlottesville using lat long valies
map_cville = folium.Map(location=[latitude, longitude], zoom_start=13)

#add markers for EV Charging Stations to map. In the df_ev dataframe Y=lat, X=long.
for lat, lng, entry, address, description in zip(df_ev['Y'], df_ev['X'], df_ev['Entry'], df_ev['Address'], df_ev['Description']):
      
    label = '{}, {}, {}'.format(entry, address, description)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_cville)
    
map_cville

### Display nearby venues<a name="display"></a>

In [10]:
#define Foursquare Credentials and Version
CLIENT_ID = 'E0KWLLFTSDLJ11AVSSSZOJXRIJZIA3EMIAFMYDUVQJMHNQYG' # your Foursquare ID
CLIENT_SECRET = '3TBMX1RPU5UA5IK01VCUNAZ45QAH1SU3LJVCGRDDAP5ER4BR' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: E0KWLLFTSDLJ11AVSSSZOJXRIJZIA3EMIAFMYDUVQJMHNQYG
CLIENT_SECRET:3TBMX1RPU5UA5IK01VCUNAZ45QAH1SU3LJVCGRDDAP5ER4BR


In [11]:
#Get and entry name
df_ev.loc[0, 'Entry']

'The Flats at West Village'

In [12]:
#Get the entry's latitude and longitude values.
entry_latitude = df_ev.loc[0, 'Y'] # entry latitude value
entry_longitude = df_ev.loc[0, 'X'] # entry longitude value

entry_name = df_ev.loc[0, 'Entry'] # entry name

print('Latitude and longitude values of {} are {}, {}.'.format(entry_name, entry_latitude, entry_longitude))

Latitude and longitude values of The Flats at West Village are 38.031930555712535, -78.49324869297422.


In [13]:
#get the top 10 venues that are within a radius of 400 meters (.25 miles) of The Flats at West Village

LIMIT = 10 # limit of number of venues returned by Foursquare API

radius = 400 # define radius

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    entry_latitude, 
    entry_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=E0KWLLFTSDLJ11AVSSSZOJXRIJZIA3EMIAFMYDUVQJMHNQYG&client_secret=3TBMX1RPU5UA5IK01VCUNAZ45QAH1SU3LJVCGRDDAP5ER4BR&v=20180605&ll=38.031930555712535,-78.49324869297422&radius=400&limit=10'

In [14]:
#send the GET request and examine the results
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e650d4147b43d0028f58aea'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'},
    {'name': '$-$$$$', 'key': 'price'}]},
  'headerLocation': 'Charlottesville',
  'headerFullLocation': 'Charlottesville',
  'headerLocationGranularity': 'city',
  'totalResults': 25,
  'suggestedBounds': {'ne': {'lat': 38.03553055931254,
    'lng': -78.48868676569252},
   'sw': {'lat': 38.02833055211253, 'lng': -78.49781062025592}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4b81e93ff964a520aec330e3',
       'name': 'Continental Divide',
       'location': {'address': '811 W Main St',
        'lat': 38.03174846372526,
        'lng': -78.49103648727873,
        'labeledLatLngs': [{'label

In [15]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [16]:
#clean the json and structure it into a pandas dataframe
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues = nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

Unnamed: 0,name,categories,lat,lng
0,Continental Divide,Mexican Restaurant,38.031748,-78.491036
1,Sugar Shack Donuts & Coffee,Donut Shop,38.032918,-78.494927
2,Wild Wing Cafe,Wings Joint,38.031532,-78.491995
3,Peloton Station,Sports Bar,38.033243,-78.494078
4,Mel's Cafe,Southern / Soul Food Restaurant,38.031678,-78.490575
5,Hardywood Pilot Brewery & Taproom,Brewery,38.032496,-78.495093
6,Doma Korean Kitchen,Korean Restaurant,38.031693,-78.489938
7,"The Draftsman, Autograph Collection",Hotel,38.032613,-78.496445
8,Potbelly's,Sandwich Place,38.03257,-78.49405
9,Snowing In Space Coffee,Coffee Shop,38.031559,-78.489936


In [17]:
#create a function to repeat the same process to all the EV Charging Station in Charlottesville
def getNearbyVenues(names, latitudes, longitudes, radius=400):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        
        print("------EV Charging Station: "+name+"------")
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
        
        for v in results:
            print(v['venue']['name'])
        print('\n')
    
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Entry', 
                  'Entry Latitude', 
                  'Entry Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [18]:
#run the above function on each entry and create a new dataframe called cville_venues.
cville_venues = getNearbyVenues(names=df_ev['Entry'],
                                   latitudes=df_ev['Y'],
                                   longitudes=df_ev['X']
                                  )

------EV Charging Station: The Flats at West Village------
Continental Divide
Sugar Shack Donuts & Coffee
Wild Wing Cafe
Peloton Station
Mel's Cafe
Hardywood Pilot Brewery & Taproom
Doma Korean Kitchen
The Draftsman, Autograph Collection
Potbelly's
Snowing In Space Coffee


------EV Charging Station: First and Market Parking Garage------
The Jefferson Theater
Charlottesville Historic Downtown Mall
Jack Brown's Beer & Burger Joint
Revolutionary Soup
The Paramount
The Alley Light
Charlottesville City Market
The Pie Chest
Market Street Wineshop Downtown
Splendora's Gelato


------EV Charging Station: Water Street Parking------
The Jefferson Theater
Charlottesville Historic Downtown Mall
Jack Brown's Beer & Burger Joint
Revolutionary Soup
The Paramount
The Alley Light
Charlottesville City Market
Splendora's Gelato
Blue Whale Books
Mudhouse


------EV Charging Station: Oakhurst Inn------
Oakhurst Inn
Starbucks
Einstein Bros. Bagels
West Range Cafe


------EV Charging Station: Martin Horn---

### Analyze the venues<a name="analyze"></a>

In [19]:
#check the dataframe
print(cville_venues.shape)
cville_venues

(63, 7)


Unnamed: 0,Entry,Entry Latitude,Entry Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Flats at West Village,38.031931,-78.493249,Continental Divide,38.031748,-78.491036,Mexican Restaurant
1,The Flats at West Village,38.031931,-78.493249,Sugar Shack Donuts & Coffee,38.032918,-78.494927,Donut Shop
2,The Flats at West Village,38.031931,-78.493249,Wild Wing Cafe,38.031532,-78.491995,Wings Joint
3,The Flats at West Village,38.031931,-78.493249,Peloton Station,38.033243,-78.494078,Sports Bar
4,The Flats at West Village,38.031931,-78.493249,Mel's Cafe,38.031678,-78.490575,Southern / Soul Food Restaurant
...,...,...,...,...,...,...,...
58,Timbercreek Market,38.035969,-78.487974,Random Row Brewing Co.,38.035058,-78.486472,Brewery
59,Timbercreek Market,38.035969,-78.487974,Ace Biscuit & Barbecue,38.038035,-78.484864,BBQ Joint
60,Timbercreek Market,38.035969,-78.487974,Sticks Kebob Shop,38.038308,-78.489803,Middle Eastern Restaurant
61,Timbercreek Market,38.035969,-78.487974,Integral Yoga Natural Foods,38.038881,-78.489677,Health Food Store


In [20]:
#how many venues were returned for each Entry (EV Charging Station)
cville_venues.groupby('Entry').count()

Unnamed: 0_level_0,Entry Latitude,Entry Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Entry,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
First and Market Parking Garage,10,10,10,10,10,10
Homewood Suites,10,10,10,10,10,10
Martin Horn,9,9,9,9,9,9
Oakhurst Inn,4,4,4,4,4,4
The Flats at West Village,10,10,10,10,10,10
Timbercreek Market,10,10,10,10,10,10
Water Street Parking,10,10,10,10,10,10


In [21]:
#how many unique categories can be curated from all the returned venues
print('There are {} uniques categories.'.format(len(cville_venues['Venue Category'].unique())))

There are 43 uniques categories.


In [22]:
#analyze each EV Charging Station area
# one hot encoding
cville_onehot = pd.get_dummies(cville_venues[['Venue Category']], prefix="", prefix_sep="")

# add entry column back to dataframe
cville_onehot['Entry'] = cville_venues['Entry'] 

# move Entry column to the first column
fixed_columns = [cville_onehot.columns[-1]] + list(cville_onehot.columns[:-1])
cville_onehot = cville_onehot[fixed_columns]
cville_onehot.head()

Unnamed: 0,Entry,American Restaurant,Art Gallery,BBQ Joint,Bagel Shop,Bakery,Bar,Beer Garden,Bookstore,Brewery,...,Pub,Sandwich Place,Shopping Mall,Soup Place,Southern / Soul Food Restaurant,Speakeasy,Sports Bar,Thai Restaurant,Wine Shop,Wings Joint
0,The Flats at West Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,The Flats at West Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,The Flats at West Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
3,The Flats at West Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
4,The Flats at West Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0


In [23]:
#new dataframe size
cville_onehot.shape

(63, 44)

In [24]:
#group rows by Entry and by taking the mean of the frequency of occurrence of each category
cville_grouped = cville_onehot.groupby('Entry').mean().reset_index()
cville_grouped

Unnamed: 0,Entry,American Restaurant,Art Gallery,BBQ Joint,Bagel Shop,Bakery,Bar,Beer Garden,Bookstore,Brewery,...,Pub,Sandwich Place,Shopping Mall,Soup Place,Southern / Soul Food Restaurant,Speakeasy,Sports Bar,Thai Restaurant,Wine Shop,Wings Joint
0,First and Market Parking Garage,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.1,0.1,0.0,0.1,0.0,0.0,0.1,0.0
1,Homewood Suites,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0
2,Martin Horn,0.0,0.111111,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0
3,Oakhurst Inn,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,The Flats at West Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,...,0.0,0.1,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.1
5,Timbercreek Market,0.0,0.0,0.1,0.1,0.1,0.0,0.1,0.0,0.1,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Water Street Parking,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,...,0.0,0.0,0.1,0.1,0.0,0.1,0.0,0.0,0.0,0.0


In [25]:
#confirm the new size
cville_grouped.shape

(7, 44)

### Display the most common venues<a name="common"></a>

In [27]:
#print each EV Charging Station along with the top 10 most common venues
num_top_venues = 10

for ev in cville_grouped['Entry']:
    print("----"+ev+"----")
    temp = cville_grouped[cville_grouped['Entry'] == ev].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----First and Market Parking Garage----
                 venue  freq
0          Music Venue   0.2
1         Dessert Shop   0.1
2            Wine Shop   0.1
3            Speakeasy   0.1
4           Soup Place   0.1
5        Shopping Mall   0.1
6         Burger Joint   0.1
7       Farmers Market   0.1
8             Pie Shop   0.1
9  American Restaurant   0.0


----Homewood Suites----
                      venue  freq
0             Grocery Store   0.2
1         Indian Restaurant   0.2
2       American Restaurant   0.1
3                       Pub   0.1
4       Dumpling Restaurant   0.1
5  Mediterranean Restaurant   0.1
6                     Hotel   0.1
7                 Wine Shop   0.1
8                Soup Place   0.0
9             Shopping Mall   0.0


----Martin Horn----
                             venue  freq
0               Chinese Restaurant  0.11
1                 Business Service  0.11
2                      Art Gallery  0.11
3                   Farmers Market  0.11
4             

In [29]:
#put the top 10 venues for each EV Charging Station into a dataframe
#function to sort the venues in descending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [31]:
#create a new dataframe and display the top 10 venues for each EV Charging Station
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Entry']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
ev_venues_sorted = pd.DataFrame(columns=columns)
ev_venues_sorted['Entry'] = cville_grouped['Entry']

for ind in np.arange(cville_grouped.shape[0]):
    ev_venues_sorted.iloc[ind, 1:] = return_most_common_venues(cville_grouped.iloc[ind, :], num_top_venues)

ev_venues_sorted

Unnamed: 0,Entry,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,First and Market Parking Garage,Music Venue,Shopping Mall,Dessert Shop,Wine Shop,Burger Joint,Pie Shop,Farmers Market,Soup Place,Speakeasy,Café
1,Homewood Suites,Grocery Store,Indian Restaurant,American Restaurant,Wine Shop,Hotel,Pub,Mediterranean Restaurant,Dumpling Restaurant,Café,Dessert Shop
2,Martin Horn,Farmers Market,Thai Restaurant,Art Gallery,Bar,Comfort Food Restaurant,Pizza Place,Laundry Service,Chinese Restaurant,Business Service,Café
3,Oakhurst Inn,Café,Hotel,Bagel Shop,Coffee Shop,Wings Joint,Farmers Market,Dumpling Restaurant,Donut Shop,Dessert Shop,Comfort Food Restaurant
4,The Flats at West Village,Wings Joint,Sandwich Place,Hotel,Coffee Shop,Korean Restaurant,Mexican Restaurant,Brewery,Donut Shop,Sports Bar,Southern / Soul Food Restaurant
5,Timbercreek Market,Health Food Store,BBQ Joint,Bagel Shop,Bakery,Beer Garden,Juice Bar,Brewery,Garden Center,Middle Eastern Restaurant,Chinese Restaurant
6,Water Street Parking,Music Venue,Bookstore,Burger Joint,Dessert Shop,Speakeasy,Soup Place,Shopping Mall,Coffee Shop,Farmers Market,Donut Shop


### Cluster and visualize EV Charging Station clusters<a name="cluster"></a>

In [32]:
#Cluster EV Charging Stations
#Run k-means to cluster the neighborhood into 4 clusters
# set number of clusters
kclusters = 4

cville_grouped_clustering = cville_grouped.drop('Entry', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(cville_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 3, 2, 0, 2, 2, 1], dtype=int32)

In [33]:
#create a new dataframe that includes the cluster as well as the top 10 venues for each EV Charging Station
# add clustering labels
ev_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

cville_merged = df_ev

# merge cville_grouped with df_ev to add latitude/longitude for each entry
cville_merged = cville_merged.join(ev_venues_sorted.set_index('Entry'), on='Entry')

cville_merged # check the last columns!

Unnamed: 0,X,Y,OBJECTID,Webmap,Entry,Type_,Description,Address,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,-78.493249,38.031931,4,EV Support,The Flats at West Village,EV Charging Station,Tesla Charger,852 W Main St,2,Wings Joint,Sandwich Place,Hotel,Coffee Shop,Korean Restaurant,Mexican Restaurant,Brewery,Donut Shop,Sports Bar,Southern / Soul Food Restaurant
1,-78.480905,38.031099,5,EV Support,First and Market Parking Garage,EV Charging Station,"Level 2, DC Fast",104 1st St N,1,Music Venue,Shopping Mall,Dessert Shop,Wine Shop,Burger Joint,Pie Shop,Farmers Market,Soup Place,Speakeasy,Café
2,-78.481435,38.029704,6,EV Support,Water Street Parking,EV Charging Station,DC Fast,Water Street,1,Music Venue,Bookstore,Burger Joint,Dessert Shop,Speakeasy,Soup Place,Shopping Mall,Coffee Shop,Farmers Market,Donut Shop
3,-78.507632,38.031464,7,EV Support,Oakhurst Inn,EV Charging Station,Level 2,100 Oakhurst Circle,0,Café,Hotel,Bagel Shop,Coffee Shop,Wings Joint,Farmers Market,Dumpling Restaurant,Donut Shop,Dessert Shop,Comfort Food Restaurant
4,-78.469102,38.023136,8,EV Support,Martin Horn,EV Charging Station,Level 2,210 Carlton Rd,2,Farmers Market,Thai Restaurant,Art Gallery,Bar,Comfort Food Restaurant,Pizza Place,Laundry Service,Chinese Restaurant,Business Service,Café
5,-78.488893,38.061129,9,EV Support,Homewood Suites,EV Charging Station,DC Fast,2036 India Rod,3,Grocery Store,Indian Restaurant,American Restaurant,Wine Shop,Hotel,Pub,Mediterranean Restaurant,Dumpling Restaurant,Café,Dessert Shop
6,-78.487974,38.035969,10,EV Support,Timbercreek Market,EV Charging Station,Level 2,722 Preston Ave,2,Health Food Store,BBQ Joint,Bagel Shop,Bakery,Beer Garden,Juice Bar,Brewery,Garden Center,Middle Eastern Restaurant,Chinese Restaurant


In [34]:
#visualize the resulting clusters
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=13)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(cville_merged['Y'], cville_merged['X'], cville_merged['Entry'], cville_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [50]:
#examine each cluster
#cluster 1
cville_merged.loc[cville_merged['Cluster Labels'] == 0, cville_merged.columns[[4] + list(range(9, cville_merged.shape[1]))]]

Unnamed: 0,Entry,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,First and Market Parking Garage,Music Venue,Shopping Mall,Farmers Market,Wine Shop,Burger Joint,Pie Shop,Dessert Shop,Soup Place,Speakeasy,Bakery
2,Water Street Parking,Music Venue,Shopping Mall,Dessert Shop,Coffee Shop,Burger Joint,Bookstore,Farmers Market,Soup Place,Speakeasy,Bakery


In [51]:
#cluster 2
cville_merged.loc[cville_merged['Cluster Labels'] == 1, cville_merged.columns[[4] + list(range(9, cville_merged.shape[1]))]]

Unnamed: 0,Entry,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,The Flats at West Village,Wings Joint,Sports Bar,Coffee Shop,Southern / Soul Food Restaurant,Hotel,Sandwich Place,Korean Restaurant,Donut Shop,Brewery,Mexican Restaurant
4,Martin Horn,Business Service,Thai Restaurant,Art Gallery,Comfort Food Restaurant,Bar,Chinese Restaurant,Pizza Place,Farmers Market,Wings Joint,Donut Shop
6,Timbercreek Market,Garden Center,Chinese Restaurant,Health Food Store,BBQ Joint,Bagel Shop,Bakery,Beer Garden,Brewery,Juice Bar,Middle Eastern Restaurant


In [52]:
#cluster 3
cville_merged.loc[cville_merged['Cluster Labels'] == 2, cville_merged.columns[[4] + list(range(9, cville_merged.shape[1]))]]

Unnamed: 0,Entry,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Oakhurst Inn,Amphitheater,Hotel,Bagel Shop,Coffee Shop,Café,Wings Joint,Business Service,Dumpling Restaurant,Donut Shop,Dessert Shop


In [53]:
#cluster 4
cville_merged.loc[cville_merged['Cluster Labels'] == 3, cville_merged.columns[[4] + list(range(9, cville_merged.shape[1]))]]

Unnamed: 0,Entry,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Homewood Suites,Grocery Store,Indian Restaurant,Pub,Dumpling Restaurant,Wine Shop,Hotel,Mediterranean Restaurant,American Restaurant,Sandwich Place,Bookstore


## Results <a name="results"></a>

The results of the top 10 nearby venues for each EV Charging Station are shown in the tables below:

|EV Charging Station: The Flats at West Village|
|:--------------------------------------------:|
|Continental Divide|
|Sugar Shack Donuts & Coffee|
|Wild Wing Cafe|
|Peloton Station|
|Mel's Cafe|
|Hardywood Pilot Brewery & Taproom|
|Doma Korean Kitchen|
|The Draftsman, Autograph Collection|
|Potbelly's|
|Snowing In Space Coffee|
 
  
|EV Charging Station: First and Market Parking Garage|
|:--------------------------------------------------:|
|The Jefferson Theater|
|Charlottesville Historic Downtown Mall|
|Jack Brown's Beer & Burger Joint|
|Revolutionary Soup|
|The Paramount|
|The Alley Light|
|Charlottesville City Market|
|The Pie Chest|
| Market Street Wineshop Downtown|
|Splendora's Gelato|
 
  
|EV Charging Station: Water Street Parking|
|:---------------------------------------:|
|The Jefferson Theater|
|Charlottesville Historic Downtown Mall|
|Jack Brown's Beer & Burger Joint|
|Revolutionary Soupn| 
|The Paramount|
|The Alley Light|
|Charlottesville City Market|
|Splendora's Gelato|
|Blue Whale Books|
|Mudhouse|
 
  
|EV Charging Station: Oakhurst Inn|
|:-------------------------------:|
|Oakhurst Inn|
|Starbucks|
|Einstein Bros. Bagels|
|West Range Cafe|
|McIntire Amphitheatre|
 
   
|EV Charging Station: Martin Horn|
|:------------------------------:| 
|Beer Run| 
|Red Lantern|  
|Belmont Pizza and Pub| 
|Pad Thai| 
|Firefly| 
|Cavalier Produce| 
|The Glass Palette| 
|Frontrunner Sign Studios| 
 
 
|EV Charging Station: Homewood Suites|
|:----------------------------------:|
|Homewood Suites by Hilton|
|Charlottesville Pub|
|Whole Foods|
|Trader Joe's|
|Maharaja|
|Mezeh Mediterranean Grill|
|Marco & Luca Noodle Shop|
|Burtons Grill|
|Milan Indian Cuisine|
|Wine Warehouse|
 
  
|EV Charging Station: Timbercreek Market|
|:-------------------------------------:|
|Kardinal Hall|
|MarieBette|
|The Juice Laundry|
|Bodo's Bagels|
|Fifth Season Gardening Co.|
|Random Row Brewing Co.|
|Ace Biscuit & Barbecue|
|Sticks Kebob Shop|
|Integral Yoga Natural Foods|
|Cafe 88|

The results of the 10 most common venue categories for each EV Charging Station are shown in the tables below:

|First and Market Parking Garage|
|:-----------------------------:|
|Music Venue|
|Wine Shop|
|Speakeasy|
|Farmers Market|
|Soup Place|
|Shopping Mall|
|Burger Joint|
|Dessert Shop|
|Pie Shop|
|American Restaurant|

|Homewood Suites|
|:-------------:|
|Grocery Store|
|Indian Restaurant|
|Dumpling Restaurant|
|Wine Shop|
|Pub|
|Mediterranean Restaurant|
|Hotel|
|American Restaurant|
|Bookstore|
|Sandwich Place|

|Martin Horn|
|:---------:|
|Comfort Food Restaurant|
|Art Gallery|
|Thai Restaurant|
|Bar|
|Farmers Market|
|Business Service|
|Pizza Place|
|Chinese Restaurant|
|American Restaurant|
|Mediterranean Restaurant|

|Oakhurst Inn|
|:----------:|
|Café|
|Bagel Shop|
|Amphitheater|
|Coffee Shop|
|Hotel|
|Sports Bar|
|Pie Shop|
|Juice Bar|
|Wine Shop|
|Korean Restaurant|

|The Flats at West Village|
|:-----------------------:|
|Wings Joint|
|Coffee Shop|
|Korean Restaurant|
|Sports Bar|
|Southern / Soul Food Restaurant|
|Mexican Restaurant|
|Donut Shop|
|Brewery|
|Sandwich Place|
|Hotel|:----------:|

|Timbercreek Market|
|:----------------:|
|Chinese Restaurant|
|Beer Garden|
|Health Food Store|
|Middle Eastern Restaurant|
|Garden Center|
|Brewery|
|Juice Bar|
|Bakery|
|BBQ Joint|
|Bagel Shop|

|Water Street Parking|
|:------------------:|
|Music Venue|
|Speakeasy|
|Farmers Market|
|Soup Place|
|Bookstore|
|Shopping Mall|
|Burger Joint|
|Dessert Shop|
|Coffee Shop|
|American Restaurant|

## Discussion <a name="discussion"></a>

This project shows the variety of different venues and venue categories located withing 400 meters (.25 miles) of each EV Charging Station in Charlottesville, VA.  It enables EV owner stakeholders to decide which EV Charging Station to use based on their needs.  The current EV Charging Stations are located in a way to take advantage of amenities in the areas of downtown, midtown, uptown, the "Corner", and Woolen Mills.

The k-means clustering to visualize clusters with similar venue categories could help city and business planners to determine new locations for EV charging stations that will address any gaps in EV charging station area amenities.  For example, only one EV charging stations was within walking distance to a grocery store.  This could incentivize a grocery store to install and EV charging station to attract a new customer base.  Additionally, those who are interested in starting a new business could view the most common venues for each EV charging station to determine what type of business would be a complementary category for each area and to determine competition.  For example, if a coffee shop is the most common venue for an EV charging station area, it would indicate current, established competition for the business of coffee shops.

## Conclusion <a name="conclusion"></a>

The purpose of this project was to assist EV stakeholder owners with their travels and errands by exploring the various EV charging stations in Charlottesville, Virginia to determine the top 10 venues that are within a radius of 400 meters (.25 miles) of each EV charging station.  Next, it determined the ten most common venue categories near each station.  The project grouped the EV charging stations into clusters using k-means clustering and visualized EV charging station locations in Charlottesville and their clusters. 

The analysis and results in this project not only benefit EV owner stakeholders to choose the best EV charging station for their needs, it also benefits city planners and current and future business owners for planning services and amenities near current and proposed EV charging stations.  Future work for this project could be to add location information for proposed new EV charging stations to view the nearby venues and the most common venues for the area.  The proposed new EV charging stations can also be added to the k-means clustering to determining similarities and differences in the venue categories.

This project helps an EV owner answer questions such as "Which EV charging station should I use if I want to get pizza while my car charges?" and "Which EV charging station should I use if I want to do some shopping while my car charges?".  This project also helps current and new business owners answer questions such as "What competition is located near each EV charging station?" and "What gaps in services are opportunities at each EV charging station?".  The project lends itself well to future research to answer the question "Where should new EV charging stations be installed to fill gaps in services and amenities located nearby?".