# Advanced Geospatial Plotting

## This script contains the following:

### 1. Import libraries and data

### 2. Create value column and aggregated datframe

### 3. Merge data to create longitude and latitude data for Chigago

### 4. Initialise instance of kepler.gl map

### 5. Customise map output

### 6. Add filter to map

### 7. Create config object

### 1. Import libraries and data

In [1]:
# import libraries

import pandas as pd
import os
from keplergl import KeplerGl
from pyproj import CRS
import numpy as np
from matplotlib import pyplot as plt

In [2]:
import keplergl
pd.__version__

'1.5.1'

In [3]:
df = pd.read_csv('chicago_data.csv', index_col= 0)

### 2. Create value column and aggregated dataframe

In [4]:
# create value column and group by 'start' and 'end' station

df['value'] = 1
df_group = df.groupby(['from_station_name', 'to_station_name'])['value'].count().reset_index()

In [5]:
df_group

Unnamed: 0,from_station_name,to_station_name,value
0,2112 W Peterson Ave,2112 W Peterson Ave,14
1,2112 W Peterson Ave,Ashland Ave & Belle Plaine Ave,1
2,2112 W Peterson Ave,Avondale Ave & Irving Park Rd,1
3,2112 W Peterson Ave,Benson Ave & Church St,2
4,2112 W Peterson Ave,Broadway & Argyle St,2
...,...,...,...
113674,Yates Blvd & 75th St,South Shore Dr & 74th St,2
113675,Yates Blvd & 75th St,Stony Island Ave & 71st St,2
113676,Yates Blvd & 75th St,Stony Island Ave & 75th St,3
113677,Yates Blvd & 75th St,Woodlawn Ave & 55th St,2


In [6]:
print(df_group['value'].sum())
print(df.shape)

3603082
(3603082, 16)


In [7]:
""" rename columns to use in merge of dataframes """

df_group.rename(columns = {'from_station_name':'start_station_name','to_station_name' : 'end_station_name',
                          'value': 'trips'}, inplace = True)

### 3. Merge data to create longitude and latitude data for Chigago

In [8]:
# load location data

df_stations = pd.read_csv('Divvy_Bicycle_Stations_20241112.csv', index_col = 0)

In [9]:
df_stations.head()

Unnamed: 0_level_0,Station Name,Short Name,Total Docks,Docks in Service,Status,Latitude,Longitude,Location
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
a3ad5c90-a135-11e9-9cda-0a87ae2ba916,Dorchester Ave & 49th St,KA1503000069,15,15,In Service,41.805772,-87.592464,POINT (-87.592464 41.805772)
1571105068000485406,Narragansett & Irving Park,,9,9,In Service,41.952614,-87.785383,POINT (-87.7853829 41.952614)
a3b2af02-a135-11e9-9cda-0a87ae2ba916,MLK Jr Dr & 83rd St,586,11,11,In Service,41.743116,-87.6148,POINT (-87.6148 41.743116)
1594046405283107528,California & 16th St,,9,9,In Service,41.859228,-87.695562,POINT (-87.695562 41.859228)
a3aa017e-a135-11e9-9cda-0a87ae2ba916,Southport Ave & Clark St,TA1308000047,11,11,In Service,41.957081,-87.664199,POINT (-87.664199 41.957081)


In [10]:
""" add column showing station at which the bike hire starts """

df_stations['start_station_name'] = df_stations['Station Name']

In [11]:
df_stations.head()

Unnamed: 0_level_0,Station Name,Short Name,Total Docks,Docks in Service,Status,Latitude,Longitude,Location,start_station_name
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
a3ad5c90-a135-11e9-9cda-0a87ae2ba916,Dorchester Ave & 49th St,KA1503000069,15,15,In Service,41.805772,-87.592464,POINT (-87.592464 41.805772),Dorchester Ave & 49th St
1571105068000485406,Narragansett & Irving Park,,9,9,In Service,41.952614,-87.785383,POINT (-87.7853829 41.952614),Narragansett & Irving Park
a3b2af02-a135-11e9-9cda-0a87ae2ba916,MLK Jr Dr & 83rd St,586,11,11,In Service,41.743116,-87.6148,POINT (-87.6148 41.743116),MLK Jr Dr & 83rd St
1594046405283107528,California & 16th St,,9,9,In Service,41.859228,-87.695562,POINT (-87.695562 41.859228),California & 16th St
a3aa017e-a135-11e9-9cda-0a87ae2ba916,Southport Ave & Clark St,TA1308000047,11,11,In Service,41.957081,-87.664199,POINT (-87.664199 41.957081),Southport Ave & Clark St


In [12]:
""" rename column to show station at which bike hire ends """

df_stations.rename(columns = {'Station Name' : 'end_station_name'}, inplace = True)

In [13]:
df_stations.reset_index(inplace = True)

In [14]:
""" create dataframe from subset of data with four columns """

df_stations = df_stations[['end_station_name', 'start_station_name', 'Latitude', 'Longitude']]

In [15]:
df_stations.head()

Unnamed: 0,end_station_name,start_station_name,Latitude,Longitude
0,Dorchester Ave & 49th St,Dorchester Ave & 49th St,41.805772,-87.592464
1,Narragansett & Irving Park,Narragansett & Irving Park,41.952614,-87.785383
2,MLK Jr Dr & 83rd St,MLK Jr Dr & 83rd St,41.743116,-87.6148
3,California & 16th St,California & 16th St,41.859228,-87.695562
4,Southport Ave & Clark St,Southport Ave & Clark St,41.957081,-87.664199


By start station

In [16]:
""" merge df_s with df_stations on 'start_station_name' column """

df_s = df_group.merge(df_stations, how = 'outer', on = 'start_station_name', indicator = 'merge_flag')

In [17]:
df_s['merge_flag'].value_counts(dropna = False)

both          100966
left_only      12713
right_only       486
Name: merge_flag, dtype: int64

In [18]:
df_s = df_s[df_s['merge_flag'] =='both']

In [19]:
df_s.head()

Unnamed: 0,start_station_name,end_station_name_x,trips,end_station_name_y,Latitude,Longitude,merge_flag
0,2112 W Peterson Ave,2112 W Peterson Ave,14.0,2112 W Peterson Ave,41.991178,-87.683593,both
1,2112 W Peterson Ave,Ashland Ave & Belle Plaine Ave,1.0,2112 W Peterson Ave,41.991178,-87.683593,both
2,2112 W Peterson Ave,Avondale Ave & Irving Park Rd,1.0,2112 W Peterson Ave,41.991178,-87.683593,both
3,2112 W Peterson Ave,Benson Ave & Church St,2.0,2112 W Peterson Ave,41.991178,-87.683593,both
4,2112 W Peterson Ave,Broadway & Argyle St,2.0,2112 W Peterson Ave,41.991178,-87.683593,both


In [20]:
df_s.shape

(100966, 7)

In [21]:
df_s.drop(columns = {'end_station_name_y'}, inplace = True)

In [22]:
df_s.rename(columns = {'end_station_name_x' : 'end_station_name'}, inplace = True)

In [23]:
df_s.head()

Unnamed: 0,start_station_name,end_station_name,trips,Latitude,Longitude,merge_flag
0,2112 W Peterson Ave,2112 W Peterson Ave,14.0,41.991178,-87.683593,both
1,2112 W Peterson Ave,Ashland Ave & Belle Plaine Ave,1.0,41.991178,-87.683593,both
2,2112 W Peterson Ave,Avondale Ave & Irving Park Rd,1.0,41.991178,-87.683593,both
3,2112 W Peterson Ave,Benson Ave & Church St,2.0,41.991178,-87.683593,both
4,2112 W Peterson Ave,Broadway & Argyle St,2.0,41.991178,-87.683593,both


By end station

In [24]:
""" merge df_s with df_stations on 'end_station_name' column """

df_final = df_s.merge(df_stations, how = 'outer', on = 'end_station_name', indicator = 'merge_flag_2')

In [25]:
df_final.head()

Unnamed: 0,start_station_name_x,end_station_name,trips,Latitude_x,Longitude_x,merge_flag,start_station_name_y,Latitude_y,Longitude_y,merge_flag_2
0,2112 W Peterson Ave,2112 W Peterson Ave,14.0,41.991178,-87.683593,both,2112 W Peterson Ave,41.991178,-87.683593,both
1,Ashland Ave & Belle Plaine Ave,2112 W Peterson Ave,1.0,41.956057,-87.668835,both,2112 W Peterson Ave,41.991178,-87.683593,both
2,Avondale Ave & Irving Park Rd,2112 W Peterson Ave,1.0,41.953393,-87.732002,both,2112 W Peterson Ave,41.991178,-87.683593,both
3,Broadway & Barry Ave,2112 W Peterson Ave,5.0,41.937582,-87.644098,both,2112 W Peterson Ave,41.991178,-87.683593,both
4,Broadway & Berwyn Ave,2112 W Peterson Ave,6.0,41.978361,-87.659789,both,2112 W Peterson Ave,41.991178,-87.683593,both


In [26]:
df_final = df_final[df_final['merge_flag_2'] =='both']

In [27]:
df_final.drop(columns = {'start_station_name_y', 'merge_flag', 'merge_flag_2'}, inplace = True)

In [28]:
df_final.rename(columns = {'start_station_name_x' : 'start_station_name'}, inplace = True)

In [29]:
df_final.head()

Unnamed: 0,start_station_name,end_station_name,trips,Latitude_x,Longitude_x,Latitude_y,Longitude_y
0,2112 W Peterson Ave,2112 W Peterson Ave,14.0,41.991178,-87.683593,41.991178,-87.683593
1,Ashland Ave & Belle Plaine Ave,2112 W Peterson Ave,1.0,41.956057,-87.668835,41.991178,-87.683593
2,Avondale Ave & Irving Park Rd,2112 W Peterson Ave,1.0,41.953393,-87.732002,41.991178,-87.683593
3,Broadway & Barry Ave,2112 W Peterson Ave,5.0,41.937582,-87.644098,41.991178,-87.683593
4,Broadway & Berwyn Ave,2112 W Peterson Ave,6.0,41.978361,-87.659789,41.991178,-87.683593


In [37]:
""" rename both sets of longitude and latitude columns for clarity """

df_final.rename(columns = {'Latitude_x' : 'start_latitude', 'Longitude_x' : 'start_longitude', 
       'Latitude_y' : 'end_latitude', 'Longitude_y' : 'end_longitude',}, inplace = True)

### 4. Initialise instance of kepler.gl map

In [31]:
df_final.to_csv('final_locations_for_map.csv')

In [32]:
# create instance of kepler.gl map

m = KeplerGl(height = 700, data={'data_1': df_final})
m

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


KeplerGl(data={'data_1':                          start_station_name     end_station_name  trips  \
0         …

### 5. Customise map output

I chose to continue with the 'autumn' palette that I used for visualisations in the last task. This provides continuity in the theme for what will eventually become my finished dashboard. I marked the start and end points as yellow, with orange for the start of the trip arc and red for the end of the arc. The start and end points did not need separate colours as the changing colour of the arc tells the narrative of the direction of travel. I changed these colours using the Layer tab in the settings menu, moving away from the default palette which made the map look too busy.

### 6. Add filter to map

Having filtered my map to journeys that occurred more than 3000 times, it was clear that there were two distinct hubs - one on the coast near Navy Pier and another at Millennium train station. 

The first (Navy Pier) is the location of some of Chicago's biggest tourist attractions - Chicago's Children's Museum, the pier itself and Ohio Street Beach. The journeys generally finished their, with popular start points including the Theatre on the Lake, Shedd Aquarium and Millennium Park.

The second, at Millennium train station, is just north of Millennium Park, a large, popular park that spans several blocks. Among its many points of interest is the Art Institute of Chicago and the Abraham Lincold statue. This forms both a popular start and end stop, linking with important transport hubs at Ogilvie Transportation Center and Chicago Union Station, as well as the Chicago Opera House.

### 7. Create config object

In [33]:
config = m.config

In [34]:
config

{}

In [35]:
import json
with open("config.json", "w") as outfile:
    json.dump(config, outfile)

In [36]:
# save map

m.save_to_html(file_name = 'Chicago_bike_trips.html', read_only = False, config = config)

Map saved to Chicago_bike_trips.html!
