### 2.5 Advanced Geospatial Plotting

#### Import libraries and load data

In [1]:
import pandas as pd
import os
from keplergl import KeplerGl
from pyproj import CRS
import numpy as np
from matplotlib import pyplot as plt

In [2]:
df = pd.read_csv('NY_data_sample.csv', index_col = 0)

In [3]:
df.head()

Unnamed: 0,ride_id,rideable_type,started_at,ended_at,start_station_name,end_station_name,start_lat,start_lng,end_lat,end_lng,member_casual,date,avgTemp,value,bike_rides_daily,trip_duration
0,2993FF066FCB3D4D,electric_bike,2022-09-14 18:12:25.303,2022-09-14 18:18:43.395,W 18 St & 6 Ave,Sullivan St & Washington Sq,40.739713,-73.994564,40.730477,-73.999061,member,2022-09-14,22.9,1,1384,6.301533
1,F29D0D965A2E4F34,classic_bike,2022-09-14 11:01:04.111,2022-09-14 11:04:01.488,21 St & 4 Ave,4 Ave & 17 St,40.662584,-73.995554,40.665507,-73.993037,member,2022-09-14,22.9,1,1384,2.956283
2,414C54C4E7AF5121,electric_bike,2022-09-14 17:12:54.897,2022-09-14 17:21:17.592,W 13 St & 5 Ave,E 11 St & 3 Ave,40.735445,-73.99431,40.73127,-73.98849,casual,2022-09-14,22.9,1,1384,8.37825
3,EFD664E543812DAB,classic_bike,2022-09-14 05:04:20.448,2022-09-14 05:05:36.216,1 Ave & E 30 St,2 Ave & E 29 St,40.741444,-73.975361,40.741724,-73.978093,member,2022-09-14,22.9,1,1384,1.2628
4,02E14999E265B4B2,classic_bike,2022-09-14 23:09:16.578,2022-09-14 23:27:34.445,Henry St & Remsen St,Fulton St & Clermont Ave,40.69401,-73.994651,40.684157,-73.969223,member,2022-09-14,22.9,1,1384,18.297783


#### Create a new column with the value of 1. Then create a new aggregated dataframe that contains 3 columns: starting station, ending station, and the count of trips between those stations.

In [4]:
# Create a value column and group by start and end station 

df['value'] = 1
df_group = df.groupby(['start_station_name', 'end_station_name',])['value'].count().reset_index()

In [5]:
df_group

Unnamed: 0,start_station_name,end_station_name,value
0,1 Ave & E 110 St,1 Ave & E 110 St,6
1,1 Ave & E 110 St,1 Ave & E 44 St,1
2,1 Ave & E 110 St,1 Ave & E 78 St,1
3,1 Ave & E 110 St,1 Ave & E 94 St,3
4,1 Ave & E 110 St,2 Ave & E 104 St,5
...,...,...,...
149688,Yankee Ferry Terminal,Pioneer St & Van Brunt St,1
149689,Yankee Ferry Terminal,Soissons Landing,45
149690,Yankee Ferry Terminal,South St & Gouverneur Ln,1
149691,Yankee Ferry Terminal,South St & Whitehall St,2


In [9]:
print(df_group['value'].sum())
print(df.shape)

297668
(298382, 16)


In [12]:
df_group.rename(columns = {'value': 'trips'}, inplace = True)

In [13]:
df_group.head()

Unnamed: 0,start_station_name,end_station_name,trips
0,1 Ave & E 110 St,1 Ave & E 110 St,6
1,1 Ave & E 110 St,1 Ave & E 44 St,1
2,1 Ave & E 110 St,1 Ave & E 78 St,1
3,1 Ave & E 110 St,1 Ave & E 94 St,3
4,1 Ave & E 110 St,2 Ave & E 104 St,5


##### Create the appropriate dataframe with stations, trips and longitute and latitude

In [14]:
df_final=df.groupby(['start_station_name', 'end_station_name', 'start_lat', 'start_lng', 'end_lat', 'end_lng'])['value'].count().reset_index()

In [15]:
df_final.head()

Unnamed: 0,start_station_name,end_station_name,start_lat,start_lng,end_lat,end_lng,value
0,1 Ave & E 110 St,1 Ave & E 110 St,40.79223,-73.9379,40.792327,-73.9383,1
1,1 Ave & E 110 St,1 Ave & E 110 St,40.792327,-73.9383,40.792327,-73.9383,4
2,1 Ave & E 110 St,1 Ave & E 110 St,40.792373,-73.938079,40.792327,-73.9383,1
3,1 Ave & E 110 St,1 Ave & E 44 St,40.792327,-73.9383,40.75002,-73.969053,1
4,1 Ave & E 110 St,1 Ave & E 78 St,40.792327,-73.9383,40.771404,-73.953517,1


In [19]:
df_final.rename(columns = {'value':'trips',}, inplace = True)

In [21]:
df_final.head()

Unnamed: 0,start_station_name,end_station_name,start_lat,start_lng,end_lat,end_lng,trips
0,1 Ave & E 110 St,1 Ave & E 110 St,40.79223,-73.9379,40.792327,-73.9383,1
1,1 Ave & E 110 St,1 Ave & E 110 St,40.792327,-73.9383,40.792327,-73.9383,4
2,1 Ave & E 110 St,1 Ave & E 110 St,40.792373,-73.938079,40.792327,-73.9383,1
3,1 Ave & E 110 St,1 Ave & E 44 St,40.792327,-73.9383,40.75002,-73.969053,1
4,1 Ave & E 110 St,1 Ave & E 78 St,40.792327,-73.9383,40.771404,-73.953517,1


##### Export new data frame to csv

In [20]:
df_final.to_csv('df_final_locations_for_map.csv')

#### Initialize an instance of a kepler.gl map.

In [17]:
# Create KeplerGl instance

m = KeplerGl(height = 700, data={"data_1": df_final})
m

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


KeplerGl(data={'data_1':            start_station_name          end_station_name  start_lat  start_lng  \
0   …

In [7]:
config = m.config

In [8]:
config

{}

### Map Customization Explanation

In this map, I've made the following customizations:

1. Point Layer Customization:

   - Color: Changed the color of the points to a bright color (red) to enhance visibility and distinguish starting and ending stations.
   - Radius: Adjusted the radius to ensure that points are easily identifiable on the map.

3. Arc Layer Customization:
   - Color: Used a gradient color palette with shades of blue to represent the arcs, making it easier to visualize the trips between stations.
   - Thickness: Set the thickness of the arcs to ensure that routes are clear and stand out against the map background.

These settings were chosen to enhance the visual clarity of the map, making it easier to understand the distribution of bike rides and connections between stations.


### Add a filter to your map and use it to see what the most common trips are in New York City. What else makes an impression? For example, are there any zones that seem particularly busy? Using some additional research, write a few sentences to make sense of that output.

Based on the map visualization, certain zones in New York City, particularly midtown and downtown Manhattan, appear to be especially busy. These areas are characterized by a high density of arcs and points, indicating a significant volume of bike trips.

Midtown Manhattan:

Times Square: Known for its bustling atmosphere, Times Square is a major commercial and tourist hub. The high volume of bike trips in this area can be attributed to the numerous attractions, theaters, and shopping destinations.

Penn Station: As one of the busiest transportation hubs in the city, Penn Station sees a large number of commuters and travelers, contributing to the high density of bike trips.

Downtown Manhattan:

Wall Street: The financial district, attracts a significant number of professionals and visitors, leading to a high volume of bike trips.

World Trade Center: The area around the World Trade Center, including the National September 11 Memorial & Museum, is a major tourist destination, contributing to the high density of bike trips.

Central Park is another significant hotspot for bike trips in New York City. Given its large area, scenic beauty, and numerous attractions, it's no surprise that Central Park generates a high volume of bike traffic.