In [7]:
import pandas as pd
import os
from keplergl import KeplerGl
from pyproj import CRS
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns




In [8]:
df = pd.read_csv('/Users/runi/Downloads/ny_data.csv', index_col = 0)
df = pd.read_csv('/Users/runi/Downloads/processed_data.csv', index_col = 0)



In [9]:


# Add a 'count' column with a value of 1
df["count"] = 1

# Group by origin and destination
df_agg = df.groupby([
    "start_station_name", "end_station_name",
    "start_lat", "start_lng", "end_lat", "end_lng"
]).agg({"count": "sum"}).reset_index()

df_agg.head()


Unnamed: 0,start_station_name,end_station_name,start_lat,start_lng,end_lat,end_lng,count
0,11 St & Washington St,11 St & Washington St,40.747251,-74.027879,40.749985,-74.02715,1
1,11 St & Washington St,11 St & Washington St,40.749817,-74.027383,40.749985,-74.02715,1
2,11 St & Washington St,11 St & Washington St,40.749857,-74.02753,40.749985,-74.02715,1
3,11 St & Washington St,11 St & Washington St,40.749882,-74.02738,40.749985,-74.02715,1
4,11 St & Washington St,11 St & Washington St,40.749885,-74.027409,40.749985,-74.02715,1


In [12]:
# Create the map
map_1 = KeplerGl(height=600)

# Add the data to the map
map_1.add_data(data=df_agg, name="Bike Trips")

map_1

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


KeplerGl(data={'Bike Trips': {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, …


### Kepler.gl Map Customization
To make the map more informative and visually appealing, several settings were adjusted in Kepler.gl:

Color Settings:
The points representing stations were colored using a blue-to-red gradient to reflect starting and ending locations. This color palette helps distinguish dense bike hubs from less active stations at a glance.
Arc Layer:
An Arc Layer was added to visualize the direction and volume of trips between stations.
The thickness of each arc represents the number of trips.
The color of the arcs was set using a red-yellow gradient to emphasize high-volume connections.
This makes it easier to spot heavily used routes (e.g., commuter corridors or popular recreational paths).
Filters:
A filter was applied to display trips with counts above a threshold of 100, making it easier to identify the most common paths. This helped isolate major biking flows while hiding noise from rarely-used paths.
Tooltips:
Tooltips were customized to display station names and trip counts when hovering over points or arcs. This provides valuable context for interpreting the connections.
These settings were chosen to support clear visual storytelling—highlighting key bike routes, hotspots, and the relationship between urban infrastructure and rider behavior.




### Observations

After applying a filter on trip count in Kepler.gl with an upper threshold of 50, the map clearly highlights the most frequent station-to-station bike trips in New York City. The densest arcs—representing the busiest routes—are concentrated in Manhattan, especially in areas like Midtown, Flatiron, and the Financial District. These zones show a high volume of trips, indicating strong commuter and tourist activity.

What stands out is the intense bike traffic near transportation hubs such as Grand Central Station, Penn Station, and ferry terminals. In addition, areas like Central Park and the Brooklyn waterfront (particularly near Williamsburg and Dumbo) appear frequently in high-trip routes, suggesting their popularity for recreational cycling.


In [13]:
# Save your current config
config = map_1.config
import json

# Export to a JSON file
with open('nyc_bike_trips_map_config.json', 'w') as f:
    json.dump(config, f)

# Optionally save the map state
map_1.save_to_html(file_name='nyc_bike_trips_map.html')


Map saved to nyc_bike_trips_map.html!
