Emily Belote

-Static assignments not included, everything else working

# Citi Bike Data
- https://www.citibikenyc.com/system-data 

This Jupyter notebook does the following (as of Monday, Feb. 17):
1. Imports .json data of Station Info (consider to be static)
2. Imports .json data of Station Status (may change frequently)
3. Imports .csv of Citi Bike trips


- We'll use the Station Info/Status data to build a VeRoViz "nodes" dataframe.
- We'll use the trips data to build a VeRoViz "assignments" dataframe.

With the nodes and assignments dataframes, we can then generate Leaflet maps (static) and Cesium movies.

---

In [3]:
# We'll need these libraries
import numpy as np
import pandas as pd

In [4]:
# These libraries will help us import JSON data:
import json
import urllib.request

In [5]:
# Go ahead and import VeRoViz
import veroviz as vrv
vrv.checkVersion()

'Your current installed version of veroviz is 0.3.1. You are up-to-date with the latest available version.'

In [6]:
# I like to use "environment" variables to store "private" stuff
# (like API keys, or paths to installed files).
# We'll need the `os` library for that:
import os

# See https://veroviz.org/documentation.html#installation for details

--- 

## 1. Import Station Info (from .json)
- These data are *mostly* static...certainly won't change throughout the course of a day.

In [7]:
# Tim's approach for grabbing JSON data:
with urllib.request.urlopen("https://gbfs.citibikenyc.com/gbfs/en/station_information.json") as url:
    station_info_data = json.loads(url.read().decode())
station_info_data

{'last_updated': 1582826464,
 'ttl': 10,
 'data': {'stations': [{'station_id': '304',
    'external_id': '66db6da2-0aca-11e7-82f6-3863bb44ef7c',
    'name': 'Broadway & Battery Pl',
    'short_name': '4962.01',
    'lat': 40.70463334,
    'lon': -74.01361706,
    'region_id': 71,
    'rental_methods': ['CREDITCARD', 'KEY'],
    'capacity': 33,
    'rental_url': 'http://app.citibikenyc.com/S6Lr/IBV092JufD?station_id=304',
    'electric_bike_surcharge_waiver': False,
    'eightd_has_key_dispenser': True,
    'eightd_station_services': [{'id': 'a58d9e34-2f28-40eb-b4a6-c8c01375657a',
      'service_type': 'ATTENDED_SERVICE',
      'bikes_availability': 'UNLIMITED',
      'docks_availability': 'NONE',
      'name': 'Valet Service',
      'description': 'Citi Bike Valet Attendant Service Available',
      'schedule_description': '',
      'link_for_more_info': 'https://www.citibikenyc.com/valet'}],
    'has_kiosk': True},
   {'station_id': '367',
    'external_id': '66dbcdfc-0aca-11e7-82f6-3

In [8]:
# Convert the JSON data into a Pandas dataframe:
station_info_df = pd.DataFrame(station_info_data['data']['stations'])
station_info_df.head()

Unnamed: 0,station_id,external_id,name,short_name,lat,lon,region_id,rental_methods,capacity,rental_url,electric_bike_surcharge_waiver,eightd_has_key_dispenser,eightd_station_services,has_kiosk
0,304,66db6da2-0aca-11e7-82f6-3863bb44ef7c,Broadway & Battery Pl,4962.01,40.704633,-74.013617,71,"[CREDITCARD, KEY]",33,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,True,[{'id': 'a58d9e34-2f28-40eb-b4a6-c8c01375657a'...,True
1,367,66dbcdfc-0aca-11e7-82f6-3863bb44ef7c,E 53 St & Lexington Ave,6617.09,40.758281,-73.970694,71,"[CREDITCARD, KEY]",34,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,False,[{'id': '2d9a5c9e-50e0-4aed-a63b-91ca81e7d2c0'...,True
2,402,66dbf0d0-0aca-11e7-82f6-3863bb44ef7c,Broadway & E 22 St,6098.07,40.740343,-73.989551,71,"[CREDITCARD, KEY]",39,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,False,[{'id': '37a1ae1b-3dd6-4876-8c57-572aaac97981'...,True
3,3443,66de8a86-0aca-11e7-82f6-3863bb44ef7c,W 52 St & 6 Ave,6740.01,40.76133,-73.97982,71,"[CREDITCARD, KEY]",41,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,False,[{'id': '286d75b2-088f-4a79-bf7d-223928be711c'...,True
4,72,66db237e-0aca-11e7-82f6-3863bb44ef7c,W 52 St & 11 Ave,6926.01,40.767272,-73.993929,71,"[CREDITCARD, KEY]",55,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,False,,True


---

## 2.  Get Station Status Data (from .json)
- These data may change frequently.  I don't know how often they're updated.

In [9]:
# Tim's approach for grabbing JSON data:
with urllib.request.urlopen("https://gbfs.citibikenyc.com/gbfs/en/station_status.json") as url:
    station_status_data = json.loads(url.read().decode())
station_status_data

{'last_updated': 1582826474,
 'ttl': 10,
 'data': {'stations': [{'station_id': '304',
    'num_bikes_available': 10,
    'num_ebikes_available': 1,
    'num_bikes_disabled': 1,
    'num_docks_available': 22,
    'num_docks_disabled': 0,
    'is_installed': 1,
    'is_renting': 1,
    'is_returning': 0,
    'last_reported': 1582825054,
    'eightd_has_available_keys': True,
    'eightd_active_station_services': [{'id': 'a58d9e34-2f28-40eb-b4a6-c8c01375657a'}]},
   {'station_id': '367',
    'num_bikes_available': 5,
    'num_ebikes_available': 0,
    'num_bikes_disabled': 0,
    'num_docks_available': 29,
    'num_docks_disabled': 0,
    'is_installed': 1,
    'is_renting': 1,
    'is_returning': 0,
    'last_reported': 1582826089,
    'eightd_has_available_keys': False,
    'eightd_active_station_services': [{'id': '2d9a5c9e-50e0-4aed-a63b-91ca81e7d2c0'}]},
   {'station_id': '402',
    'num_bikes_available': 10,
    'num_ebikes_available': 0,
    'num_bikes_disabled': 0,
    'num_docks_

In [10]:
# Convert the data into a Pandas dataframe:
station_status_df = pd.DataFrame(station_status_data['data']['stations'])
station_status_df.head()

Unnamed: 0,station_id,num_bikes_available,num_ebikes_available,num_bikes_disabled,num_docks_available,num_docks_disabled,is_installed,is_renting,is_returning,last_reported,eightd_has_available_keys,eightd_active_station_services
0,304,10,1,1,22,0,1,1,0,1582825054,True,[{'id': 'a58d9e34-2f28-40eb-b4a6-c8c01375657a'}]
1,367,5,0,0,29,0,1,1,0,1582826089,False,[{'id': '2d9a5c9e-50e0-4aed-a63b-91ca81e7d2c0'}]
2,402,10,0,0,29,0,1,1,0,1582826401,False,[{'id': '37a1ae1b-3dd6-4876-8c57-572aaac97981'}]
3,3443,3,0,1,37,0,1,1,0,1582826174,False,[{'id': '286d75b2-088f-4a79-bf7d-223928be711c'}]
4,72,44,0,0,11,0,1,1,1,1582825850,False,


--- 

## 3.  Import Trip Data (from .csv)
- We'll create a pandas dataframe from the data.
- See https://s3.amazonaws.com/tripdata/index.html for available datasets.

In [11]:
# I just randomly grabbed this file:
bike_trips_df = pd.read_csv('202001-citibike-tripdata.csv')

In [12]:
# bike_trips_df.columns

# Using `list()` formats things a little better:
list(bike_trips_df.columns)

['tripduration',
 'starttime',
 'stoptime',
 'start station id',
 'start station name',
 'start station latitude',
 'start station longitude',
 'end station id',
 'end station name',
 'end station latitude',
 'end station longitude',
 'bikeid',
 'usertype',
 'birth year',
 'gender']

--- 

## Create a VeRoViz "nodes" Dataframe
- We'll populate this with data from Station Info and Station Status
- We'll also hard-code some columns

In [13]:
nodes = vrv.initDataframe('nodes')

In [14]:
# Here are the columns we'll need to populate:
#list(nodes.columns)

In [15]:
# Here are the columns from our "Station Info":
#list(station_info_df.columns)

In [16]:
# An example to show the syntax for displaying 2 particular columns from a df:
#station_info_df[['lat', 'lon']].head()

---
# Create the nodes to be passed to veroviz and Cesium

## Color-code the nodes based on Station STATUS.  For example:
 - green  --> bikes and docks are available
 - red    --> no bikes available
 - yellow --> no docks available
 - black --> avoid this station completely


In [17]:
# Convert the station status data into a Pandas dataframe:
station_status_df = pd.DataFrame(station_status_data['data']['stations'])
#Create a new column for leafletColor and cesiumColor 
    #-automatically populated as green for all stations
    #-later will represent all completely open stations (pick up and drop off)
station_status_df['leafletColor'] = 'green'
station_status_df['cesiumColor'] = 'Cesium.Color.GREEN'
#station_status_df

#Red if no bikes are available
#uses "is_renting"
station_status_df.loc[(station_status_df['is_renting']==0), 'leafletColor'] = 'red'
station_status_df.loc[(station_status_df['is_renting']==0), 'cesiumColor'] = 'Cesium.Color.RED'

#Orange (Yellow) if no bikes are available
#uses "is_returning"
station_status_df.loc[(station_status_df['is_returning']==0), 'leafletColor'] = 'orange'
station_status_df.loc[(station_status_df['is_returning']==0), 'cesiumColor'] = 'Cesium.Color.ORANGE'

#Black if station is not accessible (Avoid this station, we cannot rent or return)
#uses "is_renting" and "is_returing"
station_status_df.loc[(station_status_df['is_returning']==0) & (station_status_df['is_renting']==0), 'leafletColor'] = 'black'
station_status_df.loc[(station_status_df['is_returning']==0) & (station_status_df['is_renting']==0), 'cesiumColor'] = 'Cesium.Color.BLACK'

#Test to see if there are any 'color' stations
#station_status_df.loc[station_status_df.leafletColor=='orange']
#Show the dataframe
#station_status_df

In [18]:
# Let's go ahead and re-initialize an empty dataframe within this cell:
nodes = vrv.initDataframe('nodes')

# Now, copy the relevant columns from our Station Info dataframe:
# NOTE: We were getting some size mis-match errors until we copied 
#       just a single column first.  
nodes['id'] = station_info_df['station_id'].values
nodes[['id', 'lat', 'lon', 'nodeName']] = station_info_df[['station_id', 'lat', 'lon', 'name']].values
nodes[['leafletIconText', 'cesiumIconText']] = station_info_df[['name', 'station_id']].values
nodes[['leafletColor', 'cesiumColor']] = station_status_df[['leafletColor', 'cesiumColor']].values


# Finally, we'll fill in the rest of our nodes dataframe with some hard-coded/constant values:
nodes.loc[:,'altMeters'] = 0
nodes.loc[:,['nodeType', 'leafletIconPrefix', 'leafletIconType']] = [
             'CitiBikeStation',  'fa',                'bicycle']
nodes.loc[:,['cesiumIconType']] = ['pin']

In [19]:
# Show all of the nodes on a Leaflet map:
vrv.createLeaflet(nodes=nodes)

--- 

## Create a VeRoViz "assignments" Dataframe
- We'll populate this with trip data
- We'll also hard-code some columns

In [20]:
# Initialize an empty "assignments" dataframe:
assignments = vrv.initDataframe('assignments')
#assignments.info()

### Here's the plan:
- These columns will come directly from bike trip data:
    - `objectID` (from `bikeid`)
    - `startLat` and `startLon` (from `start station latitude` and `start station longitude`)
    - `endLat` and `endLon` (from `end station latitude` and `end station longitude`)
- These columns will need to be calculated:
    - `startTimeSec` (from `starttime`, but converted to "seconds since the first event")
    - `endTimeSec`   (from `starttime` and `tripduration`, or `starttime` and `stoptime`)
    - We'll create some new columns in `bike_trips_df` to hold our calculations.  Then we'll copy these calculated columns into our assignments dataframe.
- This column will need to be auto generated:
    - `odID` (each origin/destination pair should get a unique integer)
- The remaining columns will be hard-coded (for now)

In [21]:
# What is the first start time in our bike_trips_df?
min(bike_trips_df['starttime'])

'2020-01-01 00:00:55.3900'

In [22]:
# Add a new column to bike_trips_df...

# This next command will produce a "timestamp" (days HH:MM:SS.ms) 
# showing the time since the first observed `starttime`:
bike_trips_df['timeAfterStart'] = pd.to_datetime(bike_trips_df['starttime']) - \
                                  pd.to_datetime(min(bike_trips_df['starttime']))

# Now, convert this to a decimal number of seconds:
bike_trips_df['timeAfterStart'] = bike_trips_df['timeAfterStart'].dt.total_seconds().astype(int)

bike_trips_df['timeAfterStart'].head()

0     0
1    12
2    46
3    50
4    50
Name: timeAfterStart, dtype: int32

In [23]:
# Just for fun, here's the time differences between start/stop times:
pd.to_datetime(bike_trips_df['stoptime']) - pd.to_datetime(bike_trips_df['starttime'])

0         00:13:09.757000
1         00:25:41.076000
2         00:24:24.871000
3         00:09:52.594000
4         00:11:42.452000
                ...      
1240591   00:26:27.607000
1240592   00:03:42.831000
1240593   00:02:43.862000
1240594   00:05:27.148000
1240595   00:08:04.146000
Length: 1240596, dtype: timedelta64[ns]

In [24]:
# In one cell, we'll create our assignments dataframe.

# Make sure we're starting with an empty dataframe:
assignments = vrv.initDataframe('assignments')

# Copy over the static values.
# We'll start by copying a single column, to avoid the size mis-match issue:
assignments['objectID'] = bike_trips_df['bikeid']
assignments[['startLat', 'startLon', 'endLat', 'endLon']] = bike_trips_df[['start station latitude', 
                                                                          'start station longitude',
                                                                          'end station latitude',
                                                                          'end station longitude']].values

# Copy our new calculated column:
assignments['startTimeSec'] = bike_trips_df['timeAfterStart'].values

# Use the calculated column and tripduration to get the end time (in seconds):
assignments['endTimeSec'] = (bike_trips_df['timeAfterStart'] + bike_trips_df['tripduration']).values

# Fill in the rest of our assignments df with some hard-coded values:
# (we'll probably want to revisit this later)
assignments.loc[:,['modelFile', 'modelScale', 'modelMinPxSize', 'startAltMeters', 'endAltMeters', 
                   'leafletColor', 'leafletWeight', 'leafletStyle', 'leafletOpacity', 'useArrows',
                   'cesiumColor', 'cesiumWeight', 'cesiumStyle', 'cesiumOpacity']] = \
                  ['veroviz/models/car_blue.gltf', 100, 45, 0, 0, 
                   'blue', 2, 'solid', 0.8, False, 
                   'Cesium.Color.BLUE', 2, 'solid', 0.7]

# Finally (for now), let's generate a unique odID value for each row.
# This will make sense only if we assume that each row corresponds to a specific
# O/D pair.  Conversely, if we have turn-by-turn arcs, we'll need to group
# multiple rows into the same O/D pair.  We'll tackle that case if/when 
# we encounter it.
assignments.loc[:,'odID'] = list(range(0, len(assignments)))

In [25]:
# Display what we've created:
assignments.head()

Unnamed: 0,odID,objectID,modelFile,modelScale,modelMinPxSize,startTimeSec,startLat,startLon,startAltMeters,endTimeSec,...,endAltMeters,leafletColor,leafletWeight,leafletStyle,leafletOpacity,useArrows,cesiumColor,cesiumWeight,cesiumStyle,cesiumOpacity
0,0,30326,veroviz/models/car_blue.gltf,100,45,0,40.732219,-73.981656,0,789,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
1,1,17105,veroviz/models/car_blue.gltf,100,45,12,40.661063,-73.979453,0,1553,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
2,2,40177,veroviz/models/car_blue.gltf,100,45,46,40.743227,-73.974498,0,1510,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
3,3,27690,veroviz/models/car_blue.gltf,100,45,50,40.736529,-74.00618,0,642,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
4,4,32583,veroviz/models/car_blue.gltf,100,45,50,40.694546,-73.958014,0,752,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7


--- 

### Create a Leaflet map 
- We have a lot of bikes...let's just display one.

In [26]:
# FIXME

# Some (most?) of our bikes are being re-positioned by the CitiBike staff.
# We'll want to identify when this happens.

# One way to identify that a re-positioning has occurred is if 
# The start location for the next arc of a given bike does not 
# match the end location of the previous arc for this bike.

# We might also want to know when these activities occur.
# Is it only at night?  Do we have enough data to figure this out?

# OPTION 1: Use a "for" loop
# OPTION 2: Don't use a "for" loop
# Which option is faster?
# --> timeit()
#       import timeit or
#       %timeit   magic commands, only available in ipython
# --> current time - previous time

# Draw on (separate) leaflet map and a (separate) cesium movie, the repositioning of bikes
    #color the arcs differently on leaflet map? end time of previous station to start time of next station

#bike does not show up if it is static

#Use vrv.addStaticAssignment() to display stationary bikes. 
#This will show the bike slowly transitiioning between stations if the company is moving them.
#  ? Can we just append to end of assignments dataframe?
# ? If not, 

#Use the vrv filter option to create a map that only displays the manual moves

In [38]:
# I just randomly grabbed this file:
bike_trips_df = pd.read_csv('202001-citibike-tripdata.csv')
bike_trips_14530_df = pd.DataFrame(bike_trips_df[bike_trips_df['bikeid'] == 14530]).reset_index(drop = True)

---
# "Not a For loop" Option

In [39]:
#Duplicate the dataframe omitting the first row
B_bike_trips_14530_df = bike_trips_14530_df.loc[1:, :].reset_index(drop = True)
#B_bike_trips_14530_df[['start station latitude', 'start station longitude']]

#Create new columns in the original dataframe and populate them with the next start station to check for repositioning
bike_trips_14530_df['next start lat'] = 0
bike_trips_14530_df['next start lon'] = 0
bike_trips_14530_df['next start time'] = 0
bike_trips_14530_df['repositioning'] = 0
bike_trips_14530_df['leafletColor'] = 'blue'
bike_trips_14530_df['cesiumColor'] = 'Cesium.Color.BLUE'
bike_trips_14530_df['next start lat'] = B_bike_trips_14530_df['start station latitude']
bike_trips_14530_df['next start lon'] = B_bike_trips_14530_df['start station longitude']
bike_trips_14530_df['next start time'] = B_bike_trips_14530_df['starttime'] #for static assignment


In [40]:
#Calculate if repositioning happens
bike_trips_14530_df.loc[(bike_trips_14530_df['next start lat'] != bike_trips_14530_df['end station latitude']) & (bike_trips_14530_df['next start lon'] != bike_trips_14530_df['end station longitude']), 'repositioning'] = 1
bike_trips_14530_df.iloc[-1, bike_trips_14530_df.columns.get_loc('repositioning')] = 0 
bike_trips_14530_df.iloc[-1, bike_trips_14530_df.columns.get_loc('next start time')] = bike_trips_14530_df.iloc[-1, bike_trips_14530_df.columns.get_loc('stoptime')]
#bike_trips_14530_df.loc[bike_trips_14530_df['repositioning'] == 1]
bike_trips_14530_df

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,...,bikeid,usertype,birth year,gender,next start lat,next start lon,next start time,repositioning,leafletColor,cesiumColor
0,548,2020-01-02 09:25:11.9390,2020-01-02 09:34:20.3600,261,Johnson St & Gold St,40.694749,-73.983625,2000,Front St & Washington St,40.702551,...,14530,Subscriber,1987,1,40.702551,-73.989402,2020-01-02 18:16:45.1630,0,blue,Cesium.Color.BLUE
1,1061,2020-01-02 18:16:45.1630,2020-01-02 18:34:26.5780,2000,Front St & Washington St,40.702551,-73.989402,3414,Bergen St & Flatbush Ave,40.680945,...,14530,Subscriber,1992,1,40.680945,-73.975673,2020-01-05 14:27:01.9750,0,blue,Cesium.Color.BLUE
2,400,2020-01-05 14:27:01.9750,2020-01-05 14:33:42.1210,3414,Bergen St & Flatbush Ave,40.680945,-73.975673,3486,Schermerhorn St & Bond St,40.688417,...,14530,Customer,1969,0,40.688417,-73.984517,2020-01-06 16:10:01.6600,0,blue,Cesium.Color.BLUE
3,301,2020-01-06 16:10:01.6600,2020-01-06 16:15:03.0080,3486,Schermerhorn St & Bond St,40.688417,-73.984517,241,DeKalb Ave & S Portland Ave,40.689810,...,14530,Subscriber,1985,1,40.689810,-73.974931,2020-01-06 17:15:51.0260,0,blue,Cesium.Color.BLUE
4,167,2020-01-06 17:15:51.0260,2020-01-06 17:18:38.1240,241,DeKalb Ave & S Portland Ave,40.689810,-73.974931,324,DeKalb Ave & Hudson Ave,40.689888,...,14530,Subscriber,1998,1,40.689888,-73.981013,2020-01-06 17:24:58.6320,0,blue,Cesium.Color.BLUE
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
69,1689,2020-01-30 18:30:48.6870,2020-01-30 18:58:58.0690,362,Broadway & W 37 St,40.751726,-73.987535,151,Cleveland Pl & Spring St,40.722104,...,14530,Customer,1995,1,40.722104,-73.997249,2020-01-30 19:30:54.8530,0,blue,Cesium.Color.BLUE
70,388,2020-01-30 19:30:54.8530,2020-01-30 19:37:23.7710,151,Cleveland Pl & Spring St,40.722104,-73.997249,311,Norfolk St & Broome St,40.717227,...,14530,Subscriber,1973,1,40.717227,-73.988021,2020-01-30 21:52:15.6920,0,blue,Cesium.Color.BLUE
71,884,2020-01-30 21:52:15.6920,2020-01-30 22:07:00.2830,311,Norfolk St & Broome St,40.717227,-73.988021,297,E 15 St & 3 Ave,40.734232,...,14530,Subscriber,1983,1,40.734232,-73.986923,2020-01-31 08:27:54.4500,0,blue,Cesium.Color.BLUE
72,515,2020-01-31 08:27:54.4500,2020-01-31 08:36:30.1120,297,E 15 St & 3 Ave,40.734232,-73.986923,379,W 31 St & 7 Ave,40.749156,...,14530,Subscriber,1993,1,40.749156,-73.991600,2020-01-31 08:41:55.1210,0,blue,Cesium.Color.BLUE


In [41]:
#appending a new row to show the movement of bikes
for i in range(len(bike_trips_14530_df)-1):
    if bike_trips_14530_df['repositioning'][i] == 1:
        data = [{'tripduration':28800 ,
                 'starttime':bike_trips_14530_df['stoptime'][i], 
                 'stoptime':bike_trips_14530_df['starttime'][i+1], 
                 'start station id':bike_trips_14530_df['end station id'][i], 
                 'start station name': None, 
                 'start station latitude':bike_trips_14530_df['end station latitude'][i],
                 'start station longitude':bike_trips_14530_df['end station longitude'][i], 
                 'end station id':bike_trips_14530_df['start station id'][i+1], 
                 'end station name':None, 
                 'end station latitude':bike_trips_14530_df['next start lat'][i], 
                 'end station longitude': bike_trips_14530_df['next start lon'][i], 
                 'bikeid':bike_trips_14530_df['bikeid'][i],  
                 'usertype':None, 
                 'birth year':None, 
                 'gender':None, 
                 'timeAfterStart':bike_trips_14530_df['tripduration'][i+1], 
                 'leafletColor':'red', 
                 'cesiumColor': 'Cesium.Color.RED', 
                 'capacity':None}]
        bike_trips_14530_df = bike_trips_14530_df.append(data)
bike_trips_14530_df = bike_trips_14530_df.reset_index(drop = True)
bike_trips_14530_df    

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,...,birth year,gender,next start lat,next start lon,next start time,repositioning,leafletColor,cesiumColor,timeAfterStart,capacity
0,548,2020-01-02 09:25:11.9390,2020-01-02 09:34:20.3600,261,Johnson St & Gold St,40.694749,-73.983625,2000,Front St & Washington St,40.702551,...,1987,1,40.702551,-73.989402,2020-01-02 18:16:45.1630,0.0,blue,Cesium.Color.BLUE,,
1,1061,2020-01-02 18:16:45.1630,2020-01-02 18:34:26.5780,2000,Front St & Washington St,40.702551,-73.989402,3414,Bergen St & Flatbush Ave,40.680945,...,1992,1,40.680945,-73.975673,2020-01-05 14:27:01.9750,0.0,blue,Cesium.Color.BLUE,,
2,400,2020-01-05 14:27:01.9750,2020-01-05 14:33:42.1210,3414,Bergen St & Flatbush Ave,40.680945,-73.975673,3486,Schermerhorn St & Bond St,40.688417,...,1969,0,40.688417,-73.984517,2020-01-06 16:10:01.6600,0.0,blue,Cesium.Color.BLUE,,
3,301,2020-01-06 16:10:01.6600,2020-01-06 16:15:03.0080,3486,Schermerhorn St & Bond St,40.688417,-73.984517,241,DeKalb Ave & S Portland Ave,40.689810,...,1985,1,40.689810,-73.974931,2020-01-06 17:15:51.0260,0.0,blue,Cesium.Color.BLUE,,
4,167,2020-01-06 17:15:51.0260,2020-01-06 17:18:38.1240,241,DeKalb Ave & S Portland Ave,40.689810,-73.974931,324,DeKalb Ave & Hudson Ave,40.689888,...,1998,1,40.689888,-73.981013,2020-01-06 17:24:58.6320,0.0,blue,Cesium.Color.BLUE,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
73,529,2020-01-31 08:41:55.1210,2020-01-31 08:50:44.3750,379,W 31 St & 7 Ave,40.749156,-73.991600,496,E 16 St & 5 Ave,40.737262,...,1981,1,,,2020-01-31 08:50:44.3750,0.0,blue,Cesium.Color.BLUE,,
74,28800,2020-01-06 21:59:37.9540,2020-01-07 13:23:12.3180,3232,,40.689622,-73.983043,3377,,40.678612,...,,,,,,,red,Cesium.Color.RED,370.0,
75,28800,2020-01-16 09:48:57.9800,2020-01-16 12:00:41.2330,280,,40.733320,-73.995101,326,,40.729538,...,,,,,,,red,Cesium.Color.RED,288.0,
76,28800,2020-01-16 12:05:29.6630,2020-01-17 09:15:27.4700,3812,,40.734814,-73.992085,504,,40.732219,...,,,,,,,red,Cesium.Color.RED,377.0,


In [42]:
#bike_trips_14530_df = bike_trips_14530_df.reset_index(drop = True)
#bike_trips_14530_df

In [43]:
# Initialize an empty "assignments" dataframe:
assignments14530 = vrv.initDataframe('assignments')

In [44]:
# In one cell, we'll create our assignments dataframe.

# Make sure we're starting with an empty dataframe:
assignments14530 = vrv.initDataframe('assignments')

# Copy over the static values.
# We'll start by copying a single column, to avoid the size mis-match issue:
assignments14530['objectID'] = bike_trips_14530_df['bikeid']
assignments14530[['leafletColor', 'cesiumColor']] = bike_trips_14530_df[['leafletColor', 'cesiumColor']].values
assignments14530[['startLat', 'startLon', 'endLat', 'endLon']] = bike_trips_14530_df[['start station latitude', 
                                                                          'start station longitude',
                                                                          'end station latitude',
                                                                          'end station longitude']].values

# Copy our new calculated column:
assignments14530['startTimeSec'] = bike_trips_14530_df['timeAfterStart'].values

# Use the calculated column and tripduration to get the end time (in seconds):
assignments14530['endTimeSec'] = (bike_trips_14530_df['timeAfterStart'] + bike_trips_14530_df['tripduration']).values

# Fill in the rest of our assignments df with some hard-coded values:
# (we'll probably want to revisit this later)
assignments14530.loc[:,['modelFile', 'modelScale', 'modelMinPxSize', 'startAltMeters', 'endAltMeters', 
                    'leafletWeight', 'leafletStyle', 'leafletOpacity', 'useArrows',
                    'cesiumWeight', 'cesiumStyle', 'cesiumOpacity']] = \
                  ['veroviz/models/car_blue.gltf', 100, 45, 0, 0, 
                    2, 'solid', 0.8, False, 
                    2, 'solid', 0.7]

# Finally (for now), let's generate a unique odID value for each row.
# This will make sense only if we assume that each row corresponds to a specific
# O/D pair.  Conversely, if we have turn-by-turn arcs, we'll need to group
# multiple rows into the same O/D pair.  We'll tackle that case if/when 
# we encounter it.
assignments14530.loc[:,'odID'] = list(range(0, len(assignments14530)))

In [45]:
vrv.createLeaflet(arcs=assignments14530)

In [37]:
vrv.addStaticAssignment?

In [None]:
'''#Add static assignments so the bike does not disappear when docked in a station
#initialize a static assignments dataframe

static_assignment = vrv.addStaticAssignment(
    initAssignments=None,
    odID=1,
    objectID=None,
    modelFile=None,
    modelScale=100,
    modelMinPxSize=75,
    loc=None,
    startTimeSec=None,
    endTimeSec=None,
)
)
#vrv.addStaticAssignment''';

--- 

### Create a Cesium movie for one bike

In [42]:
# Use this command to get documentation on the `createCesium()` function:
#vrv.createCesium?

In [48]:
'''
# startDate: Format is "YYYY-MM-DD"
startDate = pd.to_datetime(min(bike_trips_df['starttime'])).strftime('%Y-%m-%d')

# startTime: Format is "HH:MM:SS"
startTime = pd.to_datetime(min(bike_trips_df['starttime'])).strftime('%H:%M:%S')

vrv.createCesium(
    assignments = assignments[assignments['objectID'] == 14530],
    nodes       = nodes,
    startDate   = startDate,
    startTime   = startTime,
    cesiumDir   = os.environ['CESIUMDIR'],
    problemDir  = 'IE_670/citibike_example')
''';

In [None]:
# FIXME

# The Cesium movie is cluttered with all of our station markers.
# It would be better to only include the markers that are actually relevant 
# to our given bike.

# Fortunately, our bike trips df contains the station IDs.
# We just need to get a list of unique IDs, and then 
# pass to createCesium only the subset of nodes corresponding to these IDs.

In [63]:
# Let's go ahead and re-initialize an empty dataframe within this cell:
nodes1 = vrv.initDataframe('nodes')


# Include the nodes that are traveled to by this bike
nodes1['id'] = bike_trips_df['bikeid'].values
nodes1[['id', 'lat', 'lon', 'nodeName']] = bike_trips_df[['bikeid', 'start station latitude', 'start station longitude', 'start station id']].values
nodes1[['leafletIconText', 'cesiumIconText']] = bike_trips_df[['start station id', 'start station id']].values


# Finally, we'll fill in the rest of our nodes dataframe with some hard-coded/constant values:
nodes1.loc[:,'altMeters'] = 0
nodes1.loc[:,['nodeType', 'leafletIconPrefix', 'leafletIconType']] = [
             'CitiBikeStation',  'fa',                'bicycle']
nodes1.loc[:,['cesiumIconType']] = ['pin']
nodes1.loc[:,['leafletColor', 'cesiumColor']] = ['blue', 'Cesium.Color.BLUE']

In [64]:
bike_trips_df.tail()

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender
1240591,1587,2020-01-31 23:59:26.8820,2020-02-01 00:25:54.4890,3244,University Pl & E 8 St,40.731437,-73.994903,3092,Berry St & N 8 St,40.719009,-73.958525,40662,Subscriber,1990,1
1240592,222,2020-01-31 23:59:32.6410,2020-02-01 00:03:15.4720,383,Greenwich Ave & Charles St,40.735238,-74.000271,383,Greenwich Ave & Charles St,40.735238,-74.000271,28722,Subscriber,1983,1
1240593,163,2020-01-31 23:59:39.1780,2020-02-01 00:02:23.0400,150,E 2 St & Avenue C,40.720874,-73.980858,411,E 6 St & Avenue D,40.722281,-73.976687,32530,Subscriber,1958,2
1240594,327,2020-01-31 23:59:49.2310,2020-02-01 00:05:16.3790,483,E 12 St & 3 Ave,40.732233,-73.9889,3718,E 11 St & Avenue B,40.727464,-73.979504,15314,Customer,1994,1
1240595,484,2020-01-31 23:59:57.0360,2020-02-01 00:08:01.1820,327,Vesey Pl & River Terrace,40.715338,-74.016584,534,Water - Whitehall Plaza,40.702551,-74.012723,30947,Subscriber,1987,1


In [60]:
# Show all of the nodes on a Leaflet map:
vrv.createLeaflet(nodes=nodes1[nodes1['id'] == 14530])

#vrv.createLeaflet(arcs=assignments[assignments['objectID'] == 14530])

In [65]:
# startDate: Format is "YYYY-MM-DD"
startDate = pd.to_datetime(min(bike_trips_14530_df['starttime'])).strftime('%Y-%m-%d')

# startTime: Format is "HH:MM:SS"
startTime = pd.to_datetime(min(bike_trips_14530_df['starttime'])).strftime('%H:%M:%S')

vrv.createCesium(
    assignments = assignments[assignments['objectID'] == 14530],
    nodes       = nodes1[nodes1['id'] == 14530],
    startDate   = startDate,
    startTime   = startTime,
    cesiumDir   = os.environ['CESIUMDIR'],
    problemDir  = 'IE_670/citibike_example')

Message: File selector was written to C:/Cesium/IE_670/citibike_example/;IE_670;citibike_example.vrv ...
Message: Configs were written to C:/Cesium/IE_670/citibike_example/config.js ...
Message: Nodes were written to C:/Cesium/IE_670/citibike_example/displayNodes.js ...
Message: Assignments (.js) were written to C:/Cesium/IE_670/citibike_example/displayPaths.js ...
Message: Assignments (.czml) were written to C:/Cesium/IE_670/citibike_example/routes.czml ...


--- 

#### Playing around with dates/times
- Here's some code related to formatting dates/times.  There might be something useful here in the future...

In [181]:
#pd.to_datetime(bike_trips_df['starttime']).dt.date

0          2020-01-01
1          2020-01-01
2          2020-01-01
3          2020-01-01
4          2020-01-01
              ...    
1240591    2020-01-31
1240592    2020-01-31
1240593    2020-01-31
1240594    2020-01-31
1240595    2020-01-31
Name: starttime, Length: 1240596, dtype: object

In [177]:
#pd.to_datetime(min(bike_trips_df['starttime'])).strftime('%Y-%m-%d')

'2020-01-01'

In [179]:
#pd.to_datetime(min(bike_trips_df['starttime'])).strftime('%H:%M:%S')

'00:00:55'

--- 

### NYC Subway Stations

- A list of subway stations may be found here:
    - http://web.mta.info/developers/data/nyct/subway/Stations.csv 

- Other links:
    - http://web.mta.info/developers/index.html
    - http://datamine.mta.info/list-of-feeds 
    
Ideas:
- For a given location, find the nearest subway station.
- For a given destination, find the nearest **available** CitiBike station.
- For a given O/D pair, determine the best combination of subways/bikes to use.
