# Citi Bike Data
- https://www.citibikenyc.com/system-data 

This Jupyter notebook does the following (as of Monday, Feb. 17):
1. Imports .json data of Station Info (consider to be static)
2. Imports .json data of Station Status (may change frequently)
3. Imports .csv of Citi Bike trips


- We'll use the Station Info/Status data to build a VeRoViz "nodes" dataframe.
- We'll use the trips data to build a VeRoViz "assignments" dataframe.

With the nodes and assignments dataframes, we can then generate Leaflet maps (static) and Cesium movies.

---

In [817]:
# We'll need these libraries
import numpy as np
import time
import pandas as pd

In [818]:
# These libraries will help us import JSON data:
import json
import urllib.request

In [819]:
# Go ahead and import VeRoViz
import veroviz as vrv
vrv.checkVersion()

'Your current installed version of veroviz is 0.3.1. You are up-to-date with the latest available version.'

In [820]:
# I like to use "environment" variables to store "private" stuff
# (like API keys, or paths to installed files).
# We'll need the `os` library for that:
import os

# See https://veroviz.org/documentation.html#installation for details

--- 

## 1. Import Station Info (from .json)
- These data are *mostly* static...certainly won't change throughout the course of a day.

In [821]:
# Here's one way to import JSON data.
# I'm leaving this here, because it will work with "GET" and "POST" requests,
# which we might use later this semester.
# Tim's approach (below) is a bit cleaner.
'''
import json
import urllib3

urllib3.disable_warnings()

http = urllib3.PoolManager()

response = http.request('GET', "https://gbfs.citibikenyc.com/gbfs/en/station_information.json")
station_info_data = json.loads(response.data.decode('utf-8'))
station_info_data
''';

# The trailing `;` keeps Jupyter from regurgitating our block comment

In [822]:
# Tim's approach for grabbing JSON data:
with urllib.request.urlopen("https://gbfs.citibikenyc.com/gbfs/en/station_information.json") as url:
    station_info_data = json.loads(url.read().decode())
#station_info_data

In [823]:
# station_info_data is a dictionary (which contains several sub-dictionaries).
# Get a list of keys within the station_info_data['data'] dictionary:
station_info_data['data'].keys()

dict_keys(['stations'])

In [824]:
# How many stations are there?
len(station_info_data['data']['stations'])

935

In [825]:
# Convert the JSON data into a Pandas dataframe:
station_info_df = pd.DataFrame(station_info_data['data']['stations'])
station_info_df.head()

Unnamed: 0,station_id,external_id,name,short_name,lat,lon,region_id,rental_methods,capacity,rental_url,electric_bike_surcharge_waiver,eightd_has_key_dispenser,eightd_station_services,has_kiosk
0,304,66db6da2-0aca-11e7-82f6-3863bb44ef7c,Broadway & Battery Pl,4962.01,40.704633,-74.013617,71,"[CREDITCARD, KEY]",33,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,True,[{'id': 'a58d9e34-2f28-40eb-b4a6-c8c01375657a'...,True
1,359,66dbc982-0aca-11e7-82f6-3863bb44ef7c,E 47 St & Park Ave,6584.12,40.755103,-73.974987,71,"[CREDITCARD, KEY]",64,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,False,[{'id': '2e104e31-606a-44af-8b25-ceaffc338489'...,True
2,367,66dbcdfc-0aca-11e7-82f6-3863bb44ef7c,E 53 St & Lexington Ave,6617.09,40.758281,-73.970694,71,"[CREDITCARD, KEY]",34,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,False,[{'id': '2d9a5c9e-50e0-4aed-a63b-91ca81e7d2c0'...,True
3,402,66dbf0d0-0aca-11e7-82f6-3863bb44ef7c,Broadway & E 22 St,6098.07,40.740343,-73.989551,71,"[CREDITCARD, KEY]",39,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,False,[{'id': '37a1ae1b-3dd6-4876-8c57-572aaac97981'...,True
4,3443,66de8a86-0aca-11e7-82f6-3863bb44ef7c,W 52 St & 6 Ave,6740.01,40.76133,-73.97982,71,"[CREDITCARD, KEY]",41,http://app.citibikenyc.com/S6Lr/IBV092JufD?sta...,False,False,[{'id': '286d75b2-088f-4a79-bf7d-223928be711c'...,True


In [826]:
station_info_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 935 entries, 0 to 934
Data columns (total 14 columns):
station_id                        935 non-null object
external_id                       935 non-null object
name                              935 non-null object
short_name                        935 non-null object
lat                               935 non-null float64
lon                               935 non-null float64
region_id                         935 non-null int64
rental_methods                    935 non-null object
capacity                          935 non-null int64
rental_url                        935 non-null object
electric_bike_surcharge_waiver    935 non-null bool
eightd_has_key_dispenser          935 non-null bool
eightd_station_services           6 non-null object
has_kiosk                         935 non-null bool
dtypes: bool(3), float64(2), int64(2), object(7)
memory usage: 83.2+ KB


---

## 2.  Get Station Status Data (from .json)
- These data may change frequently.  I don't know how often they're updated.

In [827]:
# Using Murray's old approach:
'''
response = http.request('GET', "https://gbfs.citibikenyc.com/gbfs/en/station_status.json")
station_status_data = json.loads(response.data.decode('utf-8'))
station_status_data
''';

In [828]:
# Tim's approach for grabbing JSON data:
with urllib.request.urlopen("https://gbfs.citibikenyc.com/gbfs/en/station_status.json") as url:
    station_status_data = json.loads(url.read().decode())
station_status_data

{'last_updated': 1582813822,
 'ttl': 10,
 'data': {'stations': [{'station_id': '304',
    'num_bikes_available': 11,
    'num_ebikes_available': 1,
    'num_bikes_disabled': 0,
    'num_docks_available': 22,
    'num_docks_disabled': 0,
    'is_installed': 1,
    'is_renting': 1,
    'is_returning': 1,
    'last_reported': 1582813735,
    'eightd_has_available_keys': True,
    'eightd_active_station_services': [{'id': 'a58d9e34-2f28-40eb-b4a6-c8c01375657a'}]},
   {'station_id': '359',
    'num_bikes_available': 21,
    'num_ebikes_available': 3,
    'num_bikes_disabled': 2,
    'num_docks_available': 41,
    'num_docks_disabled': 0,
    'is_installed': 1,
    'is_renting': 1,
    'is_returning': 1,
    'last_reported': 1582813593,
    'eightd_has_available_keys': False,
    'eightd_active_station_services': [{'id': '2e104e31-606a-44af-8b25-ceaffc338489'}]},
   {'station_id': '367',
    'num_bikes_available': 5,
    'num_ebikes_available': 0,
    'num_bikes_disabled': 0,
    'num_docks_

In [829]:
# Convert the data into a Pandas dataframe:
station_status_df = pd.DataFrame(station_status_data['data']['stations'])
station_status_df.head()

Unnamed: 0,station_id,num_bikes_available,num_ebikes_available,num_bikes_disabled,num_docks_available,num_docks_disabled,is_installed,is_renting,is_returning,last_reported,eightd_has_available_keys,eightd_active_station_services
0,304,11,1,0,22,0,1,1,1,1582813735,True,[{'id': 'a58d9e34-2f28-40eb-b4a6-c8c01375657a'}]
1,359,21,3,2,41,0,1,1,1,1582813593,False,[{'id': '2e104e31-606a-44af-8b25-ceaffc338489'}]
2,367,5,0,0,29,0,1,1,1,1582813748,False,[{'id': '2d9a5c9e-50e0-4aed-a63b-91ca81e7d2c0'}]
3,402,10,2,0,29,0,1,1,1,1582813797,False,[{'id': '37a1ae1b-3dd6-4876-8c57-572aaac97981'}]
4,3443,6,2,1,34,0,1,1,1,1582813759,False,[{'id': '286d75b2-088f-4a79-bf7d-223928be711c'}]


In [830]:
station_status_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 935 entries, 0 to 934
Data columns (total 12 columns):
station_id                        935 non-null object
num_bikes_available               935 non-null int64
num_ebikes_available              935 non-null int64
num_bikes_disabled                935 non-null int64
num_docks_available               935 non-null int64
num_docks_disabled                935 non-null int64
is_installed                      935 non-null int64
is_renting                        935 non-null int64
is_returning                      935 non-null int64
last_reported                     935 non-null int64
eightd_has_available_keys         935 non-null bool
eightd_active_station_services    6 non-null object
dtypes: bool(1), int64(9), object(2)
memory usage: 81.4+ KB


--- 

## 3.  Import Trip Data (from .csv)
- We'll create a pandas dataframe from the data.
- See https://s3.amazonaws.com/tripdata/index.html for available datasets.

In [831]:
# I just randomly grabbed this file:
bike_trips_df = pd.read_csv('202001-citibike-tripdata.csv')

In [832]:
# bike_trips_df.columns

# Using `list()` formats things a little better:
list(bike_trips_df.columns)

['tripduration',
 'starttime',
 'stoptime',
 'start station id',
 'start station name',
 'start station latitude',
 'start station longitude',
 'end station id',
 'end station name',
 'end station latitude',
 'end station longitude',
 'bikeid',
 'usertype',
 'birth year',
 'gender']

In [833]:
bike_trips_df.head()

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender
0,789,2020-01-01 00:00:55.3900,2020-01-01 00:14:05.1470,504,1 Ave & E 16 St,40.732219,-73.981656,307,Canal St & Rutgers St,40.714275,-73.9899,30326,Subscriber,1992,1
1,1541,2020-01-01 00:01:08.1020,2020-01-01 00:26:49.1780,3423,West Drive & Prospect Park West,40.661063,-73.979453,3300,Prospect Park West & 8 St,40.665147,-73.976376,17105,Customer,1969,1
2,1464,2020-01-01 00:01:42.1400,2020-01-01 00:26:07.0110,3687,E 33 St & 1 Ave,40.743227,-73.974498,259,South St & Whitehall St,40.701221,-74.012342,40177,Subscriber,1963,1
3,592,2020-01-01 00:01:45.5610,2020-01-01 00:11:38.1550,346,Bank St & Hudson St,40.736529,-74.00618,490,8 Ave & W 33 St,40.751551,-73.993934,27690,Subscriber,1980,1
4,702,2020-01-01 00:01:45.7880,2020-01-01 00:13:28.2400,372,Franklin Ave & Myrtle Ave,40.694546,-73.958014,3637,Fulton St & Waverly Ave,40.683239,-73.965996,32583,Subscriber,1982,1


--- 

## Create a VeRoViz "nodes" Dataframe
- We'll populate this with data from Station Info and Station Status
- We'll also hard-code some columns

In [834]:
nodes = vrv.initDataframe('nodes')

In [835]:
# Here are the columns we'll need to populate:
list(nodes.columns)

['id',
 'lat',
 'lon',
 'altMeters',
 'nodeName',
 'nodeType',
 'leafletIconPrefix',
 'leafletIconType',
 'leafletColor',
 'leafletIconText',
 'cesiumIconType',
 'cesiumColor',
 'cesiumIconText']

In [836]:
# Here are the columns from our "Station Info":
list(station_info_df.columns)

['station_id',
 'external_id',
 'name',
 'short_name',
 'lat',
 'lon',
 'region_id',
 'rental_methods',
 'capacity',
 'rental_url',
 'electric_bike_surcharge_waiver',
 'eightd_has_key_dispenser',
 'eightd_station_services',
 'has_kiosk']

In [837]:
# An example to show the syntax for displaying 2 particular columns from a df:
station_info_df[['lat', 'lon']].head()

Unnamed: 0,lat,lon
0,40.704633,-74.013617
1,40.755103,-73.974987
2,40.758281,-73.970694
3,40.740343,-73.989551
4,40.76133,-73.97982


In [838]:
# Let's go ahead and re-initialize an empty dataframe within this cell:
nodes = vrv.initDataframe('nodes')

# Now, copy the relevant columns from our Station Info dataframe:
# NOTE: We were getting some size mis-match errors until we copied 
#       just a single column first.  
nodes['id'] = station_info_df['station_id'].values
nodes[['id', 'lat', 'lon', 'nodeName']] = station_info_df[['station_id', 'lat', 'lon', 'name']].values
nodes[['leafletIconText', 'cesiumIconText']] = station_info_df[['name', 'station_id']].values

# Finally, we'll fill in the rest of our nodes dataframe with some hard-coded/constant values:
nodes.loc[:,'altMeters'] = 0
nodes.loc[:,['nodeType', 'leafletIconPrefix', 'leafletIconType', 'leafletColor']] = [
             'CitiBikeStation',  'fa',                'bicycle',         'orange']
nodes.loc[:,['cesiumIconType', 'cesiumColor']] = ['pin', 'Cesium.Color.ORANGE']

HW 1) CHANGE COLOR OF THE NODES 
---
DIFFERENT TO REPRESENT THE VOLUME OF THE STATIONS
-

1) Green - There are docks and bikes avaliable

2) LightRed/Yellow - There are only docs avaliable

3) Red - There are NO docks or bikes avaliable

In [839]:
# insert Color here 

# Convert the data into a Pandas dataframe:
station_status_df = pd.DataFrame(station_status_data['data']['stations'])
station_status_df['colors'] = None
station_status_df['CESIUMcolors'] = None
station_status_df['OnRoute'] = None
station_status_df.head()


Unnamed: 0,station_id,num_bikes_available,num_ebikes_available,num_bikes_disabled,num_docks_available,num_docks_disabled,is_installed,is_renting,is_returning,last_reported,eightd_has_available_keys,eightd_active_station_services,colors,CESIUMcolors,OnRoute
0,304,11,1,0,22,0,1,1,1,1582813735,True,[{'id': 'a58d9e34-2f28-40eb-b4a6-c8c01375657a'}],,,
1,359,21,3,2,41,0,1,1,1,1582813593,False,[{'id': '2e104e31-606a-44af-8b25-ceaffc338489'}],,,
2,367,5,0,0,29,0,1,1,1,1582813748,False,[{'id': '2d9a5c9e-50e0-4aed-a63b-91ca81e7d2c0'}],,,
3,402,10,2,0,29,0,1,1,1,1582813797,False,[{'id': '37a1ae1b-3dd6-4876-8c57-572aaac97981'}],,,
4,3443,6,2,1,34,0,1,1,1,1582813759,False,[{'id': '286d75b2-088f-4a79-bf7d-223928be711c'}],,,


In [840]:
for i in station_status_df.index:
    if station_status_df['num_docks_available'][i] >= 5 and station_status_df['num_bikes_available'][i] >= 5:
        station_status_df['colors'][i] = 'green'
        station_status_df['CESIUMcolors'][i] = 'Cesium.Color.GREEN'
    elif station_status_df['num_docks_available'][i] <= 5:
        station_status_df['colors'][i] = 'lightred'
        station_status_df['CESIUMcolors'][i] = 'Cesium.Color.YELLOW'
    elif station_status_df['num_bikes_available'][i] <= 5:
        station_status_df['colors'][i] = 'red'
        station_status_df['CESIUMcolors'][i] = 'Cesium.Color.RED'
station_status_df.tail(10)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  import sys
A value is trying to be set on a copy of a

Unnamed: 0,station_id,num_bikes_available,num_ebikes_available,num_bikes_disabled,num_docks_available,num_docks_disabled,is_installed,is_renting,is_returning,last_reported,eightd_has_available_keys,eightd_active_station_services,colors,CESIUMcolors,OnRoute
925,3905,11,0,1,15,0,1,1,1,1582813699,False,,green,Cesium.Color.GREEN,
926,3907,1,0,0,16,0,1,1,1,1582808341,False,,red,Cesium.Color.RED,
927,3909,3,0,0,17,0,1,1,1,1582797170,False,,red,Cesium.Color.RED,
928,3910,0,0,0,17,0,1,1,1,1582797296,False,,red,Cesium.Color.RED,
929,3911,11,0,0,11,0,1,1,1,1582813761,False,,green,Cesium.Color.GREEN,
930,3913,19,0,0,0,0,1,1,1,1582783965,False,,lightred,Cesium.Color.YELLOW,
931,3914,0,0,1,38,0,1,1,1,1582807501,False,,red,Cesium.Color.RED,
932,3916,28,0,1,16,0,1,1,1,1582813739,True,,green,Cesium.Color.GREEN,
933,3917,5,1,0,26,0,1,1,1,1582813529,False,,green,Cesium.Color.GREEN,
934,3918,2,1,0,28,0,1,1,1,1582813767,False,,red,Cesium.Color.RED,


In [841]:
# Let's go ahead and re-initialize an empty dataframe within this cell:
nodes = vrv.initDataframe('nodes')
# Now, copy the relevant columns from our Station Info dataframe:
# NOTE: We were getting some size mis-match errors until we copied
#       just a single column first.
nodes['id'] = station_info_df['station_id'].values
nodes[[ 'lat', 'lon', 'nodeName']] = station_info_df[[ 'lat', 'lon', 'name']].values
nodes[['leafletIconText', 'cesiumIconText']] = station_info_df[['name', 'station_id']].values
# Finally, we'll fill in the rest of our nodes dataframe with some hard-coded/constant values:
nodes['leafletColor'] = station_status_df['colors']
nodes['cesiumColor'] = station_status_df['CESIUMcolors']
nodes.loc[:,'altMeters'] = 0
nodes.loc[:,['nodeType', 'leafletIconPrefix', 'leafletIconType']] = [
             'Citibike Station', 'fa', 'bicycle']
nodes.loc[:,'cesiumIconType'] = 'pin'


HW 2) SET UP A NEW DATABASE FOR BIKE 14530 AND ONLY SHOW THOSE STATIONS
---

In [842]:
value_list=[14530]
bike141_info_df = bike_trips_df[bike_trips_df.bikeid.isin(value_list)]
bike141_info_df

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender
26822,548,2020-01-02 09:25:11.9390,2020-01-02 09:34:20.3600,261,Johnson St & Gold St,40.694749,-73.983625,2000,Front St & Washington St,40.702551,-73.989402,14530,Subscriber,1987,1
50614,1061,2020-01-02 18:16:45.1630,2020-01-02 18:34:26.5780,2000,Front St & Washington St,40.702551,-73.989402,3414,Bergen St & Flatbush Ave,40.680945,-73.975673,14530,Subscriber,1992,1
134556,400,2020-01-05 14:27:01.9750,2020-01-05 14:33:42.1210,3414,Bergen St & Flatbush Ave,40.680945,-73.975673,3486,Schermerhorn St & Bond St,40.688417,-73.984517,14530,Customer,1969,0
171702,301,2020-01-06 16:10:01.6600,2020-01-06 16:15:03.0080,3486,Schermerhorn St & Bond St,40.688417,-73.984517,241,DeKalb Ave & S Portland Ave,40.689810,-73.974931,14530,Subscriber,1985,1
175456,167,2020-01-06 17:15:51.0260,2020-01-06 17:18:38.1240,241,DeKalb Ave & S Portland Ave,40.689810,-73.974931,324,DeKalb Ave & Hudson Ave,40.689888,-73.981013,14530,Subscriber,1998,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1183146,1689,2020-01-30 18:30:48.6870,2020-01-30 18:58:58.0690,362,Broadway & W 37 St,40.751726,-73.987535,151,Cleveland Pl & Spring St,40.722104,-73.997249,14530,Customer,1995,1
1186951,388,2020-01-30 19:30:54.8530,2020-01-30 19:37:23.7710,151,Cleveland Pl & Spring St,40.722104,-73.997249,311,Norfolk St & Broome St,40.717227,-73.988021,14530,Subscriber,1973,1
1191584,884,2020-01-30 21:52:15.6920,2020-01-30 22:07:00.2830,311,Norfolk St & Broome St,40.717227,-73.988021,297,E 15 St & 3 Ave,40.734232,-73.986923,14530,Subscriber,1983,1
1200708,515,2020-01-31 08:27:54.4500,2020-01-31 08:36:30.1120,297,E 15 St & 3 Ave,40.734232,-73.986923,379,W 31 St & 7 Ave,40.749156,-73.991600,14530,Subscriber,1993,1


In [843]:
stationList1 = bike141_info_df['end station id'].unique().tolist()
stationList2 = bike141_info_df['start station id'].unique().tolist()
stationList1 = list(map(int, stationList1))

bike141_info_df
bike141_info_df['colors'] = None
bike141_info_df['CESIUMcolors'] = None
bike141_info_df['capacity'] = None
bike141_info_df.tail()


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  import sys
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,colors,CESIUMcolors,capacity
1183146,1689,2020-01-30 18:30:48.6870,2020-01-30 18:58:58.0690,362,Broadway & W 37 St,40.751726,-73.987535,151,Cleveland Pl & Spring St,40.722104,-73.997249,14530,Customer,1995,1,,,
1186951,388,2020-01-30 19:30:54.8530,2020-01-30 19:37:23.7710,151,Cleveland Pl & Spring St,40.722104,-73.997249,311,Norfolk St & Broome St,40.717227,-73.988021,14530,Subscriber,1973,1,,,
1191584,884,2020-01-30 21:52:15.6920,2020-01-30 22:07:00.2830,311,Norfolk St & Broome St,40.717227,-73.988021,297,E 15 St & 3 Ave,40.734232,-73.986923,14530,Subscriber,1983,1,,,
1200708,515,2020-01-31 08:27:54.4500,2020-01-31 08:36:30.1120,297,E 15 St & 3 Ave,40.734232,-73.986923,379,W 31 St & 7 Ave,40.749156,-73.9916,14530,Subscriber,1993,1,,,
1201970,529,2020-01-31 08:41:55.1210,2020-01-31 08:50:44.3750,379,W 31 St & 7 Ave,40.749156,-73.9916,496,E 16 St & 5 Ave,40.737262,-73.99239,14530,Subscriber,1981,1,,,


In [844]:
# Let's go ahead and re-initialize an empty dataframe within this cell:
nodes2 = vrv.initDataframe('nodes')
# Now, copy the relevant columns from our Station Info dataframe:
# NOTE: We were getting some size mis-match errors until we copied
#       just a single column first.
nodes2['id'] = bike141_info_df['start station id'].values
nodes2[['lat', 'lon', 'nodeName']] = bike141_info_df[[ 'start station latitude', 'start station longitude', 'start station name']].values
nodes2[['leafletIconText', 'cesiumIconText']] = bike141_info_df[[ 'start station name', 'start station id']].values
# Finally, we'll fill in the rest of our nodes dataframe with some hard-coded/constant values:
nodes2['leafletColor'] = station_status_df['colors']
nodes2['cesiumColor'] = station_status_df['CESIUMcolors']
nodes2.loc[:,'altMeters'] = 0
nodes2.loc[:,['nodeType', 'leafletIconPrefix', 'leafletIconType']] = [
             'Citibike Station', 'fa', 'bicycle']
nodes2.loc[:,'cesiumIconType'] = 'pin'

In [845]:
nodes2.head()

Unnamed: 0,id,lat,lon,altMeters,nodeName,nodeType,leafletIconPrefix,leafletIconType,leafletColor,leafletIconText,cesiumIconType,cesiumColor,cesiumIconText
0,261,40.694749,-73.983625,0,Johnson St & Gold St,Citibike Station,fa,bicycle,green,Johnson St & Gold St,pin,Cesium.Color.GREEN,261
1,2000,40.702551,-73.989402,0,Front St & Washington St,Citibike Station,fa,bicycle,green,Front St & Washington St,pin,Cesium.Color.GREEN,2000
2,3414,40.680945,-73.975673,0,Bergen St & Flatbush Ave,Citibike Station,fa,bicycle,green,Bergen St & Flatbush Ave,pin,Cesium.Color.GREEN,3414
3,3486,40.688417,-73.984517,0,Schermerhorn St & Bond St,Citibike Station,fa,bicycle,green,Schermerhorn St & Bond St,pin,Cesium.Color.GREEN,3486
4,241,40.68981,-73.974931,0,DeKalb Ave & S Portland Ave,Citibike Station,fa,bicycle,green,DeKalb Ave & S Portland Ave,pin,Cesium.Color.GREEN,241


In [846]:
# Show all of the nodes on a Leaflet map:
vrv.createLeaflet(nodes=nodes2)

--- 

## Create a VeRoViz "assignments" Dataframe
- We'll populate this with trip data
- We'll also hard-code some columns

In [847]:
# NOTE:  VeRoViz also has an "arcs" dataframe,
#        but it doesn't have time-related columns.
arcs = vrv.initDataframe('arcs')
list(arcs.columns)

# We won't use the "arcs" dataframe

['odID',
 'objectID',
 'startLat',
 'startLon',
 'endLat',
 'endLon',
 'leafletColor',
 'leafletWeight',
 'leafletStyle',
 'leafletOpacity',
 'useArrows',
 'cesiumColor',
 'cesiumWeight',
 'cesiumStyle',
 'cesiumOpacity']

In [848]:
# Initialize an empty "assignments" dataframe:
assignments = vrv.initDataframe('assignments')
assignments.info()

<class 'pandas.core.frame.DataFrame'>
Index: 0 entries
Data columns (total 22 columns):
odID              0 non-null object
objectID          0 non-null object
modelFile         0 non-null object
modelScale        0 non-null object
modelMinPxSize    0 non-null object
startTimeSec      0 non-null object
startLat          0 non-null object
startLon          0 non-null object
startAltMeters    0 non-null object
endTimeSec        0 non-null object
endLat            0 non-null object
endLon            0 non-null object
endAltMeters      0 non-null object
leafletColor      0 non-null object
leafletWeight     0 non-null object
leafletStyle      0 non-null object
leafletOpacity    0 non-null object
useArrows         0 non-null object
cesiumColor       0 non-null object
cesiumWeight      0 non-null object
cesiumStyle       0 non-null object
cesiumOpacity     0 non-null object
dtypes: object(22)
memory usage: 0.0+ bytes


### Here's the plan:
- These columns will come directly from bike trip data:
    - `objectID` (from `bikeid`)
    - `startLat` and `startLon` (from `start station latitude` and `start station longitude`)
    - `endLat` and `endLon` (from `end station latitude` and `end station longitude`)
- These columns will need to be calculated:
    - `startTimeSec` (from `starttime`, but converted to "seconds since the first event")
    - `endTimeSec`   (from `starttime` and `tripduration`, or `starttime` and `stoptime`)
    - We'll create some new columns in `bike_trips_df` to hold our calculations.  Then we'll copy these calculated columns into our assignments dataframe.
- This column will need to be auto generated:
    - `odID` (each origin/destination pair should get a unique integer)
- The remaining columns will be hard-coded (for now)

In [849]:
# What is the first start time in our bike_trips_df?
min(bike_trips_df['starttime'])

'2020-01-01 00:00:55.3900'

In [850]:
# Add a new column to bike_trips_df...

# This next command will produce a "timestamp" (days HH:MM:SS.ms) 
# showing the time since the first observed `starttime`:
bike_trips_df['timeAfterStart'] = pd.to_datetime(bike_trips_df['starttime']) - \
                                  pd.to_datetime(min(bike_trips_df['starttime']))

# Now, convert this to a decimal number of seconds:
bike_trips_df['timeAfterStart'] = bike_trips_df['timeAfterStart'].dt.total_seconds().astype(int)

bike_trips_df['timeAfterStart'].head()

0     0
1    12
2    46
3    50
4    50
Name: timeAfterStart, dtype: int32

In [851]:
# Just for fun, here's the time differences between start/stop times:
pd.to_datetime(bike_trips_df['stoptime']) - pd.to_datetime(bike_trips_df['starttime'])

0         00:13:09.757000
1         00:25:41.076000
2         00:24:24.871000
3         00:09:52.594000
4         00:11:42.452000
                ...      
1240591   00:26:27.607000
1240592   00:03:42.831000
1240593   00:02:43.862000
1240594   00:05:27.148000
1240595   00:08:04.146000
Length: 1240596, dtype: timedelta64[ns]

In [852]:
# In one cell, we'll create our assignments dataframe.

# Make sure we're starting with an empty dataframe:
assignments = vrv.initDataframe('assignments')

# Copy over the static values.
# We'll start by copying a single column, to avoid the size mis-match issue:
assignments['objectID'] = bike_trips_df['bikeid']
assignments[['startLat', 'startLon', 'endLat', 'endLon']] = bike_trips_df[['start station latitude', 
                                                                          'start station longitude',
                                                                          'end station latitude',
                                                                          'end station longitude']].values

# Copy our new calculated column:
assignments['startTimeSec'] = bike_trips_df['timeAfterStart'].values

# Use the calculated column and tripduration to get the end time (in seconds):
assignments['endTimeSec'] = (bike_trips_df['timeAfterStart'] + bike_trips_df['tripduration']).values

# Fill in the rest of our assignments df with some hard-coded values:
# (we'll probably want to revisit this later)
assignments.loc[:,['modelFile', 'modelScale', 'modelMinPxSize', 'startAltMeters', 'endAltMeters', 
                   'leafletColor', 'leafletWeight', 'leafletStyle', 'leafletOpacity', 'useArrows',
                   'cesiumColor', 'cesiumWeight', 'cesiumStyle', 'cesiumOpacity']] = \
                  ['veroviz/models/car_blue.gltf', 100, 45, 0, 0, 
                   'blue', 2, 'solid', 0.8, False, 
                   'Cesium.Color.BLUE', 2, 'solid', 0.7]

# Finally (for now), let's generate a unique odID value for each row.
# This will make sense only if we assume that each row corresponds to a specific
# O/D pair.  Conversely, if we have turn-by-turn arcs, we'll need to group
# multiple rows into the same O/D pair.  We'll tackle that case if/when 
# we encounter it.
assignments.loc[:,'odID'] = list(range(0, len(assignments)))

In [853]:
# Display what we've created:
assignments.head()

Unnamed: 0,odID,objectID,modelFile,modelScale,modelMinPxSize,startTimeSec,startLat,startLon,startAltMeters,endTimeSec,...,endAltMeters,leafletColor,leafletWeight,leafletStyle,leafletOpacity,useArrows,cesiumColor,cesiumWeight,cesiumStyle,cesiumOpacity
0,0,30326,veroviz/models/car_blue.gltf,100,45,0,40.732219,-73.981656,0,789,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
1,1,17105,veroviz/models/car_blue.gltf,100,45,12,40.661063,-73.979453,0,1553,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
2,2,40177,veroviz/models/car_blue.gltf,100,45,46,40.743227,-73.974498,0,1510,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
3,3,27690,veroviz/models/car_blue.gltf,100,45,50,40.736529,-74.00618,0,642,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
4,4,32583,veroviz/models/car_blue.gltf,100,45,50,40.694546,-73.958014,0,752,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7


--- 

### Create a Leaflet map 
- We have a lot of bikes...let's just display one.

In [854]:
# I'll just choose the bike with the smallest ID number:
assignments[assignments['objectID'] == min(assignments['objectID'])]

Unnamed: 0,odID,objectID,modelFile,modelScale,modelMinPxSize,startTimeSec,startLat,startLon,startAltMeters,endTimeSec,...,endAltMeters,leafletColor,leafletWeight,leafletStyle,leafletOpacity,useArrows,cesiumColor,cesiumWeight,cesiumStyle,cesiumOpacity
26822,26822,14530,veroviz/models/car_blue.gltf,100,45,120256,40.694749,-73.983625,0,120804,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
50614,50614,14530,veroviz/models/car_blue.gltf,100,45,152149,40.702551,-73.989402,0,153210,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
134556,134556,14530,veroviz/models/car_blue.gltf,100,45,397566,40.680945,-73.975673,0,397966,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
171702,171702,14530,veroviz/models/car_blue.gltf,100,45,490146,40.688417,-73.984517,0,490447,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
175456,175456,14530,veroviz/models/car_blue.gltf,100,45,494095,40.689810,-73.974931,0,494262,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1183146,1183146,14530,veroviz/models/car_blue.gltf,100,45,2572193,40.751726,-73.987535,0,2573882,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
1186951,1186951,14530,veroviz/models/car_blue.gltf,100,45,2575799,40.722104,-73.997249,0,2576187,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
1191584,1191584,14530,veroviz/models/car_blue.gltf,100,45,2584280,40.717227,-73.988021,0,2585164,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7
1200708,1200708,14530,veroviz/models/car_blue.gltf,100,45,2622419,40.734232,-73.986923,0,2622934,...,0,blue,2,solid,0.8,False,Cesium.Color.BLUE,2,solid,0.7


In [855]:
# Show all of the arcs for this particular bike:
vrv.createLeaflet(arcs=assignments[assignments['objectID'] == 14530])

In [856]:
# Add a new column to bike_trips_df...

# This next command will produce a "timestamp" (days HH:MM:SS.ms) 
# showing the time since the first observed `starttime`:
bike141_info_df['timeAfterStart'] = pd.to_datetime(bike141_info_df['starttime']) - \
                                  pd.to_datetime(min(bike141_info_df['starttime']))

# Now, convert this to a decimal number of seconds:
bike141_info_df['timeAfterStart'] = bike141_info_df['timeAfterStart'].dt.total_seconds().astype(int)

bike141_info_df['timeAfterStart'].head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  if __name__ == '__main__':


26822          0
50614      31893
134556    277310
171702    369889
175456    373839
Name: timeAfterStart, dtype: int32

HW 3) ADD IN NEW CITIBIKE MOVES WITH FOR LOOP
---

In [857]:
for i in bike141_info_df.index:
    bike141_info_df['colors'][i] = 'green'
    bike141_info_df['CESIUMcolors'][i] = 'Cesium.Color.GREEN'
bike141_info_df

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  exec(code_obj, self.user_global_ns, self.user_ns)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,colors,CESIUMcolors,capacity,timeAfterStart
26822,548,2020-01-02 09:25:11.9390,2020-01-02 09:34:20.3600,261,Johnson St & Gold St,40.694749,-73.983625,2000,Front St & Washington St,40.702551,-73.989402,14530,Subscriber,1987,1,green,Cesium.Color.GREEN,,0
50614,1061,2020-01-02 18:16:45.1630,2020-01-02 18:34:26.5780,2000,Front St & Washington St,40.702551,-73.989402,3414,Bergen St & Flatbush Ave,40.680945,-73.975673,14530,Subscriber,1992,1,green,Cesium.Color.GREEN,,31893
134556,400,2020-01-05 14:27:01.9750,2020-01-05 14:33:42.1210,3414,Bergen St & Flatbush Ave,40.680945,-73.975673,3486,Schermerhorn St & Bond St,40.688417,-73.984517,14530,Customer,1969,0,green,Cesium.Color.GREEN,,277310
171702,301,2020-01-06 16:10:01.6600,2020-01-06 16:15:03.0080,3486,Schermerhorn St & Bond St,40.688417,-73.984517,241,DeKalb Ave & S Portland Ave,40.689810,-73.974931,14530,Subscriber,1985,1,green,Cesium.Color.GREEN,,369889
175456,167,2020-01-06 17:15:51.0260,2020-01-06 17:18:38.1240,241,DeKalb Ave & S Portland Ave,40.689810,-73.974931,324,DeKalb Ave & Hudson Ave,40.689888,-73.981013,14530,Subscriber,1998,1,green,Cesium.Color.GREEN,,373839
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1183146,1689,2020-01-30 18:30:48.6870,2020-01-30 18:58:58.0690,362,Broadway & W 37 St,40.751726,-73.987535,151,Cleveland Pl & Spring St,40.722104,-73.997249,14530,Customer,1995,1,green,Cesium.Color.GREEN,,2451936
1186951,388,2020-01-30 19:30:54.8530,2020-01-30 19:37:23.7710,151,Cleveland Pl & Spring St,40.722104,-73.997249,311,Norfolk St & Broome St,40.717227,-73.988021,14530,Subscriber,1973,1,green,Cesium.Color.GREEN,,2455542
1191584,884,2020-01-30 21:52:15.6920,2020-01-30 22:07:00.2830,311,Norfolk St & Broome St,40.717227,-73.988021,297,E 15 St & 3 Ave,40.734232,-73.986923,14530,Subscriber,1983,1,green,Cesium.Color.GREEN,,2464023
1200708,515,2020-01-31 08:27:54.4500,2020-01-31 08:36:30.1120,297,E 15 St & 3 Ave,40.734232,-73.986923,379,W 31 St & 7 Ave,40.749156,-73.991600,14530,Subscriber,1993,1,green,Cesium.Color.GREEN,,2502162


In [858]:
%%time
moved=0
bike141_info_df= bike141_info_df.reset_index(drop=True)
for i in range(len(bike141_info_df)-1):
    if bike141_info_df['end station id'][i] != bike141_info_df['start station id'][i+1] and bike141_info_df['stoptime'][i] <= bike141_info_df['starttime'][i+1]:
        moved+=1
        data = [{'tripduration':28800 ,'starttime':bike141_info_df['stoptime'][i], 'stoptime':bike141_info_df['starttime'][i+1], 'start station id':bike141_info_df['end station id'][i], 'start station name': None, 'start station latitude':bike141_info_df['end station latitude'][i],'start station longitude':bike141_info_df['end station longitude'][i], 'end station id':bike141_info_df['start station id'][i+1], 'end station name':None, 'end station latitude':bike141_info_df['start station latitude'][i+1], 'end station longitude': bike141_info_df['start station longitude'][i+1], 'bikeid':bike141_info_df['bikeid'][i+1],  'usertype':None, 'birth year':None, 'gender':None, 'timeAfterStart':bike141_info_df['tripduration'][i+1], 'colors':'red', 'CESIUMcolors': 'Cesium.Color.RED', 'capacity':None}]
        
        bike141_info_df = bike141_info_df.append(data)
        
print(moved)
#12,000 bikes

4
Wall time: 36.1 ms


In [859]:
assignments2 = vrv.initDataframe('assignments')

# Copy over the static values.
# We'll start by copying a single column, to avoid the size mis-match issue:
assignments2['objectID'] = bike141_info_df['bikeid']
assignments2[['startLat', 'startLon', 'endLat', 'endLon']] = bike141_info_df[['start station latitude', 
                                                                          'start station longitude',
                                                                          'end station latitude',
                                                                          'end station longitude']].values

# Copy our new calculated column:
assignments2['startTimeSec'] = bike141_info_df['timeAfterStart'].values

# Use the calculated column and tripduration to get the end time (in seconds):
assignments2['endTimeSec'] = (bike141_info_df['timeAfterStart'] + bike141_info_df['tripduration']).values
assignments2['cesiumColor'] = bike141_info_df['CESIUMcolors'].values
assignments2['leafletColor'] = bike141_info_df['colors'].values

# Fill in the rest of our assignments df with some hard-coded values:
# (we'll probably want to revisit this later)
assignments2.loc[:,['modelFile', 'modelScale', 'modelMinPxSize', 'startAltMeters', 'endAltMeters', 
                    'leafletWeight', 'leafletStyle', 'leafletOpacity', 'useArrows',
                    'cesiumWeight', 'cesiumStyle', 'cesiumOpacity']] = \
                  ['veroviz/models/UB_Truck.gltf', 100, 45, 0, 0, 
                    2, 'solid', 0.8, False, 
                    2, 'solid', 0.7]

# Finally (for now), let's generate a unique odID value for each row.
# This will make sense only if we assume that each row corresponds to a specific
# O/D pair.  Conversely, if we have turn-by-turn arcs, we'll need to group
# multiple rows into the same O/D pair.  We'll tackle that case if/when 
# we encounter it.
assignments2.loc[:,'odID'] = list(range(0, len(assignments2)))

In [860]:
vrv.createLeaflet(arcs=assignments2)

In [861]:
vrv.createLeaflet(arcs=assignments2[assignments2['leafletColor'] == 'red'])

In [862]:
# startDate: Format is "YYYY-MM-DD"
startDate = pd.to_datetime(min(bike_trips_df['starttime'])).strftime('%Y-%m-%d')

# startTime: Format is "HH:MM:SS"
startTime = pd.to_datetime(min(bike_trips_df['starttime'])).strftime('%H:%M:%S')

vrv.createCesium(
    assignments = assignments2,
    nodes       = nodes2,
    startDate   = startDate,
    startTime   = startTime,
    cesiumDir   = os.environ['CESIUMDIR'],
    problemDir  = 'IE_670/citibike_example')

Message: File selector was written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/;IE_670;citibike_example.vrv ...
Message: Configs were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/config.js ...
Message: Nodes were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/displayNodes.js ...
Message: Assignments (.js) were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/displayPaths.js ...
Message: Assignments (.czml) were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/routes.czml ...


HW 4) ADD IN NEW CITIBIKE MOVES WITHOUT FOR LOOP
---

In [868]:
%%timeit
bike141_info_df2 = []
#Duplicate the dataframe
value_list=[14530]
bike141_info_df2 = bike_trips_df[bike_trips_df.bikeid.isin(value_list)]
B_bike141_info_df = bike141_info_df2.loc[1:, :].reset_index(drop = True)
B_bike141_info_df[['start station latitude', 'start station longitude']]

#Create new columns in the regular
bike141_info_df2['next start lat'] = 0
bike141_info_df2['next start lon'] = 0
bike141_info_df2['repositioning'] = 0
bike141_info_df2['nextStartTime'] = 0
bike141_info_df2['next start lat'] = B_bike141_info_df['start station latitude']
bike141_info_df2['next start lon'] = B_bike141_info_df['start station longitude']
bike141_info_df2['nextStartTime'] = B_bike141_info_df['timeAfterStart']
#Calculate if repositioning happens
bike141_info_df2.loc[(bike141_info_df2['next start lat'] != bike141_info_df2['end station latitude']), 'repositioning'] = 1
bike141_info_df2.iloc[-1, bike141_info_df2.columns.get_loc('repositioning')] = 0
bike141_info_df2.loc[bike141_info_df2['repositioning'] == 1]
bike141_info_df2

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  if __name__ == '__main__':
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  # Remove the CWD from sys.path while we load stuff.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  # This is added back by InteractiveShellApp.init_path()
A value is trying to be set on a copy of a slice from a DataFrame.
Try us

10.9 ms ± 452 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [876]:
repo = pd.DataFrame()
repo = bike141_info_df2[bike141_info_df2['repositioning'] == 1]
repo = repo.reset_index(drop=True)

for i in range(len(repo.index-1)):
    data = [{'tripduration':pd.to_datetime(repo['stoptime'][i]) ,'starttime':repo['stoptime'][i], 'stoptime':repo['nextStartTime'][i], 'start station id':repo['end station id'][i], 'start station name': None, 'start station latitude':repo['end station latitude'][i],'start station longitude':repo['end station longitude'][i], 'end station id':repo['start station id'][i], 'end station name':None, 'end station latitude':repo['start station latitude'][i], 'end station longitude': repo['start station longitude'][i], 'bikeid':repo['bikeid'][i],  'usertype':None, 'birth year':None, 'gender':None, 'timeAfterStart':repo['timeAfterStart'][i] + repo['tripduration'][i] , 'colors':'red', 'CESIUMcolors': 'Cesium.Color.RED', 'capacity':None}]
    bike141_info_df2 = bike141_info_df2.append(data)


Time Comparision - 10.9 ms without  vs 36.1 ms with a for loop
---

--- 

### Create a Cesium movie for one bike

In [870]:
# Use this command to get documentation on the `createCesium()` function:
#vrv.createCesium?
bike141_info_df= bike141_info_df.reset_index(drop=True)
bike141_info_df

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,colors,CESIUMcolors,capacity,timeAfterStart
0,548,2020-01-02 09:25:11.9390,2020-01-02 09:34:20.3600,261,Johnson St & Gold St,40.694749,-73.983625,2000,Front St & Washington St,40.702551,-73.989402,14530,Subscriber,1987,1,green,Cesium.Color.GREEN,,0
1,1061,2020-01-02 18:16:45.1630,2020-01-02 18:34:26.5780,2000,Front St & Washington St,40.702551,-73.989402,3414,Bergen St & Flatbush Ave,40.680945,-73.975673,14530,Subscriber,1992,1,green,Cesium.Color.GREEN,,31893
2,400,2020-01-05 14:27:01.9750,2020-01-05 14:33:42.1210,3414,Bergen St & Flatbush Ave,40.680945,-73.975673,3486,Schermerhorn St & Bond St,40.688417,-73.984517,14530,Customer,1969,0,green,Cesium.Color.GREEN,,277310
3,301,2020-01-06 16:10:01.6600,2020-01-06 16:15:03.0080,3486,Schermerhorn St & Bond St,40.688417,-73.984517,241,DeKalb Ave & S Portland Ave,40.689810,-73.974931,14530,Subscriber,1985,1,green,Cesium.Color.GREEN,,369889
4,167,2020-01-06 17:15:51.0260,2020-01-06 17:18:38.1240,241,DeKalb Ave & S Portland Ave,40.689810,-73.974931,324,DeKalb Ave & Hudson Ave,40.689888,-73.981013,14530,Subscriber,1998,1,green,Cesium.Color.GREEN,,373839
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
73,529,2020-01-31 08:41:55.1210,2020-01-31 08:50:44.3750,379,W 31 St & 7 Ave,40.749156,-73.991600,496,E 16 St & 5 Ave,40.737262,-73.992390,14530,Subscriber,1981,1,green,Cesium.Color.GREEN,,2503003
74,28800,2020-01-06 21:59:37.9540,2020-01-07 13:23:12.3180,3232,,40.689622,-73.983043,3377,,40.678612,-73.990373,14530,,,,red,Cesium.Color.RED,,370
75,28800,2020-01-16 09:48:57.9800,2020-01-16 12:00:41.2330,280,,40.733320,-73.995101,326,,40.729538,-73.984267,14530,,,,red,Cesium.Color.RED,,288
76,28800,2020-01-16 12:05:29.6630,2020-01-17 09:15:27.4700,3812,,40.734814,-73.992085,504,,40.732219,-73.981656,14530,,,,red,Cesium.Color.RED,,377


HW 5) ADD STATIC ASSIGNMENTS TO THE DATABASE
---

In [906]:
bike141_static_info_df = pd.DataFrame(columns= ['tripduration','starttime', 'stoptime', 'start station id', 'start station name', 'start station latitude','start station longitude', 'end station id', 'end station name', 'end station latitude', 'end station longitude','bikeid','usertype','birth year','gender', 'timeAfterStart','colors','CESIUMcolors','Cesium.Color.RED','capacity'])
data=[]

for i in range(len(bike141_info_df.index)-1):
    data = [{'tripduration':0,'starttime':bike141_info_df['stoptime'][i], 'stoptime':bike141_info_df['starttime'][i+1], 'start station id':bike141_info_df['end station id'][i], 'start station name': None, 'start station latitude':bike141_info_df['end station latitude'][i],'start station longitude':bike141_info_df['end station longitude'][i], 'end station id':bike141_info_df['start station id'][i+1], 'end station name':None, 'end station latitude':bike141_info_df['start station latitude'][i+1], 'end station longitude': bike141_info_df['start station longitude'][i+1], 'bikeid':bike141_info_df['bikeid'][i+1],  'usertype':None, 'birth year':None, 'gender':None, 'timeAfterStart':0, 'colors':'red', 'CESIUMcolors': 'Cesium.Color.RED', 'capacity':None}]
    bike141_static_info_df = bike141_static_info_df.append(data)
    
bike141_static_info_df= bike141_static_info_df.reset_index(drop=True)
bike141_static_info_df    

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,timeAfterStart,colors,CESIUMcolors,Cesium.Color.RED,capacity
0,0,2020-01-02 09:34:20.3600,2020-01-02 18:16:45.1630,2000,,40.702551,-73.989402,2000,,40.702551,-73.989402,14530,,,,0,red,Cesium.Color.RED,,
1,0,2020-01-02 18:34:26.5780,2020-01-05 14:27:01.9750,3414,,40.680945,-73.975673,3414,,40.680945,-73.975673,14530,,,,0,red,Cesium.Color.RED,,
2,0,2020-01-05 14:33:42.1210,2020-01-06 16:10:01.6600,3486,,40.688417,-73.984517,3486,,40.688417,-73.984517,14530,,,,0,red,Cesium.Color.RED,,
3,0,2020-01-06 16:15:03.0080,2020-01-06 17:15:51.0260,241,,40.689810,-73.974931,241,,40.689810,-73.974931,14530,,,,0,red,Cesium.Color.RED,,
4,0,2020-01-06 17:18:38.1240,2020-01-06 17:24:58.6320,324,,40.689888,-73.981013,324,,40.689888,-73.981013,14530,,,,0,red,Cesium.Color.RED,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
72,0,2020-01-31 08:36:30.1120,2020-01-31 08:41:55.1210,379,,40.749156,-73.991600,379,,40.749156,-73.991600,14530,,,,0,red,Cesium.Color.RED,,
73,0,2020-01-31 08:50:44.3750,2020-01-06 21:59:37.9540,496,,40.737262,-73.992390,3232,,40.689622,-73.983043,14530,,,,0,red,Cesium.Color.RED,,
74,0,2020-01-07 13:23:12.3180,2020-01-16 09:48:57.9800,3377,,40.678612,-73.990373,280,,40.733320,-73.995101,14530,,,,0,red,Cesium.Color.RED,,
75,0,2020-01-16 12:00:41.2330,2020-01-16 12:05:29.6630,326,,40.729538,-73.984267,3812,,40.734814,-73.992085,14530,,,,0,red,Cesium.Color.RED,,


In [907]:
bike141_static_info_df['timeAfterStart'] = pd.to_datetime(bike141_static_info_df['starttime']) - \
                                  pd.to_datetime(min(bike141_static_info_df['starttime']))

# Now, convert this to a decimal number of seconds:
bike141_static_info_df['timeAfterStart'] = bike141_static_info_df['timeAfterStart'].dt.total_seconds().astype(int)

bike141_static_info_df['tripduration'] = pd.to_datetime(bike141_static_info_df['stoptime']) - \
                                  pd.to_datetime(bike141_static_info_df['starttime'])
bike141_static_info_df['tripduration'] = bike141_static_info_df['tripduration'].dt.total_seconds().astype(int)

bike141_static_info_df

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,timeAfterStart,colors,CESIUMcolors,Cesium.Color.RED,capacity
0,31344,2020-01-02 09:34:20.3600,2020-01-02 18:16:45.1630,2000,,40.702551,-73.989402,2000,,40.702551,-73.989402,14530,,,,0,red,Cesium.Color.RED,,
1,244355,2020-01-02 18:34:26.5780,2020-01-05 14:27:01.9750,3414,,40.680945,-73.975673,3414,,40.680945,-73.975673,14530,,,,32406,red,Cesium.Color.RED,,
2,92179,2020-01-05 14:33:42.1210,2020-01-06 16:10:01.6600,3486,,40.688417,-73.984517,3486,,40.688417,-73.984517,14530,,,,277161,red,Cesium.Color.RED,,
3,3648,2020-01-06 16:15:03.0080,2020-01-06 17:15:51.0260,241,,40.689810,-73.974931,241,,40.689810,-73.974931,14530,,,,369642,red,Cesium.Color.RED,,
4,380,2020-01-06 17:18:38.1240,2020-01-06 17:24:58.6320,324,,40.689888,-73.981013,324,,40.689888,-73.981013,14530,,,,373457,red,Cesium.Color.RED,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
72,325,2020-01-31 08:36:30.1120,2020-01-31 08:41:55.1210,379,,40.749156,-73.991600,379,,40.749156,-73.991600,14530,,,,2502129,red,Cesium.Color.RED,,
73,-2112666,2020-01-31 08:50:44.3750,2020-01-06 21:59:37.9540,496,,40.737262,-73.992390,3232,,40.689622,-73.983043,14530,,,,2502984,red,Cesium.Color.RED,,
74,764745,2020-01-07 13:23:12.3180,2020-01-16 09:48:57.9800,3377,,40.678612,-73.990373,280,,40.733320,-73.995101,14530,,,,445731,red,Cesium.Color.RED,,
75,288,2020-01-16 12:00:41.2330,2020-01-16 12:05:29.6630,326,,40.729538,-73.984267,3812,,40.734814,-73.992085,14530,,,,1218380,red,Cesium.Color.RED,,


In [908]:
bike141_static_info_df=bike141_static_info_df.drop(bike141_static_info_df.index[73])
bike141_static_info_df

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,timeAfterStart,colors,CESIUMcolors,Cesium.Color.RED,capacity
0,31344,2020-01-02 09:34:20.3600,2020-01-02 18:16:45.1630,2000,,40.702551,-73.989402,2000,,40.702551,-73.989402,14530,,,,0,red,Cesium.Color.RED,,
1,244355,2020-01-02 18:34:26.5780,2020-01-05 14:27:01.9750,3414,,40.680945,-73.975673,3414,,40.680945,-73.975673,14530,,,,32406,red,Cesium.Color.RED,,
2,92179,2020-01-05 14:33:42.1210,2020-01-06 16:10:01.6600,3486,,40.688417,-73.984517,3486,,40.688417,-73.984517,14530,,,,277161,red,Cesium.Color.RED,,
3,3648,2020-01-06 16:15:03.0080,2020-01-06 17:15:51.0260,241,,40.689810,-73.974931,241,,40.689810,-73.974931,14530,,,,369642,red,Cesium.Color.RED,,
4,380,2020-01-06 17:18:38.1240,2020-01-06 17:24:58.6320,324,,40.689888,-73.981013,324,,40.689888,-73.981013,14530,,,,373457,red,Cesium.Color.RED,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
71,37254,2020-01-30 22:07:00.2830,2020-01-31 08:27:54.4500,297,,40.734232,-73.986923,297,,40.734232,-73.986923,14530,,,,2464359,red,Cesium.Color.RED,,
72,325,2020-01-31 08:36:30.1120,2020-01-31 08:41:55.1210,379,,40.749156,-73.991600,379,,40.749156,-73.991600,14530,,,,2502129,red,Cesium.Color.RED,,
74,764745,2020-01-07 13:23:12.3180,2020-01-16 09:48:57.9800,3377,,40.678612,-73.990373,280,,40.733320,-73.995101,14530,,,,445731,red,Cesium.Color.RED,,
75,288,2020-01-16 12:00:41.2330,2020-01-16 12:05:29.6630,326,,40.729538,-73.984267,3812,,40.734814,-73.992085,14530,,,,1218380,red,Cesium.Color.RED,,


In [909]:
bike141_static_info_df= bike141_static_info_df.reset_index(drop=True)
bike141_static_info_df

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,timeAfterStart,colors,CESIUMcolors,Cesium.Color.RED,capacity
0,31344,2020-01-02 09:34:20.3600,2020-01-02 18:16:45.1630,2000,,40.702551,-73.989402,2000,,40.702551,-73.989402,14530,,,,0,red,Cesium.Color.RED,,
1,244355,2020-01-02 18:34:26.5780,2020-01-05 14:27:01.9750,3414,,40.680945,-73.975673,3414,,40.680945,-73.975673,14530,,,,32406,red,Cesium.Color.RED,,
2,92179,2020-01-05 14:33:42.1210,2020-01-06 16:10:01.6600,3486,,40.688417,-73.984517,3486,,40.688417,-73.984517,14530,,,,277161,red,Cesium.Color.RED,,
3,3648,2020-01-06 16:15:03.0080,2020-01-06 17:15:51.0260,241,,40.689810,-73.974931,241,,40.689810,-73.974931,14530,,,,369642,red,Cesium.Color.RED,,
4,380,2020-01-06 17:18:38.1240,2020-01-06 17:24:58.6320,324,,40.689888,-73.981013,324,,40.689888,-73.981013,14530,,,,373457,red,Cesium.Color.RED,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
71,37254,2020-01-30 22:07:00.2830,2020-01-31 08:27:54.4500,297,,40.734232,-73.986923,297,,40.734232,-73.986923,14530,,,,2464359,red,Cesium.Color.RED,,
72,325,2020-01-31 08:36:30.1120,2020-01-31 08:41:55.1210,379,,40.749156,-73.991600,379,,40.749156,-73.991600,14530,,,,2502129,red,Cesium.Color.RED,,
73,764745,2020-01-07 13:23:12.3180,2020-01-16 09:48:57.9800,3377,,40.678612,-73.990373,280,,40.733320,-73.995101,14530,,,,445731,red,Cesium.Color.RED,,
74,288,2020-01-16 12:00:41.2330,2020-01-16 12:05:29.6630,326,,40.729538,-73.984267,3812,,40.734814,-73.992085,14530,,,,1218380,red,Cesium.Color.RED,,


In [910]:
assignmentsS = vrv.initDataframe('assignments')
for i in range(len(bike141_static_info_df)):
    assignmentsS = vrv.addStaticAssignment(
        initAssignments = assignmentsS, 
        odID            = i+77, 
        objectID        = 'bike',
        modelFile       = 'veroviz/models/UB_Truck.gltf', 
        modelScale      = 100, 
        modelMinPxSize  = 45, 
        loc             = [bike141_static_info_df['start station latitude'][i] , bike141_static_info_df['start station longitude'][i], 0],
        startTimeSec    = bike141_static_info_df['timeAfterStart'][i],
        endTimeSec      = bike141_static_info_df['timeAfterStart'][i] + bike141_static_info_df['tripduration'][i])

In [911]:
assignments2=assignments2.append(assignmentsS)
assignments2

Unnamed: 0,odID,objectID,modelFile,modelScale,modelMinPxSize,startTimeSec,startLat,startLon,startAltMeters,endTimeSec,...,endAltMeters,leafletColor,leafletWeight,leafletStyle,leafletOpacity,useArrows,cesiumColor,cesiumWeight,cesiumStyle,cesiumOpacity
0,0,14530,/////veroviz/models/UB_Truck.gltf,100,45,0,40.694749,-73.983625,0,548,...,0,green,2,solid,0.8,False,Cesium.Color.GREEN,2,solid,0.7
1,1,14530,/veroviz/models/UB_Truck.gltf,100,45,31893,40.702551,-73.989402,0,32954,...,0,green,2,solid,0.8,False,Cesium.Color.GREEN,2,solid,0.7
2,2,14530,/veroviz/models/UB_Truck.gltf,100,45,277310,40.680945,-73.975673,0,277710,...,0,green,2,solid,0.8,False,Cesium.Color.GREEN,2,solid,0.7
3,3,14530,/veroviz/models/UB_Truck.gltf,100,45,369889,40.688417,-73.984517,0,370190,...,0,green,2,solid,0.8,False,Cesium.Color.GREEN,2,solid,0.7
4,4,14530,/veroviz/models/UB_Truck.gltf,100,45,373839,40.689810,-73.974931,0,374006,...,0,green,2,solid,0.8,False,Cesium.Color.GREEN,2,solid,0.7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
71,148,bike,/veroviz/models/UB_Truck.gltf,100,45,2464359,40.734232,-73.986923,0,2501613,...,0,,,,,,,,,
72,149,bike,/veroviz/models/UB_Truck.gltf,100,45,2502129,40.749156,-73.991600,0,2502454,...,0,,,,,,,,,
73,150,bike,/veroviz/models/UB_Truck.gltf,100,45,445731,40.678612,-73.990373,0,1210476,...,0,,,,,,,,,
74,151,bike,/veroviz/models/UB_Truck.gltf,100,45,1218380,40.729538,-73.984267,0,1218668,...,0,,,,,,,,,


In [913]:
# this works but some datapoints put two cars overlapped on a handful of occations
# startDate: Format is "YYYY-MM-DD"
startDate = pd.to_datetime(min(bike141_info_df['starttime'])).strftime('%Y-%m-%d')

# startTime: Format is "HH:MM:SS"
startTime = pd.to_datetime(min(bike141_info_df['starttime'])).strftime('%H:%M:%S')

vrv.createCesium(
    assignments = assignments2,
    nodes       = nodes2,
    startDate   = startDate,
    startTime   = startTime,
    cesiumDir   = os.environ['CESIUMDIR'],
    problemDir  = 'IE_670/citibike_example')

Message: File selector was written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/;IE_670;citibike_example.vrv ...
Message: Configs were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/config.js ...
Message: Nodes were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/displayNodes.js ...
Message: Assignments (.js) were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/displayPaths.js ...
Message: Assignments (.czml) were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/routes.czml ...


--- 

#### Playing around with dates/times
- Here's some code related to formatting dates/times.  There might be something useful here in the future...

In [664]:
pd.to_datetime(bike_trips_df['starttime']).dt.date

0          2020-01-01
1          2020-01-01
2          2020-01-01
3          2020-01-01
4          2020-01-01
              ...    
1240591    2020-01-31
1240592    2020-01-31
1240593    2020-01-31
1240594    2020-01-31
1240595    2020-01-31
Name: starttime, Length: 1240596, dtype: object

In [665]:
pd.to_datetime(min(bike_trips_df['starttime'])).strftime('%Y-%m-%d')

'2020-01-01'

In [666]:
pd.to_datetime(min(bike_trips_df['starttime'])).strftime('%H:%M:%S')

'00:00:55'

--- 

### NYC Subway Stations

- A list of subway stations may be found here:
    - http://web.mta.info/developers/data/nyct/subway/Stations.csv 

- Other links:
    - http://web.mta.info/developers/index.html
    - http://datamine.mta.info/list-of-feeds 
    
Ideas:
- For a given location, find the nearest subway station.
- For a given destination, find the nearest **available** CitiBike station.
- For a given O/D pair, determine the best combination of subways/bikes to use.


In [667]:
trains_df = pd.read_csv('http://web.mta.info/developers/data/nyct/subway/Stations.csv')
trains_df.head()


Unnamed: 0,Station ID,Complex ID,GTFS Stop ID,Division,Line,Stop Name,Borough,Daytime Routes,Structure,GTFS Latitude,GTFS Longitude,North Direction Label,South Direction Label
0,1,1,R01,BMT,Astoria,Astoria - Ditmars Blvd,Q,N W,Elevated,40.775036,-73.912034,,Manhattan
1,2,2,R03,BMT,Astoria,Astoria Blvd,Q,N W,Elevated,40.770258,-73.917843,Ditmars Blvd,Manhattan
2,3,3,R04,BMT,Astoria,30 Av,Q,N W,Elevated,40.766779,-73.921479,Astoria - Ditmars Blvd,Manhattan
3,4,4,R05,BMT,Astoria,Broadway,Q,N W,Elevated,40.76182,-73.925508,Astoria - Ditmars Blvd,Manhattan
4,5,5,R06,BMT,Astoria,36 Av,Q,N W,Elevated,40.756804,-73.929575,Astoria - Ditmars Blvd,Manhattan


In [668]:
# Let's go ahead and re-initialize an empty dataframe within this cell:
trainNodes = vrv.initDataframe('nodes')

# Now, copy the relevant columns from our Station Info dataframe:
# NOTE: We were getting some size mis-match errors until we copied 
#       just a single column first.  
trainNodes['id'] = trains_df['Station ID'].values
trainNodes[['id', 'lat', 'lon', 'nodeName']] = trains_df[['Station ID', 'GTFS Latitude', 'GTFS Longitude', 'Stop Name']].values
trainNodes[['leafletIconText', 'cesiumIconText']] = trains_df[['Stop Name', 'Station ID']].values

# Finally, we'll fill in the rest of our nodes dataframe with some hard-coded/constant values:
trainNodes.loc[:,'altMeters'] = 0
trainNodes.loc[:,['nodeType', 'leafletIconPrefix', 'leafletIconType', 'leafletColor']] = [
             'CitiBikeStation',  'fa',                'train',         'blue']
trainNodes.loc[:,['cesiumIconType', 'cesiumColor']] = ['pin', 'Cesium.Color.BLUE']

In [44]:
vrv.createLeaflet(nodes=trainNodes)

In [90]:
from math import cos, asin, sqrt
station_info_df['lat'] = station_info_df['lat'].astype(float)
station_info_df['lon'] = station_info_df['lon'].astype(float)
bikes = pd.DataFrame(station_info_df[['lat', 'lon']])
trains=pd.DataFrame(trains_df[['GTFS Latitude', 'GTFS Longitude']])


def distance(lat1, lon1, lat2, lon2):
    p = 0.017453292519943295
    a = 0.5 - cos((lat2-lat1)*p)/2 + cos(lat1*p)*cos(lat2*p) * (1-cos((lon2-lon1)*p)) / 2
    return 12742 * asin(sqrt(a))

def closest(data, v):
    return min(data, key=lambda p: distance(v['lat'],v['lon'],p['lat'],p['lon']))



In [87]:
from math import radians, cos, sin, asin, sqrt

def stationClose(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    r = 6371 # Radius of earth in kilometers. Use 3956 for miles
    return c * r

center_point = [{'lat': -7.7940023, 'lng': 110.3656535}]
test_point = [{'lat': -7.79457, 'lng': 110.36563}]

lat1 = center_point[0]['lat']
lon1 = center_point[0]['lng']
lat2 = test_point[0]['lat']
lon2 = test_point[0]['lng']

radius = 1.00 # in kilometer

a = haversine(lon1, lat1, lon2, lat2)

#print('Distance (km) : ', a)
#if a <= radius:
#    return 1
#else:
#    return 0

In [103]:
closeStations=0
bikes = pd.DataFrame(station_info_df[['lat', 'lon']])
trains=pd.DataFrame(trains_df[['GTFS Latitude', 'GTFS Longitude']])
locations=[]
nodesB = pd.DataFrame()



for i in range(len(bikes)):
    for j in range(len(trains)):
        if stationClose(bikes['lon'].iloc[i],bikes['lat'].iloc[i],trains['GTFS Longitude'].iloc[j],trains['GTFS Latitude'].iloc[j])<= 1:
                closeStations+=1
           # nodesB = pd.DataFrame( bikes['lat'].iloc[i].values)
            #df = pd.DataFrame()
           # nodesB.append(df)

print(closeStations)

#457141

6619


In [898]:
dataSet = pd.DataFrame(columns= ['tripduration','starttime', 'stoptime', 'start station id', 'start station name', 'start station latitude','start station longitude', 'end station id', 'end station name', 'end station latitude', 'end station longitude','bikeid','usertype','birth year','gender', 'timeAfterStart','colors','CESIUMcolors','Cesium.Color.RED','capacity'])
dataSet1 = bike141_info_df.loc[[1,2] , :]
dataSet=dataSet.append(dataSet1)
dataSet2 = bike141_info_df.loc[[3,4] , :]
dataSet = dataSet.append(dataSet2)
dataSet3 = bike141_static_info_df.loc[[1,2] , :]
dataSet = dataSet.append(dataSet3)
dataSet4 = bike141_static_info_df.loc[[3,4] , :]
dataSet = dataSet.append(dataSet4)
# dataSet.append(dataSet4)
dataSet

Unnamed: 0,CESIUMcolors,Cesium.Color.RED,bikeid,birth year,capacity,colors,end station id,end station latitude,end station longitude,end station name,gender,start station id,start station latitude,start station longitude,start station name,starttime,stoptime,timeAfterStart,tripduration,usertype
1,Cesium.Color.GREEN,,14530,1992.0,,green,3414,40.680945,-73.975673,Bergen St & Flatbush Ave,1.0,2000,40.702551,-73.989402,Front St & Washington St,2020-01-02 18:16:45.1630,2020-01-02 18:34:26.5780,31893,1061,Subscriber
2,Cesium.Color.GREEN,,14530,1969.0,,green,3486,40.688417,-73.984517,Schermerhorn St & Bond St,0.0,3414,40.680945,-73.975673,Bergen St & Flatbush Ave,2020-01-05 14:27:01.9750,2020-01-05 14:33:42.1210,277310,400,Customer
3,Cesium.Color.GREEN,,14530,1985.0,,green,241,40.68981,-73.974931,DeKalb Ave & S Portland Ave,1.0,3486,40.688417,-73.984517,Schermerhorn St & Bond St,2020-01-06 16:10:01.6600,2020-01-06 16:15:03.0080,369889,301,Subscriber
4,Cesium.Color.GREEN,,14530,1998.0,,green,324,40.689888,-73.981013,DeKalb Ave & Hudson Ave,1.0,241,40.68981,-73.974931,DeKalb Ave & S Portland Ave,2020-01-06 17:15:51.0260,2020-01-06 17:18:38.1240,373839,167,Subscriber
1,Cesium.Color.RED,,14530,,,red,3414,40.680945,-73.975673,,,3414,40.680945,-73.975673,,2020-01-02 18:34:26.5780,2020-01-05 14:27:01.9750,153211,244355,
2,Cesium.Color.RED,,14530,,,red,3486,40.688417,-73.984517,,,3486,40.688417,-73.984517,,2020-01-05 14:33:42.1210,2020-01-06 16:10:01.6600,397966,92179,
3,Cesium.Color.RED,,14530,,,red,241,40.68981,-73.974931,,,241,40.68981,-73.974931,,2020-01-06 16:15:03.0080,2020-01-06 17:15:51.0260,490447,3648,
4,Cesium.Color.RED,,14530,,,red,324,40.689888,-73.981013,,,324,40.689888,-73.981013,,2020-01-06 17:18:38.1240,2020-01-06 17:24:58.6320,494262,380,


In [902]:
# Let's go ahead and re-initialize an empty dataframe within this cell:
nodesTest = vrv.initDataframe('nodes')
# Now, copy the relevant columns from our Station Info dataframe:
# NOTE: We were getting some size mis-match errors until we copied
#       just a single column first.
nodesTest['id'] = dataSet['start station id'].values
nodesTest[['lat', 'lon', 'nodeName']] = dataSet[[ 'start station latitude', 'start station longitude', 'start station name']].values
nodesTest[['leafletIconText', 'cesiumIconText']] = dataSet[[ 'start station name', 'start station id']].values
# Finally, we'll fill in the rest of our nodes dataframe with some hard-coded/constant values:
nodesTest['leafletColor'] = station_status_df['colors']
nodesTest['cesiumColor'] = station_status_df['CESIUMcolors']
nodesTest.loc[:,'altMeters'] = 0
nodesTest.loc[:,['nodeType', 'leafletIconPrefix', 'leafletIconType']] = [
             'Citibike Station', 'fa', 'bicycle']
nodes2.loc[:,'cesiumIconType'] = 'pin'


In [903]:
assignmentsTest = vrv.initDataframe('assignments')

# Copy over the static values.
# We'll start by copying a single column, to avoid the size mis-match issue:
assignmentsTest['objectID'] = dataSet['bikeid']
assignmentsTest[['startLat', 'startLon', 'endLat', 'endLon']] = dataSet[['start station latitude', 
                                                                          'start station longitude',
                                                                          'end station latitude',
                                                                          'end station longitude']].values

# Copy our new calculated column:
assignmentsTest['startTimeSec'] = dataSet['timeAfterStart'].values

# Use the calculated column and tripduration to get the end time (in seconds):
assignmentsTest['endTimeSec'] = (dataSet['timeAfterStart'] + dataSet['tripduration']).values
assignmentsTest['cesiumColor'] = dataSet['CESIUMcolors'].values
assignmentsTest['leafletColor'] = dataSet['colors'].values

# Fill in the rest of our assignments df with some hard-coded values:
# (we'll probably want to revisit this later)
assignmentsTest.loc[:,['modelFile', 'modelScale', 'modelMinPxSize', 'startAltMeters', 'endAltMeters', 
                    'leafletWeight', 'leafletStyle', 'leafletOpacity', 'useArrows',
                    'cesiumWeight', 'cesiumStyle', 'cesiumOpacity']] = \
                  ['veroviz/models/UB_Truck.gltf', 100, 45, 0, 0, 
                    2, 'solid', 0.8, False, 
                    2, 'solid', 0.7]

# Finally (for now), let's generate a unique odID value for each row.
# This will make sense only if we assume that each row corresponds to a specific
# O/D pair.  Conversely, if we have turn-by-turn arcs, we'll need to group
# multiple rows into the same O/D pair.  We'll tackle that case if/when 
# we encounter it.
assignmentsTest.loc[:,'odID'] = list(range(0, len(assignmentsTest)))

In [905]:
# startDate: Format is "YYYY-MM-DD"
startDate = pd.to_datetime(min(dataSet['starttime'])).strftime('%Y-%m-%d')

# startTime: Format is "HH:MM:SS"
startTime = pd.to_datetime(min(dataSet['starttime'])).strftime('%H:%M:%S')

vrv.createCesium(
    assignments = assignmentsTest,
    nodes       = nodesTest,
    startDate   = startDate,
    startTime   = startTime,
    cesiumDir   = os.environ['CESIUMDIR'],
    problemDir  = 'IE_670/citibike_example')

Message: File selector was written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/;IE_670;citibike_example.vrv ...
Message: Configs were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/config.js ...
Message: Nodes were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/displayNodes.js ...
Message: Assignments (.js) were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/displayPaths.js ...
Message: Assignments (.czml) were written to C:/Users/kayli/Downloads/Cesium-1.63/IE_670/citibike_example/routes.czml ...
