### 3 Location Data Review

#### 3.0 Data Description

The locaition data is contained in the **tbl_ttc_location_log** table. This data obtained using the 'vehicleLocations' command.  The *download_data.py* script is set up to download the complete location data every 30 seconds for the duration of the analysis period.



In [1]:
#import helperfunction.py
%run helperfunctions

import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import HTML
import psycopg2
import datetime
db = db_name

#uncomment if you want to analyse a different database from the one in your settings file
#db = 'ttctestii'

#### 3.1 tbl_ttc_location_log

Here *id* will bet the vehicle unique identifier, *dirtag* is matched to the *directiontag* from the route data reviewed in section 2. *Timestamp* is the unix epoch (x1000) of the report and *secssincereport* is the number of seconds old the reading is at the reported time.  *Speedkmhr* we probably won't look to closely at.


In [2]:
tbl = 'tbl_ttc_location_log'
gdf_location_log = gettbl(tbl,db)
gdf_location_log.head()

Unnamed: 0,dirtag,heading,id,lat,lon,predictable,routetag,secssincereport,speedkmhr,timestamp
0,89_0_89,136,1406,43.6776227,-79.4725316,True,89,16,31,1573866299126
1,11_0_11A,165,8126,43.7049325,-79.3748807,True,11,17,0,1573866299126
2,32_0_32C,72,1010,43.6944186,-79.4558243,True,32,17,33,1573866299126
3,41_1_41,351,9102,43.6967391,-79.4752791,True,41,16,0,1573866299126
4,133_1_133,69,3454,43.7745018,-79.2583465,True,133,16,2,1573866299126


#### 3.2 Putting the Data Together

Next we want to use postgis to look at the data, put it into a more useful format and combine it with the route information that we put together in section 2.

##### 3.2.1 Plotting the Location Data on the Route Lines

Here we take the position points (lat/lon) from section 3.1 above and plot those points vs the route line we created in section 2.  In the example below we plot bus number 1010 on the 32C West Bound route over time so we can observe it's westerly progression.

In [2]:
sql = """
select dirtag, id, st_setsrid(st_makepoint(lon::float, lat::float), 4326) geom_position, routetag,
timestamp::bigint/1000 - secssincereport::bigint actual_time_epoch, 
fn_epoch_to_dt(timestamp::bigint/1000 - secssincereport::bigint) actual_time
from tbl_ttc_location_log;
"""
gdf_location = getsql_postgis(sql,'geom_position', db)


In [3]:
sql = 'select * from calc_stop_paths'
gdf_stoppaths = getsql_postgis(sql,'stop_path_geom', db)

%matplotlib agg
fig, ax = plt.subplots()
def animate(ts):
    ax.clear()
    gdf_stoppaths[(gdf_stoppaths['directiontag']=='32_1_32C')].plot(ax=ax,color='blue')
    gdf = gdf_location[(gdf_location['dirtag']=='32_1_32C') & (gdf_location['actual_time']==ts) & (gdf_location['id']=='1010')].sort_values(['actual_time'])
    gdf.plot(ax=ax,color='red')
    ax.set_title('Bus ID 1010 on the 32C West Route')
    ax.text(0.6,0.9,datetime.datetime.strftime(ts, '%b %d %I:%M:%S%p'),transform=ax.transAxes)
    ax.axis('equal')


In [4]:
%matplotlib inline

In [5]:
ts = gdf_location[(gdf_location['dirtag']=='32_1_32C') & (gdf_location['id']=='1010')].actual_time.tolist()
ani = animation.FuncAnimation(fig, animate, frames=ts)

In [8]:
ani.save('fig_01.gif',writer='imagemagick',fps=2)

<Figure size 432x288 with 0 Axes>

![fig_01.gif](fig_01.gif)

In [7]:
HTML(ani.to_jshtml(fps=2))

<Figure size 432x288 with 0 Axes>

##### 3.2.2 Joining Route and Vehicle Location Data

In the above section we show how we can take the vehicle location data and plot it on the same axes as the route data, while this makes for an interesting visual, from the perspective of analysing the data we will want to be able to join the route and vehicle location data together into one data set.  This will allow us to know *where* the vehicle is on the route and if we know where the vehicle is on the route over time we should be able to analyse things like time between bus arivals at stops, bus bunching, average speed along individual route segments.

The 'calc_location_log_07' table is created in Part 2 of the *db_calculations.sql* sql script, you should have run that in section *1 Analysis Setup*, but if you haven't run that yet, do so before you proceed.

What that section of code does is:
* Standardises timestamps to global 30 second time intervals so that we can analyse all vehicles against the same data point schedule.
* Projects the vehicle location onto the closest point of the route map to find the vehicle position on the map.
* For each datapoint, creates a line from the current location to the previous location for that vehicle along the route path.
* Interpolates missing data.
* Removes inconsistent data:
    * You can see removed records in the table: calc_location_removes.
    * If the datapoint is more than 20 meters from the route line.  Any datapoint that does get removed will be estimated by interpolating from the preceding and lagging datapoints provided they are less than 6min appart.  This does remove datapoints in situations where there is a detour, which is something we will have to address at a later date.
    * If the vehicle has been traveling in one direction for less than 5min before switching to a new direction (being defined by the direction tag rather than physical heading) we will remove those datapoints.  At a later date we might analyse short turns, or cases where it looks like a vehicle has gone out of service mid-route but for now we are just going to be looking at data from vehicles where a significant portion of the route has been covered.
    * Similar to the above, we also remove data where a vehicle has covered less than 20% of the route before changing direction.


In [8]:
sql = 'select *, fn_epoch_to_dt(analysis_time) analysis_dt from calc_location_log_07'
gdfl = getsql_postgis(sql,'analysis_path',db)
gdfl.head()

Unnamed: 0,id,analysis_time,dirtag,routetag,path_geom,lag_dirtag,next_dirtag,lag_analysis_time,st_path_pos,end_path_pos,flag_reversed,analysis_path,analysis_dt
0,1000,1573869840,32_0_32A,32,0102000020E6100000B9000000EBFF1CE6CBE553C0F0F9...,32_0_32A,,1573869810,0.783007,0.790276,0,LINESTRING (-79.43734247641889 43.698464591017...,2019-11-16 02:04:00+00:00
1,1000,1573869810,32_0_32A,32,0102000020E6100000B9000000EBFF1CE6CBE553C0F0F9...,32_0_32A,32_0_32A,1573869780,0.774967,0.783007,0,LINESTRING (-79.43894117236982 43.698070293810...,2019-11-16 02:03:30+00:00
2,1000,1573869780,32_0_32A,32,0102000020E6100000B9000000EBFF1CE6CBE553C0F0F9...,32_0_32A,32_0_32A,1573869750,0.766421,0.774967,0,LINESTRING (-79.4406489246731 43.6976883095897...,2019-11-16 02:03:00+00:00
3,1000,1573869750,32_0_32A,32,0102000020E6100000B9000000EBFF1CE6CBE553C0F0F9...,32_0_32A,32_0_32A,1573869720,0.757388,0.766421,0,LINESTRING (-79.44245868842802 43.697304260885...,2019-11-16 02:02:30+00:00
4,1000,1573869720,32_0_32A,32,0102000020E6100000B9000000EBFF1CE6CBE553C0F0F9...,32_0_32A,32_0_32A,1573869690,0.75191,0.757388,0,LINESTRING (-79.44355577516335 43.697070015855...,2019-11-16 02:02:00+00:00


In [9]:
%matplotlib agg
fig, ax = plt.subplots()
def animate(ts):
    ax.clear()
    gdf_stoppaths[(gdf_stoppaths['directiontag']=='32_1_32C')].plot(ax=ax,color='blue')
    gdf = gdfl[(gdfl['dirtag']=='32_1_32C') & (gdfl['id']=='1010') & (gdfl['analysis_dt']==ts)].sort_values(['analysis_dt'],ascending=True)
    gdf.plot(ax=ax,color='red',linewidth=5, markersize=1)
    ax.set_title('Bus ID 1010 on the 32C West Route')
    ax.text(0.6,0.9,datetime.datetime.strftime(ts, '%b %d %I:%M:%S%p'),transform=ax.transAxes)
    ax.axis('equal')
    
ts = gdfl[(gdfl['dirtag']=='32_1_32C') & (gdfl['id']=='1010')].analysis_dt.tolist()
ts.sort()
ani = animation.FuncAnimation(fig, animate, frames=ts)
#ani.save('route32_stop_paths_v3.mp4', fps=10, extra_args=['-vcodec', 'libx264'])


In [10]:
%matplotlib inline

Here we can see the same data ploted as we looked at in section 3.2.1.  When we plot the lines between the current datapoint and the previous data point we can visually get a sense of how quickly the vehicle is traveling.

In [11]:
HTML(ani.to_jshtml(fps=2))

<Figure size 432x288 with 0 Axes>

In the next section we will look at put this data to use and analying some routes 
