Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bus2train: Analyse bus arrivals to train stations (data analysis) #13

Closed
daphshez opened this issue Sep 22, 2016 · 6 comments
Closed

Bus2train: Analyse bus arrivals to train stations (data analysis) #13

daphshez opened this issue Sep 22, 2016 · 6 comments

Comments

@daphshez
Copy link
Collaborator

Some time ago we were asked by Gil from 15 minutes to analyse the transfers between buses and trains in train stations. His request was to find a metric for how well the buses are coordinates with the trains in different stations, and different times of day. This can have value both for PR and for prioritising the work with the ministry of transport.

Here's a set of files containing arrival of buses and trains to train stations on Thursday, 2016-9-1. It was created from GTFS data using the calling_at_station module.

This task is rather open ended: can anyone look at these files and think of ways to analyse the data and create useful metrics (or even visualisations?) that could provide insights on where and when the coordination of trains and buses is especially problematic?

@daphshez
Copy link
Collaborator Author

Some thoughts about this task: זהירות חפירה

I am not sure all these details should be implemented in the first version. But these are things that we discussed so I thought they should be documented.

Initial metrics that we could use are:

  1. If you arrive by train, what's the average time to the first bus that arrives.
  2. If you arrive by train, what's the average time over all bus lines.

For example, say there are two bus lines serving a station, and the schedule is:
Train 1 9:00
Line 1 9:04
Line 2 9:08
Train 2 9:20
Line 2 9:25
Line 2 9:30
Line 1 9:35

With metric 1, the result will be 4.5m (average of 4 minutes for train 1, and 5 minutes for train 2).
With metric 2, the result will be 8m (average of 6 minutes for train 1, and 10 minutes for train 2)

Both metrics aren't perfect, but maybe they are a good starting point to see how the data looks like.

Dividing the day into windows
We should divide the day into windows (e.g. 1 hour) and calculate the metrics separately for each hour. That would be more informative than having a single daily average.

*Transfer time: *
If a train arrives at 8:00 and the bus departs at 8:01, it's very probable that people would miss the bus, because it takes time to get out of the station. So we need a transfer constant TC. If a train arrives at time t1 and the bus leaves at time t2, the actual wait time is t2 - t1 - TC.
We can start with TC=5m.

Maximum wait time
Say there is a specific bus line that operates only in the afternoons. If a train arrives at 08:00am, no one is going to wait 8 hours for a bus. So we need a constant MAX_WAIT. If the wait > MAX_WAIT, we can ignore this bus.

Separate first stop from rest of stops
We know that the regulator (משרד התחבורה) only plans bus departure times. The data in the GTFS for the second, third etc. stop is not precise. So we should calculate the metrics twice: once for all buses, and once only for buses that start at the station.

From bus to train
I described metrics for transfers from train to bus. Similar metrics could be calculated for transfer from bus to train (arriving to the station by bus and then boarding a train).

@daphshez
Copy link
Collaborator Author

And BTW, @MYank0 and Uriya P are currently looking at this task.

I am going to try to load the data into the DB and make it available through re:dash.

@pankon
Copy link

pankon commented Oct 19, 2016

Hi,
I was thinking about using a simplified frequency metric with peaks for hub
stations, although the average per hour method would work pretty well. It
would be helpful to look at affected areas or maybe to pull from waze data
also.
Chag Sameach
Nathan

On Oct 19, 2016 1:33 AM, "Daphna Shezaf" notifications@github.com wrote:

And BTW, @MYank0 https://github.com/MYank0 and Uriya P are currently
looking at this task.

I am going to try to load the data into the DB and make it available
through re:dash.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#13 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AQPl5W05hBS_lJOgeVJSaAkv9qpNw9Ozks5q1UlEgaJpZM4KEBBN
.

@daphshez
Copy link
Collaborator Author

I am happy to say GTFS data (loaded from the file of October 16th) is now available in the database.

This can be accessed through re:dash

I also added to the database the data about which bus stops are within walking distance from train stations (station_walking_distance table).

This means that the data in the files I originally posted can be retrieved from the database using SQL. This file has examples of the queries.

@daphshez
Copy link
Collaborator Author

@MYank0 - what's our status re bus2train? Can we close this issue?

@MYank0
Copy link
Collaborator

MYank0 commented May 28, 2017

Final report is here I think we can close the issue and for now. Maybe when we have real time data we will have more meaningful insights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants