Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trips at edges of time window can cause unrealistic accessibility results #2148

Closed
mattwigway opened this issue Sep 23, 2015 · 6 comments
Closed

Comments

@mattwigway
Copy link
Member

In analyst, we calculate the average accessibility by calculating the average travel time reach a destination. This all seems fine, but it can break down if the destination is not accessible for the whole time window. We exclude the minutes when the destination is inaccessible from this average because you can't take the average of infinity.

However, consider the following case. You have set your time window to 7 AM to 9 AM. There is an express bus at 7:08 AM, after which there is no more transit service. So from 7 to 7:04 or so (once you take into account walk time, board slack, etc.) you have fantastic accessibility. After that you have no accessibility, but those minutes are excluded from the averages.

This is not a hypothetical example; note the following images:

image

image

Both represent the same time window. The lower image has the pin moved 100m west so that an infrequent all-day service is also available at the end of the walk threshold. Thus all minutes are considered at the accessibility graph looks as expected. In the first graph, we exclude most of the time window and only show results for a few minutes when accessibility happens to be fantastic.

This is also a problem with trips that take about the same amount of time as the cutoff threshold. We cut trips off at 120 minutes currently. Suppose a particular destination takes between 117 and 155 minutes to reach. We would exclude the trips that take more than 120 minutes from the average, pulling the average down.

The solution is to weight destinations by the percentage of the time they are accessible in average accessibility calculations, and only include destinations that are always accessible in worst case calculations.

@mattwigway
Copy link
Member Author

Here's another location that may be related:

image

At first glance, it sure looks like the stop at 75th Street (the conspicuous hole in the center) is not being linked properly, but it appears that that is not the case. What could be happening is that there is only one way to reach the accessible areas, so we only take the average of those when they are accessible, but there are multiple ways to reach the inaccessible area (this is in fact true, the northbound bus stops running ~7:30 but there is infrequent all day service on an east-west line through that area that can be reached by first traveling southbound on Antioch). Further evidence that this is what's going on is that the best case looks fine (which is what you would expect if this is the problem):

image

@mattwigway
Copy link
Member Author

One solution is to calculate accessibility at every minute, rather than travel time, and then average that. Of course that doesn't help with isochrones. This is what the Accessibility Observatory did: http://www.its.umn.edu/Publications/ResearchReports/pdfdownloadl.pl?id=2504 page 11f.

@mattwigway
Copy link
Member Author

After discussion with @abyrd, we decided the best way to address this problem is to have a simple cutoff; if a location is accessible less than, say, half the time, we simply exclude it. Results look a lot healthier:

image

There are extensive comments in the source code about the implications of this approach, which are duplicated below for posterity:

if the destination is reachable less than half the time, consider it unreachable.
This avoids issues where destinations are reachable for some very small percentage of the time, either because there is a single departure near the start of the time window, or because they take approximately 2 hours (the default maximum cutoff) to reach.

Consider a search run with time window 7AM to 9AM, and an origin and destination connected by an express bus that runs once at 7:05. For the first five minutes of the time window, accessibility is very good. For the rest, there is no accessibility; if we didn't have this rule in place, the average would be the average of the time the destination is reachable, and the time it is unreachable would be excluded from the calculation (see issue 2148)

There is another issue that this rule does not completely address. Consider a trip that takes 1:45
exclusive of wait time and runs every half-hour. Half the time it takes less than two hours and is considered and half the time it takes more than two hours and is excluded, so the average is biased low on very long trips. This rule catches the most egregious cases (say where we average only the best four minutes out of a two-hour span) but does not completely address the issue. However if you're looking at a time cutoff significantly less than two hours, it's not a big problem. Significantly less is half the headway of your least-frequent service, because if there is a trip on your least-frequent service that takes on average the time cutoff plus one minute it will be unbiased and considered unreachable iff the longest trip is less than two hours, which it has to be if the time cutoff plus half the headway is less than two hours, assuming a symmetric travel time distribution.

TODO: due to multiple paths to a target the distribution is not symmetrical though - evaluate the
effect of this. Also, transfers muddy the concept of "worst frequency" since there is variation in mid-trip
wait times as well.

@mattwigway
Copy link
Member Author

So, to summarize what I did to get the most correct reasonable results:

  1. Only populate average if destination is accessible at least 50% of the time window, otherwise leave unreachable.
  2. Only populate worst case if destination is always accessible, otherwise leave unreachable.
  3. Populate best case if destination is ever reachable.

@mattwigway
Copy link
Member Author

@abyrd can you do code review on that when you get a chance?

mattwigway added a commit that referenced this issue Oct 19, 2015
…ts by only including the trips with least waiting)
flibbertigibbet pushed a commit to flibbertigibbet/OpenTripPlanner that referenced this issue Jan 25, 2016
flibbertigibbet pushed a commit to flibbertigibbet/OpenTripPlanner that referenced this issue Jan 25, 2016
flibbertigibbet pushed a commit to flibbertigibbet/OpenTripPlanner that referenced this issue Jan 25, 2016
flibbertigibbet pushed a commit to flibbertigibbet/OpenTripPlanner that referenced this issue Jan 25, 2016
…(biasing results by only including the trips with least waiting)
flibbertigibbet pushed a commit to flibbertigibbet/OpenTripPlanner that referenced this issue Jan 25, 2016
@abyrd
Copy link
Member

abyrd commented Jul 4, 2019

This issue has been eliminated with our own definition of accessibility, using a specified percentile of the travel times to each destination. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants