Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make warning-level notice when an equal shape_dist_traveled value is encountered and the previous row in the sequence had the same GPS coordinate #1070

Closed
evansiroky opened this issue Nov 18, 2021 · 7 comments · Fixed by #1083
Assignees
Labels
bug Something isn't working (crash, a rule has a problem) GTFS Reference Used for Adding or changing rules that belong in the GTFS reference

Comments

@evansiroky
Copy link

What problem in GTFS datasets does this new rule address? Please describe.

In some feeds, I have seen records in the shapes.txt file where two rows of data have the exact same GPS coordinate and exact shape_dist_traveled value. The current behavior of the validator treats this as a severe error. Although it is a somewhat duplicative record, I wouldn't think that it would qualify as a severe error and probably has no noticeable effect to an end user when shown a shape in a trip planner UI.

Describe the new validation rule

The code as follows should have more nuance:

The code should also look to see if the GPS coordinates in the previous record were the same GPS coordinates in the current record and as long as the shape_dist_traveled value is also equal, a notice should be generated, but the severity of that notice should be only a warning.

Error vs warning

There should still be an error if the shape_dist_traveled is equal but the GPS coordinates change. But if the coordinates are the same and the shape_dist_traveled is the same, only a warning should be generated.

@isabelle-dr isabelle-dr added the GTFS Reference Used for Adding or changing rules that belong in the GTFS reference label Nov 19, 2021
@lionel-nj lionel-nj added the bug Something isn't working (crash, a rule has a problem) label Nov 29, 2021
@github-actions
Copy link
Contributor

Thank you for your reporting a bug. The issue has been placed in triage, the MobilityData team will follow-up on it.

@lionel-nj
Copy link
Contributor

Hi @evansiroky, thanks for flagging that. We will investigate the logic of this rule in one of the oncoming sprints.

@evansiroky
Copy link
Author

Thanks, @lionel-nj! I had one more thought to add here. I recently saw another feed where the validator was flagging errors for same-distance values but where the actual difference was less than 1 meter between the coordinates. Perhaps it was that the output of the distance in the feed was rounded down to a certain amount of decimal points and lost that extra precision which practically wasn't needed anyways. So I'd like to add that it'd be nice if the eventual logic used in this rule wouldn't flag same-distance values as an error if the coordinates are acceptably physically close enough (like <1 meter or something).

@lionel-nj
Copy link
Contributor

Interesting! It would be amazing if you could join the url to the related dataset to help troubleshooting with existing data.

So I'd like to add that it'd be nice if the eventual logic used in this rule wouldn't flag same-distance values as an error if the coordinates are acceptably physically close enough (like <1 meter or something).

We'll investigate how to best cover this case in our unit tests.

@barbeau
Copy link
Member

barbeau commented Dec 1, 2021

One option here is to split DecreasingOrEqualShapeDistanceNotice into two different notices:

  1. DecreasingShapeDistanceNotice
  2. EqualShapeDistanceNotice

We've had a similar request in the gtfs-realtime-validator (CUTR-at-USF/gtfs-realtime-validator#365) for same or decreasing prediction times, which are also currently bundled under one notice (E022).

The reasoning is that depending on precision of systems involved, sometimes adjacent rows may have the same values after export to GTFS. So equal (or near equal, in the above case) values (2) may sometimes be valid.

However, in all cases backwards travel (1) is not valid and should be flagged with a higher priority.

So splitting the notice allows agencies to always fix (1), and consider (2) with lesser priority if needed.

@botanize
Copy link

botanize commented Dec 2, 2021

Our scheduling software produces duplicate path points at stop locations. For example, the last point on the stop-to-stop segment of Hennepin Ave S for the stop at 22nd St is exactly the same as the first point on the next stop segment to Franklin Ave and Dupont Ave S. You can see this in our GTFS https://svc.metrotransit.org/mtgtfs/archive/gtfs20211009.zip, shape_id = '20002' AND shape_pt_sequence < 20002. So we get duplicate shape distances at every stop along the shape!

We would support Sean's option of splitting the current error into two notices of appropriate severity.

@evansiroky
Copy link
Author

I'd support splitting into two notices as long as the decreasing notice is an error and the equal notice is a warning. Although that alone would be an improvement, I still think it'd be useful (perhaps in another work effort altogether) to examine the actual distance between GPS coordinates to make sure the equal notice becomes an error if the GPS coordinates are far enough apart that the respective shape distance delta should not be equal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working (crash, a rule has a problem) GTFS Reference Used for Adding or changing rules that belong in the GTFS reference
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants