Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: duplicate/non-existent ITP IDs in GTFS Schedule Feeds Latest views #988

Closed
edasmalchi opened this issue Jan 25, 2022 · 3 comments
Closed
Labels
bug Something isn't working project-gtfs-schedule For issues related to gtfs-schedule project

Comments

@edasmalchi
Copy link
Member

edasmalchi commented Jan 25, 2022

Describe the bug
In the GTFS Schedule Feeds Latest views, there appears to be data for itp_ids that do not exist in agencies.yml. In some cases, these strange itp_ids also contain data from multiple feeds that have their own unique itp_ids as well.

Strange itp_ids I've found so far: 1, 8, 2, 3 (there may be more?)

To Reproduce
Steps to reproduce the behavior:

  1. In one of the GTFS Schedule Feeds Latest views (feed info or agency are good places to start), filter by itp_id == 1 or itp_id == 8 or itp_id == 2 or itp_id == 3 ...
  2. Observe that data appears for apparently multiple feeds despite these itp_ids not being present in agencies.yml

Expected behavior
itp_ids in these views should not span multiple unrelated feeds, and should be present in agencies.yml

Additional context
For example, filtering by itp_id == 1 gives data for Long Beach Transit alongside Eastern Sierra Transit Authority and Fairfield and Suisun Transit, despite these feeds not being related and having their own itp_ids (170, 99, and 110)...

@holly-g holly-g added the bug Something isn't working label Jan 31, 2022
@lauriemerrell
Copy link
Contributor

Wondering if this is related to #521

@lauriemerrell
Copy link
Contributor

Some of this is caused by #235 (duplicate), but I think that there is at least some aspect of another issue here that's akin to #1184 (like, there are two things here:

  1. The logic in these gtfs_schedule tables should have filtered out stuff that's not marked as current
  2. But separately, the things that aren't current should have been marked as deleted in some more obvious way)

@lauriemerrell
Copy link
Contributor

@edasmalchi I am closing this as fixed by #1320 since GTFS Schedule Feeds Latest itself is ok now. But I am moving the second issue (the fact that the mislabeled data exists at all) to #1353 for subsequent remediation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working project-gtfs-schedule For issues related to gtfs-schedule project
Projects
None yet
Development

No branches or pull requests

3 participants