Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: realtime parser returns duplicate vehicles #8

Closed
jamespfennell opened this issue Aug 5, 2023 · 3 comments
Closed

Bug: realtime parser returns duplicate vehicles #8

jamespfennell opened this issue Aug 5, 2023 · 3 comments

Comments

@jamespfennell
Copy link
Owner

When I parse the most recent GTFS realtime feed for the NYC L train I see many duplicate vehicles in the result. One vehicle has IsEntityInMessage: false and the other has IsEntityInMessage: true. Presumably the parser sees the first vehicle attached to a TripUpdate and the second attached to a VehiclePosition. It should de-duplicate these - in fact, the IsEntityInMessage: false vehicle should just be dropped in this case.

Observed after jamespfennell/transiter#110 was submitted to Transiter mainline. In Transiter I see lots of debug logging warning of duplicate vehicles in the NYC subway feed, and I think the root cause is here.

CC @cedarbaum.

@jamespfennell
Copy link
Owner Author

Started working on this in https://github.com/jamespfennell/gtfs/tree/duplicate-vehicles

@jamespfennell
Copy link
Owner Author

Ahhhhh the root cause is pretty subtle. When we're parsing entities we de-duplicate vehicles using a map keyed on VehicleID whose current definition is:

type VehicleID struct {
	ID           *string
	Label        *string
	LicencePlate *string
}

The problem is that even if the values of the ID fields match after following the pointer, Go considers them distinct from a hashing perspective because the pointers themselves are different. I think the simplest solution would be to remove the pointers:

type VehicleID struct {
	ID           string
	Label        string
	LicencePlate string
}

and interpret a field being "" as equivalent to the field not being provided in the GTFS realtime feed. In theory the feed could have Label: &"" but I think this edge case is not a big deal.

jamespfennell added a commit to jamespfennell/transiter that referenced this issue Aug 5, 2023
@cedarbaum
Copy link
Collaborator

Great catch, thanks so much for fixing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants