Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a /vehicles endpoint to show the current status of the fleet #376

Merged
merged 15 commits into from
Feb 28, 2020

Conversation

bhandzo
Copy link
Contributor

@bhandzo bhandzo commented Oct 11, 2019

Explain pull request

There have been several long running issues & conversations about how to provide real time vs. historical data on vehicles in the PROW. @johnpena specifically requested something like this in #310.

This goes well with the move to static pagination for historical status changes in #357 , and allows consumers to get the full state of the fleet right now rather than having to reconstruct it from a stream of status changes. It's likely this would obviate the need to include GBFS as part of MDS.

This makes MDS more accessible by allowing for a simpler fleet counting method (count the number of vehicles returned in this response). These counts should line up with counts generated from status changes, which would give agencies the ability to fact check real time responses to feel confident they fully reflect what's on the PROW.

Is this a breaking change

A breaking change would require consumers or implementors of the API to modify their code for it to continue to function (ex: renaming of a required field or the change in data type of an existing field). A non-breaking change would allow existing code to continue to function (ex: addition of an optional field or the creation of a new optional endpoint).

  • No, not breaking (This adds a new endpoint that consumers could choose to.. consume)

Provider or agency

Which API(s) will this pull request impact:

  • provider

Additional context

@johnpena @hunterowens @thekaveman @ascherkus @babldev @hyperknot @rf- @fscottfoti Would love thoughts.

Closes #310.

@marie-x
Copy link
Collaborator

marie-x commented Oct 12, 2019

Did you intentionally drop provider_id?

I think this endpoint might also want pagination. A single provider might have upward of 10,000 vehicles in large cities.

@bhandzo
Copy link
Contributor Author

bhandzo commented Oct 14, 2019

@karcass I think it would make most sense for this to work similarly to GBFS with each provider having a unique URL rather than using provider ID, but i'm v open to feedback.

Pagination makes a ton of sense to me, I think in this case it would be great to define how it should work for the sake of clarity.

@johnpena
Copy link

Agreed that this endpoint should include pagination. Otherwise LGTM, I think this is really valuable and I'd love to see this included in provider.

Copy link
Collaborator

@thekaveman thekaveman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intentionally drop provider_id?

I think it would make most sense for this to work similarly to GBFS with each provider having a unique URL rather than using provider ID

@bhandzo I think @karcass was referring to the data coming back from the endpoint (vs. a query parameter/URL token); each vehicle should be identified with the provider_id that vehicle belongs to, and similarly for the provider_name.

These fields are necessary for consumers that may hit /vehicles endpoints from multiple providers, and align with the vehicle information currently coming out of /status_changes and /trips.

@thekaveman thekaveman added the Provider Specific to the Provider API label Oct 14, 2019
@bhandzo
Copy link
Contributor Author

bhandzo commented Oct 14, 2019

@karcass @thekaveman @johnpena Added provider_id and provider_name, as well as pagination (copied down briefly from further up).

@thekaveman
Copy link
Collaborator

@bhandzo thanks for adding those fields.

If this endpoint was specified as having the same payload wrapper as the others, we could avoid duplicating the pagination docs from earlier (and maintain consistency with other parts of the spec). I might re-word like:

Vehicles

In addition to the standard Provider payload wrapper, responses from this endpoint should contain the last update timestamp and amount of time until the next update:

{
    "version": "x.y.z",
    "data": {
        "vehicles": []
    },
    "last_updated": "12345",
    "ttl": "12345"
}

Where last_updated and ttl are defined as follows:

Field Name Required Defines
last_updated Yes Timestamp indicating the last time the data in this feed was updated
ttl Yes Integer representing the number of milliseconds before the data in this feed will be updated again (0 if the data should always be refreshed)

Endpoint: /vehicles
Method: GET
data Payload: { "vehicles": [] }, an array of objects with the following structure
...

I also suggest changing the definition of ttl to be the integer number of milliseconds, for consistency with other representations of time in this spec.

Finally, we'll want a JSON schema for this endpoint as well - and assuming the above suggestions are agreeable, the common schema will need some edits to handle the additional payload fields.

provider/README.md Outdated Show resolved Hide resolved
Copy link

@ascherkus ascherkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo nits... but overall looks great. Agree with @rf- and @bhandzo re: desire to have more explicit way to reconcile this feed with status changes (e.g., lost event?), otherwise reconciling relies on some sort implicit / underspecified logic.

I feel it'd be good to avoid a situation where each consumer/municipality uses different implicit logic for status change reconciliation if we can manage it (i.e., a bit of explicitness now would avoid increasing hidden system complexity down the road).

| `last_event_time` | [timestamp][ts] | Required | Date/time when last status change occurred. See [Event Times](#event-times) |
| `last_event_type` | Enum | Required | Event type of most recent status change. See [event types](#event-types) table |
| `last_event_type_reason` | Enum | Required | Event type reason of most recent status change, allowable values determined by [`event type`](#event-types) |
| `last_event_location` | GeoJSON [Point Feature][geo]| Required | Location of veicle's last event |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: veicle -> vehicle

| `last_event_type` | Enum | Required | Event type of most recent status change. See [event types](#event-types) table |
| `last_event_type_reason` | Enum | Required | Event type reason of most recent status change, allowable values determined by [`event type`](#event-types) |
| `last_event_location` | GeoJSON [Point Feature][geo]| Required | Location of veicle's last event |
| `current_location` | GeoJSON [Point Feature][geo] | Optional | Current loaction of vehicle if different from last event |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: loaction -> location

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see this field being used in a few different ways. It could indicate a vehicle is that is moved in a way that's not covered by a status event (eg a passerby moves a vehicle that is blocking the sidewalk). That seems useful. I could also see it being used to update the position of a vehicle that is in a reserved state. That would be very different than what we have today, since it would report realtime events of a vehicle when it is associated to an individual (vs a vehicle unoccupied on the PROW). Are there privacy implications that need to be addressed? An example, perhaps a edge case, of an abuse, would be if person1, with access to this feed, observes person2 in real life starting a trip, then person1 could then track in realtime person2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a super good point. I think the updating of current location for vehicles that have moved without emitting events seems quite useful, and i agree that updating location during a ride has some privacy implications. I'm going to tweak the language to take a crack at clarifying.

@hyperknot
Copy link
Contributor

Thanks for this great PR. This looks like the perfect direction forward! I have a feeling that this + the new static endpoints could even replicate the current provider MDS + GBFS endpoints one day.

I'd like to raise two important points:
1.

| `current_location` | GeoJSON [Point Feature][geo] | Optional | Current loaction of vehicle if different from last event |

This should be required. The reason is that if we only supply anything "if different from last event" then this endpoint cannot be used for a "current state" view, as it needs to have all previous "diff" snapshots to reconstruct the current state.

  1. What vehicles to include. This is the most important question.
  • Currently it states: Only vehicles that are currently in an available, unavailable, or reserved states should be returned in this payload.

There are 3 options here:

  1. Include only available vehicles, like GBFS. This is not adequate for many known reasons.
  2. Include available "available, unavailable, or reserved" like proposed here. The problem with this is that this won't include out-of-service vehicles, what is probably the most important when a city is trying to figure out what vehicle is left over on a sidewalk in a broken state.

=> I'd propose a simplified approach, which solves all these problems:
3. Include all vehicles seen in the last n days (say 3 or 7), showing their last known state.

I believe this approach would solve all the mentioned problems, including lost vehicles. When a vehicle is registered lost, it should simply keep being visible as lost, so whenever a city finds an abandoned vehicle, they'd immediately know if it's in service, out of service, or possibly lost.

@billdirks
Copy link
Contributor

@hyperknot Just wanted to make a comment about your point (2). I believe known broken vehicles on the PROW should have a status event "unavailable" with reason "maintenance".

@hunterowens hunterowens added this to the 0.4.0 milestone Oct 16, 2019
@marie-x
Copy link
Collaborator

marie-x commented Oct 17, 2019

=> I'd propose a simplified approach, which solves all these problems:
3. Include all vehicles seen in the last n days (say 3 or 7), showing their last known state.

I believe this approach would solve all the mentioned problems, including lost vehicles. When a vehicle is registered lost, it should simply keep being visible as lost, so whenever a city finds an abandoned vehicle, they'd immediately know if it's in service, out of service, or possibly lost.

+1 for this, and I'd like to suggest that in addition to removed, Provider adds inactive to indicate that a device is lost/stolen or permanently removed from the fleet. The event_type_reason would be missing for short-term, and decommissioned for permanent.

Copy link
Contributor

@whereissean whereissean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good PR. I'd like to continue to build towards consistency with /vehicles in agency specs (perhaps via changes to that spec).

provider/README.md Show resolved Hide resolved
provider/README.md Show resolved Hide resolved
@hunterowens
Copy link
Collaborator

hey y'all:

I think this is really great. given that I'm already super behind on getting 0.4.0 out the door, I think we are gonna fast follow 0.4.0 (within 6 weeks) with an 0.4.1 that absolutely will have this in in it.

@hunterowens hunterowens modified the milestones: 0.4.0 , 0.4.1 Oct 25, 2019
@thekaveman
Copy link
Collaborator

Another option:

Since #357 introduces the near-ish real-time /events endpoint with explicit wording that:

[/events] functions similarly to status changes, but shall not include data older than 2 weeks

we could take the main ideas from this /vehicles proposal and fold into /events and #357:

  • /vehicles renames the event fields to last_event_*, whereas the schema of /events is the same as the current /status_changes - keeping a consistent schema for the "real-time" endpoint could make implementation on the provider and consumer side much easier.

  • from discussion here, it sounds like the concept of current_location would be useful; this field could be added to the /events endpoint as a required field with the same description as here ("...if different from last event, and the vehicle is not currently on a trip")

  • the explicit ttl and last_updated fields in the payload wrapper of /vehicles, as opposed to query params for the time range in /events mirroring the current (0.3.x) /status_changes. We could drop these payload fields in favor of the current wording in Allow static file data storage for Provider /status_changes and /trips #357 that I mentioned above ("...shall not include data older than 2 weeks") or be explicit and add the payload fields for /events. I haven't thought this all the way through yet... comments welcome.

It feels confusing for all to have both /events and /vehicles co-exist, can we solve the use-cases here with /events (potentially including some minor tweaks)?

@hyperknot
Copy link
Contributor

There are core differences between /events and /vehicles endpoints, even though at the end the reflect the near real-time-ish state.

/events is a diff-ing of state, similar to a git repository. To have the latest state, you'd need to start from a known reference state and "built up" all diff-s to get the current state. Downloading only the latest event means nothing in terms of reconstructing the state.

/vehicles is a "snapshot" of the state, similar to a "download ZIP" in GitHub for example. It contains everything needed to reconstruct the current state, but it's inefficient for downloading the diffs.

To have a live map based on MDS, we'd need to either redownload /vehicles snapshot every minute, or download vehicles once and then get the /events every minute.

So in conclusion I believe for a live map like functionality both of them makes sense. The big question is how well each endpoint could be supported by providers, and how much processing latency would they need to add for them.

@jfh01
Copy link
Contributor

jfh01 commented Nov 5, 2019

Great work on this, everyone.

I wanted to offer a general +1 on this PR. I've heard from a few folks that cities without a lot of technical capacity want an easier way to track current vehicle status and location. This approach is more functional than a GBFS feed and doesn't require the full technical capability need to ingest a full MDS provider feed.

Also, at their last developer workshop, the Mobility Data folks talked about pivoting GBFS towards serving only consumer-facing applications (rather than regulatory ones), and so this offers clear replacement functionality for cities that had been using GBFS as part of their regulatory approach.

Looking forward to seeing this get buttoned up, hopefully for our next release.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@sarob sarob added the enhancement New feature or request label Dec 19, 2019
provider/README.md Outdated Show resolved Hide resolved
@jfh01
Copy link
Contributor

jfh01 commented Jan 9, 2020

Documenting a few notes from the discussion on 12/19. Corrections/edits welcome.

On-trip vehicles:
Agreed that they would be marked as on-trip in the end-point. Location would be the start location of the trip, not the current, actual location.

Jurisdiction:
When a vehicle is moving across jurisdictional boundaries, it was agreed that it would reflect in the feed of the jurisdiction in which the last status change event occurred. So a vehicle that starts in a jurisdiction would stay in that jurisdiction's feed until such time as a new status change occurred in a new jurisdiction.

Missing/lost vehicles:
Was agreed that this concept needs to be formalized in /vehicles and in other places in the spec.

Querying:
Request that /vehicles support REST-style querying by vehicle ID.

Data update speed / latency:
Question was raised as to whether the spec should provide guidance as to how up-to-date the /vehicles endpoint should be. My notes do not indicate whether consensus was reached, though it was noted that GBFS does not define a maximum (or minimum) latency in the spec.

@jfh01
Copy link
Contributor

jfh01 commented Jan 16, 2020

Need to ask a missing/lost status. Some notes here: #67

@hunterowens
Copy link
Collaborator

For now, no need to support REST style querying.

@thekaveman thekaveman added the Schema Implications for JSON Schema or OpenAPI label Feb 13, 2020
@thekaveman
Copy link
Collaborator

@hyperknot brought up an important question earlier in the comment stream that didn't come up on the 2/13 call discussion of this proposal:

What vehicles to include. This is the most important question.

  • Currently it states: Only vehicles that are currently in an available, unavailable, or reserved states should be returned in this payload.

...
=> I'd propose a simplified approach, which solves all these problems:
Include all vehicles seen in the last n days (say 3 or 7), showing their last known state.

@karcass showed support for this simplified approach, and I also think this makes the most sense.

Let's try a vote: 👍 this comment to support the notion of including all vehicles with an event in the last N days (let's say 3 days for starters). 👎 this comment to support the notion of including all vehicles in specific event_type states (the current wording of the proposal).

@billdirks
Copy link
Contributor

I would be great if some operators would vote on this since it was proposed by an operator to help solve a pain point for them. Right now I'm giving a tenative 👎 until I understand their needs better.

I believe the purpose of this PR is to include a real-time snapshot of the vehicles on the PROW. This allows one to count the number of vehicles in real-time in a way that is simpler than playing back status changes. Including vehicles that are known to be no longer on the PROW (ie removed, lost, and stolen vehicles) seems to be at odds with the stated objective since a consumer will now have to process the returned payload before they can understand what is on the PROW. The problem this suggestion asserts to solve is the non-inclusion of out-of-service vehicles. However, I believe this use case is already addressed since vehicles marked "unavailable" with reason "maintenance" are these out-of-service vehicles.

I think this could also complicate the implementation. The implementer needs to know about vehicles which were removed days earlier in addition to the current world view.

@bhandzo bhandzo requested a review from a team as a code owner February 14, 2020 00:38
@bhandzo bhandzo requested a review from a team February 14, 2020 00:38
provider/README.md Outdated Show resolved Hide resolved
@thekaveman
Copy link
Collaborator

thekaveman commented Feb 21, 2020

@bhandzo I'm doing some work on the JSON Schema and notice that your branch is a ways out of date with dev here... mind if I rebase to clean up future merge conflicts?

@thekaveman
Copy link
Collaborator

@bhandzo please sign the CLA as soon as possible.

* current location is different from last event; AND
* vehicle is not currently onl a trip
@bhandzo bhandzo requested a review from a team February 27, 2020 18:22
@thekaveman
Copy link
Collaborator

thekaveman commented Feb 27, 2020

...
Let's try a vote: 👍 this comment to support the notion of including all vehicles with an event in the last N days (let's say 3 days for starters). 👎 this comment to support the notion of including all vehicles in specific event_type states (the current wording of the proposal).

Decided on the 2020/02/27 call to leave language as-is (vehicles should show up on this endpoint if they are in one of the given states). Updating the schema to reflect this.

Copy link
Collaborator

@thekaveman thekaveman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll plan to merge this later today (2020/02/27) unless I hear back from anyone. Please take a look if you have time, particularly at the JSON Schema.

@bhandzo @johnpena @hyperknot @ascherkus @thekidder @karcass @billdirks @rf- @jrheard @whereissean

@quicklywilliam
Copy link
Contributor

Just making note that there is a request for an "authenticated GBFS feed" with similar functionality in GBFS now that stable vehicle IDs have been removed. I believe several cities already require something like this, and several operators offer it.

I'd love to see a single standard get adopted here.
MobilityData/gbfs#237

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Provider Specific to the Provider API Schema Implications for JSON Schema or OpenAPI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add an available vehicles endpoint to provider