New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define semantic cardinality for GTFS-realtime fields #64

Merged
merged 5 commits into from Aug 9, 2017

Conversation

Projects
None yet
5 participants
@barbeau
Collaborator

barbeau commented Jun 28, 2017

  • As originally discussed in #19 and https://groups.google.com/forum/#!msg/gtfs-realtime/wm3W7QIEZ9Y/DLyWKkknJyoJ, the current GTFS-realtime spec documentation describes field Protocol Buffer cardinality, not semantic cardinality. This has created confusion for consumers and producers where fields have been omitted based on them being labeled as "optional" in the GTFS-realtime spec, even if they were required under certain logical transit conditions (e.g., stop_sequence must be provided if a trip contains a loop).
  • This patch changes the "Cardinality" documentation to define the semantic cardinality for each data element as "required", "conditionally required", and "optional", with an explanation for when "conditionally required" fields are required in the "Description" section of that field. EDIT - Field requirements are now being defined in a new field "Required" - see #64 (comment) and discussion that follows. It also bumps the gtfs_realtime_version in the .proto file to 2.0 so validators can strictly enforce semantic cardinality based on the gtfs_realtime_version.

Announced on GTFS-realtime Google Group at:
https://groups.google.com/forum/#!topic/gtfs-realtime/J-csZSxeWjs

Define semantic cardinality for GTFS-realtime fields
* As originally discussed in #19 and https://groups.google.com/forum/#!msg/gtfs-realtime/wm3W7QIEZ9Y/DLyWKkknJyoJ, the current GTFS-realtime spec documentation describes field Protocol Buffer cardinality, not semantic cardinality.  This has created confusion for consumers and producers where fields have been omitted based on them being labeled as "optional" in the GTFS-realtime spec, even if they were required under certain logical transit conditions (e.g., stop_sequence must be provided if a trip contains a loop).
* This patch changes the "Cardinality" documentation to define the semantic cardinality for each data element as "required", "conditionally required", and "optional".  It also bumps the gtfs_realtime_version in the .proto file to 2.0 so validators can strictly enforce semantic cardinality based on the gtfs_realtime_version.

@googlebot googlebot added the cla: yes label Jun 28, 2017

@barbeau

This comment has been minimized.

Show comment
Hide comment
@barbeau

barbeau Jun 28, 2017

Collaborator

I should also mention that with the exception of making the header timestamp required, I believe all of the other field cardinalities reflect existing GTFS-rt documentation. So, this isn't so much proposing something new as it is trying to capture existing logic spread around the docs into a concise description in the Cardinality and Description fields.

Collaborator

barbeau commented Jun 28, 2017

I should also mention that with the exception of making the header timestamp required, I believe all of the other field cardinalities reflect existing GTFS-rt documentation. So, this isn't so much proposing something new as it is trying to capture existing logic spread around the docs into a concise description in the Cardinality and Description fields.

### Term Definitions
* **required**: Exactly one
* **repeated**: Zero or more

This comment has been minimized.

@RachM

RachM Jun 29, 2017

Contributor

I'm concerned that repeated has been removed; I think it should be included so that people know they can supply 0 or more.

@RachM

RachM Jun 29, 2017

Contributor

I'm concerned that repeated has been removed; I think it should be included so that people know they can supply 0 or more.

This comment has been minimized.

@barbeau

barbeau Jun 29, 2017

Collaborator

@RachM I had concerns about removing repeated too, but wasn't sure how to represent this in the table alongside the new info. Currently I added text to the "Description" field, but I do like the ability to scan the table and easily see which fields are 0+ elements.

Questions for this:

  • Should we add a new column for one/many relationships? If so, what should this be called?
  • Do we leave this as the "Cardinality" column and change the name of the column for the new semantic fields to "Requirement", or something similar?

Also, I've never liked the term repeated for this concept because to my knowledge it's very protocol buffer (implementation)-specific. I'd prefer to find another term - suggestions?

@barbeau

barbeau Jun 29, 2017

Collaborator

@RachM I had concerns about removing repeated too, but wasn't sure how to represent this in the table alongside the new info. Currently I added text to the "Description" field, but I do like the ability to scan the table and easily see which fields are 0+ elements.

Questions for this:

  • Should we add a new column for one/many relationships? If so, what should this be called?
  • Do we leave this as the "Cardinality" column and change the name of the column for the new semantic fields to "Requirement", or something similar?

Also, I've never liked the term repeated for this concept because to my knowledge it's very protocol buffer (implementation)-specific. I'd prefer to find another term - suggestions?

This comment has been minimized.

@RachM

RachM Jun 30, 2017

Contributor

Do we leave this as the "Cardinality" column and change the name of the column for the new semantic fields to "Requirement", or something similar?

I think that's the best approach. My reasoning:

  • Cardinality best describes the number of entities allowed in that field, so let's keep it purely to numbers. Of course the savvy user may determine the "requirement" from the number, but I think it's useful to explicitly state the "requirement" in a separate column.
  • A new column, Required or something similar makes it very clear that the user should check that column to see if the field is needed.

I think it's good to separate the two concepts; a) is it needed (check Required column) and 2) how many can I provide (check Cardinality column).

I agree about repeated; it's not user friendly. If you have the Cardinality column, then the Required column entries become: Yes (for required) and No (for optional and repeated).

@RachM

RachM Jun 30, 2017

Contributor

Do we leave this as the "Cardinality" column and change the name of the column for the new semantic fields to "Requirement", or something similar?

I think that's the best approach. My reasoning:

  • Cardinality best describes the number of entities allowed in that field, so let's keep it purely to numbers. Of course the savvy user may determine the "requirement" from the number, but I think it's useful to explicitly state the "requirement" in a separate column.
  • A new column, Required or something similar makes it very clear that the user should check that column to see if the field is needed.

I think it's good to separate the two concepts; a) is it needed (check Required column) and 2) how many can I provide (check Cardinality column).

I agree about repeated; it's not user friendly. If you have the Cardinality column, then the Required column entries become: Yes (for required) and No (for optional and repeated).

This comment has been minimized.

@barbeau

barbeau Jun 30, 2017

Collaborator

Sounds good on adding the new column.

So here's what I'm working on now to update this proposal, in summary:

  • Required column will address semantic requirements
  • Cardinality column will include information about how many elements should be provided for a particular field

The goal of this proposal is to represent the semantic requirements of fields, based on transit use cases and domain logic. The main challenge here is that there isn't a direct mapping between semantic requirements in Required and protobuf cardinality. As a result I think it will be confusing to show semantic requirements in Required and protobuf cardinality in Cardinality.

I'd prefer to keep the spec documentation independent of the implementation - in other words, the docs should be purely semantic, with the .proto file being the reference for the protobuf cardinality.

With this in mind, I propose that we define semantic cardinality something like the following:

Cardinality represents the number of elements that may be provided for a particular field:

Always reference the Required and Description fields to see when a field is required, conditionally required, or optional. Please reference gtfs-realtime.proto for Protocol Buffer cardinality.

Using these definitions, FeedMessage and TripUpdate would look like the following:

FeedMessage

Field Name Type Required Cardinality Description
header FeedHeader Required One Metadata about this feed and feed message.
entity FeedEntity Required Many Contents of the feed.

TripUpdate

Field Name Type Required Cardinality Description
trip TripDescriptor Required One XXXX
vehicle VehicleDescriptor Optional One XXXX
stop_time_update StopTimeUpdate Conditionally required Many XXXX
timestamp uint64 Optional One XXXX
delay int32 Optional One XXXX
@barbeau

barbeau Jun 30, 2017

Collaborator

Sounds good on adding the new column.

So here's what I'm working on now to update this proposal, in summary:

  • Required column will address semantic requirements
  • Cardinality column will include information about how many elements should be provided for a particular field

The goal of this proposal is to represent the semantic requirements of fields, based on transit use cases and domain logic. The main challenge here is that there isn't a direct mapping between semantic requirements in Required and protobuf cardinality. As a result I think it will be confusing to show semantic requirements in Required and protobuf cardinality in Cardinality.

I'd prefer to keep the spec documentation independent of the implementation - in other words, the docs should be purely semantic, with the .proto file being the reference for the protobuf cardinality.

With this in mind, I propose that we define semantic cardinality something like the following:

Cardinality represents the number of elements that may be provided for a particular field:

Always reference the Required and Description fields to see when a field is required, conditionally required, or optional. Please reference gtfs-realtime.proto for Protocol Buffer cardinality.

Using these definitions, FeedMessage and TripUpdate would look like the following:

FeedMessage

Field Name Type Required Cardinality Description
header FeedHeader Required One Metadata about this feed and feed message.
entity FeedEntity Required Many Contents of the feed.

TripUpdate

Field Name Type Required Cardinality Description
trip TripDescriptor Required One XXXX
vehicle VehicleDescriptor Optional One XXXX
stop_time_update StopTimeUpdate Conditionally required Many XXXX
timestamp uint64 Optional One XXXX
delay int32 Optional One XXXX
@barbeau

This comment has been minimized.

Show comment
Hide comment
@barbeau

barbeau Jun 29, 2017

Collaborator

@jxeeno Good catch on the schedule_relationship conditional states - I've pushed commits to fix both of these.

Collaborator

barbeau commented Jun 29, 2017

@jxeeno Good catch on the schedule_relationship conditional states - I've pushed commits to fix both of these.

Separate "Required" and "Cardinality" into different columns
* A new "Required" field holds the information as to whether a particular field is Required, Optional, or Conditionally required
* The existing "Cardinality" field is changed to a semantic version of cardinality instead of using Protobuf cardinality
@barbeau

This comment has been minimized.

Show comment
Hide comment
@barbeau

barbeau Jun 30, 2017

Collaborator

I've pushed a new commit 040f49c into this PR that creates a new column for "Required", and uses semantic cardinality definitions (One and Many) in "Cardinality" instead of protobuf cardinality (as proposed in #64 (comment)).

Feedback welcome! I'm definitely open to comments on how to improve this.

Collaborator

barbeau commented Jun 30, 2017

I've pushed a new commit 040f49c into this PR that creates a new column for "Required", and uses semantic cardinality definitions (One and Many) in "Cardinality" instead of protobuf cardinality (as proposed in #64 (comment)).

Feedback welcome! I'm definitely open to comments on how to improve this.

@barbeau

This comment has been minimized.

Show comment
Hide comment
@barbeau

barbeau Jul 12, 2017

Collaborator

Any additional comments on this proposal? I'd like to make any further changes based on feedback prior to calling a vote.

Collaborator

barbeau commented Jul 12, 2017

Any additional comments on this proposal? I'd like to make any further changes based on feedback prior to calling a vote.

@RachM

This comment has been minimized.

Show comment
Hide comment
@RachM

RachM Jul 13, 2017

Contributor

LGTM - I'm happy for voting to commence.

Contributor

RachM commented Jul 13, 2017

LGTM - I'm happy for voting to commence.

@barbeau

This comment has been minimized.

Show comment
Hide comment
@barbeau

barbeau Jul 13, 2017

Collaborator

Great! I'd like to call for a vote on this proposal then - voting will end Thursday July 20th at 23:59:59 UTC.

Collaborator

barbeau commented Jul 13, 2017

Great! I'd like to call for a vote on this proposal then - voting will end Thursday July 20th at 23:59:59 UTC.

@RachM

This comment has been minimized.

Show comment
Hide comment
@RachM

RachM Jul 13, 2017

Contributor

I vote yes (on behalf of Google).

Contributor

RachM commented Jul 13, 2017

I vote yes (on behalf of Google).

@jxeeno

This comment has been minimized.

Show comment
Hide comment
@jxeeno

jxeeno commented Jul 14, 2017

+1

Change FeedMessage.entity to "Conditionally required"
* This field can't be "Required", because there are legitimate scenarios where a feed may not have any real-time information about the system (e.g., if it's 3am, and the agency doesn't offer any service at 3am).  So, change to "Conditionally required" with description.
@barbeau

This comment has been minimized.

Show comment
Hide comment
@barbeau

barbeau Jul 31, 2017

Collaborator

Sorry for the delay updating this proposal - I was out on vacation last week.

Unfortunately we currently don't have the 3 needed votes for this to pass (we're at 2), and the voting period ended July 20th.

This ends up being a good thing, as I realized there was one issue in this proposal (FeedMessage.entity should be "Conditionally required" - I just fixed this in 3c24ff7, see commit description for more info).

So, I'd like to call for another vote on this PR - voting will end Tuesday Aug 8th at 23:59:59 UTC.

Collaborator

barbeau commented Jul 31, 2017

Sorry for the delay updating this proposal - I was out on vacation last week.

Unfortunately we currently don't have the 3 needed votes for this to pass (we're at 2), and the voting period ended July 20th.

This ends up being a good thing, as I realized there was one issue in this proposal (FeedMessage.entity should be "Conditionally required" - I just fixed this in 3c24ff7, see commit description for more info).

So, I'd like to call for another vote on this PR - voting will end Tuesday Aug 8th at 23:59:59 UTC.

@RachM

This comment has been minimized.

Show comment
Hide comment
@RachM

RachM Jul 31, 2017

Contributor

I vote yes.

Contributor

RachM commented Jul 31, 2017

I vote yes.

@jxeeno

This comment has been minimized.

Show comment
Hide comment
@jxeeno

jxeeno Aug 1, 2017

+1 I vote yes.

jxeeno commented Aug 1, 2017

+1 I vote yes.

@qwasar

This comment has been minimized.

Show comment
Hide comment
@qwasar

qwasar Aug 4, 2017

qwasar commented Aug 4, 2017

@barbeau

This comment has been minimized.

Show comment
Hide comment
@barbeau

barbeau Aug 9, 2017

Collaborator

Alright, looks like this passed the vote - 3 yes votes and 0 nos! Congrats everyone, we have GTFS-realtime v2.0! 🎊 🎊 🎊

This includes better documentation for semantic cardinality and requirements - which fields are required and under what conditions.

@RachM Would you be able to squash and merge this using the Github web UI, so one commit ends up in the master branch?

Alternately I could force-push a new single commit to replace the iterative commits currently in the PR, although it will erase the history of commits based on feedback through the proposal.

Collaborator

barbeau commented Aug 9, 2017

Alright, looks like this passed the vote - 3 yes votes and 0 nos! Congrats everyone, we have GTFS-realtime v2.0! 🎊 🎊 🎊

This includes better documentation for semantic cardinality and requirements - which fields are required and under what conditions.

@RachM Would you be able to squash and merge this using the Github web UI, so one commit ends up in the master branch?

Alternately I could force-push a new single commit to replace the iterative commits currently in the PR, although it will erase the history of commits based on feedback through the proposal.

@RachM RachM merged commit eb4b243 into google:master Aug 9, 2017

1 check passed

cla/google All necessary CLAs are signed

@barbeau barbeau deleted the barbeau:semantic-cardinality branch Aug 10, 2017

barbeau added a commit to barbeau/transit that referenced this pull request Aug 31, 2017

RachM added a commit that referenced this pull request Sep 3, 2017

@barbeau barbeau referenced this pull request Feb 8, 2018

Merged

Update to README #85

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment