Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Provider VOI #338

Merged
merged 1 commit into from Oct 11, 2019
Merged

Conversation

jeespers
Copy link

Explain pull request

Please provide a clear and concise reason for this pull request and the impact of the change

Is this a breaking change

A breaking change would require consumers or implementors of the API to modify their code for it to continue to function (ex: renaming of a required field or the change in data type of an existing field). A non-breaking change would allow existing code to continue to function (ex: addition of an optional field or the creation of a new optional endpoint).

  • Yes, breaking
  • No, not breaking
  • I'm not sure

Provider or agency

Which API(s) will this pull request impact:

  • Provider
  • Agency
  • Both

@jeespers jeespers requested review from hunterowens, thekaveman and a team as code owners July 11, 2019 08:41
@hunterowens
Copy link
Collaborator

@jeespers I'm 404 on the GBFS endpoint. Has VOI launched yet?

@Zatte
Copy link
Contributor

Zatte commented Jul 16, 2019

@hunterowens : Hi @jeespers just went on vacation so let's see if I can help out;

I guess you referring to curl https://mds.voiapp.io/v1/gbfs/ => 404

however if you try

curl https://mds.voiapp.io/v1/gbfs/gbfs.json
{"code":"401.2","message":"Unauthorized, Token Invalid"}

We could index gbfs.json under / but this doesn't seem to be a GBFS requirement?

I can provide you with sample API credentials if you want to try / verify the API.

@thekaveman
Copy link
Collaborator

Hi @Zatte @jeespers, since GBFS is meant to be public there shouldn't be any credentials requirement for these endpoints.

@marie-x
Copy link
Collaborator

marie-x commented Aug 2, 2019

Hm, I'm not getting 401, but I'm not getting any content either.

max$ curl https://mds.voiapp.io/v1/gbfs/
max$ 

Edit: I'm dumb, I forgot the -I when invoking curl

@drPytho
Copy link

drPytho commented Aug 12, 2019

@karcass as noted by @Zatte

I guess you referring to curl https://mds.voiapp.io/v1/gbfs/ => 404
however if you try

curl https://mds.voiapp.io/v1/gbfs/gbfs.json
{"code":"401.2","message":"Unauthorized, Token Invalid"}

@drPytho
Copy link

drPytho commented Aug 12, 2019

@thekaveman due to GDPR regulations we need to be able to control that the consumers of the data are compliant.

To my understanding, a lot of the exciting providers are also requiring authentication.

@stkdiretto
Copy link

@drPytho could you please outline your legal reasoning why GBFS data (i.e. where your scooters are ready for renting) would even begin to touch GDPR regulation? Have you, unbeknownst to the public, deployed sentient scooters that would make them identified or identifiable individuals as subjects of the GDPR? And, if yes, how? And, more importantly, why?

@drPytho
Copy link

drPytho commented Aug 12, 2019

deployed sentient scooters that would make them identified or identifiable individuals as subjects of the GDPR? And, if yes, how? And, more importantly, why?

For fun and for profit 😉

No, but location data for how scooters move through the cities is to be considered personal data if it can be connected to an individual, which location definitely can do.

Although sentient scooters would be amazing, and probably a fruitless side project from now on, is not the main objection we would have to not expose this data publically :)

@kyle-nycdot
Copy link

GBFS for dockless micromobility services is adequately anonymized (for public exposure) by excluding vehicle ID. This prevents individual trip information from being inferred by monitoring the endpoint. See such trip inference in action, for example, in these two papers from Grant McKenzie (@grantdmckenzie) at McGill U: 10.1016/j.jtrangeo.2019.05.007 and 10.4230/LIPIcs.GIScience.2018.46

Even though GBFS is only tracking hardware, the movement of said hardware can expose what could, and probably should, be considered PII. This isn't true for docked systems, because stationary docking locations have the effect of spatially binning intended trip destination, on top of the fact that stations, not bikes, are being tracked with station_status.json. Because of how GBFS was adapted for dockless micromobility services with free_bike_status.json, the exact location of a trip end is easily determined, which, coupled with the fact that a user is likely to end their trip as close as possible to their final destination, becomes very murky when it becomes feasible to figure that as half of an O-D pair.

That being said, a vehicle ID should still be included for use by city agencies, as it is too important from a regulatory standpoint. Any GBFS endpoint for use by an agency should be authenticated with an access token or otherwise.

@mobilitygirl
Copy link

GBFS, housed by NABSA, is currently undergoing an update led by a consulting firm, supported by a broader consulting bench that includes Populus, Transit, and others.

Encourage folks who are interested in following this conversation to register your interest here.

In addition to evaluating and updating GBFS as a standard, the team will also develop best practices to protect traveler privacy. My $0.02 are that cities need access to vehicle IDs for regulatory and transportation planning purposes, but that is also reasonable to have security policies associated with potentially more sensitive data feeds that can be used to re-identify individuals.

@stkdiretto
Copy link

No, but location data for how scooters move through the cities is to be considered personal data if it can be connected to an individual, which location definitely can do.

Is this based in any real-world legal counselling you have had? Even including the possibilities of rotating or even omitting vehicle IDs? I'm kind of at a loss here. Are you even considering wanting to provide GBFS to the general public and any input on how to make GBFS GDPR compliant would be welcome?

@drPytho
Copy link

drPytho commented Aug 13, 2019

We want to be able to provide it to the public and other parties to use the data, but we want/need control to validate that they are compliant and are not misusing the data. Possibilities to rate limit access and understand access patterns.

Rotating and/or omitting vehicle IDs would go a long way, however, the cities will not like that and we don't have the time to implement an anonymized version to have it be completely open.

I hope this PR can still be approved but for now, the API will be locked down, and anyone can request API keys to get access to the data.

@stkdiretto
Copy link

Again, could you please provide a concrete legal reasoning how this constitutes a real-life problem within the GDPR's jurisdiction that goes beyond “I wave my hands and invoke the magical GDPR incantation that makes unwanted eyes go away”?

If what you're saying were true, how would your exposing a (non-rotating) vehicle ID and location through your own application or, say, endpoints of your API (that one might found documented somewhere™ on the net) which supply data in other formats than GBFS be substantially different from giving access to your GBFS API?

@drPytho
Copy link

drPytho commented Aug 15, 2019

I really don't have time in my day to argue this issue with concrete legal reasoning, I am both very enthusiastic about open source and open data and wish to be able to provide to this market. However, my first obligation is to protect our user's data.

We have therefore made the final decision to be able to control the access to make sure that the users are in good hands. As well as being able to control the integrity of our systems and to rate-limit requests.

To make this clear, we will not provide an unauthenticated endpoint for this data but anyone can always contact us to request access, and we encourage this!

I hope this is good enough for @hunterowens and @thekaveman.

@stkdiretto
Copy link

I really don't have time in my day to argue this issue with concrete legal reasoning, I am both very enthusiastic about open source and open data and wish to be able to provide to this market. However, my first obligation is to protect our user's data.

And I call bullshit on that. You have been and are continuing to expose exactly the kind of data that would be expected from you through your existing APIs. No signup, no access tokens, nothing. If your publishing open access GBFS endpoints would be in breach of GDPR regulation, you would have been in breach of exactly this regulation for months now.

As well as being able to control the integrity of our systems and to rate-limit requests.

Well, then at least have the courtesy to call it as it is: You, as a corporation, have made the decision not to provide a second, best-effort GBFS endpoint without authentication. You have been shown mitigation measures and just don't find the time to implement them. There is absolutely no need to hide behind legal fiction in order to take this stance.

@drPytho
Copy link

drPytho commented Aug 20, 2019

And I call bullshit on that.

Cool man... 👏 👏 👏

You have been and are continuing to expose exactly the kind of data that would be expected from you through your existing APIs.

Unfortunately correct, this is something we are working on closing down and will be locked down as soon as possible.

If your publishing open access GBFS endpoints would be in breach of GDPR regulation, you would have been in breach of exactly this regulation for months now.

Sort of correct, we want to be on the safer side, the grayscale on the regulations where something is personal data and to what level it should be protected. Therefore we want to be on the safe side. (Call bullshit on this too if you want.) We won't improve by opening up more data.

Well, then at least have the courtesy to call it as it is: You, as a corporation, have made the decision not to provide a second, best-effort GBFS endpoint without authentication. You have been shown mitigation measures and just don't find the time to implement them. There is absolutely no need to hide behind legal fiction in order to take this stance.

Yes, we have decided to not put work into randomizing the data for multiple reasons. First and foremost anyone who wants access can get it by getting an API key from us and get access to higher quality data. Even if there was no issue with the privacy we still want to be able to control request flow and rate-limiting which anyone who is building a system would be dumb not to have this.

Hope this makes you satisfied @stkdiretto.

@hunterowens
Copy link
Collaborator

Hi all-

I'm merging this, but I've asked the folks at NABSA to clarify. Can you all move discussion over there? cc MobilityData/gbfs#184

@hunterowens hunterowens merged commit 31e8f3d into openmobilityfoundation:dev Oct 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants