Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/guidance/publication.md: Add guidance on location obfuscation #263

Merged
merged 5 commits into from May 17, 2023

Conversation

duncandewhurst
Copy link
Collaborator

@duncandewhurst duncandewhurst commented May 3, 2023

Related issues

Merge checklist

  • Update the changelog (style guide)
  • Run ./manage.py pre-commit to update derivative schema files, reference documentation and examples

If there are changes to network-schema.json, network-package-schema.json, reference/publication_formats/json.md, reference/publication_formats/geojson.md or guidance/publication.md#how-to-publish-large-networks, update the relevant manually authored examples:

  • examples/json/:
    • network-package.json
    • api-response.json
    • multiple-networks.json
    • network-embedded.json
    • network-separate-endpoints.json
    • network-separate-files.json
    • nodes-endpoint.json
    • spans-endpoint.json
  • examples/geojson/:
    • api-response.geojson
    • multiple-networks.geojson

If you used a validation keyword, type or format that is not already used in the schema:

If you added a normative rule that is not encoded in JSON Schema:

If there are changes to examples/geojson/nodes.geojson or examples/geojson/spans.geojson, check and update the data use examples:

  • examples/leaflet/leaflet.ipynb
  • examples/qgis/geojson.qgs

@duncandewhurst duncandewhurst marked this pull request as ready for review May 3, 2023 22:23
@duncandewhurst
Copy link
Collaborator Author

@stevesong please could you check that you're happy with the guidance added in this PR? It is linked from the decide what data to publish section of the publication process guidance.

@stevesong
Copy link
Contributor

Do we want to make more explicit the distinction between what is captured in the standard and what is exported? Operators may choose to capture data in the standard at a high level of detail but export it for club or public use to different levels of precision. I feel like we should get that across in some form, lest operators default to less precision.

@duncandewhurst
Copy link
Collaborator Author

duncandewhurst commented May 8, 2023

Since that idea of levels of sharing can apply to any field, I suggest that we add a paragraph to decide what data to publish (new content in italics). At the same time, we can fix the extra words in the final sentence:

Decide what data to publish

Bearing in mind your priority use cases, you ought to review the OFDS schema and decide which fields you want to publish.

OFDS is designed for the public disclosure of open data. However, you can also use it to structure data that you want to share only with specific partners and data that you want to keep within your own organisation. As such, this step can involve deciding which fields to make public, which to share with partners and which to keep private.

Most fields in the OFDS schema are optional. However, the more fields you publish, the more useful your data will be.

If you are concerned about disclosing the exact sensitive location data, see how to obfuscate location data.

We can then update the text under how to obfuscate location data as follows (new content in italics). At the same time we can correct the erroneous use of a normative keyword (should) on a non-normative documentation page that I introduced in this PR:

How to obfuscate location data

If you’re concerned about disclosing the exact location of fibre infrastructure, you can truncate the coordinates of node locations and span routes in your public or shared data to obfuscate their exact locations, whilst retaining the precise coordinates for use within your own organisation. Before truncating coordinates, you should ought to consider what level of accuracy is required to satisfy your priority use cases. You can use the following table as a guide to the relationship between coordinate precision and accuracy:

Does that sound good?

@stevesong
Copy link
Contributor

That sounds reasonable @duncandewhurst

@duncandewhurst duncandewhurst requested a review from rhiaro May 8, 2023 22:07
@duncandewhurst duncandewhurst merged commit bfec53a into 0.3-dev May 17, 2023
8 of 10 checks passed
@duncandewhurst duncandewhurst deleted the 133-location-obfuscation branch May 17, 2023 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add guidance on location obfuscation to address concerns about disclosing detailed location data
3 participants