Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DATA] Access to global datasets from 3rd parties #31

Closed
matamadio opened this issue May 19, 2021 · 2 comments
Closed

[DATA] Access to global datasets from 3rd parties #31

matamadio opened this issue May 19, 2021 · 2 comments
Labels
dataset Data ingestion into RDL exposure Issues related to Exposure data hazard Issues related to Hazard data

Comments

@matamadio
Copy link
Contributor

matamadio commented May 19, 2021

Overview

Global datasets are oftern the only source of data for country scale risk analysis in developing countries.
GFDRR gets a lot of request for these kind of data to be used in projects.
We need to address if and how these datasets are going to be accessed via RDL and its upcoming workflows.

Example data

Exposure:

Hazard (see also sheet)

Use cases

  • Download data to use in GIS analytics
  • Visualisation of map in GIS or WebGIS
  • Processing in online analytical tools (e.g. clip to country extent, overlay with other data)

Options

Options for the inclusion of data have balanced pros and cons.

1. Include in the catalogue as metadata; download links points to original source (download page); link to API (data access) and WMS (data view) whenever available from source (e.g. OSM)
(+) No storage used
(+) Always up to date (new versions, etc)
(-) TIed to original data format, which may not be optimal (in particular for any pre-set analytical tool)
(-) Become inaccessible if source is discontinued

2. Include in the catalogue pointing to a copy of the data in the RDL storage
(+) Reformat as optimal for workflow and alignment to schema
(+) Access granted independently of source changes
(-) Storage used
(-) Need manual updating of versions

@matamadio matamadio pinned this issue May 19, 2021
@matamadio matamadio changed the title Provision of access to global data from third parties Access to global datasets from 3rd parties May 25, 2021
@matamadio matamadio transferred this issue from another repository Mar 6, 2023
@matamadio matamadio changed the title Access to global datasets from 3rd parties [DATA] Access to global datasets from 3rd parties Mar 6, 2023
@matamadio matamadio added hazard Issues related to Hazard data exposure Issues related to Exposure data dataset Data ingestion into RDL labels Mar 6, 2023
@stufraser1
Copy link
Member

stufraser1 commented Mar 8, 2023 via email

@pzwsk
Copy link
Contributor

pzwsk commented Jul 12, 2023

Agree that option 1, i.e. referencing and pointing to the original source through the Risk Data Library catalog would already be a nice added value. I would imagine we could communicate on the fact the World Bank has improved its curation of global hazard layers thanks to the Risk Data Library Standard. Let's add this to FY24 workplan.

For option 2, I would need a better understanding of the efforts needed and the potential workflow but I don't think our role would be to become a data warehouse whereby we transform and maintain a copy of the datasets. Rather, we should seek collaboration with global data producers for them to provide the data according to our standards and/or transformation tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dataset Data ingestion into RDL exposure Issues related to Exposure data hazard Issues related to Hazard data
Projects
None yet
Development

No branches or pull requests

3 participants