Skip to content

Use cases

David Megginson edited this page Nov 1, 2016 · 15 revisions

This page introduces sample use cases for different filters and other features of the HXL Proxy.

Privacy and protection

Goal Suggestions
Convert individual records to aggregated totals. Use the Count rows filter to aggregate by #adm1 (etc.).
Remove personal phone numbers from a dataset. Use the Cut columns filter to remove any columns tagged #contact+phone
Redact small numbers of affected people that could identify individuals. Use the Replace data filter with a regular expression to replace small numbers in columns tagged #affected (optionally after normalising numbers using the Clean data filter first).
Remove data from sensitive sectors (e.g. gender-based violence) before sharing publicly. Use the Select rows filter to remove rows based on the value of columns tagged #sector.
Remove records from an organisation that would prefer not to have its data shared. Use the Select rows filter to remove rows based on the value of columns tagged #org.

Reporting

Goal Suggestions
Extract all activities for a specific organisation. Use the Select rows filter to select rows based on values in columns tagged #org.
Find the total number of organisations active in each district. Use the Deduplicate rows filter to select all unique combinations of #adm1 and #org, then use the Count rows filter to count the number of rows for each #adm1.
Sum the total number of people in need in each country. Use the Count rows filter with #country as the count tag, and #inneed as the aggregation tag.
Show only entries from 2014 or later. Use the Select rows filter with #date>=2014 as the [[row query
Group data by cluster. Use the Sort rows filter with #sector+cluster as the sort key.
Remove columns that a recipient does not want to see. Use the Cut columns filter to remove the columns by hashtag, e.g. #meta+id,#description+internal
Add extra information to a dataset, like the total population of each district. Use the Merge columns filter to pull the #population column from a second dataset, using (e.g.) #adm1+code as the shared key.

Collaboration

Goal Suggestions
Combine input data from multiple reporting organisations. Create a master template, then use the Append datasets filter to pull in each organisation's dataset from a location on the web (such as HDX or a Dropbox folder).
Give partner organisations feedback on their datasets. Use the Validation page together with a custom [[HXL schema

Data cleaning

TODO

Interoperability

TODO

Clone this wiki locally