-
Notifications
You must be signed in to change notification settings - Fork 4
Use cases
David Megginson edited this page Nov 3, 2016
·
15 revisions
This page introduces sample use cases for different filters and other features of the HXL Proxy. Coders may choose to use JSON recipes instead of the interactive forms.
| Goal | Suggestions |
|---|---|
| Convert individual records to aggregated totals. | Use the Count rows filter to aggregate by #adm1 (etc.). |
| Remove personal phone numbers from a dataset. | Use the Cut columns filter to remove any columns tagged #contact+phone |
| Redact small numbers of affected people that could identify individuals. | Use the Replace data filter with a regular expression to replace small numbers in columns tagged #affected (optionally after normalising numbers using the Clean data filter first). |
| Remove data from sensitive sectors (e.g. gender-based violence) before sharing publicly. | Use the Select rows filter to remove rows based on the value of columns tagged #sector. |
| Remove records from an organisation that would prefer not to have its data shared. | Use the Select rows filter to remove rows based on the value of columns tagged #org. |
| Goal | Suggestions |
|---|---|
| Extract all activities for a specific organisation. | Use the Select rows filter to select rows based on values in columns tagged #org. |
| Find the total number of organisations active in each district. | Use the Deduplicate rows filter to select all unique combinations of #adm1 and #org, then use the Count rows filter to count the number of rows for each #adm1. |
| Sum the total number of people in need in each country. | Use the Count rows filter with #country as the count tag, and #inneed as the aggregation tag. |
| Show only entries from 2014 or later. | Use the Select rows filter with #date>=2014 as the [[row query |
| Group data by cluster. | Use the Sort rows filter with #sector+cluster as the sort key. |
| Remove columns that a recipient does not want to see. | Use the Cut columns filter to remove the columns by hashtag, e.g. #meta+id,#description+internal |
| Goal | Suggestions |
|---|---|
| Combine input data from multiple reporting organisations. | Create a master template, then use the Append datasets filter to pull in each organisation's dataset from a location on the web (such as HDX or a Dropbox folder). |
| Give partner organisations feedback on their datasets. | Use the Validation page together with a custom [[HXL schema |
| Publish a link that always has your latest data. | Use the Save page to create a short link. You can edit the [[recipe |
| Goal | Suggestions |
|---|---|
| Remove duplicate rows | Use the Deduplicate rows filter to keep only the first copy of each row. If you have a combination of columns that form a unique identifier (e.g. #meta+id or #org+code), use those as tags to look at. |
| Normalise values for sorting and charting | Use the Clean data filter to standardise the representation of dates and numbers, and to remove extra whitespace. |
| Find outlier values (potential typos) | Use the Count rows filter to display the number of times each value appears in a column, then optionally use the Sort rows filter to bring the rarest occurrences to the top. |
| Fix an error or variant that keeps appearing in input data | Start a [[replacement map |
| Fix spelling variations in district names | Use the Merge columns filter to pull standardised spellings from another dataset, with #adm1+code as the merge key, and #adm1+name as the column to copy. Select the "Use existing columns" and "Overwrite existing values" options to replace the current spellings in your source dataset. |
| Goal | Suggestions |
|---|---|
| Compare data from different countries, where the country is not specified | For each input dataset, use the Add column filter to add #country+name and #country+code as fixed values to every row before [[merging |
| Pull population data for districts from a different dataset | Ensure that the sources have a shared key, such as a P-code in the #adm1+code column, then use the Merge columns filter to pull data from other datasets into your source dataset. |
| Check that the organisation identifiers in a dataset are correct | Use a [[HXL schema |
Learn more about the HXL standard at http://hxlstandard.org