Skip to content

Count rows filter

David Megginson edited this page Nov 16, 2018 · 9 revisions
Count rows filter form.

The Count rows filter aggregates data using one or more columns, counting the number of unique values across the columns, and optionally providing a sum, average, minimum, and maximum value for a related numeric column.

This filter performs a similar function to a pivot table in a spreadsheet. It is one of the most commonly-used filters in the HXL Proxy.

Options

Count unique combinations of these hashtags: a comma-separated list of Tag patterns to count (leading "#" is optional). For example, "org+name" will count the number of times each organisation name appears, while "org,sector" will count the unique combinations of organisations and sectors/clusters. If you leave this field empty, the filter will calculate totals for the entire dataset.

Count only rows that match this query: here you may enter a Row query to restrict the rows counted. For example, if you want to count rows only in the WASH sector, you can use "sector+name=wash".

Options for each aggregator

Each aggregator produces one column of output. You may add up to 20 aggregators to the filter.

Aggregate type: (required) You may choose one of the five aggregate types:

  1. Count counts the number of unique combinations of the hashtag patterns specified under "Name and hashtag for count column". This is useful for operations like counting the number of activities for an organisation, sector, or district.
  2. Sum computes the total of the numeric values for a specific column, for each unique hashtag combination. You could use this aggregate, for example, to total the number of people in need for each district. It will ignore any non-numeric values.
  3. Average computes the mean value for a specific column, for each unique hashtag combination. You could use this aggregate, for example, to calculate the average number of beneficiaries per activity per district. It will ignore any non-numeric values.
  4. Minimum computes the smallest value for a specific column, for each unique hashtag combination. It works with numbers, dates, or strings (case-insensitive).
  5. Maximum computes the largest value for a specific column, for each unique hashtag combination. It works with numbers, dates, or strings (case-insensitive).
  6. Concatenate joins all unique values for a specific column, separated by "|". It is case-sensitive, but normalises whitespace.

Source column: (required for all aggregates except Count) A tag pattern for a column containing numeric values, e.g. "#affected".

Output column header: The text header to include at the top of the column containing the aggregate, e.g. "Total affected".

Output column hashtag: (required) The HXL hashtag and attributes to include at the top of the column containing the aggregate, e.g. "#affected+total".

Typical use cases

You can use this filter to create counts for mapping or graphing (e.g. counting the number of activities in a 3W report).

You can use this filter to anonymise data by aggregating it (e.g. starting with a survey of individuals, and aggregating by district or prefecture).

You can use this filter for quality control (e.g. counting the number of times each organisation name appears, and looking for obvious misspellings).

Examples

Input data

(Usually the input dataset would be much larger.)

#org #sector #adm1+name #adm1+code #reached
UNICEF Education Coast X001 5000
Save the Children Education Plains X002 300
IOM CCCM Coast X001 1500
UNICEF Protection Plains X002 8000

Aggregated data

Counting activities by organisation

Unique hashtags: #org
Aggregator: Count
Output column hashtag: #output+activities

#org #output+activities
IOM 1
Save the Children 1
UNICEF 2

Counting activities by organisation and sector

Unique hashtags: #org,#sector
Aggregator: Count
Output column hashtag: #output+activities

#org #sector #output+activities
IOM CCCM 1
Save the Children Education 1
UNICEF Education 1
UNICEF Protection 1

Calculating people reached by district

Unique hashtags: #adm1+name
Aggregator: Sum
Source column: #reached
Output column hashtag: #reached+total

#adm1+name #reached+total
Coast 6500
Plains 8300
Clone this wiki locally