Skip to content
David Megginson edited this page Jun 12, 2020 · 4 revisions

After you have installed the libhxl package, you can load and parse a HXL dataset like this:

import hxl
source = hxl.data('http://example.org/hxl-data.csv')

The HXL Proxy can work with CSV, XLS, XLSX, or JSON files, and knows how to extract CSV from a top-level Google Sheet URL, or how to download data from a top-level HDX dataset.

If you want to load from a local file (rather than a URL), you can do that as well, but you have to supply a second argument to tell the hxl.data function that it's OK to load from local files (this is a safeguard to avoid accidentally letting the library read sensitive local files in a web application):

import hxl
source = hxl.data('local-data.csv', allow_local=True)

(For additional loading options, see Data loading.)

You can iterate over the dataset like this:

import hxl
for row in hxl.data('http://example.org/hxl-data.csv'):
    print("The organisation is {}\n".format(row.get('#org')))

You can write HXL data back out as CSV

import hxl
source = hxl.data('http://example.org/hxl-data.csv')
with open('new-hxl-data.csv', 'w') as output:
    hxl.write_hxl(output, source)

The most-interesting thing to do with HXL data, however, is to transform it. This example produces a new HXL dataset including only the rows from the original where #org is "UNICEF" (see Queries):

import hxl
url = 'http://example.org/hxl-data.csv'
source = hxl.data(url).with_rows('org=UNICEF')

This example produces a new dataset counting the number of rows for each unique combination of #sector and #adm1:

import hxl
url = 'http://example.org/hxl-data.csv'
source = hxl.data(url).count(['sector', 'adm1'])

You can string multiple filters together to create complex transformations:

import hxl
url = 'http://example.org/hxl-data.csv'
source = hxl.data(url).with_rows('org=UNICEF').count('adm1').sort()

Dive into the API docs for more details, or take a look at the list of filters available.