---
title: Full workflow for importing WorldPop population data into DHIS2
short_title: Import Population data
---

In this notebook we will show a complete workflow showing how to import [yearly population count data from WorldPop v2 (2015-2030)](https://hub.worldpop.org/geodata/listing?id=135) into DHIS2. Because the WorldPop population data is static, this is a one-time import and does not need to run continuously. 

-------------
## Library imports

Start by importing the libraries that we need:

In [1]:
from datetime import date
import json

import geopandas as gpd
import xarray as xr
from earthkit import transforms

from dhis2_client import DHIS2Client
from dhis2_client.settings import ClientSettings

from dhis2eo.data.worldpop import pop_total
from dhis2eo.integrations.pandas import dataframe_to_dhis2_json

## Input parameters

Let's first define all the input parameters so they are clearly stated at the top of the notebook. 

For this example we will connect to a public DHIS2 instance, and set the `DHIS2_DATA_ELEMENT_ID` below to import directly into the existing "Total Population" Data Element. 

Note that we recommend setting `DHIS2_ORG_UNIT_LEVEL` to the highest available level to get the most detailed possible population data; DHIS2 will take care to aggregate up to lower levels when needed. 

Unlike other import workflows, the WorldPop data cannot be accessed by organisation unit bounding box, but rather requires that we specify the country we are requesting data for, which we specify with the `IMPORT_COUNTRY_CODE` parameter. 

In [None]:
# DHIS2 connection
DHIS2_BASE_URL = "https://climate.im.dhis2.org/climate-tools-42"
DHIS2_USERNAME = "admin"
DHIS2_PASSWORD = "district"

# DHIS2 import settings
DHIS2_DATA_ELEMENT_ID = 'WUg3MYWQ7pt'
DHIS2_ORG_UNIT_LEVEL = 2
DHIS2_DRY_RUN = True                            # default to safe dry-run mode; set to False for actual import

# Population import configuration
IMPORT_VALUE_COL = "pop_total"                  # variable name in the downloaded xarray dataset
IMPORT_START_DATE = "2015"                      # first year of the dataset
IMPORT_END_DATE = "2030"                        # last year of the dataset
IMPORT_COUNTRY_CODE = "SLE"                     # 3-letter ISO country code

# Download settings
DOWNLOAD_FOLDER = "../../guides/data/local"
DOWNLOAD_PREFIX = "worldpop-pop"    # prefix for caching downloads; existing files are reused

# Aggregation settings
SPATIAL_AGGREGATION = "sum"

## Connect to DHIS2

First, we connect the python-client to the DHIS2 instance we want to import into. You can point this to your own instance, but for the purposes of this example we will use one of the public access DHIS2 instances, since these are continuously reset:

In [6]:
# Client configuration
cfg = ClientSettings(
  base_url=DHIS2_BASE_URL,
  username=DHIS2_USERNAME,
  password=DHIS2_PASSWORD
)

client = DHIS2Client(settings=cfg)
info = client.get_system_info()

# Check if everything is working.
# You should see your current DHIS2 version info.
print("Current DHIS2 version:", info["version"])

Current DHIS2 version: 2.42.3.1


## Get the DHIS2 organisation units

In order to download and aggregate the data to our DHIS2 organisation units, we also use the python-client to get the requested organisation units from our DHIS2 instance:

In [17]:
### Get org units GeoJSON from DHIS2
org_units_geojson = client.get_org_units_geojson(level=DHIS2_ORG_UNIT_LEVEL)

# Convert GeoJSON to geopandas
org_units = gpd.read_file(json.dumps(org_units_geojson))
org_units

Skipping field groups: unsupported OGR type: 5


Unnamed: 0,id,code,name,level,parent,parentGraph,geometry
0,O6uvpzGd5pu,OU_264,Bo,2,ImspTQPwCqd,ImspTQPwCqd,"POLYGON ((-11.5914 8.4875, -11.5906 8.4769, -1..."
1,fdc6uOvgoji,OU_193190,Bombali,2,ImspTQPwCqd,ImspTQPwCqd,"POLYGON ((-11.8091 9.2032, -11.8102 9.1944, -1..."
2,lc3eMKXaEfw,OU_197385,Bonthe,2,ImspTQPwCqd,ImspTQPwCqd,"MULTIPOLYGON (((-12.5568 7.3832, -12.5574 7.38..."
3,jUb8gELQApl,OU_204856,Kailahun,2,ImspTQPwCqd,ImspTQPwCqd,"POLYGON ((-10.7972 7.5866, -10.8002 7.5878, -1..."
4,PMa2VCrupOd,OU_211212,Kambia,2,ImspTQPwCqd,ImspTQPwCqd,"MULTIPOLYGON (((-13.1349 8.8471, -13.1343 8.84..."
5,kJq2mPyFEHo,OU_222616,Kenema,2,ImspTQPwCqd,ImspTQPwCqd,"POLYGON ((-11.3596 8.5317, -11.3513 8.5234, -1..."
6,qhqAxPSTUXp,OU_226213,Koinadugu,2,ImspTQPwCqd,ImspTQPwCqd,"POLYGON ((-10.585 9.0434, -10.5877 9.0432, -10..."
7,Vth0fbpFcsO,OU_233310,Kono,2,ImspTQPwCqd,ImspTQPwCqd,"POLYGON ((-10.585 9.0434, -10.5848 9.0432, -10..."
8,jmIPBj66vD6,OU_246990,Moyamba,2,ImspTQPwCqd,ImspTQPwCqd,"MULTIPOLYGON (((-12.6351 7.6613, -12.6346 7.66..."
9,TEQlaapDQoK,OU_254945,Port Loko,2,ImspTQPwCqd,ImspTQPwCqd,"MULTIPOLYGON (((-13.119 8.4718, -13.1174 8.470..."


## Download the necessary data

In the next step we download all the requested data to the local file system, using convenience functionality from the `dhis2eo.data.worldpop.pop_total` module. 

Note that after the initial data download, subsequent runs of this notebook will re-use the previously imported files to avoid repeated downloads of the same data. 

For more details on this step, see our guide for [Downloading WorldPop population data](../../guides/getting-data/worldpop/worldpop-total-download.ipynb). 

In [15]:
files = pop_total.yearly.download(
    start=IMPORT_START_DATE, 
    end=IMPORT_END_DATE, 
    country_code=IMPORT_COUNTRY_CODE, 
    dirname=DOWNLOAD_FOLDER, 
    prefix=DOWNLOAD_PREFIX, 
)
files

INFO - 2026-01-17 22:20:34,066 - dhis2eo.data.worldpop.pop_total.yearly - Year 2015
INFO - 2026-01-17 22:20:34,070 - dhis2eo.data.worldpop.pop_total.yearly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\worldpop-pop_2015.nc
INFO - 2026-01-17 22:20:34,072 - dhis2eo.data.worldpop.pop_total.yearly - Year 2016
INFO - 2026-01-17 22:20:34,077 - dhis2eo.data.worldpop.pop_total.yearly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\worldpop-pop_2016.nc
INFO - 2026-01-17 22:20:34,080 - dhis2eo.data.worldpop.pop_total.yearly - Year 2017
INFO - 2026-01-17 22:20:34,084 - dhis2eo.data.worldpop.pop_total.yearly - File already downloaded: C:\Users\karimba\Documents\Github\climate-tools\docs\guides\data\local\worldpop-pop_2017.nc
INFO - 2026-01-17 22:20:34,088 - dhis2eo.data.worldpop.pop_total.yearly - Year 2018
INFO - 2026-01-17 22:20:34,094 - dhis2eo.data.worldpop.pop_total.yearly - File already down

[WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/worldpop-pop_2015.nc'),
 WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/worldpop-pop_2016.nc'),
 WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/worldpop-pop_2017.nc'),
 WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/worldpop-pop_2018.nc'),
 WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/worldpop-pop_2019.nc'),
 WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/worldpop-pop_2020.nc'),
 WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/worldpop-pop_2021.nc'),
 WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/worldpop-pop_2022.nc'),
 WindowsPath('C:/Users/karimba/Documents/Github/climate-tools/docs/guides/data/local/worldpop-pop_2023.nc'),
 WindowsPath('C:/Us

## Open the downloaded data

Once the data has been downloaded, we can then pass the list of files to `xr.open_mfdataset()`. This allows us to open and work with the data as if it were a single xarray dataset: 

In [18]:
ds_yearly = xr.open_mfdataset(files)
ds_yearly

  ds_yearly = xr.open_mfdataset(files)


Unnamed: 0,Array,Chunk
Bytes,1.60 GiB,102.47 MiB
Shape,"(16, 3695, 3635)","(1, 3695, 3635)"
Dask graph,16 chunks in 33 graph layers,16 chunks in 33 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 1.60 GiB 102.47 MiB Shape (16, 3695, 3635) (1, 3695, 3635) Dask graph 16 chunks in 33 graph layers Data type float64 numpy.ndarray",3635  3695  16,

Unnamed: 0,Array,Chunk
Bytes,1.60 GiB,102.47 MiB
Shape,"(16, 3695, 3635)","(1, 3695, 3635)"
Dask graph,16 chunks in 33 graph layers,16 chunks in 33 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


## Aggregate to organisation units

Since the data is already at the correct yearly level, we can proceed directly to aggregate the gridded data to the organisation units from your DHIS2 instance:

In [19]:
print("Aggregating to organisation units...")
ds_org_units = transforms.spatial.reduce(
    ds_yearly[IMPORT_VALUE_COL],
    org_units,
    mask_dim="id",
    how=SPATIAL_AGGREGATION,
)
ds_org_units

Aggregating to organisation units...


## Post-processing

After aggregating the population data to the desired organizational units, we convert the xarray Dataset to a Pandas DataFrame. This makes it easier to inspect the data and prepare it for subsequent post-processing:

In [20]:
dataframe = ds_org_units.to_dataframe().reset_index()
dataframe

Unnamed: 0,id,time,pop_total
0,O6uvpzGd5pu,2015-01-01,7.010201e+05
1,O6uvpzGd5pu,2016-01-01,7.218049e+05
2,O6uvpzGd5pu,2017-01-01,7.441263e+05
3,O6uvpzGd5pu,2018-01-01,7.666244e+05
4,O6uvpzGd5pu,2019-01-01,7.893836e+05
...,...,...,...
203,at6UHUQatSo,2026-01-01,1.325815e+06
204,at6UHUQatSo,2027-01-01,1.341199e+06
205,at6UHUQatSo,2028-01-01,1.356822e+06
206,at6UHUQatSo,2029-01-01,1.371576e+06


The only thing we have to change before importing to DHIS2, is that we need the `time` column to only record the year so that DHIS2 assigns it the correct period type:

In [24]:
dataframe['time'] = dataframe["time"].dt.year.astype(str)

This should now display correctly:

In [25]:
dataframe

Unnamed: 0,id,time,pop_total
0,O6uvpzGd5pu,2015,7.010201e+05
1,O6uvpzGd5pu,2016,7.218049e+05
2,O6uvpzGd5pu,2017,7.441263e+05
3,O6uvpzGd5pu,2018,7.666244e+05
4,O6uvpzGd5pu,2019,7.893836e+05
...,...,...,...
203,at6UHUQatSo,2026,1.325815e+06
204,at6UHUQatSo,2027,1.341199e+06
205,at6UHUQatSo,2028,1.356822e+06
206,at6UHUQatSo,2029,1.371576e+06


## Create DHIS2 payload

At this point we have the final data that we want to import into DHIS2. In order to submit the data to DHIS2 we first have to convert the data a standardized JSON format, which can be done with the help of the `dhis2eo` library: 

In [26]:
print(f"Creating payload with {len(dataframe)} values...")
payload = dataframe_to_dhis2_json(
    df=dataframe,
    org_unit_col="id",
    period_col="time",
    value_col=IMPORT_VALUE_COL,
    data_element_id=DHIS2_DATA_ELEMENT_ID,
)
payload['dataValues'][:3]

Creating payload with 208 values...


[{'orgUnit': 'O6uvpzGd5pu',
  'period': '2015',
  'value': '701020.0837291926',
  'dataElement': 'WUg3MYWQ7pt'},
 {'orgUnit': 'O6uvpzGd5pu',
  'period': '2016',
  'value': '721804.8800076675',
  'dataElement': 'WUg3MYWQ7pt'},
 {'orgUnit': 'O6uvpzGd5pu',
  'period': '2017',
  'value': '744126.2547043348',
  'dataElement': 'WUg3MYWQ7pt'}]

## Import to DHIS2



In [22]:
print(f"Importing payload into DHIS2 (dryrun={DHIS2_DRY_RUN})...")
res = client.post("/api/dataValueSets", json=payload, params={"dryRun": str(DHIS2_DRY_RUN).lower()})
print(f'Result: {res["response"]["importCount"]}')

Importing payload into DHIS2 (dryrun=True)...
Result: {'imported': 208, 'updated': 0, 'ignored': 0, 'deleted': 0}


We have now successfully completed a full workflow for downloading, postprocessing, aggegating, and importing yearly WorldPop population data into DHIS2. 