Skip to content
This repository has been archived by the owner on Apr 24, 2024. It is now read-only.

paging for /api/analytics/events/query fails with a KeyError #21

Open
davidhuser opened this issue Oct 22, 2020 · 6 comments
Open

paging for /api/analytics/events/query fails with a KeyError #21

davidhuser opened this issue Oct 22, 2020 · 6 comments
Labels
bug Something isn't working

Comments

@davidhuser
Copy link
Owner

davidhuser commented Oct 22, 2020

dhis2.py 2.0.2 -- 2.1.2

from dhis2 import Api

PROGRAM_UID = 'IpHINAT79UW'

def get_data_no_paging(api, params):
  result = api.get(f'analytics/events/query/{PROGRAM_UID}', params=params).json()
  print(result['rows'][0])


def get_data_paging(api, params):
  for page in api.get_paged(f'analytics/events/query/{PROGRAM_UID}', params=params, page_size=100):
    print(page['rows'][0])


def main():
  api = Api('play.dhis2.org/demo', 'admin', 'district')
  
  # doesn't work with this type of params
  #params = [
  #  ('dimension', 'pe:LAST_12_MONTHS'),
  #  ('dimension', 'ou:ImspTQPwCqd'),
  #  ('dimension', 'A03MvHHogjR.a3kGcGDCuk6'),
  #  ('stage', 'A03MvHHogjR'),
  #  ('outputType', 'EVENT')
  #]

  params = {
    'dimension': [
      'pe:LAST_12_MONTHS', 
      'ou:ImspTQPwCqd', 
      'A03MvHHogjR.a3kGcGDCuk6'
      ],
    'stage': 'A03MvHHogjR',
    'outputType': 'EVENT'
  }

  get_data_no_paging(api, params)
  get_data_paging(api, params)

if __name__ == '__main__':
  main()

results in:

['q0Hcikut16c', 'A03MvHHogjR', '2019-11-12 00:00:00.0', '2020-11-12 01:00:00.0', '2020-11-12 01:00:00.0', '', '0.0', '0.0', 'Ngelehun CHC', 'OU_559', 'DiszpKrYNg8', '5.0']

Traceback (most recent call last):
  File "main.py", line 41, in <module>
    main()
  File "main.py", line 38, in main
    get_data_paging(api, params)
  File "main.py", line 11, in get_data_paging
    for page in api.get_paged(f'analytics/events/query/{PROGRAM_UID}', params=params, page_size=100):
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/dhis2/api.py", line 409, in page_generator
    page_count = page["pager"]["pageCount"]
KeyError: 'pager'
@davidhuser davidhuser added the bug Something isn't working label Oct 22, 2020
@pvanliefland
Copy link

Hi @davidhuser , I'm encountering the same issue. I could try to fix it and make a PR, any pointer?

Thanks!

@davidhuser
Copy link
Owner Author

Hi @pvanliefland , thanks for reporting. The paging control properties are located in another place, if you inspect an example JSON response they are in the data['metaData']['pager'] and not in data['pager'] as usual.

I'd be happy to review a PR! The idea would be to check whether the endpoint is analytics/events/query and switch the way it controls the paging inside the page_generator() func , e.g.

# sketch:
if endpoint = 'analytics/events/query':
    page_count = page['metaData']['pager']['pageCount']

Accessing the rows with the paged data would then be delegated to the calling function (of get_paged) I think.

@pvanliefland
Copy link

Hey @davidhuser , I'm working on a fix for this. However, paging is not the only issue - the response format is also quite different. This could be an issue when using merge=True. I've attached a sample response below.

What would be your thoughts on this? I'm leaning towards having collections as a list rather than a string, and iterate over the collections as well when appending data.

{
  "headers": [
    {
      "name": "pi",
      "column": "Enrollment",
      "valueType": "TEXT",
      "type": "java.lang.String",
      "hidden": false,
      "meta": true
    },
    {
      "name": "tei",
      "column": "Tracked entity instance",
      "valueType": "TEXT",
      "type": "java.lang.String",
      "hidden": false,
      "meta": true
    },
    {
      "name": "enrollmentdate",
      "column": "Enrollment date",
      "valueType": "DATE",
      "type": "java.util.Date",
      "hidden": false,
      "meta": true
    },
    {
      "name": "incidentdate",
      "column": "Incident date",
      "valueType": "DATE",
      "type": "java.util.Date",
      "hidden": false,
      "meta": true
    },
    {
      "name": "geometry",
      "column": "Geometry",
      "valueType": "TEXT",
      "type": "java.lang.String",
      "hidden": false,
      "meta": true
    },
    {
      "name": "longitude",
      "column": "Longitude",
      "valueType": "NUMBER",
      "type": "java.lang.Double",
      "hidden": false,
      "meta": true
    },
    {
      "name": "latitude",
      "column": "Latitude",
      "valueType": "NUMBER",
      "type": "java.lang.Double",
      "hidden": false,
      "meta": true
    },
    {
      "name": "ouname",
      "column": "Organisation unit name",
      "valueType": "TEXT",
      "type": "java.lang.String",
      "hidden": false,
      "meta": true
    },
    {
      "name": "oucode",
      "column": "Organisation unit code",
      "valueType": "TEXT",
      "type": "java.lang.String",
      "hidden": false,
      "meta": true
    },
    {
      "name": "ou",
      "column": "Organisation unit",
      "valueType": "TEXT",
      "type": "java.lang.String",
      "hidden": false,
      "meta": true
    },
    {
      "name": "de0FEHSIoxh",
      "column": "WHOMCH Chronic conditions",
      "valueType": "BOOLEAN",
      "type": "java.lang.Boolean",
      "hidden": false,
      "meta": true
    },
    {
      "name": "sWoqcoByYmD",
      "column": "WHOMCH Smoking",
      "valueType": "BOOLEAN",
      "type": "java.lang.Boolean",
      "hidden": false,
      "meta": true
    }
  ],
  "metaData": {
    "pager": {
      "page": 2,
      "total": 163,
      "pageSize": 4,
      "pageCount": 41
    },
    "items": {
      "ImspTQPwCqd": {
        "name": "Sierra Leone"
      },
      "PFDfvmGpsR3": {
        "name": "Care at birth"
      },
      "bbKtnxRZKEP": {
        "name": "Postpartum care visit"
      },
      "ou": {
        "name": "Organisation unit"
      },
      "PUZaKR0Jh2k": {
        "name": "Previous deliveries"
      },
      "edqlbukwRfQ": {
        "name": "Antenatal care visit"
      },
      "WZbXY0S00lP": {
        "name": "First antenatal care visit"
      },
      "sWoqcoByYmD": {
        "name": "WHOMCH Smoking"
      },
      "WSGAb5XwJ3Y": {
        "name": "WHO RMNCH Tracker"
      },
      "de0FEHSIoxh": {
        "name": "WHOMCH Chronic conditions"
      }
    },
    "dimensions": {
      "pe": [],
      "ou": [
        "ImspTQPwCqd"
      ],
      "sWoqcoByYmD": [],
      "de0FEHSIoxh": []
    }
  },
  "width": 12,
  "rows": [
    [
      "A0cP533hIQv",
      "to8G9jAprnx",
      "2019-02-02 12:05:00.0",
      "2019-02-02 12:05:00.0",
      "",
      "0.0",
      "0.0",
      "Tonkomba MCHP",
      "OU_193264",
      "xIMxph4NMP1",
      "0",
      "1"
    ],
    [
      "ZqiUn2uXmBi",
      "SJtv0WzoYki",
      "2019-02-02 12:05:00.0",
      "2019-02-02 12:05:00.0",
      "",
      "0.0",
      "0.0",
      "Mawoma MCHP",
      "OU_254973",
      "Srnpwq8jKbp",
      "0",
      "0"
    ],
    [
      "lE747mUAtbz",
      "PGzTv2A1xzn",
      "2019-02-02 12:05:00.0",
      "2019-02-02 12:05:00.0",
      "",
      "0.0",
      "0.0",
      "Kunsho CHP",
      "OU_193254",
      "tdhB1JXYBx2",
      "",
      "0"
    ],
    [
      "nmcqu9QF8ow",
      "pav3tGLjYuq",
      "2019-02-03 12:05:00.0",
      "2019-02-03 12:05:00.0",
      "",
      "0.0",
      "0.0",
      "Korbu MCHP",
      "OU_678893",
      "m73lWmo5BDG",
      "",
      "1"
    ]
  ],
  "height": 4
}

@davidhuser
Copy link
Owner Author

I see @pvanliefland, yes, since the response here is a list of lists (in rows) the merging needs to be handled while still preserving the rest of the response (metaData and headers). I don't understand your suggestion yet, do you refer to use collection as in this line) ?

Given the existing rather high complexity of the get_paged function it might prove useful for maintainability to refactor the different types of responses into new "private" methods, instead of trying to handle everything inside this function. This could later be extended to even handle other types of responses in the future. get_paged is then just used as an entry point for the types of responses that require different handling.

Important would be to maintain the public API of that method, which should detect the kinds of responses (as mentioned, like if endpoint = 'analytics/event/query': do_this().

Sorry that was not fully practical advice but general thoughts, but hoping that it might be a bit easier to implement.

@pvanliefland
Copy link

pvanliefland commented May 8, 2021

@davidhuser your suggestions make sense, I'm giving it a try right now. Forget about my remark regarding collection - it was unclear. I just need to take into account that for analytics, the key in the response data will always be rows - and not dataElements or indicators.

pvanliefland added a commit to pvanliefland/dhis2.py that referenced this issue May 9, 2021
@pvanliefland
Copy link

Hey @davidhuser! I've started a draft PR for this. I intend to test it in a real use case in the coming days.

What do you think about the proposed implementation, with paging classes?

In any case, I think that there are still a few things to clarify - I'm not sure that merging the rows is the only thing to do (thinking about metadata among other things).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants