As user, I would like to optimise the data collection process #27

Edouard-Legoupil · 2023-09-27T02:59:38Z

The goal of this ticked is to develop the back office function & logic behind the data collection step

The module script is here: https://github.com/unhcr-americas/surveyDesigner/blob/main/R/mod_collection.R

From the previous stage, we have used the filters based on indicator selection and language (language label used for the country) to subset a list of questions and potential answers (if select_one or select_multiple).

In the collection stage, we need to assess if the total questionnaire should be split in different parts, aka data collection waves using :

interview duration (each questions within the form can be assessed with the interview duration function),
questions groups (aka a module of questions grouped between 'begin_group' and 'end_group'), and
on indicator requirement (aka, based on the mapping, multiple questions potentially spread over multiple modules together with linked questions required for indicator disaggregation).
data collection mode as it impact the sequence of the questions (in CAPI, sensitive questions being more kept at the end, while it is the contrary for CATI)
an estimation of the response rate based on average interview duration. As the longer is a survey, the higher is the risk of dropout, the impact of the designing long survey be can be estimated by the cost of reaching out people whose information will not be recorded. (basically total cost per interview would be a function of response rate). See this publication - Optimizing Data Collection Interventions to Balance Cost and Quality in a Sequential Multimode Survey

Also we shall estimates an operational budget, based on costing input (aka enumeration capacity and total cost per interview) and various respondant sample size threshold (500, 1000, 5000).

This should be done by assessing the current decision input and simulating the results of other alternatives.

Client - Validation

Specification of required input data to smartly split a too-lengthy questionnaire...
Output one or many split questionnaire (one per wave and data collection mode..)
Output a summary of has been done - and what could be done for what advantage

Dev - Tech

Might need to rework the current input data in order to build the use case
The output should suggest some adjustment parameters.. (increase the data collection waves.. ) with recommendations with projected budgets per scenario
Technical validation (tests, check etc.)

Edouard-Legoupil · 2023-09-27T17:31:13Z

Based on the discussion this AM - I have revised the logic in the interface -

Scoping the need now between 2 distinct functions

An function to optimize the generation of the n surveys (aka wave) based on a list of questions
A simplex function to optimize the association between number of data collection waves and cost, based on:
- Indicators that required
- Data Collection mode
- Attrition (attempt & correlation between duration and drop-off)
- Capacity (total cost per interview, # of numerators)

Edouard-Legoupil assigned Cervangirard Sep 27, 2023

Edouard-Legoupil added the enhancement New feature or request label Sep 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

As user, I would like to optimise the data collection process #27

As user, I would like to optimise the data collection process #27

Edouard-Legoupil commented Sep 27, 2023 •

edited

Edouard-Legoupil commented Sep 27, 2023 •

edited

As user, I would like to optimise the data collection process #27

As user, I would like to optimise the data collection process #27

Comments

Edouard-Legoupil commented Sep 27, 2023 • edited

Client - Validation

Dev - Tech

Edouard-Legoupil commented Sep 27, 2023 • edited

Edouard-Legoupil commented Sep 27, 2023 •

edited

Edouard-Legoupil commented Sep 27, 2023 •

edited