Skip to content

Background

iabaako edited this page Jun 21, 2022 · 3 revisions

The Data Management System is a package with Stata commands for performing high-frequency checks (HFCs). An HFC is a check of some element of the data collection process, completed on a regular basis as new data comes in. At IPA/J-PAL, high-frequency checks are typically implemented in Stata, after the data flow is complete. High-frequency checks can provide information about any of the following elements of data collection:

  • The quality of the data
  • Enumerator performance
  • Errors in electronic Survey Programme
  • Other systematic flaws in the data flow

Given how much information they can provide about the quality of the data collection, high-frequency checks are one of the major benefits of CAI. It’s hard to overstate how important these checks are. High-frequency checks are different from CAI logic checks, which are programmed into the CAI survey program and not in Stata. High-frequency checks should ideally be used to complement in-built checks in the survey program and are used for checks that cannot be effectively implemented in a CAI program. For instance, while CAI logic checks are restrictions on a field or the relationship between fields within a survey, high-frequency checks often check trends across surveys.

Daily logic checks

  1. Check the survey form version
  2. Check that there are no duplicate observations
  3. Check that there are no duplicates in other variables expected to be unique. eg. GPS coordinates or phone number
  4. Check that certain critical variables have no missing values
  5. Check that no variable has all missing values
  6. Check for "specify other" values in the survey
  7. Check for outliers in numeric variables
  8. Check for field comments
  9. Track survey progress

Enumerator checks (dashboard)

  1. Check the percentage of “don’t know” and “refuse to answer”
  2. Check the "yes" percentage for filter questions
  3. Check for enumerator productivity
  4. Check average interview duration
  5. Check active hours
  6. Check statistics for numeric variables

Survey Dashboard (Dashboard)

  1. Check survey consent rate
  2. Check the percentage of survey values missing
  3. Check the percentage of "don't know" & "Refuse to Answer" responses
  4. Check the number and percentage of other specify values
  5. Check the number of variables with all missing and at least 1 missing value
  6. Check survey productivity
Clone this wiki locally