Skip to content

DQA_Results

kapsner edited this page Jun 11, 2022 · 3 revisions

DQA Results

The results of the DQA output are organized in several sections, which will be introduced in the following.

Descriptive Results

The descriptive results provide an overview of metadata information, completeness counts, frequency counts / statistics, and conformance checks for each dataelement with two databases being displayed side by side for better comparison.

Description

For each dataelement, a description is provided if defined in the MDR.

Metadata

The metadata section summarizes some informative metadata from the MDR, such as the dataelement's variable name and table name in the database, the variable type, and more.

Completeness Overview

The completeness overview provides four numbers for a direct comparison between source database and target database. These numbers are also compared automatically between these databases and displayed at the beginning of the report with the 'ETL Checks (Validation)'.

  • n: the total number of available rows in the database for the selected dataelement
  • valid values: the number of rows of the dataelement that are not missing (NULL, NA, etc.)
  • missing values: the number of rows with missing values ($n = missingvalues + validvalues$)
  • distinct values: the number of distinct / unique values of the dataelement

Results (Frequency Counts / Statistics)

The formatting of the results section depends on the variable type:

  • enumerated / string: A maximum of 25 categories are displayed by default along with their frequency counts.

  • float / integer: statistical dispersion parameters are displayed (min, median, mean, max, SD, and more ...)

  • datetime: a simple summary statistic is displayed (min, q25, median, mean, q75, max)

Value Conformance Checks

If defined in the MDR for the respective dataelement, the results of the value conformance checks are also displayed along with the constraining values / rules. It is indicated clearly, if the checks were passed or failed according to the constraining values / rules. If the status is failed, the values that are not conform with the rules are also displayed. Similar to the completeness checks, these numbers are also compared automatically between both databases and displayed at the beginning of the report with the 'Value Conformance Checks (Verification)'.

💡 If the analyzed database is an SQL database, the SQL statement for retrieving the data for the respective dataelement can be accessed by clicking a button, which shows up on the descriptive results page.

Plausibility Results

The plausibility results are organized in the same manner als the descriptive results only for the plausibility statements (if they were defined in the MDR).

Missings

The missings section is basically a table, which provides the absolute and relative missings of each dataelement, again with results of the source and target databases presented side by side for better comparison.