Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdmDatatype check failing #425

Closed
tlecarrour-ee opened this issue Mar 9, 2023 · 2 comments · Fixed by #429
Closed

cdmDatatype check failing #425

tlecarrour-ee opened this issue Mar 9, 2023 · 2 comments · Fixed by #429
Assignees
Labels
bug Something isn't working

Comments

@tlecarrour-ee
Copy link

Hi,

I recently updated DQD from 2.0.0 to 2.1.1 and I now have an error that might not be related to my data, but to the SQL used by the check.

Check id is field_cdmdatatype_measurement_operator_concept_id. It reads:

SELECT 
  num_violated_rows, 
	CASE 
		WHEN denominator.num_rows = 0 THEN 0 
		ELSE 1.0*num_violated_rows/denominator.num_rows 
	END AS pct_violated_rows, 
	denominator.num_rows AS num_denominator_rows
FROM
(
	SELECT 
	  COUNT(violated_rows.violating_field) AS num_violated_rows
	FROM
	(
		/*violatedRowsBegin*/
		SELECT 
		  'MEASUREMENT.OPERATOR_CONCEPT_ID' AS violating_field, 
		  cdmTable.* 
		FROM omop_cdm.MEASUREMENT cdmTable
		WHERE 
		  CASE WHEN (CAST(cdmTable.OPERATOR_CONCEPT_ID AS VARCHAR) ~ '^([0-9]+\.?[0-9]*|\.[0-9]+)$') THEN 1 ELSE 0 END = 0 
		  OR (
		    CASE WHEN (CAST(cdmTable.OPERATOR_CONCEPT_ID AS VARCHAR) ~ '^([0-9]+\.?[0-9]*|\.[0-9]+)$') THEN 1 ELSE 0 END = 1 
		    AND STRPOS(CAST(ABS(cdmTable.OPERATOR_CONCEPT_ID) AS varchar),'.') != 0
		  )
      AND cdmTable.OPERATOR_CONCEPT_ID IS NOT NULL
		/*violatedRowsEnd*/
	) violated_rows
) violated_row_count,
( 
	SELECT 
	  COUNT(*) AS num_rows
	FROM omop_cdm.MEASUREMENT
) denominator;

When I execute the following sub-query:

SELECT 
		  'MEASUREMENT.OPERATOR_CONCEPT_ID' AS violating_field, 
		  cdmTable.* 
FROM
                  omop_cdm.MEASUREMENT cdmTable
WHERE 
		  CASE WHEN (CAST(cdmTable.OPERATOR_CONCEPT_ID AS VARCHAR) ~ '^([0-9]+\.?[0-9]*|\.[0-9]+)$') THEN 1 ELSE 0 END = 0 
		  OR (
		    CASE WHEN (CAST(cdmTable.OPERATOR_CONCEPT_ID AS VARCHAR) ~ '^([0-9]+\.?[0-9]*|\.[0-9]+)$') THEN 1 ELSE 0 END = 1 
		    AND STRPOS(CAST(ABS(cdmTable.OPERATOR_CONCEPT_ID) AS varchar),'.') != 0
		  )
                  AND cdmTable.OPERATOR_CONCEPT_ID IS NOT NULL

I get a list of records with null OPERATOR_CONCEPT_ID. I was expecting those records to be excluded.

If I add a set of parenthesis, I get the expected result:

SELECT 
		  'MEASUREMENT.OPERATOR_CONCEPT_ID' AS violating_field, 
		  cdmTable.* 
FROM
                  omop_cdm.MEASUREMENT cdmTable
WHERE
		  ( -- OPEN
                          CASE WHEN (CAST(cdmTable.OPERATOR_CONCEPT_ID AS VARCHAR) ~ '^([0-9]+\.?[0-9]*|\.[0-9]+)$') THEN 1 ELSE 0 END = 0 
		          OR (
		                  CASE WHEN (CAST(cdmTable.OPERATOR_CONCEPT_ID AS VARCHAR) ~ '^([0-9]+\.?[0-9]*|\.[0-9]+)$') THEN 1 ELSE 0 END = 1 
		                  AND STRPOS(CAST(ABS(cdmTable.OPERATOR_CONCEPT_ID) AS varchar),'.') != 0
		           )
                  ) -- CLOSE
                  AND cdmTable.OPERATOR_CONCEPT_ID IS NOT NULL

Or is my understanding of this check somehow incorrect?

@katy-sadowski
Copy link
Collaborator

Oh, this is a very good catch, thank you so much for reporting it! We do want & need those parentheses around the numeric/integer checks. I will ship a fix for this as soon as possible. Reverting to v2.0.0 should resolve this issue, as we refined the logic in this check in v2.1.0.

@katy-sadowski katy-sadowski self-assigned this Mar 10, 2023
@katy-sadowski katy-sadowski added the bug Something isn't working label Mar 10, 2023
@tlecarrour-ee
Copy link
Author

Thanks for your quick reply and for making DQD better everyday!
I'm looking forward for the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants