Generate Heel error when non-sensical units are used for a given measurement/observation #4

vojtechhuser · 2018-04-12T16:21:00Z

FORUM: http://forums.ohdsi.org/t/improved-data-quality-checking-for-measurement/5421
SOLUTION:

For a subset of measurements, only valid units for a given measurements should be used in the data. For example, unit of % is used for weight is considered incorrect.(Heel data quality warning; subset is determined by annual OHDSI DataQuality network study). A completely missing unit also triggers such warning.
To facilitate machine-assisted and human analysis, for a subset of measurements, a single unit is recommended (data-driven network consensus) to be used. ETL process should convert data to the specified single target unit. For example, weight in lb is converted to kg such that all weight data are recorded in a single targeted unit.

NEXT STEPS: modify conventions for MEASUREMENT table

Observed units per lab test: https://github.com/OHDSI/StudyProtocolSandbox/blob/master/themis/extras/results2019/S3-units-with-tests.csv
(for example see data for LOINC,8302-2 - height - cm (6 datasets), inches (2 datasets))

Data driven concensus KB for the second point:
https://github.com/OHDSI/StudyProtocolSandbox/blob/master/themis/extras/results2019/S7-preferred_units-ABC.csv

This issue will document the progress on it - or people to make comments.

ThemisUnits knowledge base will be used.

Proposal is to implement it as R logic (not in SQL). (VH volunteers to do that).

If SQL has to be used - I am making a call for volunteer SQL developer willing to help.

vojtechhuser · 2018-11-28T21:01:27Z

Currently, per convention 10 - unit is optional. This proposal still allows this to be missing, but for advanced sites provides Heel notifications and warnings.

https://github.com/OHDSI/CommonDataModel/wiki/MEASUREMENT#conventions
Convention MEASUREMENT-10:

dimshitc · 2018-11-28T22:25:33Z

Well, an empty unit is totally correct for some tests, for example: haematocrit and other ratios.

aostropolets · 2018-11-28T22:36:17Z

I believe for those unit ‘ratio’ should be used. There is nothing wrong with leaving it null though

vojtechhuser · 2018-12-27T15:50:38Z

This overlaps with issue #28 (which was closed and subsumed into this one).

ThemisMeasurements study was extended to include coded values and was renamed ThemisConcepts.
The repo was also moved out of sandbox and is now in https://github.com/OHDSI/OhdsiStudies
folder (ThemisConcepts)

vojtechhuser · 2019-01-04T16:46:39Z

For the hematocrit question, let's use the power of real world data and see what sites are using.
and indeed, 4 sites are using % as unit :-)

So teams at 13 sites thought hard about how to standardize their lab data. And without ever meeting (those 13 teams) and debating for a very long time, we can look at an emerging group consensus (assuming each site is diligent about having their data analysis-ready and neat)

see here - line 39
https://github.com/OHDSI/StudyProtocolSandbox/blob/master/themis/extras/results2019/S3-units-with-tests.csv#L39

Let not forget that we try to eliminate ambiguity for a machine or analyst. And 0.47 or 47 (as %) make a 100 fold difference.

dimshitc · 2019-01-08T11:32:47Z

So, basically we need to create relationships from Measurement to preffered unit in concept_relationship table, right?
And then, during ETL, people use them converting what source gives.
For example, if source gives the body weight in pounds, in the CDM it will be converted to KG.

cgreich · 2019-04-26T14:01:06Z

I believe for those unit ‘ratio’ should be used. There is nothing wrong with leaving it null though

I like the idea to mandate the unit "1" or "{ratio}" or something like that.

For example, if source gives the body weight in pounds, in the CDM it will be converted to KG

I think that is future. Right now, we want to create mandatory (preferred as you call it) llist of units.

MelaniePhilofsky · 2023-07-06T13:55:38Z

Current: Measurements should have a concept set for acceptable units. Once instantiated, then DQD rules should be built.
Example: Weight measurements can only have kg, lb, ounces units. All other units will generate a DQD error
Who's responsible: Start with Vojtech's study, Anna's study?, put out to the community, Find a group to review final recommendations.

Future: We harmonize all value and units for a particular measurement to one, OHDSI blessed unit. This will be a recommendation.

dimshitc · 2023-07-10T16:00:11Z

Working on DQD, we already created sets of permissible units per measurement concept.
It is relatively broad, for example Weight measurements will have all weight units allowed: from tonne to picogram.
The idea behind that is that

LOINC has table with example units
UCUM has crosswalks between units of the same dimension
Data might have different variation of the same value/unit: 1000 mg or 1 g, for example.
So we allowed all units of the same dimension.

It might seem to be too broad, but it's the way how to get values for a lot of measurements.
Also it's not that straightforward when it comes to the
cells/ml, units/ml, erythrocytes/ml, /ml, {copies}/ml, etc, when from the LOINC perspective there are different groups, but it's the same value,
and abstract units such as index, ratio, Generic unit for indivisible thing, score, etc.
So we need to extend these allowed unit groups manually.

What is your suggested approach?
Do you want to work on a small amount of measurements where you can pick units manually and carefully OR improve the existing approach by extending the unit groups? If latest, I can share the scripts of how to get the pairs we currently use in DQD

MelaniePhilofsky · 2023-07-18T20:13:40Z

@dimshitc

You should discuss this issue with @vojtechhuser since he is the creator or the original issue. I think you or Vojtech should be the owner of this issue. Let me know who it will be, so I can appropriately assign this issue.

We need a clear description of the problem, the current use case, and approach to remedy the issue. We will also need guidance for the ETLer and for the end users of the data. Though I am not sure the latter is necessary, but the former definitely is since the DQD will generate an error for the non-sensical units.

MelaniePhilofsky · 2025-03-07T18:00:27Z

If this issue is still a concern, please bring it to the DQD WG.

ericaVoss added the NEW label Jun 25, 2018

ericaVoss removed the NEW label Nov 13, 2018

ericaVoss added this to the Next Release milestone Nov 13, 2018

ericaVoss added the PRIORITY HIGH label Nov 27, 2018

MeghanPettine mentioned this issue Nov 28, 2018

Standardization of Measurements and Units #28

Closed

mvanzandt added the Under 60 Day Review label Apr 16, 2019

vojtechhuser mentioned this issue May 6, 2019

Implement Themis WG1 recommendations (units for labs) OHDSI/Achilles#250

Closed

vojtechhuser mentioned this issue Sep 9, 2019

notification about non-preferred units OHDSI/DataQualityDashboard#44

Closed

MelaniePhilofsky removed the PRIORITY HIGH label Jul 5, 2023

MelaniePhilofsky added the PRIORITY MEDIUM label Jul 6, 2023

clairblacketer added this to OMOP CDM & THEMIS Conventions Dec 8, 2023

clairblacketer removed this from OMOP CDM & THEMIS Conventions Feb 7, 2024

MNairn mentioned this issue Apr 9, 2024

How to populate observation_period from EHR Data #103

Closed

MelaniePhilofsky removed the PRIORITY MEDIUM label Dec 3, 2024

MelaniePhilofsky closed this as completed Mar 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate Heel error when non-sensical units are used for a given measurement/observation #4

Generate Heel error when non-sensical units are used for a given measurement/observation #4

vojtechhuser commented Apr 12, 2018 •

edited

Loading

vojtechhuser commented Nov 28, 2018 •

edited

Loading

dimshitc commented Nov 28, 2018

aostropolets commented Nov 28, 2018

vojtechhuser commented Dec 27, 2018

vojtechhuser commented Jan 4, 2019 •

edited

Loading

dimshitc commented Jan 8, 2019

cgreich commented Apr 26, 2019 •

edited

Loading

MelaniePhilofsky commented Jul 6, 2023

dimshitc commented Jul 10, 2023 •

edited

Loading

MelaniePhilofsky commented Jul 18, 2023

MelaniePhilofsky commented Mar 7, 2025

Generate Heel error when non-sensical units are used for a given measurement/observation #4

Generate Heel error when non-sensical units are used for a given measurement/observation #4

Comments

vojtechhuser commented Apr 12, 2018 • edited Loading

vojtechhuser commented Nov 28, 2018 • edited Loading

dimshitc commented Nov 28, 2018

aostropolets commented Nov 28, 2018

vojtechhuser commented Dec 27, 2018

vojtechhuser commented Jan 4, 2019 • edited Loading

dimshitc commented Jan 8, 2019

cgreich commented Apr 26, 2019 • edited Loading

MelaniePhilofsky commented Jul 6, 2023

dimshitc commented Jul 10, 2023 • edited Loading

MelaniePhilofsky commented Jul 18, 2023

MelaniePhilofsky commented Mar 7, 2025

vojtechhuser commented Apr 12, 2018 •

edited

Loading

vojtechhuser commented Nov 28, 2018 •

edited

Loading

vojtechhuser commented Jan 4, 2019 •

edited

Loading

cgreich commented Apr 26, 2019 •

edited

Loading

dimshitc commented Jul 10, 2023 •

edited

Loading