Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Survey Data in CDM #90
Adding PROM data to CDM
ICON plc is currently engaged in a project with [[http://www.ichom.org/|ICHOM (International Consortium for Health Outcomes Measurement]].
ICHOM brings together patient representatives, clinician leaders, and registry leaders from all over the world to develop Standard Sets, comprehensive yet parsimonious sets of outcomes and case-mix variables for specific medical conditions that ICHOM recommends all providers track.
Each Standard Set focuses on patient-centered results, and provides an internationally-agreed upon method for measuring each of these outcomes. ICHOM believes that standardized outcomes measurement will open up new possibilities to compare performance globally, allow clinicians to learn from each other, and rapidly improve the care provided to patients.
ICHOM Standard Sets include baseline conditions and risk factors to enable meaningful case-mix adjustment globally, ensuring that comparisons of outcomes will take into account the differences in patient populations across not just providers, but also countries and regions. They also include high-level treatment variables to allow stratification of outcomes by major treatment types. A comprehensive data dictionary, as well as scoring guides for patient-reported outcomes is provided for each Standard Set.
ICON plc is developing a platform to ingest, store and analyse the outcome measures and is using the OMOP Common Data Model to store the data. The current CDM satisfies many of the requirements, but there are some gaps, specifically:
(other potential table name options: PRO, PATIENT_REPORTED_OUTCOME, PROM, others?)
Patient responses to survey questions are stored in the RESPONSE table. Each record in the RESPONSE table represents a single question/response pair and is linked to a specific SURVEY / questionnaire in the SURVEY_OCCURRENCE_ID. Each response record is the response to a specific question identified by the QUESTION_CONCEPT_ID. This concept ID is a unique question in the CONCEPT table identifying the concept DOMAIN (see example below for more details). An individual survey question can have multiple responses (e.g. which of these items relate to you, a, b, c ,…?) to a question. Each response is stored as a separate record in the RESPONSE table.
CONCEPT table - example
The patient response is captured as a code 2 (in this instance) in the questionnaire. The CONCEPT_ID is determined by finding a match in the concept table for the code (2) for the specific question (identified by HPS1) in column DOMAIN_ID and the response value (2) in the column CONCEPT_CODE.
SURVEY table - example
RESPONSE table - example
Survey data is very important, but I think we could potentially use a combination of visit_occurrence, the new visit_detail and observation table with minor modifications to achieve the same end result. Visit_detail may be able to represent much of the proposed survey table, and the observation table maybe able to represent much of the response table.
I actually think this fits one of the questions @alondhe / @clairblacketer / I put out on the forum which was where do you put patient reported information like HRA or Patient Reported Medications. I think the suggestion that gets adopted here could be used to answer this question.
Is it at all possible to consider the OBSERVATION table instead of the RESPONSE table (like @gowthamrao is suggesting above)? I feel like we can still make it fit. Instead of adding a whole new table we could add some columns to an existing table.
Erica, yes, for some of this data I agree, it is the same class of problem. However, some of the data as you have already pointed out belongs in their already defined domains such as drug exposure. The method of collection may be somewhat irrelevant or least secondary. For the collection of validated PRO survey data, the collection mechanism is extremely important and hence the need to track the incidence of a questionnaire being completed. It is possible to store the answers to the questionnaires in the OBSERVATIONS table with a number of key modifications and that is what I am currently doing in the absence of a RESPONSE table. I have included a RESPONSE table in my proposal because when I ask myself the question, is this data (observations that are collected today AND responses to a patient questionnaire) combined for any analysis purposes the answer is no. When I am interrogating patient responses to validated patient questionnaires (or non validated also), I am scoring patients on a very specific measure or reported outcome. I never add/subtract or count this data with general invalidated observations. I don't see any value in merging this data to offset the impact of changing the structure of the observation table to accommodation questionnaire responses.
Hope that adds some clarity to my reasoning.
Thanks for pushing this forward with a vengeance, this is really an important work. And I got a whole lot of comments I would like to engage with you guys. However, I feel this towel-like long comment exchange is not a good way of doing it: it is very hard to refer to anything and have a in-depth meaningful discussion. You keep scrolling up and down till the mouse gets sore. Should we figure out something more streamlined, until the proposal has matured a little more? Other working groups have used:
Let me know if you need help with setting things up, we'll gladly put some effort in.
My worries are mostly around the following issues:
BTW: There is no "negative address space". All concept_ids are positive integers. There is a convention that if you have local codes that are completely useless to the outside world, but you want to create private concepts, you use the ID space above 2 Billion. But the problem at hand doesn't seem to apply to that. We are talking about standardized Survey Concepts.
Chris, I agree with the comment on collaboration, so there is a meeting / workshop set up for next week to go through this. In the meantime, just to comment on your feedback above (which is great)
Happy to set this up as a Google doc for the workshop if that think that will help. Christian, do you want to plan this offline prior to the meeting. Happy to work with you on how best to conduction the workshop
I'm sorry I couldn't make the call last week, but I'm glad there's discussion about structring PRO data in the CDM. The PEPR Consortium (a group of sites doing validation of pediatric PROMIS measures) have hammering out a structure for this purpose as well, which I've pasted below in the hope that it'll be useful for the discussion. Some top-level points:
PEPR pro_occurrence Table
@baileych , thanks for that input and in the context of PROMIS it makes perfect sense. Some of your attributes have less ambiguous definitions than those contained in my earlier proposal. From a general PRO perspective and to address some additional requirements (e.g. tracking progression of a patients condition over time), I have normalized your PRO_occurrence into two separate tables. There are a number of attributes in the SURVEY table that are key for DQ analysis and validity of the data.
Adding the score into this PRO_occurrence table is a great idea. It is not something I currently store but makes sense to do it here. I am going to merge some of the concepts here into my earlier proposal and walk through them during the workshop scheduled for next Tuesday (19th Sept)