I found an issue in the getdPatientCohort function. The current versi… #19
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I found two issues in the code, and the following is the explanation of the problem.
Issue 1:
Following is a section of the code used in the current version.
WHERE observation_concept_id IN (", paste(includeConceptlist,collapse=","), ") AND observation_concept_id NOT IN (", paste(excludeConceptlist,collapse=",")
The expected outcome is to find patient_ids who have at least one observation_concept_id in the includeConceptlist but have not in the excludeConceptlist. However, this part does not give the expected outcome. The current code corrected this issue using a join.
Issue 2:
A given patient might have an observation_concept_id in the include concept list. However, the same person might have an exclusion concept in the condition_concept_id. Therefore, one has to combine both observation and condition_occurrence tables. The solution is to combine the observation and condition_occurrence tables using union all. Also, note that if someone is including measurement_concept_id, drug_concept_id, procedure_concept_id, and note_nlp_concept_ids in the inclusion/exclusion lists, we have to combine all tables using union all before checking for inclusion and exclusion criteria.
Solution.
We need to fix three places.
Place 1:
The section from line 193 to 196. This section has both issues 1 and 2.
One has to combine the observation and condition_occurrence tables. However, this requires changing line 233. I am afraid that could cause confusions. Therefore, in the current version, I only fix issue 1. If you agree with my logic. I will fix the issue 2 too.
Place 2:
The section from line 198 to 230. I am not touching this section now. If you agree with my solution we could revisit this section.
Place 3:
Line 240. This line also suffers from both issues 1 and 2. I fixed both problems in the new version. Note that I use a CTE to increase the clarity of the coding. This helps me to keep the code short and easy to read. However, CTEs are sometimes known to slow down the query. If CTEs are slowing you too much, we could modify this part easily. Also, note that I union six tables. I have two dashes in front of every union. During testing, one can comment on sections as needed.
Signed-off-by: udaraabeysekara udaraabeysekara@yahoo.com