Refactoring column output and export routines #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@lhalvors @PRijnbeek I reviewed these changes and ran the code against our internal SQL PDW environment and ran into some issues. A first problem was that SQL PDW doesn't support INSERT statements that include common table expressions. As I got deeper into the review, I noted that you are storing data as:
Achilles_results.stratum_1 ==> concept_id
Achilles_results.stratum_2 ==> % people
Achilles_results.count_value ==> the frequency # for the concept_id in a person's history
As a result, you also had to modify the delete statement that removes any records based on the @smallcellcount setting:
In speaking with @chrisknoll, we noted that the Achilles_results.count_value should hold the count of persons as this is the convention that is used in the other analyses. As a result, I've refactored the frequency analyses to store the data as follows:
Achilles_results.stratum_1 ==> concept_id
Achilles_results.stratum_2 ==> the frequency # for the concept_id in a person's history
Achilles_results.count_value ==> the count of persons
I then revised the delete statements referenced above to remove your analyses from the list since we want to ensure that any analyses that have a person count <= @smallcellcount are censored properly.
Furthermore, I refactored the exportJSON SQL calls so that we calculate the % of persons and preserved the columns that you originally had in your query. The notable difference is that I've included a reference to another Achilles analysis to find the denominator when calculating the % which is consistent with the other export SQL scripts. Specifically, you will see this in the queries:
(select count_value from @results_database_schema.ACHILLES_results where analysis_id = 1) denom
Which references an analysis that computes the total # of people in the CDM.
I'd kindly ask you to review and test these changes. I know they will have a ripple effect to your changes for WebAPI/Atlas so I'm happy to help change that code as well. Please reach out with any questions and happy to discuss as needed. If and when you are comfortable with these changes, we can then push this update to the open pull request on the OHDSI repository.