Refactoring column output and export routines #3

anthonysena · 2017-08-18T02:50:32Z

@lhalvors @PRijnbeek I reviewed these changes and ran the code against our internal SQL PDW environment and ran into some issues. A first problem was that SQL PDW doesn't support INSERT statements that include common table expressions. As I got deeper into the review, I noted that you are storing data as:

Achilles_results.stratum_1 ==> concept_id
Achilles_results.stratum_2 ==> % people
Achilles_results.count_value ==> the frequency # for the concept_id in a person's history

As a result, you also had to modify the delete statement that removes any records based on the @smallcellcount setting:

delete from @results_database_schema.ACHILLES_results 
 where analysis_id <> 691 
 	and analysis_id <> 791 
 	and analysis_id <> 891 
 	and analysis_id <> 1891 
 	and count_value <= @smallcellcount;
delete from @results_database_schema.ACHILLES_results_dist 
 where analysis_id <> 691 
 	and analysis_id <> 791
 	and analysis_id <> 891
 	and analysis_id <> 1891
 	and count_value <= @smallcellcount;

In speaking with @chrisknoll, we noted that the Achilles_results.count_value should hold the count of persons as this is the convention that is used in the other analyses. As a result, I've refactored the frequency analyses to store the data as follows:

Achilles_results.stratum_1 ==> concept_id
Achilles_results.stratum_2 ==> the frequency # for the concept_id in a person's history
Achilles_results.count_value ==> the count of persons

I then revised the delete statements referenced above to remove your analyses from the list since we want to ensure that any analyses that have a person count <= @smallcellcount are censored properly.

Furthermore, I refactored the exportJSON SQL calls so that we calculate the % of persons and preserved the columns that you originally had in your query. The notable difference is that I've included a reference to another Achilles analysis to find the denominator when calculating the % which is consistent with the other export SQL scripts. Specifically, you will see this in the queries:

(select count_value from @results_database_schema.ACHILLES_results where analysis_id = 1) denom

Which references an analysis that computes the total # of people in the CDM.

I'd kindly ask you to review and test these changes. I know they will have a ripple effect to your changes for WebAPI/Atlas so I'm happy to help change that code as well. Please reach out with any questions and happy to discuss as needed. If and when you are comfortable with these changes, we can then push this update to the open pull request on the OHDSI repository.

lhalvors

The Achilles changes are OK.

In the exportToJson changes, the Y_NUM_PERSONS values currently come out as 1/100'th of the original.

Changing the following statement:
"1.0*num.count_value / denom.count_value as Y_NUM_PERSONS, "

to:
"cast(round((100.0*num.count_value / denom.count_value), 0) as int) as Y_NUM_PERSONS, "

should produce the same output values as before.

anthonysena · 2017-08-21T15:20:35Z

Thanks Lars for the feedback. I've pushed a change to change the calculation of the Y_NUM_PERSONS column and I believe that should address the files for export. Please let me know if you find any other issues. Thanks!

* Frequency distribution of total persons that have at least x drug_exposure/observation/measurement records, respectively * Refactoring column output and export routines (#3) * Refactoring column output and export routines * Fixing Y_NUM_PERSONS for export

* Frequency distribution of total persons that have at least x drug_exposure/observation/measurement records, respectively * Refactoring column output and export routines (#3) * Refactoring column output and export routines * Fixing Y_NUM_PERSONS for export * Correct sub-query syntax for frequency distribution queries

Refactoring column output and export routines

31a2961

lhalvors requested changes Aug 21, 2017

View reviewed changes

Fixing Y_NUM_PERSONS for export

239d6ba

lhalvors approved these changes Aug 22, 2017

View reviewed changes

lhalvors merged commit 18bbb4f into mi-erasmusmc:freq_distribution Aug 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring column output and export routines #3

Refactoring column output and export routines #3

anthonysena commented Aug 18, 2017

lhalvors left a comment

anthonysena commented Aug 21, 2017

Refactoring column output and export routines #3

Refactoring column output and export routines #3

Conversation

anthonysena commented Aug 18, 2017

lhalvors left a comment

Choose a reason for hiding this comment

anthonysena commented Aug 21, 2017