Clinical Feature Extraction/Engineering Abstraction to Reusable SQL Code #392

jonc101 · 2023-01-12T00:27:25Z

Similar to Issue #391 with cohort query abstraction, do similar for common task of clinical feature extraction from Stanford clinical EMR databases. Adapt to SQL to make it efficient to just reuse BigQuery infrastructure without having dependencies on Python, R, or other layers. (Though application logic layers can still be used for more advanced feature engineering and manipulation.)

Do we have SQL code for common clinical feature extraction / engineering / feature matrix factory? I know we have several version of people's Python code for common feature engineering, @ccorbin's is probably the most robust at this point. But just as we can consolidate cohort construction in just SQL (given how bizarrely efficient BigQuery is), it would be worth doing the same for feature extraction. Can still use Python or other downstream code for further feature engineering or manipulation. I think @Grace K might have had some examples of constructing some common features using just SQL queries to add feature columns to a cohort table/query?

jonc101 created this issue from a note in DevOps (To Do) Jan 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clinical Feature Extraction/Engineering Abstraction to Reusable SQL Code #392

Clinical Feature Extraction/Engineering Abstraction to Reusable SQL Code #392

jonc101 commented Jan 12, 2023

Clinical Feature Extraction/Engineering Abstraction to Reusable SQL Code #392

Clinical Feature Extraction/Engineering Abstraction to Reusable SQL Code #392

Comments

jonc101 commented Jan 12, 2023