Build your own universe
Scale high-quality research data provisioning with R packages
Travis Gerke & Garrick Aden-Buie
This is a 20-minute talk most recently given at R/Medicine 2020.
Institutional honest brokers consolidate patient, clinical, and lab data from a variety of data sources in order to provide investigators with research-ready data sets. High-quality research data provisioning requires skilled navigation of heterogeneous software systems and a detailed understanding of data structure standards within each source. In this talk we discuss how we, as honest brokers at a large cancer center, have created a universe of internal R packages that simplify data access, store and present metadata, standardize best practices, support reproducibility and repeatability, apply branding styles to reports and visualizations, and facilitate communication with the research data end user. Our package ecosystem simplifies the workflow of honest brokers to scale curation and delivery of high-quality research data.