This repository contains files for a two-day course on Visualization for Data Science in R, offered during Data Matters 2018. The course description and activities are listed below.
This course is designed for two audiences: experienced visualization designers looking to apply open data science techniques to their work, and data science professionals who have limited experience with visualization. Participants will develop skills in visualization design using R, a tool commonly used for data science. Basic familiarity with R is required.
Data science skills are increasingly important for research and industry projects. With complex data science projects, however, come complex needs for understanding and communicating analysis processes and results. The rise of data science has accompanied a comparable rise in business intelligence and the demand for visualizations and dashboards that can explain models, summarize results, assist with decision making, and even predict outcomes. Ultimately, an analyst's data science toolbox is incomplete without visualization skills.
The course will take a project-based approach to learning best practices for visualization for data science. Participants will be guided through 2-3 sample analysis and visualization projects that will highlight different types of visualization, different features of R and its visualization libraries, and different challenges that arise when trying to apply an open data science philosophy to visualization.
- Introduction to visualization in R
- Visualization for data exploration
- Visualization for communication
- Interactive visualizations with Shiny
This course assumes basic familiarity with R -- e.g., R syntax, data structures, development environments. Most visualizations will be created with ggplot2 and other tidyverse libraries, but prior experience with those libraries is not required. In order to participate in class exercises, participants should bring a laptop where current versions of R and RStudio have been installed and where the participant has sufficient privileges to install new R packages on demand.
- ggplot2 Resources
- Shiny Resources
- Example Shiny Apps