R Programming for Research Workshop
Nick Michalak and Iris Wang
University of Michigan LSA Department of Psychology
Required Texts
- Wickham, H., & Grolemund, G. (2017). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Sebastopol, CA: O'Reilly Media, Inc.
- The tidyverse style guide by Hadley Wickham
Philosophy
- ReadCollegePDX (2015, October 19). Hadley Wickham "Data Science with R". Retrieved from https://youtu.be/K-ss_ag2k9E?list=PLNtpLD4WiWbw9Cgcg6IU75u-44TrrN3A4
- Robinson, D. (2017, July 05). Teach the tidyverse to beginners. Variance Explained. Retreived from http://varianceexplained.org/r/teach-tidyverse/
- Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10), 1-23.
Day 1. Installation and Introduction
Before Workshop
- Skim introduction (Wickham & Grolemund)
- Browse tidyverse.org
- Skim Hadley Wickham "Data Science with R" (ReedCollegePDX, 2016)
- Find one or two datasets you know well and are OK with others seeing.
- Preferably, find the raw (hasn't been "cleaned") data.
- Make a new folder. give it a good name. repeat with subfolders. (Hint: Skim some data management best practices from the Stanford Library or the Michigan Library guide)
- Put your raw data in there, somewhere.
During Workshop
- Introduction / philosophy
- Installing (and uninstalling) R and RStudio
- Installing R (Macintosh / Windows)
- Uninstalling R (Macintosh / Windows)
- Installing RStudio
- Uninstalling RStudio
- R environment
- Running R code
- Demonstrations
- tidyverse
- Exercises
- Resources
- Cheat Sheets
Day 2. Visualization
Before Workshop
- Skim Data visualization and Data import (Wickham & Grolemund)
- Skim magrittr and ggplot2
- Skim Anscombe's quartet
- Skim Matejka, J., & Fitzmaurice, G. (2017, May). Same stats, different graphs: Generating datasets with varied appearance and identical statistics through simulated annealing. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 1290-1294). ACM.
- Skim Weissgerber, T. L., Milic, N. M., Winham, S. J., & Garovic, V. D. (2015). Beyond bar and line graphs: time for a new data presentation paradigm. PLoS biology, 13(4), e1002128.
- Skim McCabe, C. J., Kim, D. S., & King, K. M. (2018). Improving Present Practices in the Visual Display of Interactions. Advances in Methods and Practices in Psychological Science, 2515245917746792.
- Play with their R Shiny web application that accompanies the paper: interActive: A tool for the visual display of interactions
During Workshop
- Introduction and Demonstration
- Anscombe's quartet
- Matejka, J., & Fitzmaurice, G. (2017, May). Same stats, different graphs: Generating datasets with varied appearance and identical statistics through simulated annealing. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 1290-1294). ACM.
- Weissgerber, T. L., Milic, N. M., Winham, S. J., & Garovic, V. D. (2015). Beyond bar and line graphs: time for a new data presentation paradigm. PLoS biology, 13(4), e1002128.
- ggplot2 and the grammar of graphics
- Demonstrations
- Exercises
- Cheat Sheets
Day 3. Workflow and Data Transformation
Before Workshop
- Skim Workflow: basics, Data transformation, and Tidy data (Wickham & Grolemund)
- Skim Files and Syntax from the tidyverse style guide (Wickham)
During Workshop
- Coding Basics
- Naming
- Calling Functions
rep()
filter()
arrange()
select()
mutate()
summarise()
gather()
spread()
full_join()
,left_join()
,right_join()
,inner_join()
ifelse()
- Exercises in wrangling your own data
- Cheat Sheets
Day 4. Summarizing and Modeling
Before Workshop
- your favorite regression or ANOVA text, or any tutorials at https://designingexperiments.com/supplements/
- Skim
help("lm")
,help("car")
, andhelp("afex")
- Skim An introduction to the psych package: Part I: data entry and data description
- Skim An introduction to the psych package: Part II Scale construction and psychometrics
- Skim lavaan: tutorial
- Judd, C. M., Westfall, J., & Kenny, D. A. (2017). Experiments with more than one random factor: Designs, analytic models, and statistical power. Annual Review of Psychology, 68, 601-625.
During Workshop
describe()
anddescribeBy()
t.test()
lm()
andAnova()
corr.test()
pairs.panels()
andcor.plot()
lmer()
sem()
fa.parallel()
andfa()
Day 5. Your Data
Before the Workshop
Browse RMarkdown from RStudio Skim Workflow: projects
- find one or two datasets you know well and are OK with others seeing.
- Preferably, find the raw (hasn't been "cleaned") data
- Make a new folder. give it a good name. repeat with subfolders. (Hint: Skim some data management best practices from the Stanford Library or the Michigan Library guide)
- Put your raw data in there, somewhere
- Skim The tidyverse style guide by Hadley Wickham
During the Workshop
- R Projects
- R Markdown
- Importing data
read_csv()
,read_spss()
, andread_stata()
- Writing code you and others can read
R Resources
Websites
- Quick-R a roadmap to the language and the code necessary to get started quickly (i.e. tutorials)
- RStudio Cheat Sheets just like it reads, these are cheat sheets for "favorite" R packages and more (e.g. dplyr, ggplot2, base, R Markdown, regular expressions)
- UCLA Institute for Digital Research and Education: R statistics and programming tutorials for R, among other helpful related resources
- The Personality Project: Using R for psychological research seemingly endless tutorials and explainers about R programming for (personality-themed) psychology research; also, some tutorials cover the psych package, which is written by Michigan Psychology alumni, William Revelle (1973)
- Richard Gonzalez's Advanced Statistical Methods Course Notes Nick's regression bible, complete with SPSS and R code for common procedures + detailed notes
- Doug Bonett's Quantitative Data Analysis Course R Functions includes functions for testing linear contrasts (standardized and unstandardized) that don't assume equal variances
- tidyverse: ggplot2 ggplot2 bible (also check out the rest of the tidyverse website)
- lavaan: latent variable analysis overview and tutorials for the best sem package (IMO) in R (disclaimer: no support for discrete latent variables, aka mixture modeling, latent class analysis)
- RExRepos: R code examples for a number of common data analysis tasks just like it reads, how-to guide for common procedures
- R Base Graphics: An Idiot's Guide if you want to plot with Base graphics like an R hipster?a hipstR, if you will?here's a jumping off point
- { swirl }: Learn R, in R "swirl teaches you R programming and data science interactively, at your own pace, and right in the R console!"
- A language, not a letter: Learning statistics in R "This online collection of tutorials was created by graduate students in psychology as a resource for other experimental psychologists interested in using R for statistical analyses and graphics. Each chapter was created to provide an overview of how to code a particular topic in the R language."
- STAT 545 @ UBC: Data wrangling, exploration, and analysis with R "Learn how to explore, groom, visualize, and analyze data and make all of that reproducible, reusable, and shareable using R"
- designingexperiments.com site accompanies Designing Experiments and Analyzing Data: A Model Comparison Perspective (3rd edition; Maxwell, Delaney, & Kelley, 2018). It's full of modeling examples for R, but it also includes some extremely useful website applications for power analyses for all sorts of common designs
Texts
- Beaujean, A. A. (2014). Latent variable modeling using R: A step-by-step guide. New York, NY: Routledge.
- Field, A., Miles., J., & Field, Z. (2012). Discovering statistics using R. London: SAGE Publications.
- Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. New York, NY: Cambridge University Press.
- Ismay, C. & Kim, A.Y. (2017). ModernDive: An Introduction to Statistical and Data Sciences via R.
- Navarro, D. (2015). Learning Statistics with R. Raleigh, North Carolina: Lulu Press, Inc.
- Maxwell, Delaney, & Kelley, (2018). Designing experiments and analyzing data: A model comparison perspective. (3rd ed.). Routledge.
- Wickham, H. (2015). Advanced R. Boca Raton, FL: CRC Press.
- Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. New York, NY: Springer.
- Zuur, A. F., Ieno, E. N., Walker, N. J., Saveliev, A. A., & Smith, G. M. (2009). Mixed effects models and extensions in ecology with R. New York, NY: Springer.
Acknowledgements
- Iris and I couldn't have done this alone. We thank all of these thoughtful and helpful people:
Josh Wondra (he started this workshop in the Psychology Department last summer and helped us take it over this summer); Brian Wallace and everyone at Psychology Student Academic Affairs (they approved us!); Rich Gonzalez (especially his Psychology 613/614 course); Adrienne Beltz and Pam Davis-Kean and everyone who's a part of the Psychology Methods Hour; Instructional Support Services and Blue Corps at the University of Michigan; and, of course, the R community, especially Hadley Wickham and Garrett Grolemund (they wrote the book!).