Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
1113 lines (783 sloc) 28.6 KB
<!DOCTYPE html>
<html>
<head>
<title>Won’t you be my neighboR?</title>
<meta charset="utf-8">
<meta name="author" content="Joshua Rosenberg and Emily Bovee" />
<meta name="date" content="2018-05-16" />
<link href="libs/remark-css/default.css" rel="stylesheet" />
<link href="libs/remark-css/metropolis.css" rel="stylesheet" />
<link href="libs/remark-css/metropolis-fonts.css" rel="stylesheet" />
<link rel="stylesheet" href="custom.css" type="text/css" />
</head>
<body>
<textarea id="source">
class: center, middle, inverse, title-slide
# Won’t you be my neighboR?
## An introduction to R for data science in education
### Joshua Rosenberg and Emily Bovee
### 2018-05-16
---
# #whoami1
.pull-left[
* Dr. Joshua Rosenberg
* Assistant Professor, STEM Education, University of Tennessee, Knoxville
* Dad (one-year-old toddler!)
* Primary areas of interest
* Data science in education
* Data science education
]
&lt;img src="img/kr-joro.jpg" width="350px" style="display: block; margin: auto;" /&gt;
---
# #whoami2
.pull-left[
* Dr. Emily Bovee
* Director of Educational Development and Assessment: Marquette University School of Dentistry
* Cat mom
* Primary areas of interest
* Understanding student motivation in higher education and professional education
* Data science education
]
&lt;img src="img/ebv.jpg" width="350px" style="display: block; margin: auto;" /&gt;
---
class: inverse, center, middle
# Agenda
---
# Agenda
1. Why learn R?
1. Getting started with R Studio
1. Computer science foundations
1. Processing data
1. Loading data
1. Visualizing data
1. Modeling data
1. Learning and doing more with R
1. Questions
---
# Why learn R?
* It is capable of carrying out basic and complex statistical analyses
* It is able to work with data small (*n* = 30!) and large (*n* = 100,000+) efficiently
* It is a programming language and so is quite flexible
* There is a great, inclusive community of users and developers (and teachers)
* It is cross-platform, open-source, and freely-available
---
class: inverse, center, middle
# Getting started with R Studio
---
# Installations
To download R:
- Visit this page to download R: https://cran.r-project.org/
- Find your operating system (Mac, Windows, or Linux)
- Download the 'latest release' on the page for your operating system and download and install the application
To download R Studio:
- Visit this page to download R studio: https://www.rstudio.com/products/rstudio/download/
- Find your operating system (Mac, Windows, or Linux)
- Download the 'latest release' on the page for your operating system and download and install the application
## Check that it worked
Open R Studio. Find the console window and type in `2 + 2`. If what you can guess is returned (hint: it's what you expect!), then R Studio *and* R both work.
---
# Creating Projects
Before proceeding, we're going to take a few steps to set ourselves to make the analysis easier; namely, through the use of Projects, an R Studio-specific organizational tool.
To create a project, in R Studio, navigate to "File" and then "New Directory". Then, click "New Project".
Names should be short and easy-to-remember but also informative.
---
# Packages
"Packages" are shareable collections of R code that provide functions (i.e., a command to perform a specific task), data, and documentation. Packages increase the functionality of R by improving and expanding on base R (basic R functions).
### Installing and Loading Packages
To download a package, you must call `install.packages()`:
```r
install.packages("dplyr")
```
You can also navigate to the Packages pane, and then click "Install", which will work the same as the line of code above. This is a way to install a package using code or part of the R Studio interface.
Usually, writing code is a bit quicker, but using the interface can be very useful and complimentary to use of code.
---
# Using packages
*After* the package is installed, it must be loaded into your R Studio session using `library()`:
```r
library(dplyr)
```
We only have to install a package once, but to use it, we have to load it each time we start a new R session.
&gt; A package is a like a book, a library is like a library; you use library() to check a package out of the library
&gt; - Hadley Wickham, Chief Scientist, R Studio
---
# Vignettes
Vignettes are long-form documentation (and tutorials) that can provide a helpful introduction to a package.
Run `vignette()` in order to view *all* of the vignettes available for a package:
```r
vignette(package = "dplyr")
```
Then, you can load a specific vignette.
```r
vignette("dplyr", package = "dplyr")
```
These are also available through CRAN (i.e., https://cran.r-project.org/web/packages/dplyr/index.html)
---
# Using a specific function
If you know the specific function that you want to look up, you can run this in the R Studio console:
```r
?dplyr::filter
```
Once you know what you want to do with the function, you can run it in your code:
```r
dat &lt;- # example data frame
tibble(letter = c("A", "A", "A", "B", "B"),
number = c(1L, 2L, 3L, 4L, 5L))
filter(dat, letter == "A")
```
```
## # A tibble: 3 x 2
## letter number
## &lt;chr&gt; &lt;int&gt;
## 1 A 1
## 2 A 2
## 3 A 3
```
---
# Configuring R Studio
By default, R Studio saves all of the objects in your environment. In general, this is not ideal, because it means that you may have taken steps interactively that are not documented your code.
&lt;img src="img/save-workspace-reminder.jpg" width="425px" style="display: block; margin: auto;" /&gt;
---
# The tidyverse
The tidyverse is a set of packages for data manipulation, exploration, and visualization using the design philosophy of 'tidy' data. Tidy data has a specific structure: each variable is a column, each observation is a row, and each type of observational unit is a table.
The packages contained in the Tidyverse provide useful functions that augment base R functionality.
You can installing and load the complete tidyverse with:
```r
install.packages("tidyverse")
```
```r
library(tidyverse)
```
---
class: inverse, center, middle
# Computer science foundations
---
# Data structures
- Data Frame
```r
data.frame(var1 = c("a", "b", "c"),
var2 = c(4, 2, 5))
```
```
## var1 var2
## 1 a 4
## 2 b 2
## 3 c 5
```
- Tibble
```r
library(tidyverse)
tibble(var1 = c("a", "b", "c"),
var2 = c(4, 2, 5))
```
```
## # A tibble: 3 x 2
## var1 var2
## &lt;chr&gt; &lt;dbl&gt;
## 1 a 4
## 2 b 2
## 3 c 5
```
---
# Data structures
```r
vec1 = c("a", "b", "c")
vec1
```
```
## [1] "a" "b" "c"
```
```r
vec2 = c(4, 2, 5)
vec2
```
```
## [1] 4 2 5
```
```r
tibble(var1 = vec1,
var2 = vec2)
```
```
## # A tibble: 3 x 2
## var1 var2
## &lt;chr&gt; &lt;dbl&gt;
## 1 a 4
## 2 b 2
## 3 c 5
```
---
# Conditional logic
- `ifelse()`
```r
vec1 &lt;- runif(10) # generates random number between 0 and 1
d1 &lt;- tibble(var1 = vec1)
d1 %&gt;%
mutate(var2 = ifelse(var1 &gt;= .5, "greater", "lesser"))
```
```
## # A tibble: 10 x 2
## var1 var2
## &lt;dbl&gt; &lt;chr&gt;
## 1 0.822 greater
## 2 0.201 lesser
## 3 0.951 greater
## 4 0.596 greater
## 5 0.925 greater
## 6 0.666 greater
## 7 0.734 greater
## 8 0.484 lesser
## 9 0.209 lesser
## 10 0.494 lesser
```
---
# Conditional logic
- `dplyr::case_when()`
```r
d1 %&gt;%
mutate(var2 = case_when(
var1 &gt;= .75 ~ "a lot",
var1 &gt;= .5 &amp; var1 &lt; .75 ~ "greater",
var1 &gt;= .25 &amp; var1 &lt; .5 ~ "lesser",
var1 &lt; .25 ~ "very little"
))
```
```
## # A tibble: 10 x 2
## var1 var2
## &lt;dbl&gt; &lt;chr&gt;
## 1 0.822 a lot
## 2 0.201 very little
## 3 0.951 a lot
## 4 0.596 greater
## 5 0.925 a lot
## 6 0.666 greater
## 7 0.734 greater
## 8 0.484 lesser
## 9 0.209 very little
## 10 0.494 lesser
```
---
# Objects, functions, directories, and data structures
Here's a short URL: `https://goo.gl/bUeMhV`
This is what it resolves to (it's a CSV file): `https://raw.githubusercontent.com/data-edu/data-science-in-education/master/data/pisaUSA15/stu-quest.csv`
This chunk of code will save that file to a data directory in our working environment.
```r
student_responses_url &lt;-
"https://goo.gl/bUeMhV"
student_responses_file_name &lt;-
paste0(getwd(), "/data/student-responses-data.csv")
download.file(
url = student_responses_url,
destfile = student_responses_file_name)
```
---
# Step 1
1. The *character string* `"https://goo.gl/wPmujv"` is being saved to an *object* called `student_responses_url`.
```r
student_responses_url &lt;-
"https://goo.gl/bUeMhV"
```
---
# Step 2
2. We concatenate the working directory file path to the desired file name for the CSV using a *function* called `str_c`. This is stored in another *object* called `student_reponses_file_name`. This creates a file name with a *file path* in your working directory and it saves the file in the folder that you are working in.
```r
student_responses_file_name &lt;-
str_c(getwd(), "./data/student-responses-data.csv")
```
---
# Step 3
3. The `student_responses_url` *object* is passed to the `url` argument of the *function* called `download.file()` along with `student_responses_file_name`, which is passed to the `destfile` argument.
In short, the `download.file()` function needs to know
- where the file is coming from (which you tell it through the `url`) argument and
- where the file will be saved (which you tell it through the `destfile` argument).
```r
download.file(
url = student_responses_url,
destfile = student_responses_file_name)
```
---
class: inverse, center, middle
# Processing Data
---
# The pipe: %&gt;%
First, let's load the data we downloaded in the last step.
```r
student_responses &lt;- read_csv("data/student-responses-data.csv")
```
We're also going to introduce a powerful, unusual *operator* in R, the pipe. The pipe is this symbol: `%&gt;%`. It lets you *compose* functions. It does this by passing the output of one function to the next.
Select reduces the number of *columns* of a dataset.
```r
student_mot_vars &lt;- # save object student_mot_vars by...
student_responses %&gt;% # using dataframe student_responses
select(SCIEEFF, JOYSCIE, INTBRSCI, EPIST, INSTSCIE) # and selecting only these five variables
student_mot_vars
```
Also, check out the helper functions: `contains()`, `starts_with()`, and `ends_with()`, i.e.:
```r
student_responses %&gt;%
select(contains("sci"))
```
---
# Saving the results
Note that we saved the output from the `select()` function to `student_mot_vars` but we could also save it back to `student_responses`, which would simply overwrite the original data frame (the following code is not run here):
```r
student_responses &lt;- # save object student_responses by...
student_responses %&gt;% # using dataframe student_responses
select(student_responses, SCIEEFF, JOYSCIE, INTBRSCI, EPIST, INSTSCIE) # and selecting only these five variables
```
---
# Renaming
We can also rename the variables at the same time we select them. Let's save the results to `student_mot_vars`.
```r
student_mot_vars &lt;- student_responses %&gt;%
select(student_efficacy = SCIEEFF,
student_joy = JOYSCIE,
student_broad_interest = INTBRSCI,
student_epistemic_beliefs = EPIST,
student_instrumental_motivation = INSTSCIE,
)
student_mot_vars %&gt;%
head(3)
```
```
## # A tibble: 3 x 5
## student_efficacy student_joy student_broad_i… student_epistem…
## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
## 1 3.28 2.16 1.73 2.16
## 2 -0.668 0.509 0.491 -0.501
## 3 -1.62 0.0992 -0.638 -0.193
## # … with 1 more variable: student_instrumental_motivation &lt;dbl&gt;
```
---
# Creating a new variable
`mutate()` takes instructions to create a new variable.
What goes *before* the equals sign is the name of the new variable.
What goes after are the instructions.
```r
student_responses %&gt;%
select(student_efficacy = SCIEEFF,
student_joy = JOYSCIE,
student_broad_interest = INTBRSCI,
student_epistemic_beliefs = EPIST,
student_instrumental_motivation = INSTSCIE) %&gt;%
mutate(student_joy_interest = (student_joy + student_broad_interest) / 2) %&gt;%
head(3)
```
```
## # A tibble: 3 x 6
## student_efficacy student_joy student_broad_i… student_epistem…
## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
## 1 3.28 2.16 1.73 2.16
## 2 -0.668 0.509 0.491 -0.501
## 3 -1.62 0.0992 -0.638 -0.193
## # … with 2 more variables: student_instrumental_motivation &lt;dbl&gt;,
## # student_joy_interest &lt;dbl&gt;
```
---
# Filtering the data set
Filter takes *logical statements* (statements that can evalute to true or false) to select a number of *rows* from a dataset.
```r
student_responses %&gt;%
select(student_efficacy = SCIEEFF,
student_joy = JOYSCIE,
student_broad_interest = INTBRSCI,
student_epistemic_beliefs = EPIST,
student_instrumental_motivation = INSTSCIE) %&gt;%
mutate(student_joy_interest = (student_joy + student_broad_interest) / 2) %&gt;%
filter(student_joy_interest &gt; mean(student_joy_interest, na.rm = TRUE)) %&gt;%
head(3)
```
```
## # A tibble: 3 x 6
## student_efficacy student_joy student_broad_i… student_epistem…
## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
## 1 3.28 2.16 1.73 2.16
## 2 -0.668 0.509 0.491 -0.501
## 3 -0.0378 1.67 0.332 2.16
## # … with 2 more variables: student_instrumental_motivation &lt;dbl&gt;,
## # student_joy_interest &lt;dbl&gt;
```
---
# Grouping and summarizing
It is often useful to aggregate a data set.
The `group_by()` and `summarize()` functions are especially helpful for this purpose.
Let's find the mean and standard deviation (and counts) of student broad interest by teacher.
```r
student_responses %&gt;%
select(student_broad_interest = INTBRSCI, teacher_id = SCHID) %&gt;%
group_by(teacher_id) %&gt;%
summarize(student_broad_interest_mean = mean(student_broad_interest, na.rm = TRUE),
student_broad_interest_sd = sd(student_broad_interest, na.rm = TRUE),
n = n()) %&gt;%
head(3)
```
```
## # A tibble: 3 x 4
## teacher_id student_broad_interest_mean student_broad_interest_sd n
## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;int&gt;
## 1 00001 -0.157 0.846 36
## 2 00003 -0.136 1.08 31
## 3 00004 -0.0430 1.26 39
```
---
# Arranging
Finally, let's arrange the table by the number of students in each class.
`arrange()` sorts in order from the smallest to largest values.
`desc()` tells `arrange()` to sort in descending order.
```r
student_responses %&gt;%
select(student_broad_interest = INTBRSCI, teacher_id = SCHID) %&gt;%
group_by(teacher_id) %&gt;%
summarize(student_broad_interest_mean = mean(student_broad_interest, na.rm = TRUE),
student_broad_interest_sd = sd(student_broad_interest, na.rm = TRUE),
n = n()) %&gt;%
arrange(desc(n)) %&gt;%
head(3)
```
```
## # A tibble: 3 x 4
## teacher_id student_broad_interest_mean student_broad_interest_sd n
## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;int&gt;
## 1 00107 0.204 0.786 42
## 2 00117 0.204 0.893 42
## 3 00131 0.0349 0.870 41
```
---
# Counting
Sometimes, we simply want to count the number of observations associated with each group.
```r
student_responses %&gt;%
select(teacher_id = SCHID) %&gt;%
count(teacher_id)
```
We could then use this as the basis of *other* summary statistics.
```r
student_responses %&gt;%
select(teacher_id = SCHID) %&gt;%
count(teacher_id) %&gt;%
summarize(n_mean = mean(n),
n_sd = sd(n))
```
```
## # A tibble: 1 x 2
## n_mean n_sd
## &lt;dbl&gt; &lt;dbl&gt;
## 1 32.3 8.61
```
On average, there are around 32 (*SD* = 8.61) students in each teachers' class.
---
class: inverse, center, middle
# Demo doc (part 1): Try it out!
*Note. Check out [R Studio Cloud](https://rstudio.cloud/) if you're having any installation trouble!*
*The doc is [here](https://github.com/jrosen48/MSU-workshop-2019/blob/master/demo-doc.Rmd) in the repository*
---
class: inverse, center, middle
# Loading data
---
# Loading a CSV File
Okay, we're ready to go. The easiest way to read a CSV file is with the function `read_csv()` from the package `readr`, which is contained within the Tidyverse.
Again, we'll load (or have loaded, already) the tidyverse library.
```r
library(tidyverse) # so tidyverse packages can be used for analysis
```
Just as a recap of a line that we ran earlier.
```r
student_responses &lt;- read_csv("data/student-responses-data.csv")
```
Since we loaded the data, we now want to look at it. We can type its name in the function `glimpse()` to print some information on the dataset (this code is not run here).
```r
glimpse(student_responses)
```
You can also print the name of the object.
---
# Loading Excel and SAV files
## Loading Excel files
```r
install.packages("readxl")
```
```r
library(readxl)
```
```r
my_data &lt;-
read_excel("path/to/file.xlsx")
```
## Loading SAV files
```r
install.packages("haven")
```
```r
library(haven)
my_data &lt;-
read_sav("path/to/file.xlsx")
```
---
# Saving (Writing) Files
Using our data frame `student_responses`, we can save it as a CSV (for example) with the following function. The first argument, `student_reponses`, is the name of the object that you want to save. The second argument, `student-responses.csv`, what you want to call the saved dataset.
```r
write_csv(student_responses, "student-responses.csv")
```
That will save a CSV file entitled `student-responses.csv` in the working directory. If you want to save it to another directory, simply add the file path to the file, i.e. `path/to/student-responses.csv`. To save a file for SPSS, load the haven package and use `write_sav()`. There is not a function to save an Excel file, but you can save as a CSV and directly load it in Excel.
---
class: inverse, center, middle
# Visualizing data
---
# The grammar of graphics
*ggplot2* is a tidyverse package for creating visualizations (or figures).
Let's work with a smaller data set, for now.
```r
student_mot_vars &lt;- # save object student_mot_vars by...
student_responses %&gt;% # using dataframe student_responses
select(student_efficacy = SCIEEFF, # selecting variable SCIEEFF and renaming to student_efficiency
student_joy = JOYSCIE, # selecting variable JOYSCIE and renaming to student_joy
student_broad_interest = INTBRSCI, # selecting variable INTBRSCI and renaming to student_broad_interest
student_epistemic_beliefs = EPIST, # selecting variable EPIST and renaming to student_epistemic_beliefs
student_instrumental_motivation = INSTSCIE, # selecting variable INSTSCIE and renaming to student_instrumental_motivation
teacher_id = SCHID
)
```
---
# Scatter plot
Notice the three parts of all `ggplot2` graphs: a) the data (`student_mot_vars`), b) an `aes()`thetic mapping, and c) a `geom`:
&lt;img src="msu-workshop-2019_files/figure-html/unnamed-chunk-44-1.png" width="450px" style="display: block; margin: auto;" /&gt;
---
# Scatter plot with a regression line
```r
ggplot(student_mot_vars,
aes(x = student_efficacy, y = student_broad_interest)) +
geom_point() +
geom_smooth(method = "lm")
```
&lt;img src="msu-workshop-2019_files/figure-html/unnamed-chunk-45-1.png" width="350px" style="display: block; margin: auto;" /&gt;
---
# Cleaning up
```r
ggplot(student_mot_vars,
aes(x = student_efficacy, y = student_broad_interest)) +
geom_point() +
geom_smooth(method = "lm") +
theme_bw() +
xlab("Student Efficacy") +
ylab("Student Broad Interest")
```
&lt;img src="msu-workshop-2019_files/figure-html/unnamed-chunk-46-1.png" width="350px" style="display: block; margin: auto;" /&gt;
---
class: inverse, center, middle
# Demo doc (part 2)
*The doc is [here](https://github.com/jrosen48/MSU-workshop-2019/blob/master/demo-doc.Rmd) in the repository*
---
class: inverse, center, middle
# Modeling data
---
# Linear models
The `lm()` function is a very helpful general purpose function for linear models. We'll use the `student_mot_vars` data frame we created earlier.
```r
lm(student_epistemic_beliefs ~ student_broad_interest, data = student_mot_vars)
```
```
##
## Call:
## lm(formula = student_epistemic_beliefs ~ student_broad_interest,
## data = student_mot_vars)
##
## Coefficients:
## (Intercept) student_broad_interest
## 0.2407 0.2401
```
---
# Linear models
Let's save the results back to an object, `m1`.
```r
m1 &lt;- lm(student_epistemic_beliefs ~ student_broad_interest, data = student_mot_vars)
```
---
# Linear models
We can then run a summary on the output.
```r
summary(m1)
```
```
##
## Call:
## lm(formula = student_epistemic_beliefs ~ student_broad_interest,
## data = student_mot_vars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.6197 -0.5614 -0.1755 0.5917 2.5144
##
## Coefficients:
## Estimate Std. Error t value Pr(&gt;|t|)
## (Intercept) 0.24066 0.01390 17.31 &lt;2e-16 ***
## student_broad_interest 0.24015 0.01423 16.88 &lt;2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.013 on 5323 degrees of freedom
## (387 observations deleted due to missingness)
## Multiple R-squared: 0.0508, Adjusted R-squared: 0.05062
## F-statistic: 284.9 on 1 and 5323 DF, p-value: &lt; 2.2e-16
```
---
# Linear models
You can then build up a more complex model, i.e.:
```r
m2 &lt;- lm(student_epistemic_beliefs ~ student_broad_interest + student_joy + student_broad_interest:student_joy, data = student_mot_vars)
```
You can then run `summary()` to view the results.
---
# Multi-level models
```r
library(lme4)
```
```
## Loading required package: Matrix
```
```
##
## Attaching package: 'Matrix'
```
```
## The following object is masked from 'package:tidyr':
##
## expand
```
```r
m2_mlm &lt;- lmer(student_epistemic_beliefs ~ student_broad_interest + student_joy + student_broad_interest:student_joy + (1|teacher_id), data = student_mot_vars)
m2_mlm
```
```
## Linear mixed model fit by REML ['lmerMod']
## Formula:
## student_epistemic_beliefs ~ student_broad_interest + student_joy +
## student_broad_interest:student_joy + (1 | teacher_id)
## Data: student_mot_vars
## REML criterion at convergence: 14853.02
## Random effects:
## Groups Name Std.Dev.
## teacher_id (Intercept) 0.1839
## Residual 0.9648
## Number of obs: 5315, groups: teacher_id, 176
## Fixed Effects:
## (Intercept) student_broad_interest
## 0.16608 0.07491
## student_joy student_broad_interest:student_joy
## 0.27409 0.01991
```
---
class: inverse, center, middle
# Demo doc (part 3)
*The doc is [here](https://github.com/jrosen48/MSU-workshop-2019/blob/master/demo-doc.Rmd) in the repository*
---
class: inverse, center, middle
# Learning and doing more with R
---
# Resources
- [Doing data science with R](https://r4ds.had.co.nz/) by Wickham and
Grolemund (2017)
- [Big magic with R: Creating learning beyond
fear](https://speakerdeck.com/apreshill/big-magic-with-r-creative-learning-beyond-fear)
by Hill (2017)
- [Data science in education](https://github.com/data-edu/data-science-in-education)
- [R Studio Community](https://community.rstudio.com/)
---
# Courses
- [\#r4ds](https://medium.com/@kierisi/r4ds-the-next-iteration-d51e0a1b0b82)
(see a talk at rstudio::conf()
[here](https://resources.rstudio.com/rstudio-conf-2019/r4ds-online-learning-community-improvements-to-self-taught-data-science-and-the-critical-need-for-diversity-equity-and-inclusion-in-data-science-education)
by Maegan (2019))
- [Data science for social scientists](http://datascience.tntlab.org/) by
Landers (2019)
- [University of Oregon Data Science Specialization for the College of
Education](https://github.com/uo-datasci-specialization) by Anderson (2019)
---
# Useful packages
With one exception, can install via `install.packages(&lt;name-of-package)`, i.e., `install.packages("tidylog").
- tidyverse: tools for data manipulation and processing
- tidylog: companion to the tidyverse; gives you feedback
- haven: import and export files for other statistical software
- apaTables: create APA-style tables in MS Word format
- devtools: to download packages only available on GitHub
- psych: general psychological science-related analysis tools
- skimr: quickly calculate summary statistics for a data frame
- lavaan: structural equation modeling
- tipyLPA: latent profile analysis
- lme4: multi-level modeling (HLM)
- sjstats: convenient statistical functions
- poLCA: latent class analysis
- rmarkdown: reports (PDF, Word, HTML, etc.)
- papaja: APA-style paper generation (must install from [GitHub](https://github.com/crsh/papaja))
- blogdown create a website/blog
- xaringan: create presentations
- MplusAutomation: Run Mplus via R
---
# People (and a few hashtags) to follow (on Twitter)
- [Jesse Mostipak](https://twitter.com/kierisi)
- [Alison Hill](https://twitter.com/apreshill)
- [Hadley Wickham](https://twitter.com/hadleywickham)
- [Daniel Anderson](https://twitter.com/datalorax_)
- [Mara Averick](https://twitter.com/dataandme)
- [Thomas Mock](https://twitter.com/thomas_mock)
- [Angela Li](https://twitter.com/CivicAngela)
- [R4DS community](https://twitter.com/R4DScommunity)
- [#tidytuesday](https://twitter.com/hashtag/TidyTuesday?src=hash)
- [#rstats](https://twitter.com/hashtag/rstats?src=hash)
- [David Robinson](https://twitter.com/drob)
- [Julia Silge](https://twitter.com/juliasilge)
- [Gabriela de Quieroz](https://twitter.com/gdequeiroz)
- [Joshua Rosenberg](https://twitter.com/jrosenberg6432) and [Emily Bovee](https://twitter.com/ebovee09) (of course! :)
---
class: inverse, center, middle
# Questions?
The repository for this workshop is [here](https://github.com/jrosen48/MSU-workshop-2019).
---
# Contact
- [Joshua Rosenberg](mailto:jmrosenberg@utk.edu)
- [Emily Bovee](mailto:emily.a.bovee@gmail.com)
</textarea>
<script src="https://remarkjs.com/downloads/remark-latest.min.js"></script>
<script>var slideshow = remark.create({
"highlightStyle": "github",
"highlightLines": true,
"countIncrementalSlides": false
});
if (window.HTMLWidgets) slideshow.on('afterShowSlide', function (slide) {
window.dispatchEvent(new Event('resize'));
});
(function() {
var d = document, s = d.createElement("style"), r = d.querySelector(".remark-slide-scaler");
if (!r) return;
s.type = "text/css"; s.innerHTML = "@page {size: " + r.style.width + " " + r.style.height +"; }";
d.head.appendChild(s);
})();</script>
<script>
(function() {
var i, text, code, codes = document.getElementsByTagName('code');
for (i = 0; i < codes.length;) {
code = codes[i];
if (code.parentNode.tagName !== 'PRE' && code.childElementCount === 0) {
text = code.textContent;
if (/^\\\((.|\s)+\\\)$/.test(text) || /^\\\[(.|\s)+\\\]$/.test(text) ||
/^\$\$(.|\s)+\$\$$/.test(text) ||
/^\\begin\{([^}]+)\}(.|\s)+\\end\{[^}]+\}$/.test(text)) {
code.outerHTML = code.innerHTML; // remove <code></code>
continue;
}
}
i++;
}
})();
</script>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement('script');
script.type = 'text/javascript';
script.src = 'https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-MML-AM_CHTML';
if (location.protocol !== 'file:' && /^https?:/.test(script.src))
script.src = script.src.replace(/^https?:/, '');
document.getElementsByTagName('head')[0].appendChild(script);
})();
</script>
</body>
</html>
You can’t perform that action at this time.