-
Notifications
You must be signed in to change notification settings - Fork 15
Calculate Row Values
In education, we often create sum scores or means scores for different measures. And when your data is structured in a wide format (one row per participant), you will want those scores to be calculated row-wise for every individual in your data.
stu_id | item1 | item2 | item3 | sum_score |
---|---|---|---|---|
234 | 4 | 3 | 2 | 9 |
255 | 3 | 5 | 2 | 10 |
276 | 1 | 2 | 4 | 7 |
You may also want to calculate a row-wise total for something like total number of students in a school when your data is disaggregated by a category such as grade level per school.
sch_id | grade_6 | grade_7 | grade_8 | total |
---|---|---|---|---|
4578 | 120 | 113 | 142 | 375 |
5900 | 55 | 48 | 61 | 164 |
5787 | 180 | 175 | 154 | 509 |
There are several different ways to achieve this.
In the tidyverse
the two most common methods of calculating row sums, means, etc. is to use:
base::rowSums()
or base::rowMeans()
or
dplyr::rowwise()
There has been some debate around which function is more efficient. Some people have said that rowwise
is less efficient and will slow you down because it essentially is a dplyr::group_by()
for every row in your data. However, others have said that rowwise
has been reinvigorated and is the preferred method now. Both work well so I say take your pick.
I also want to note, that as someone who works with a lot of .sav files and users who use SPSS, I have also included an example of how to calculate row values when your variables contain labelled NA values (as used often in SPSS). R does not recognize labelled NA (like -999 as user-defined missing value) as an NA value in rowwise calculations. Therefore I have included an example, in Calculate row sums or means, of how you may want to work with these types of variables.
- Calculate row sums or means
- Sum row occurrences of strings
- Calculate row sums of NA values
- Calculate percent agreement per pair
Main functions used in examples
Package | Functions |
---|---|
base | rowSums(); rowMeans() |
dplyr | rowwise() |
janitor | adorn_totals() |
Other functions used in examples
Package | Functions |
---|---|
dplyr | mutate(); across(); c_across(); select(); case_when(); summarise(); group_by() |
tidyselect | starts_with(); contains() |
base | sum();mean(); round(); ifelse(); diff() |
labelled | na_values() |
stringr | str_detect() |
Resources