# Differences in Generational Perceptions of Organizational Justice: A Scale Analysis Project

This notebook outlines the code used to analyze a variety of organizational psychology scales. The code can be copied and pasted and used at one's discretion; there is also detailed commenting used throughout to help enhance readability and interpretability. Jump right in when ready!

### Introduction

R, like many programming languages, has a copious selection of packages from which to choose. Packages are essentially bundles of pre-designed code/scripts that are used to accomplish a task. For instance, the ```readr``` package is an assortment of functions used to import a variety of data files (e.g., .csv, .xlsx, .zip, etc.). We will begin by loading some useful packages and no worries, one can also load packages as needed instead of all at once. Some packages have overlapping function names with other packsges and R will notify you of this by printing a messaging displaying what is being masked.

The first line in the code block begins with a `#` symbol, signaling to R that the line should be ignored. To uncomment the line and run the code, just simple erase the symbol. **NOTE***: ```install.packages(...)``` needs to only be run once on your local machine because the package will be saved.

In [1]:
#install.packages("dplyr", "readr", "stringr", "ggplot2", corrplot", "psych")
library(readr)
library(dplyr)
library(stringr)
library(corrplot)
library(ggplot2)
library(psych)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


corrplot 0.84 loaded


Attaching package: ‘psych’


The following objects are masked from ‘package:ggplot2’:

    %+%, alpha




Next, to set the working directory. The working directory is the main folder that holds the relevant files used for our script. In this case, that includes the name of our R file as well as the data set from Qualtrics. 

For this example, make sure that the Excel file **and** R file are saved in the same folder.

Set the working directory using R's keybinding (aka keyboard shortcuts)! 
- Mac: Ctrl + Shift + h
- Windows: Ctrl + Shift + h

### Data Import & Wrangling

Now that the R system is mostly set up, let's moves to importing the data set. R is an object-orientated statistical programming language - the keyphrase here is *object-oriented* because we can name something in R to be later manipulated, transformed, sliced, along with a number of different things. 

This is one of the main benefits of R, as it grants the software extreme levels of flexibility, especially compared to programs such as Excel of SPSS. One can do both data wrangling and statistical analyses from the same platform.

In [2]:
raw = read_csv(file = "rawdata.csv", 
               col_names = TRUE)

Parsed with column specification:
cols(
  .default = col_character()
)

See spec(...) for full column specifications.



In [3]:
rename_at2 = function(data, .vars, .funs) {
    stopifnot(length(.vars) == length(.funs))
    
    for (i in seq_along(.vars)) {
        data = rename_at(data, .vars[[i]], .funs[[i]])
        }
    data
}

In [4]:
dat = raw %>% 
    select(Q1_1:D9_2) %>% 
    slice(-c(1:4)) %>% 
    rename_at2(
        list(vars(starts_with("Q1")), 
             vars(starts_with("Q2")), 
             vars(starts_with("Q3")), 
             vars(starts_with("Q4")), 
             vars(starts_with("Q5")),
             vars(starts_with("D"))),
        list(~ str_replace(., "Q1_", "wd"), 
             ~ str_replace(., "Q2_", "open"),
             ~ str_replace(., "Q3_", "org_eff"), 
             ~ str_replace(., "Q4_", "job_sat"), 
             ~ str_replace(., "Q5-", "cmfq"), 
             ~ str_replace(., "D", "dem"))
        ) %>% 
    rename_at2(
        list(vars(matches("wd4|wd7|wd8")), 
             vars(matches("open7|open9")),
             vars(matches("org_eff1")),
             vars(matches("sat2|sat4|sat6|sat10|sat11|sat12")), 
             vars(matches("cmfq2_2|cmfq2_8|cmfq2_11"))),
        list(~paste0(., "_R"), 
             ~paste0(., "_R"),
             ~paste0(., "_R"),
             ~paste0(., "_R"), 
             ~paste0(., "_R"))
        )

In [5]:
unfactorise = function(x) {
     case_when(
          x %in% c("Strongly disagree", 
                   "Disagree strongly", 
                   "Never", 
                   "Disagree very much", 
                   "1\r\nNot much like me") ~ 1, 
          x %in% c("Disagree", 
                   "Disagree a little", 
                   "Rarely", 
                   "Disagree moderately", 
                   "2\r\n") ~ 2,
          x %in% c("Agree", 
                   "Neither agree nor disagree", 
                   "Sometimes", 
                   "Disagree slightly", 
                   "3\r\n") ~ 3,
          x %in% c("Strongly agree", 
                   "Agree a little", 
                   "Frequently", 
                   "Agree slightly", 
                   "4\r\n") ~ 4,
          x %in% c("Agree strongly", 
                   "Agree moderately", 
                   "5\r\nVery much like me") ~ 5,
          x %in% c("Agree very much") ~ 6
          )
    }

In [6]:
vars = data.frame(sapply(subset(dat, select = wd1:cmfq2_11_R), unfactorise))
head(vars)

Unnamed: 0_level_0,wd1,wd2,wd3,wd4_R,wd5,wd6,wd7_R,wd8_R,wd9,open1,⋯,cmfq2_2_R,cmfq2_3,cmfq2_4,cmfq2_5,cmfq2_6,cmfq2_7,cmfq2_8_R,cmfq2_9,cmfq2_10,cmfq2_11_R
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,⋯,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,3,3,2,1,3,4,3,2,3,4,⋯,4,3,4,5,4,5,3,3,2,3
2,4,2,3,4,1,4,3,1,2,4,⋯,4,4,2,4,4,4,2,4,4,1
3,3,2,2,2,2,3,2,2,2,4,⋯,2,4,4,4,4,4,2,4,4,2
4,4,2,2,3,3,3,2,2,2,4,⋯,4,5,5,4,5,4,3,4,4,3
5,3,3,2,2,2,4,2,4,4,5,⋯,3,5,5,3,4,1,1,5,4,4
6,3,3,2,3,2,4,2,2,2,4,⋯,3,4,4,4,4,4,2,4,4,2


In [7]:
mutate_at2 <- function(data, .vars, .funs) {
    stopifnot(length(.vars) == length(.funs))
    
    for (i in seq_along(.vars)) {
        data <- mutate_at(data, .vars[[i]], .funs[[i]])
        }
    data
    }

In [8]:
vars_final = vars %>% 
    mutate_at2(
        list(c("wd4_R", "wd7_R", "wd8_R"), 
             c("open7_R", "open9_R"), 
             c("org_eff1_R"),
             c("job_sat2_R", "job_sat4_R", "job_sat6_R", "job_sat10_R", 
               "job_sat11_R", "job_sat12_R"), 
             c("cmfq2_2_R", "cmfq2_8_R", "cmfq2_11_R")), 
        list(~ 5 - ., 
             ~ 6 - ., 
             ~ 5 - .,
             ~ 7 - ., 
             ~ 6 - .)
        )

In [9]:
glimpse(select_at(vars, vars(ends_with("_R"))))
glimpse(select_at(vars_final, vars(ends_with("_R"))))

Rows: 58
Columns: 15
$ wd4_R       [3m[38;5;246m<dbl>[39m[23m 1, 4, 2, 3, 2, 3, 2, 1, 3, 1, 2, 4, 3, 2, 2, 2, 3, 2, 1, …
$ wd7_R       [3m[38;5;246m<dbl>[39m[23m 3, 3, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 3, 3, 2, 2, 2, 4, 4, …
$ wd8_R       [3m[38;5;246m<dbl>[39m[23m 2, 1, 2, 2, 4, 2, 2, 3, 3, 3, 3, 2, 3, 2, 1, 4, 3, 2, 1, …
$ open7_R     [3m[38;5;246m<dbl>[39m[23m 4, 4, 3, 2, 3, 3, 1, 5, 3, 2, 2, 4, 4, 4, 4, 1, 4, 1, 5, …
$ open9_R     [3m[38;5;246m<dbl>[39m[23m 4, 4, 3, 2, 4, 2, 2, 1, 2, 3, 2, 4, 2, 4, 4, 2, 1, 1, 1, …
$ org_eff1_R  [3m[38;5;246m<dbl>[39m[23m 3, 4, 2, 2, 2, 4, 2, 2, 3, 3, 3, 3, 3, 2, 3, 2, 3, 3, 1, …
$ job_sat2_R  [3m[38;5;246m<dbl>[39m[23m 3, 2, 2, 4, 1, 2, 2, 3, 2, 3, 2, 4, 6, 2, 1, 2, 2, 2, 1, …
$ job_sat4_R  [3m[38;5;246m<dbl>[39m[23m 6, 4, 4, 5, 3, 4, 6, 5, 4, 4, 3, 2, 5, 3, 4, 5, 2, 4, 3, …
$ job_sat6_R  [3m[38;5;246m<dbl>[39m[23m 5, 2, 3, 5, 1, 5, 3, 4, 4, 3, 2, 2, 5, 2, 4, 3, 2, 4, 3, …
$ job_sat10_R [3m[38;5;246m<dbl>[39m[23

In [10]:
varsList = list(
    wd = select(vars_final, starts_with("wd")), 
    open = select(vars_final, starts_with("open")), 
    org_eff = select(vars_final, starts_with("org_eff")), 
    job_sat = select(vars_final, starts_with("job_sat")), 
    cmfq = select(vars_final, starts_with("cmfq")), 
    dem = select(dat, starts_with("dem"))
    )

In [11]:
varsList$comps <- as.data.frame(
  do.call(cbind, lapply(varsList[-6], 
                        function(x) rowMeans(x, na.rm = TRUE))
          ))

In [12]:
#compute correlations for each subscales
corrList = lapply(
    list(wd = varsList$wd,
         open = varsList$open, 
         org_eff = varsList$org_eff, 
         job_sat = varsList$job_sat, 
         cmfq = varsList$cmfq, 
         comps = varsList$comps),  
    #run Pearson correlations on each subscale
    function(x) psych::corr.test(x, use = "pairwise", 
                                 method = "pearson")
    )

In [16]:
corrList$org_eff$r

Unnamed: 0,org_eff1_R,org_eff2,org_eff3,org_eff4,org_eff5,org_eff6
org_eff1_R,1.0,0.24540971,0.2269334,0.25841727,-0.1082745,-0.03328315
org_eff2,0.24540971,1.0,0.1167933,-0.09707394,-0.1687429,-0.18452051
org_eff3,0.22693338,0.11679332,1.0,0.67501749,0.2033596,0.31154009
org_eff4,0.25841727,-0.09707394,0.6750175,1.0,0.2957266,0.405018
org_eff5,-0.1082745,-0.16874293,0.2033596,0.29572658,1.0,0.27667244
org_eff6,-0.03328315,-0.18452051,0.3115401,0.405018,0.2766724,1.0


In [None]:
corrPlots = lapply(corrList, function(x) corrplot(x$r, 
                                                  method = "color", 
                                                  is.corr = TRUE,
                                                  type = "lower", 
                                                  #addCoef.col = "black",
                                                  number.cex = .4,
                                                  tl.col = "black", 
                                                  tl.cex = .60, 
                                                  tl.srt = 45))

In [None]:
#parallel analysis of "hum" construct items
fa.parallel(varsList$cmfq, 
            main = "Scree Plot of CMFQ Items", 
            ylabel = "Eigenvalues of Factors")