In [None]:
# Install packages (may take a few minutes)

install.packages("psych")
install.packages("GPArotation")
install.packages("fastDummies")
install.packages("factoextra")
install.packages("dplyr")

In [None]:
# Load libraries

library(factoextra)
library(fastDummies)
library(psych)
library(GPArotation)
library(datasets)
library(readr)
library(data.table)
library(dplyr)

In [None]:
# Read the file (TYPE IN THE FILE PATH)

data <- read.csv('../input/titanic/train.csv')

head(data)

In [None]:
# Drop columns (TYPE IN THE COL NO.)

data <- data[-c(1,4,6,9,11)]

head(data)

summary(data)

In [None]:
# Omit rows with any missing values (if needed)

data <- na.omit(data)

head(data)

In [None]:
# Encode categorical variables (if needed)

data <- dummy_cols(data, remove_most_frequent_dummy = TRUE, remove_selected_columns = TRUE)

head(data)

In [None]:
# Scale the data

data <- scale(data)

head(data)

# Data adequacy tests

In [None]:
# Kaiser-Meyer-Olkin Measure of Sampling Adequacy

KMO(data)

# Bartlett’s Test of Sphericity

cortest.bartlett(data, n=NULL, diag=TRUE)

# Parallel Analysis

In [None]:
# Parallel Analysis (TYPE IN: factoring method)

x <- fa.parallel(data, fm="pa", fa="both", n.iter=1)

# PCA and factor analysis

In [None]:
# Pricipal Components Analysis (TYPE IN: rotation method)

fit <- principal(data, x$ncomp, rotate="varimax") 

print(fit$loadings, cutoff=.3)

In [None]:
# To plot PCA

factor.plot(fit)

cor.plot(fit)

fa.diagram(fit)

In [None]:
# Factor Analysis (TYPE IN: rotation method, factoring method)

fit <- fa(data, x$nfact, rotate="promax", fm="pa")

print(fit$loadings, cutoff=.3)

In [None]:
# To plot factor analysis

factor.plot(fit)

cor.plot(fit)

fa.diagram(fit)

In [None]:
# Other helpful values

# print(fit)

# print(fit$values) # Eigen values

# print(fit$communality) # Sum of squared factor loadings

# print(fit$complexity) # Hoffman’s index of complexity

# print(fit$Structure, cutoff=.3) # Structure matrix

# print(fit$Phi) # Interfactor correlation matrix

# print(fit$residual) # Residual matrix

# print(fit$scores) # Factor scores

# print(fit$weights) # Beta weights

# print(fit$rot.mat) # Rotation matrix

* **Rotation options:**

rotate="none", "varimax", "quartimax", "bentlerT", "equamax", "varimin", "geominT" and "bifactor" are orthogonal rotations. 

rotate="Promax", "promax", "oblimin", "simplimax", "bentlerQ, "geominQ" and "biquartimin" and "cluster" are possible oblique transformations of the solution. 

The default is to do a oblimin transformation, although versions prior to 2009 defaulted to varimax. 

SPSS seems to do a Kaiser normalization before doing Promax, this is done here by the call to "promax" which does the normalization before calling Promax in GPArotation.

* **Factoring methods:**

fm="minres" will do a minimum residual as will fm="uls". Both of these use a first derivative. 

fm="ols" differs very slightly from "minres" in that it minimizes the entire residual matrix using an OLS procedure but uses the empirical first derivative. This will be slower. 

fm="wls" will do a weighted least squares (WLS) solution, fm="gls" does a generalized weighted least squares (GLS), fm="pa" will do the principal factor solution, fm="ml" will do a maximum likelihood factor analysis. 

fm="minchi" will minimize the sample size weighted chi square when treating pairwise correlations with different number of subjects per pair. 

fm ="minrank" will do a minimum rank factor analysis. 

fm = "old.min" will do minimal residual the way it was done prior to April, 2017 (see discussion below). 

fm="alpha" will do alpha factor analysis as described in Kaiser and Coffey (1965). 

For more about rotation and factoring methods in psych package, 

please go to: https://cran.r-project.org/web/packages/psych/psych.pdf

# Begin cluster analysis (Extra)

In [None]:
# Cluster Analysis (TYPE IN: clustering method, no. of clusters)

fviz_nbclust(data, kmeans, method = "wss")

cluster <- kmeans(data, 3, nstart = 24)

fviz_cluster(cluster, data = data, ellipse.type = "euclid", star.plot = TRUE, repel = TRUE, ggtheme = theme_minimal())