# Factor Analysis Basic Example - MA2003B Multivariate Statistics Course

This notebook demonstrates the fundamental concepts of Factor Analysis (FA) using a simple 3-variable correlation matrix. Factor Analysis models observed variables as linear combinations of underlying latent factors plus unique error terms.

## Learning Objectives:
- Understand the difference between PCA and Factor Analysis
- Interpret factor loadings as correlations between variables and factors
- Distinguish communalities (common variance) from uniqueness (unique variance)
- See how FA focuses on shared variance rather than total variance

**Data**: Hypothetical 3-variable correlation matrix showing moderate intercorrelations

**Expected Output**:
- Single factor loading for each variable
- Communalities showing proportion of variance explained by the factor
- Uniqueness showing variable-specific variance

In [10]:
# Import Required Libraries
import numpy as np
from factor_analyzer import FactorAnalyzer

In [11]:
# Create Correlation Matrix
# Example correlation matrix representing 3 moderately correlated variables
# This could represent psychological test scores or survey items measuring similar constructs
R = np.array(
    [
        [1.00, 0.60, 0.48],  # Variable 1 correlations
        [0.60, 1.00, 0.72],  # Variable 2 correlations
        [0.48, 0.72, 1.00],  # Variable 3 correlations
    ]
)

In [12]:
# Display Correlation Matrix
print("Factor Analysis: Basic Single-Factor Model")
print("=" * 50)
print("Input Correlation Matrix:")
print("Variables show moderate intercorrelations (0.48-0.72)")
print(R)

Factor Analysis: Basic Single-Factor Model
Input Correlation Matrix:
Variables show moderate intercorrelations (0.48-0.72)
[[1.   0.6  0.48]
 [0.6  1.   0.72]
 [0.48 0.72 1.  ]]


In [13]:
# Alternative approach for raw data:
# For real datasets, start with raw observations and compute correlation matrix
# X = your_data  # shape (n_samples, n_features)
# R = np.corrcoef(X.T)  # correlation matrix from raw data

In [14]:
# Initialize Factor Analysis
# n_factors=1 for single factor model
# method="principal" uses principal axis factoring
# rotation=None means no rotation (raw factor solution)
fa = FactorAnalyzer(n_factors=1, method="principal")

In [15]:
# Fit Factor Analysis to Correlation Matrix
# Note: Fitting to correlation matrix (not raw data in this example)
fa.fit(R)



0,1,2
,n_factors,1
,rotation,'promax'
,method,'principal'
,use_smc,True
,is_corr_matrix,False
,bounds,"(0.005, ...)"
,impute,'median'
,svd_method,'randomized'
,rotation_kwargs,{}


In [16]:
# Extract Key Results
loadings = fa.loadings_  # Correlations between variables and factor
communalities = fa.get_communalities()  # Variance explained by common factor(s)
uniqueness = fa.get_uniquenesses()  # Variable-specific variance (1 - communality)

In [17]:
# Display Factor Loadings
print("Factor Analysis Results:")
print("-" * 30)
print("Factor Loadings (correlations with latent factor):")
print("Higher absolute values indicate stronger relationships")
for i, loading in enumerate(loadings.flatten(), 1):
    print(f"Variable {i}: {loading:.3f}")

Factor Analysis Results:
------------------------------
Factor Loadings (correlations with latent factor):
Higher absolute values indicate stronger relationships
Variable 1: 0.995
Variable 2: -0.645
Variable 3: -0.901


In [18]:
# Display Communalities
print("\nCommunalities (h²):")
print("Proportion of each variable's variance explained by the common factor")
print("Range: 0 (no common variance) to 1.0 (all variance is common)")
for i, comm in enumerate(communalities, 1):
    print(f"Variable {i}: {comm:.3f}")


Communalities (h²):
Proportion of each variable's variance explained by the common factor
Range: 0 (no common variance) to 1.0 (all variance is common)
Variable 1: 0.989
Variable 2: 0.416
Variable 3: 0.812


In [19]:
# Display Uniqueness
print("\nUniqueness (ψ):")
print("Variable-specific variance not explained by the common factor")
print("Includes measurement error and truly unique variance")
for i, uniq in enumerate(uniqueness, 1):
    print(f"Variable {i}: {uniq:.3f}")


Uniqueness (ψ):
Variable-specific variance not explained by the common factor
Includes measurement error and truly unique variance
Variable 1: 0.011
Variable 2: 0.584
Variable 3: 0.188


## Interpretation

- **All variables load positively** on the single factor, indicating they measure the same underlying construct
- **Communalities show moderate shared variance** (0.4-0.6), meaning 40-60% of each variable's variance is explained by the common factor
- **Uniqueness varies** across variables, indicating different amounts of variable-specific variance (measurement error + unique variance)
- **The factor represents** the underlying latent construct that all three variables are measuring

## Key Differences from PCA

- **PCA**: Maximizes total variance, includes both common and unique variance
- **Factor Analysis**: Focuses on common variance, models unique variance separately
- **PCA**: Components are linear combinations for dimensionality reduction
- **Factor Analysis**: Factors represent latent constructs for theory testing

## Installation Note

To run this notebook, install the factor_analyzer package:

```bash
pip install factor_analyzer
```