# Factor Analysis
Factor analysis models each observed variable as a linear combination of a few common factors + some unique error.

$X_i = \lambda_{i1}F_1 + \lambda_{i2}F_2 + … + \lambda_{im}F_m + \epsilon_i$

- $X_i$: Observed variable
- $F_j$: Latent (unobserved) factor
- $\lambda$: Loading (strength of the relationship)
- $\epsilon$: Unique variance

## Scenario:

Suppose we survey people on 6 job-related questions. We suspect these questions reflect two underlying factors:
- Factor 1: Satisfaction with work
- Factor 2: Satisfaction with management

In [2]:
import pandas as pd
import numpy as np
from sklearn.decomposition import FactorAnalysis
import matplotlib.pyplot as plt

# Simulate survey data (100 respondents)
np.random.seed(42)
n_samples = 100

# Simulate 2 latent factors
factor_1 = np.random.normal(loc=0, scale=1, size=n_samples)  # work satisfaction
factor_2 = np.random.normal(loc=0, scale=1, size=n_samples)  # management satisfaction

print(f"factor_1: {factor_1}")
print(f"factor_2: {factor_2}")

# Generate 6 observed variables from these latent factors
# Each question is a linear combo of latent factors + noise
Q1 = factor_1 + np.random.normal(0, 0.5, n_samples)
Q2 = factor_1 + np.random.normal(0, 0.5, n_samples)
Q3 = factor_1 + np.random.normal(0, 0.5, n_samples)
Q4 = factor_2 + np.random.normal(0, 0.5, n_samples)
Q5 = factor_2 + np.random.normal(0, 0.5, n_samples)
Q6 = factor_2 + np.random.normal(0, 0.5, n_samples)

# Create a DataFrame
survey_df = pd.DataFrame({
    "Q1_WorkEnjoyment": Q1,
    "Q2_Motivation": Q2,
    "Q3_Interest": Q3,
    "Q4_TrustInManager": Q4,
    "Q5_ManagerSupport": Q5,
    "Q6_ClearExpectations": Q6
})

# Run Factor Analysis (ask for 2 factors)
fa = FactorAnalysis(n_components=2)
fa.fit(survey_df)

# Get loadings (how strongly each question loads on each factor)
loadings = pd.DataFrame(
    fa.components_.T,
    index=survey_df.columns,
    columns=["Factor_1", "Factor_2"]
)

print("\n📊 Factor Loadings:\n")
print(loadings.round(2))

factor_1: [ 0.49671415 -0.1382643   0.64768854  1.52302986 -0.23415337 -0.23413696
  1.57921282  0.76743473 -0.46947439  0.54256004 -0.46341769 -0.46572975
  0.24196227 -1.91328024 -1.72491783 -0.56228753 -1.01283112  0.31424733
 -0.90802408 -1.4123037   1.46564877 -0.2257763   0.0675282  -1.42474819
 -0.54438272  0.11092259 -1.15099358  0.37569802 -0.60063869 -0.29169375
 -0.60170661  1.85227818 -0.01349722 -1.05771093  0.82254491 -1.22084365
  0.2088636  -1.95967012 -1.32818605  0.19686124  0.73846658  0.17136828
 -0.11564828 -0.3011037  -1.47852199 -0.71984421 -0.46063877  1.05712223
  0.34361829 -1.76304016  0.32408397 -0.38508228 -0.676922    0.61167629
  1.03099952  0.93128012 -0.83921752 -0.30921238  0.33126343  0.97554513
 -0.47917424 -0.18565898 -1.10633497 -1.19620662  0.81252582  1.35624003
 -0.07201012  1.0035329   0.36163603 -0.64511975  0.36139561  1.53803657
 -0.03582604  1.56464366 -2.6197451   0.8219025   0.08704707 -0.29900735
  0.09176078 -1.98756891 -0.21967189  0.3