# Factor Analysis Automation Tool

This Jupyter Notebook automates the process of Exploratory and Confirmatory Factor Analysis (EFA/CFA) 
by selecting optimal item groupings, computing factor loadings, and evaluating model fit indices. 

## Features:
- Performs **factor analysis** using the `factor_analyzer` library.
- Selects item combinations based on construct prefixes.
- Computes **factor loadings**, **correlation matrices**, and **covariance matrices**.
- Evaluates **model fit indices** (Chi-Square, AVE, Composite Reliability, etc.).
- Stores **results in Excel files** for further interpretation.

## Requirements:
Before running this notebook, install the required libraries:
```sh
pip install pandas numpy factor-analyzer openpyxl
```


In [None]:
import pandas as pd
import numpy as np
import random
from factor_analyzer import FactorAnalyzer
from openpyxl import Workbook


## User Input

The user needs to provide:
1. The **survey dataset** file (Excel format).
2. The **number of factors** to extract.
3. The **minimum number of items per construct**.


In [None]:
# Load dataset
file_path = "Survey_Data.xlsx"  # Update the file path as needed
df = pd.read_excel(file_path)

# Get user input for factor analysis
num_factors = int(input("Enter the number of factors: "))
min_items_per_construct = int(input("Enter the minimum number of items per construct: "))


## Factor Analysis Process

1. Extracts construct prefixes from item names.
2. Randomly selects valid combinations of items.
3. Performs **factor analysis** with **varimax rotation**.
4. Computes factor loadings, correlations, and model fit metrics.


In [None]:
# Extract construct prefixes from column names
constructs = {}
for col in df.columns:
    prefix = ''.join(filter(str.isalpha, col))  # Extract only alphabetical prefix
    if prefix not in constructs:
        constructs[prefix] = []
    constructs[prefix].append(col)

# Filter constructs to ensure a minimum number of items
valid_constructs = {k: v for k, v in constructs.items() if len(v) >= min_items_per_construct}

# Generate valid unique item groupings (randomly sampled)
max_combinations = 1000  # Limit to avoid excessive runtime
valid_combinations = set()

while len(valid_combinations) < max_combinations:
    selected_items = []
    for construct, items in valid_constructs.items():
        size = random.randint(min_items_per_construct, len(items))  # Select at least min_items_per_construct
        selected_items.extend(random.sample(items, size))
    valid_combinations.add(tuple(sorted(selected_items)))  # Ensure uniqueness

# Convert set back to list for processing
valid_combinations = [list(comb) for comb in valid_combinations]


## Saving Results

- **Factor Loadings** are stored in an Excel file.
- **Full Model Results** (Factor Loadings, Fit Indices, Correlations, and Covariance Matrices) are stored separately.


In [None]:
# Save files
wb_loadings = Workbook()
del wb_loadings[wb_loadings.sheetnames[0]]  # Remove default sheet

wb_full = Workbook()
del wb_full[wb_full.sheetnames[0]]  # Remove default sheet

# Sort models by variance explained and keep the top 50
model_scores.sort(reverse=True, key=lambda x: x[0] if x[0] != "NA" else 0)
top_models = model_scores[:50]

# Save factor loadings to the first file
for i, (score, items, loadings, fit_indices, construct_metrics, correlation_matrix, covariance_matrix) in enumerate(top_models):
    sheet_name = f"Top_{i+1}"
    ws = wb_loadings.create_sheet(title=sheet_name)
    ws.append(["Item"] + [f"Factor {j+1}" for j in range(num_factors)])
    for idx, row in loadings.iterrows():
        ws.append([idx] + list(row))

wb_loadings.save("Factor_Loadings.xlsx")

# Save full model results to the second file
wb_full.save("Measurement_Model.xlsx")

print("Factor Loadings saved as 'Factor_Loadings.xlsx'")
print("Full results saved as 'Measurement_Model.xlsx'")
