Response-free semantic analysis of psychometric scales.
semanticfa reads the meaning of a scale's item wording with a language model
and recovers, interprets, and refines the scale's latent structure — entirely
from the items, with no human response data. Factor analysis on the item
embeddings is the centerpiece, but the package is a full toolkit for working
with a scale before (or without) collecting data: building semantic similarity
matrices, deciding how many factors to keep, reading a semantic "loadings"
table, comparing the recovered structure to theory, flagging redundant items,
building short forms, vetting brand-new candidate items, detecting
jingle/jangle fallacies across scales, and visualizing the item space.
# from CRAN (once available):
install.packages("semanticfa")
# development version:
# install.packages("remotes")
remotes::install_github("devon7y/semanticfa")The core of the package is pure R. Turning item text into embeddings on your
machine uses Python via reticulate — needed only if you want the package to
embed text for you (you can always bring your own embeddings):
sfa_install_python() # one-time: provisions sentence-transformersTwo-dimensional item maps work out of the box (Rtsne and uwot are bundled).
One optional package, EGAnet, powers EGA-based factor retention /
dimension selection and the faithful UVA redundancy method — install it only if
you use those parts.
library(semanticfa)
data(big5) # 50 IPIP Big-Five items + precomputed sentence-BERT embeddings
# one call: embed -> similarity -> retain -> extract -> diagnose
fit <- sfa(
data.frame(code = big5$codes, item = big5$items,
factor = big5$factors, scoring = big5$scoring),
embeddings = big5$embeddings, nfactors = 5)
fit
# interpret and refine, all from the same fit
plot(fit, type = "scree") # scree with parallel-analysis overlay
sfa_corplot(fit) # item-by-item similarity heatmap, grouped by factor
sfa_anchor(fit) # item-by-construct "belonging" (a semantic loadings table)
sfa_congruence(fit, target = big5$factors, # agreement with theory (partition metrics)
metrics = c("nmi", "ari"))
sfa_redundancy(fit) # near-duplicate itemsNo respondents are involved at any step.
| Function | Purpose |
|---|---|
sfa_embed() |
Embed item text — on-device sentence-BERT (Qwen, default), the OpenAI API, or any custom function. Results are cached. |
sfa_load_npz() |
Load pre-generated embeddings (e.g. a GPU job) from a NumPy .npz, no Python needed. |
sfa_similarity() |
Item-by-item similarity matrix with a choice of four encodings (below). |
sfa_nli_matrix() |
Signed, valence-aware similarity from natural-language inference (entailment − contradiction), so reverse-keyed items are handled directly. |
sfa_install_python(), sfa_clear_cache() |
Provision the embedding environment / clear the cache. |
| Function | Purpose |
|---|---|
sfa() |
The end-to-end pipeline: embed → similarity → retain → extract → diagnose. Accepts raw text, precomputed embeddings, an sfa_embeddings object, or a precomputed similarity matrix. |
sfa_nfactors() |
How many factors to keep — parallel analysis, Kaiser, TEFI, and EGA in one call. |
sfa_parallel() |
Embedding-adapted parallel analysis (random-unit-vector null; no sample size needed). |
sfa_dimselect() |
Select the informative leading embedding coordinates ("depth") by EGA depth optimization. |
as_psych() |
Hand the solution to psych (factor.congruence(), fa.sort(), …) as a standard fa object. |
| Function | Purpose |
|---|---|
sfa_anchor() |
An item-by-construct belonging matrix — a semantic loadings table — built from construct centroids and/or construct-name embeddings. |
sfa_project() |
Place items on interpretable bipolar axes (e.g. mild ↔ severe, passive ↔ active). |
sfa_congruence() |
Compare the recovered structure to an empirical or theoretical one: Tucker φ, NMI, ARI, Frobenius, and disattenuated correlation. |
sfa_jinglejangle() |
Flag jingle (same name, different content) and jangle (different name, same content) fallacies across multiple scales. |
| Function | Purpose |
|---|---|
sfa_redundancy() |
Detect near-duplicate items via faithful Unique Variable Analysis (absolute wTO on an EBICglasso network) or a direct cosine criterion. |
sfa_simplify() |
Build response-free short forms by selecting the most representative items per factor. |
sfa_item_fit() |
Vet a brand-new candidate item: how well does it match the construct name and the other items, and is it redundant with any of them? |
| Function | Purpose |
|---|---|
sfa_corplot() |
Heatmap of the item-by-item similarity matrix, grouped/ordered by factor (order accepts factor-name abbreviations, e.g. c("D","A","S")). |
sfa_itemplot() |
2-D item map via t-SNE, UMAP, PCA, or MDS (sfa_tsneplot() is a deprecated alias). |
plot(fit, "scree") |
Scree plot with the parallel-analysis overlay. |
Every sfa() fit reports KMO, a real partition-based TEFI (negative;
lower is better), RMSR, CAF, McDonald's ω, and — when theoretical
factors are supplied — a factor-to-theory alignment matrix (DAAL).
summary(fit) adds the full breakdown, and calibrate = TRUE adds a Monte
Carlo null reference for the diagnostics.
| Method | Description | Keying |
|---|---|---|
"atomic" (default) |
L2-normalize, cosine similarity | uses scoring sign-flip |
"atomic_reversed" |
Sign-flip reverse-keyed items, L2-normalize, cosine | uses scoring sign-flip |
"squid" |
Subtract the questionnaire-mean embedding, then cosine | keying-free |
"mean_centered_pearson" |
Mean-center → cosine = Pearson correlation | keying-free |
data(big5) — the 50-item IPIP Big-Five markers (public domain) with precomputed
sentence-transformers/all-MiniLM-L6-v2 embeddings, so every example runs
without Python or network access.
A full guided tour — every function, worked end-to-end on the Depression Anxiety
Stress Scales — is in the package vignette (vignette("introduction", package = "semanticfa")).
- Milano, N., Luongo, M., Ponticorvo, M., & Marocco, D. (2025). Semantic analysis of test items through large language model embeddings predicts a-priori factorial structure of personality tests. Current Research in Behavioral Sciences, 8, 100168. doi:10.1016/j.crbeha.2025.100168
- Casella, M., Luongo, M., Marocco, D., Milano, N., & Ponticorvo, M. (2024). LLM embeddings on test items predict post hoc loadings in personality tests. Ital-IA 2024, CEUR Workshop Proceedings.
- Guenole, N., D'Urso, E. D., Samo, A., Sun, T., & Haslbeck, J. M. B. (preprint). Enhancing Scale Development: Pseudo Factor Analysis of Language Embedding Similarity Matrices. OSF: https://osf.io/3mpzb/
- Pellert, M., Lechner, C. M., Sen, I., & Strohmaier, M. (2026). Neural network embeddings recover value dimensions from psychometric survey items on par with human data (SQuID). Findings of the ACL: EACL 2026, 5738–5752.
- Pokropek, A. (2026). From keyword-based text measures to latent variables: Confirmatory factor analysis with word embeddings. EPJ Data Science. doi:10.1140/epjds/s13688-026-00654-1
- Kmetty, Z., Koltai, J., & Rudas, T. (2021). The presence of occupational structure in online texts based on word embedding NLP models. EPJ Data Science, 10, 55. doi:10.1140/epjds/s13688-021-00311-9
- Christensen, A. P., Garrido, L. E., & Golino, H. (2023). Unique Variable Analysis: A network psychometrics method to detect local dependence. Multivariate Behavioral Research, 58(6), 1165–1182. doi:10.1080/00273171.2023.2194606
- Golino, H. (2026). Optimizing the landscape of LLM embeddings with Dynamic Exploratory Graph Analysis for generative psychometrics. arXiv:2601.17010.
- Grand, G., Blank, I. A., Pereira, F., & Fedorenko, E. (2022). Semantic projection recovers rich human knowledge of multiple object features from word embeddings. Nature Human Behaviour, 6(7), 975–987. doi:10.1038/s41562-022-01316-8
- Wulff, D. U., & Mata, R. (2025). Semantic embeddings reveal and address taxonomic incommensurability in psychological measurement. Nature Human Behaviour, 9(5), 944–954. doi:10.1038/s41562-024-02089-y
- Wulff, D. U., & Mata, R. (2026). Escaping the jingle-jangle jungle: Increasing conceptual clarity in psychology using large language models. Current Directions in Psychological Science, 35(2), 59–65. doi:10.1177/09637214251382083
- Hommel, B. E., & Arslan, R. C. (2025). Language models accurately infer correlations between psychological items and scales from text alone. Advances in Methods and Practices in Psychological Science, 8(4). doi:10.1177/25152459251377093
- Jung, S.-J., & Seo, J.-W. (2025). A transformer-based embedding approach to developing short-form psychological measures. Frontiers in Psychology, 16, 1640864. doi:10.3389/fpsyg.2025.1640864
- Wang, B., Zhang, Y., Hu, Y., Hou, H., Peng, K., & Ni, S. (2026). Discovering semantic latent structures in psychological scales: A response-free pathway to efficient simplification. arXiv:2602.12575.
GPL (>= 3)