AssayExtract is a Python package for extracting and standardizing biomedical outcome measures (assays) from research text.
It identifies assays mentioned an input text and maps them to a curated vocabulary of canonical names, outcome domains, and synonyms.
Reliable extraction of outcome measures from biomedical literature is challenging due to inconsistent reporting and high variability in terminology. Prior work (PreClinIE, ACL BioNLP 2025) found that manual annotations of outcome measures showed low inter-annotator agreement, making them unsuitable for training robust machine learning models.
To address this, AssayExtract adopts a rule-based approach grounded in a curated assay vocabulary.
AssayExtract is built on a harmonized vocabulary of outcome assessment techniques, developed through manual curation of the biomedical literature.
- A core set of commonly used assays was identified from representative studies
- Each assay was assigned a canonical name and mapped to one of five outcome domains
- Synonyms and lexical variants were expanded using a large language model and manually reviewed
- A domain-specific synonym dictionary enables robust matching via pattern-based extraction
Extracted mentions are normalized to canonical names and linked to structured metadata, including domain and subdomain.
The raw source file can be accessed here: ./data/assay_final_harmonized_with_enriched_synonyms_unique.csv.
- Literature mining and systematic reviews
- Analysis of outcome measures across studies
- Construction of structured datasets for biomedical NLP and LLMs
pip install assay-extractOr from source:
git clone https://github.com/Ineichen-Group/AssayExtract.git
cd AssayExtract
pip install -e .from assay_extract import AssayClassifier
classifier = AssayClassifier()
methods = """
We assessed anxiety using the elevated plus maze and social behavior
with the three-chamber test. Learning was measured on the morris water maze.
Motor coordination was tested on the accelerating rotarod.
"""
results = classifier.extract_measures(methods)
for result in results:
print(f"{result.canonical_name}")
print(f" Domain: {result.outcome_domain}")
print(f" Subdomain: {result.subdomain}")
print()Output:
elevated plus maze
Domain: Behavioral
Subdomain: Anxiety
three-chamber social approach test
Domain: Behavioral
Subdomain: Sociability
morris water maze
Domain: Behavioral
Subdomain: Cognition & learning
accelerating rotarod
Domain: Behavioral
Subdomain: Motor coordination
python -m pytest assay_extract/tests/ -vAll 9 tests pass.
MIT License
@software{assayextract2025,
title={AssayExtract: Extraction of Biomedical Outcome Measures from Text},
author={Simona Emilova Doneva},
year={2025},
url={https://github.com/Ineichen-Group/AssayExtract}
}Acikgoz, B., Dalkiran, B., & Dayi, A. (2022). An overview of the currency and usefulness of behavioral tests used from past to present to assess anxiety, social behavior and depression in rats and mice. Behavioural Processes, 200, 104670. https://doi.org/10.1016/j.beproc.2022.104670
Adil, A., Kumar, V., Jan, A. T., & Asger, M. (2021). Single-Cell Transcriptomics: Current Methods and Challenges in Data Acquisition and Analysis. Frontiers in Neuroscience, 15. https://doi.org/10.3389/fnins.2021.591122
Alemán, C. L., Noa, M., Más, R., Rodeiro, I., Mesa, R., Menéndez, R., Gámez, R., & Hernández, C. (2000). Reference data for the principal physiological indicators in three species of laboratory animals. Laboratory Animals, 34(4), 379–385. https://doi.org/10.1258/002367700780387741
Alturkistani, H. A., Tashkandi, F. M., & Mohammedsaleh, Z. M. (2016). Histological Stains: A Literature Review and Case Study. Global Journal of Health Science, 8(3), 72–79. https://doi.org/10.5539/gjhs.v8n3p72
Burrows, D. J., McGown, A., Jain, S. A., De Felice, M., Ramesh, T. M., Sharrack, B., & Majid, A. (2019). Animal Models of Multiple Sclerosis: From Rodents to Zebrafish. Multiple Sclerosis Journal, 25(3), 306–324. https://doi.org/10.1177/1352458518805246
Chen, C., Wang, J., Pan, D., Wang, X., Xu, Y., Yan, J., Wang, L., Yang, X., Yang, M., & Liu, G.-P. (2023). Applications of multi-omics analysis in human diseases. MedComm, 4(4), e315. https://doi.org/10.1002/mco2.315
Choi, J. D., & Kumar, V. (2024). A new era in quantification of animal social behaviors. Neuroscience & Biobehavioral Reviews, 157, 105528. https://doi.org/10.1016/j.neubiorev.2023.105528
Dufva, M. (2009). Introduction to Microarray Technology. In M. Dufva (Ed.), DNA Microarrays for Biomedical Research: Methods and Protocols (pp. 1–22). Humana Press. https://doi.org/10.1007/978-1-59745-538-1_1
Gold, E. M., Su, D., López-Velázquez, L., Haus, D. L., Perez, H., Lacuesta, G. A., Anderson, A. J., & Cummings, B. J. (2013). Functional Assessment of Long-Term Deficits in Rodent Models of Traumatic Brain Injury. Regenerative Medicine, 8(4), 483–516. https://doi.org/10.2217/rme.13.41
Gregory, N. S., Harris, A. L., Robinson, C. R., Dougherty, P. M., Fuchs, P. N., & Sluka, K. A. (2013). An Overview of Animal Models of Pain: Disease Models and Outcome Measures. The Journal of Pain, 14(11), 1255–1269. https://doi.org/10.1016/j.jpain.2013.06.008
Guevara, R. D., Pastor, J. J., Manteca, X., Tedo, G., & Llonch, P. (2022). Systematic review of animal-based indicators to measure thermal, social, and immune-related stress in pigs. PLOS ONE, 17(5), e0266524. https://doi.org/10.1371/journal.pone.0266524
Gurina, T. S., & Simms, L. (2025). Histology, Staining. In StatPearls. StatPearls Publishing.
Harrison, D. J., Creeth, H. D. J., Tyson, H. R., Boque-Sastre, R., Isles, A. R., Palme, R., Touma, C., & John, R. M. (2020). Unified Behavioral Scoring for Preclinical Models. Frontiers in Neuroscience, 14. https://doi.org/10.3389/fnins.2020.00313
Javaeed, A., Qamar, S., Ali, S., Mustafa, M. A. T., Nusrat, A., & Ghauri, S. K. (2021). Histological Stains in the Past, Present, and Future. Cureus. https://doi.org/10.7759/cureus.18486
Jin, S., & Kennedy, R. T. (2015). New developments in Western blot technology. Chinese Chemical Letters, 26(4), 416–418. https://doi.org/10.1016/j.cclet.2015.01.021
Jones, L. A. T., Field-Fote, E. C., Magnuson, D., Tom, V., Basso, D. M., Fouad, K., & Mulcahey, M. J. (2025). Outcome measures in rodent models for spinal cord injury and their human correlates. Experimental Neurology, 386, 115169. https://doi.org/10.1016/j.expneurol.2025.115169
Just, N. (2021). Proton functional magnetic resonance spectroscopy in rodents. NMR in Biomedicine, 34(5), e4254. https://doi.org/10.1002/nbm.4254
Mark, M., Teletin, M., Antal, C., Wendling, O., Auwerx, J., Heikkinen, S., Khetchoumian, K., Argmann, C. A., & Dgheem, M. (2007). Histopathology in Mouse Metabolic Investigations. Current Protocols in Molecular Biology, 78(1), 29B.4.1–29B.4.32. https://doi.org/10.1002/0471142727.mb29b04s78
Markicevic, M., Savvateev, I., Grimm, C., & Zerbi, V. (2021). Emerging imaging methods to study whole-brain function in rodent models. Translational Psychiatry, 11(1), 457. https://doi.org/10.1038/s41398-021-01575-5
Meredith, G. E., & Kang, U. J. (2006). Behavioral models of Parkinson’s disease in rodents: A new look at an old problem. Movement Disorders, 21(10), 1595–1606. https://doi.org/10.1002/mds.21010
Musumeci, G. (2014). Past, present and future: Overview on histology and histopathology. Journal of Histology and Histopathology, 1(1), 5. https://doi.org/10.7243/2055-091X-1-5
Osier, N. D., Carlson, S. W., DeSana, A., & Dixon, C. E. (2015). Chronic Histopathological and Behavioral Outcomes of Experimental Traumatic Brain Injury in Adult Male Animals. Journal of Neurotrauma, 32(23), 1861–1882. https://doi.org/10.1089/neu.2014.3680
Pai, J. A., & Satpathy, A. T. (2021). High-throughput and single-cell T cell receptor sequencing technologies. Nature Methods, 18(8), 881–892. https://doi.org/10.1038/s41592-021-01201-8
Pinkernell, S., Becker, K., & Lindauer, U. (2016). Severity assessment and scoring for neurosurgical models in rodents. Laboratory Animals, 50(6), 442–452. https://doi.org/10.1177/0023677216675010
Sadler, K. E., Mogil, J. S., & Stucky, C. L. (2022). Innovations and advances in modelling and measuring pain in animals. Nature Reviews Neuroscience, 23(2), 70–85. https://doi.org/10.1038/s41583-021-00536-7
Shepherd, A., Tyebji, S., Hannan, A. J., & Burrows, E. L. (2016). Translational Assays for Assessment of Cognition in Rodent Models of Alzheimer’s Disease and Dementia. Journal of Molecular Neuroscience, 60(3), 371–382. https://doi.org/10.1007/s12031-016-0837-1
Tremoleda, J. L., & Sosabowski, J. (2015). Imaging Technologies and Basic Considerations for Welfare of Laboratory Rodents. Lab Animal, 44(3), 97–105. https://doi.org/10.1038/laban.665
Verma, V. V., Vimal, S., Mishra, M. K., & Sharma, V. K. (2025). A Comprehensive Review on Structural Insights through Molecular Visualization: Tools, Applications, and Limitations. Journal of Molecular Modeling, 31(6), 173. https://doi.org/10.1007/s00894-025-06402-y
Waerzeggers, Y., Monfared, P., Viel, T., Winkeler, A., & Jacobs, A. H. (2010). Mouse models in neurological disorders: Applications of non-invasive imaging. Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease, 1802(10), 819–839. https://doi.org/10.1016/j.bbadis.2010.04.009
Wahl, D., Coogan, S. C. P., Solon-Biet, S. M., de Cabo, R., Haran, J. B., Raubenheimer, D., Cogger, V. C., Mattson, M. P., Simpson, S. J., & Le Couteur, D. G. (2017). Cognitive and behavioral evaluation of nutritional interventions in rodent models of brain aging and dementia. Clinical Interventions in Aging, 12, 1419–1428. https://doi.org/10.2147/CIA.S145247
Webster, S. J., Bachstetter, A. D., Nelson, P. T., Schmitt, F. A., & Van Eldik, L. J. (2014). Using Mice to Model Alzheimer’s Dementia: An Overview of the Clinical Disease and the Preclinical Behavioral Changes in 10 Mouse Models. Frontiers in Genetics, 5. https://doi.org/10.3389/fgene.2014.00088
Wickenden, A. D. (2000). Overview of Electrophysiological Techniques. Current Protocols in Pharmacology, 11(1), 11.1.1–11.1.17. https://doi.org/10.1002/0471141755.ph1101s64
Xiong, Y., Mahmood, A., & Chopp, M. (2013). Animal Models of Traumatic Brain Injury. Nature Reviews Neuroscience, 14(2), 128–142. https://doi.org/10.1038/nrn3407
Zarruk, J. G., García-Yébenes, I., Romera, V. G., Ballesteros, I., Moraga, A., Cuartero, M. I., Hurtado, O., Sobrado, M., Pradillo, J. M., Fernández-López, D., Serena, J., Castillo-Meléndez, M., Moro, M. A., & Lizasoain, I. (2011). Neurological tests for functional outcome assessment in rodent models of ischaemic stroke. Revista de Neurología, 53.