Skip to content

amazon-science/TN-Eval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

TN-Eval

This repository contains the code for our ACL 2025 paper: TN-Eval: Rubric and Evaluation Protocols for Measuring the Quality of Behavioral Therapy Notes.

Authors: Raj Sanjay Shah, Lei Xu, Qianchu Liu, Jon Burnsky, Drew Bertagnolli, Chaitanya Shivade

Introduction

TN-Eval provides tools for generating behavioral therapy notes using large language models (LLMs) and evaluating them via automatic, rubric-based protocols.

Quick Start

Download Data

Download AnnoMI data from https://github.com/uccollab/AnnoMI/raw/refs/heads/main/AnnoMI-full.csv and save it as data/AnnoMI-full.csv.

Generate Notes

python3 src/generate_soap_note.py --input data/AnnoMi-full.csv --output data/llm_notes/

Run Automatic Evaluations

python3 src/run_metrics_reference_free.py \
    --note data/llm_notes/outputs_annomi_llama31_70B_high.json \
    --output data/llm_notes/utputs_annomi_llama31_70B_high_with_eval.json

Human Notes and Evaluations

You can find all data artifacts in our companion repository: TN-Eval-Data.

This includes:

  • Human-written therapy notes
  • Human evaluations of human notes and LLM-generated notes
  • Automatic evaluations using LLaMA and Mistral models

Citation

If you use our data, please cite

@inproceedings{shah2025tneval,
  title={TN-Eval: Rubric and Evaluation Protocols for Measuring the Quality of Behavioral Therapy Notes},
  author={Shah, Raj Sanjay and Xu, Lei and Liu, Qianchu and Burnsky, Jon and Bertagnolli, Drew and Shivade, Chaitanya},
  booktitle={Proceedings of the 63nd Annual Meeting of the Association for Computational Linguistics: Industry Track},
  year={2025}
}

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages