EvaluAId: Human-AI Collaborative Evaluation of Open-Ended Student Essays

Link to CHI 2026 Paper

Open-ended writing assignments are central to higher education, yet heterogeneous submissions and scale make evaluation difficult. Automated writing evaluation (AWE) promises speed but often trades away transparency and sidelines human judgment. This paper repositions the AI as an on-demand collaborator that can provide specific, targeted support. In a formative study, we expose leverage points in three cognitive dimensions: evidence identification, comparative judgment, and feedback composition. Guided by these insights, we build EvaluAId, which supports interactive rubric-content mapping, adaptive benchmarking and self-calibration, and personalized, rubric-aligned feedback synthesis. Through a within-subjects study with 12 TAs, we evaluate how this approach supports grading compared with a rubric+LLM chatbot and an LLM-based AWE; EvaluAId improved alignment with expert ratings and increased graders’ satisfaction. Finally, interviews with TAs, instructors, and students underscored the value of thoughtfulness supported by EvaluAId while surfacing practical considerations for integration into classroom. Together, our results argue for deliberate, evidence-first, human-in-the-loop evaluation.

Setup

First, run the development server:

npm run dev
# or
yarn dev
# or
pnpm dev

Open http://localhost:3000 with your browser to see the result.

CHI 2026 Paper

EvaluAId: Human-AI Collaborative Evaluation of Open-Ended Student Essays
Chao Zhang, Kexin Ju, Xinyi Lu, Yu-Chun Grace Yen, and Jeffrey M. Rzeszotarski

Please cite this paper if you used the code or prompts in this repository.

Chao Zhang, Kexin Ju, Xinyi Lu, Yu-Chun Grace Yen, and Jeffrey M. Rzeszotarski. 2026. EvaluAId: Human-AI Collaborative Evaluation of Open-Ended Student Essays. In CHI Conference on Human Factors in Computing Systems (CHI '26), April 26-May 1, 2026, Yokohama, Japan. ACM, New York, NY, USA, 28 pages. https://doi.org/10.1145/3772318.3790814

@inproceedings{10.1145/3772318.3790814,
  author = {Zhang, Chao and Phyllis Ju, Kexin and Lu, Xinyi and Grace Yen, Yu-Chun and M. Rzeszotarski, Jeffrey},
  title = {EvaluAId: Human-AI Collaborative Evaluation of Open-Ended Student Essays},
  year = {2026},
  isbn = {9798400722783},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi-org.proxy.library.cornell.edu/10.1145/3772318.3790814},
  doi = {10.1145/3772318.3790814},
  abstract = {Open-ended writing assignments are central to higher education, yet heterogeneous submissions and scale make evaluation difficult. Automated writing evaluation (AWE) promises speed but often trades away transparency and sidelines human judgment. This paper repositions the AI as an on-demand collaborator that can provide specific, targeted support. In a formative study, we expose leverage points in three cognitive dimensions: evidence identification, comparative judgment, and feedback composition. Guided by these insights, we build EvaluAId, which supports interactive rubric-content mapping, adaptive benchmarking and self-calibration, and personalized, rubric-aligned feedback synthesis. Through a within-subjects study with 12 TAs, we evaluate how this approach supports grading compared with a rubric+LLM chatbot and an LLM-based AWE; EvaluAId improved alignment with expert ratings and increased graders’ satisfaction. Finally, interviews with TAs, instructors, and students underscored the value of thoughtfulness supported by EvaluAId while surfacing practical considerations for integration into classroom. Together, our results argue for deliberate, evidence-first, human-in-the-loop evaluation.},
  booktitle = {Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems},
  articleno = {867},
  numpages = {20},
  keywords = {Writing evaluation, student essays, human-AI collaboration},
  location = {
  },
  series = {CHI '26}
}

Acknowledgements

We sincerely thank all TAs, instructors, and students who participated in our study for generously sharing their insights. We also thank the reviewers for their valuable comments and suggestions.

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
app		app
components		components
data		data
lib		lib
public		public
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
README.md		README.md
drizzle.config.ts		drizzle.config.ts
next.config.js		next.config.js
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EvaluAId: Human-AI Collaborative Evaluation of Open-Ended Student Essays

Setup

CHI 2026 Paper

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EvaluAId: Human-AI Collaborative Evaluation of Open-Ended Student Essays

Setup

CHI 2026 Paper

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages