Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.

Analyzing human reaction time for talker change detection

The ability to detect a change in the input is an essential aspect of perception. In speech communication, we use this ability to identify “talker changes” when listening to conversational speech (such as, audio podcasts). In this paper, we propose to improve our understanding about how fast listeners detect a change in talker, and the acoustic features tracked to identify a voice by designing a novel experimental paradigm. A listening experiment is designed in which listeners indicate the moment of perceived talker change in multi-talker speech utterances. We examine talker change detection performance by probing the human reaction time (RT). A random forest regression is used to model the relationship between RTs and acoustic features. The findings suggest that: (i) RT is less than a second, (ii) RT can be predicted from the difference in acoustic features of segment before and after change, and (iii) there a exists a significant dependence of RT on MFCC-D1 (delta MFCCs) features between segments of speech before and after the change instant. Further, a comparison with a machine system designed for the same task of TCD using speaker diarization principles showed a poor performance relative to the humans.

The repository contains the data and codes used in the study.

Publication link:

To be presented in ICASSP 2019, at Brighton, UK.

See you there!

Prior work:

The Journal of the Acoustical Society of America 145, 131 (2019)


Neeraj Kumar Sharma, Shobhana Ganesh, Sriram Ganapathy, Lori L. Holt

Contributors associated with the Carnegie Mellon Univeristy, Pittsburgh and the Indian Institute of Science, Bangalore.

The manuscript is shared here for personal use only. Any other use requires prior permission of the authors.


The repository contains the data and codes used in the paper.






No releases published


No packages published