This data set was collected as part of a research project at LMU Munich (Germany), University of Bayreuth (Germany), and the University College Dublin (Ireland). This research project will be published in the proceedings of the ACM CHI Conference on Human Factors in Computing Systems in 2021 (CHI '21). You may find the project website here: http://www.medien.ifi.lmu.de/envisioned-va-dialogues/
The dataset consists of 1,854 written dialogues between a user and a voice assistant, which were envisioned by 205 people in an online survey. In particular, we asked participants to envision and write down dialogues with a perfect voice assistant without any technical limitations for nine scenarios (cf. below). In the survey instructions, we highlighted that the conversation could be initiated by both parties, and also provided an example scenario with two example dialogues (one initiated by the user, the other by the voice assistant). Participants were then presented with the eight different scenarios in random order before concluding with an open scenario, where they were given the opportunity to think of another situation where they would like to use the perfect voice assistant. For each scenario, participants were asked to first select who is speaking from a dropdown menu ("You" or "Voice assistant") and then write down what the selected speaker is saying).
We prompted participants with eight scenarios in a random order. These scenarios were based on the most popular use cases for Google Home and Amazon Alexa / Echo as recently identified by Ammari et al. (2019) from 250,00 command logs of users interacting with smart speakers. In addition, we included an open scenario where participants could describe a situation in which they would like to use a voice assistant. Each scenario contained a descriptive part and a specific issue which participants should address and solve in their envisioned dialogue between a user and a perfect voice assistant.
These are the scenarios:
Name | Description & Issue |
---|---|
Search | You want to go to the cinema to see a film, but you do not know the film times for your local cinema. |
Music | You are cooking dinner. You are on your own and you like to listen to some music while cooking. |
Internet of Things | You are going to bed. You like to read a book before going to sleep. You often fall asleep with the lights on. |
Volume | You are listening to loud music, but your neighbours are sensitive to noise. |
Weather | You are planning a trip to Italy in two days but do not know what kind of clothing to pack. You like to be prepared for the weather. |
Joke | You and your friends are hanging out. You like to entertain your friends, but the group seems to have run out of funny stories. |
Conversational | You are going to bed, but you are having trouble falling asleep. |
Alarm | You are going to bed. You have an important meeting early next morning, and you tend to oversleep. |
Open Scenario | Participants were asked to think about another situation in which they would like to use the perfect voice assistant. |
In addition to writing down dialogues between a user and a voice assistant, we also asked participants about their experience with existing voice assistants and for demographic infromation. Furthermore, participants filled out the 60 items Big Five Inventory-2 personality questionnaire by Soto and John (2017).
We recruited participants using the crowdsourcing web platform Prolific. After excluding three participants due to incomplete answers, our sample consisted of 205 participants from the UK (48.8% male, 50.7% female, 0.5% non-binary, mean age 36.2 years, range: 18--80 years).
You find three versions of our data set:
- (1) Raw Data: Original data, cleaned up with labelled columns (
perfect-va_raw-data_20-09-09.csv
) - (2) Preprocessed Data: Original data including the calculated personality scores based on the BFI-2 questionnaire (
perfect-va_preprocessed-data_20-09-09.csv
) - (3) Readable Dialogues: For reading the dialogues instead of data processing, we provide participants' IDs and the corresponding dialogues in a readable format (
perfect-va_readable-dialogues_20-09-09.csv
)
In addition, we provide detailed explanations of each item as well as answering option in the file perfect-va_codebook.csv
Please note that there are missing entires in the data set. Some participants forgot to indicate the speaker for a turn in the dialogue. Usually, the context makes clear who the speaker is but this should be considered for automatic data processing. Dialogues are provided in original spelling and thus, may include spelling or grammar mistakes.
A full description of our research design, data collection, and analyses can be found in the paper below. Please cite this paper in your work where relevant.
@inproceedings{voelkel2021,
author = {V\"{o}lkel, Sarah Theres and Buschek, Daniel and Eiband, Malin and Cowan, Benajming R. and Hussmann, Heinrich},
title = {Eliciting and Analysing Users' Envisioned Dialogues with Perfect Voice Assistants},
year = {2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
booktitle = {Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems},
location = {Yokohama, Japan},
series = {CHI '21}
}
This is the work of Sarah Theres Völkel, Daniel Buschek, Malin Eiband, Benjamin R. Cowan, and Heinrich Hussmann from LMU Munich, University of Bayreuth, and University College Dublin, made available under the Creative Commons Attribution 4.0 License.
In case you have any comments or questions, please contact sarah.voelkel@ifi.lmu.de