This is a public repository of EVOKE (Emotion Vocabulary Of Korean and English) dataset.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The dataset offers comprehensive coverage of emotion words in each language, in addition to many-to-many translations between words in the two languages and identification of language-specific emotion words (or cross-lingual lexical gaps). The dataset contains 1,426 Korean words and 1,397 English words, and we systematically annotate 819 Korean and 924 English adjectives and verbs.
The dataset consists of three separate components (see Figure 1):
Each dataset component includes its own codebook describing file contents, variable definitions, and annotation labels. Please refer to the README.md file inside each dataset folder for documentation of each dataset component.
Please refer to the paper for annotation criteria, construction of translation mappings, and descriptive statistics of the dataset.
Jung, Y., Shin, H., & Bergen, B. K. (2026). EVOKE: Emotion Vocabulary Of Korean and English. arXiv [Cs.CL]. Retrieved from http://arxiv.org/abs/2602.10414

