Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?


Failed to load latest commit information.
Latest commit message
Commit time
December 28, 2022 22:23

Emotion Dataset

This is a dataset that can be used for emotion classification. It has already been preprocessed based on the approach described in our paper. It is also stored as a pandas dataframe and ready to be used in an NLP pipeline.

Note that the version of the data provided here corresponds to a six emotions variant that's meant to be used for educational and research purposes.


Hugging Face:

Download link:

Papers with Code Public Leaderboad:

Load the Dataset Using Pandas

import pandas as pd

df = pd.read_pickle("merged_training.pkl")


Here is a notebook showing how to use it for fine-tuning a pretrained language model for the task of emotion classification.

Here is another notebook which shows how to fine-tune T5 model for emotion classification along with other tasks.

Here is also a hosted fine-tuned model on HuggingFace which you can directly use for inference in your NLP pipeline.

Feel free to reach out to me on Twitter for more questions about the dataset.


The dataset should be used for educational and research purposes only. If you use it, please cite:

    title = "{CARER}: Contextualized Affect Representations for Emotion Recognition",
    author = "Saravia, Elvis  and
      Liu, Hsien-Chi Toby  and
      Huang, Yen-Hao  and
      Wu, Junlin  and
      Chen, Yi-Shin",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "",
    doi = "10.18653/v1/D18-1404",
    pages = "3687--3697",
    abstract = "Emotions are expressed in nuanced ways, which varies by collective or individual experiences, knowledge, and beliefs. Therefore, to understand emotion, as conveyed through text, a robust mechanism capable of capturing and modeling different linguistic nuances and phenomena is needed. We propose a semi-supervised, graph-based algorithm to produce rich structural descriptors which serve as the building blocks for constructing contextualized affect representations from text. The pattern-based representations are further enriched with word embeddings and evaluated through several emotion recognition tasks. Our experimental results demonstrate that the proposed method outperforms state-of-the-art techniques on emotion recognition tasks.",


😄 Dataset for Emotion Recognition Research







No releases published


No packages published