Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding

The code and data used for our EMNLP paper Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding.

Requirements

GCC compiler (used to compile the source c file): See the guide for installing GCC.

Datasets

We collect in-domain corpus for embedding training. For evaluation, we use Restaurant and Laptop datasets in Sem-Eval 2015 and Sem-Eval 2016. We preprocessed these datasets in this repository.

Run the Code

Using the same datasets as ours

bash run_jasen.sh

This step runs the whole pipeline from embedding training, to neural network distillation and model evaluation. The --dataset in the script is used to specify which prepared dataset (restaurant or laptop) to use. Generated embedding file is stored under ${dataset}. Prediction results for each dataset are generated at /datasets/${dataset}/prediction.txt.

Preparing your own dataset

Create a new folder under /datasets for your new dataset. The in-domain unlabeled training corpus train.txt used for joint topic embedding training has the format of each line being a document. The test set test.txt used for evaluation is in following format:

line_id	aspect_label_id	sentiment_label_id	text

The keywords for each aspect and sentiment should be listed in aspect_w_kw.txt and senti_w_kw.txt. Each line refers to one aspect/sentiment category. The line order should be consistent with the order of aspect and sentiment label ids. Examples can be found in prepared dataset folders.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding

Requirements

Datasets

Run the Code

Using the same datasets as ours

Preparing your own dataset

Files

README.md

Latest commit

History

README.md

File metadata and controls

Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding

Requirements

Datasets

Run the Code

Using the same datasets as ours

Preparing your own dataset