Skip to content
Code for ACL 2018 paper "Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction"
Branch: master
Clone or download
Latest commit 7a3cb93 Apr 19, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
data/prep_data Add files via upload May 22, 2018
script Delete Nov 12, 2018
LICENSE Initial commit May 22, 2018 Update Apr 19, 2019


Code for our ACL 2018 paper "Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction"

Problem to Solve

Label "The retina display is great ." as "O B I O O O" so to extract "retina display" as an aspect. Check this article for aspect-based sentiment analysis or this for domain representation learning.


All code are tested under python 3.6.2 + pytorch 0.2.0_4

Steps to Run Code

Step 1: Download general embeddings (GloVe: ), save it in data/embedding/gen.vec

Step 2: Download Domain Embeddings (You can find the link under this paper's title in ), save them in data/embedding

Step 3: Download and install fastText ( to fastText/

Step 4: Download official datasets to data/official_data/

Download official evaluation scripts to script/

We assume the following file names.

SemEval 2014 Laptop (




SemEval 2016 Restaurant (




Step 5: Run to build numpy files for general embeddings and domain embeddings.

python script/

Step 6: Fill in out-of-vocabulary (OOV) embedding

./fastText/fasttext print-word-vectors data/embedding/laptop_emb.vec.bin < data/prep_data/laptop_emb.vec.oov.txt > data/prep_data/laptop_oov.vec

./fastText/fasttext print-word-vectors data/embedding/restaurant_emb.vec.bin < data/prep_data/restaurant_emb.vec.oov.txt > data/prep_data/restaurant_oov.vec

python script/

Step 7: Train the laptop model

python script/

Train the restaurant model

python script/ --domain restaurant 

Step 8: Evaluate Laptop dataset

python script/

Evaluate Restaurant dataset

python script/ --domain restaurant 


If you find our code useful, please cite our paper.

  author    = {Xu, Hu and Liu, Bing and Shu, Lei and Yu, Philip S.},
  title     = {Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction},
  booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics},
  publisher = {Association for Computational Linguistics},
  year      = {2018}
You can’t perform that action at this time.