Skip to content
Code and data for EMNLP 2018 paper "Cross-lingual Lexical Sememe Prediction"
C Python Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin
data
output
src
LICENSE
README.md
config
run.sh

README.md

Cross-lingual Lexical Sememe Prediction

This is the open-source code of the EMNLP 2018 paper Cross-lingual Lexical Sememe Prediction [pdf].

Introduction

Sememes are defined as the minimum semantic units of human languages. As important knowledge sources, sememe-based linguistic knowledge bases have been widely used in many NLP tasks. However, most languages still do not have sememe-based linguistic knowledge bases. Thus we present a task of cross-lingual lexical sememe prediction (CLSP), aiming to automatically predict sememes for words in other languages. We propose a novel framework to model correlations between sememes and multi-lingual words in low-dimensional semantic space for sememe prediction. Experimental results on real-world datasets show that our proposed model achieves consistent and significant improvements as compared to baseline methods in cross-lingual sememe prediction.

Usage

bash run.sh

To change the training corpus, please just switch the -mono-train1 and -mono-train2 parameters in bash.sh. Notice that lang1 refers to the source language and lang2 refers to the target language.

Datasets

Process Type Source Target
Training Corpus Sogou-T Wikipedia
Seed Lexicon Google Translate API
Sememe-based KB HowNet_zh -
Testing Sememe Prediction - HowNet_en
Bilingual Lexicon Induction Chinese-English Translation Lexicon 3.0 Version
Word Similarity Computation Wordsim-240 WordSim-353
WordSim-297 SimLex-999

Cite

If the codes or datasets help you, please cite the following paper:

@InProceedings{qi2018cross,
  Title      = {Cross-lingual lexical sememe prediction},
  Author     = {Qi, Fanchao and Lin, Yankai and Sun, Maosong and Zhu, Hao and Xie, Ruobing and Liu, Zhiyuan},
  Booktitle  = {Proceedings of EMNLP},
  Year       = {2018},
}
You can’t perform that action at this time.