A Word Aligner for English

This is a word aligner for English: given two English sentences, it aligns related words in the two sentences. It exploits the semantic and contextual similarities of the words to make alignment decisions.

Ack

Initially, this is a fork of ma-sultan/monolingual-word-aligner, the aligner presented in Sultan et al., 2015 that has been very successful in SemEval STS (Semantic Textual Similarity) Task in recent years.

Install

# download the repo
git clone https://github.com/rgtjf/monolingual-word-aligner.git

# require stopwords from nltk
python -m nltk.downloader stopwords

# require stanford corenlp
wget http://nlp.stanford.edu/software/stanford-corenlp-full-2015-12-09.zip
unzip stanford-corenlp-full-2015-12-09.zip

# lanch the stanford CoreNLP
cd stanford-corenlp-full-2015-12-09/
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer
# after this, you will find stanfordCoreNLP server at http://localhost:9000/

python test_align.py

Evaluate on STSBenchmark

sh download.sh
python run_stsbenchmark.py

Results

Methods (eval on STSbenchmark)	Dev	Test
aligner	0.6991	0.6379
idf_aligner	0.7969	0.7622

Reference

STSBenchmark board

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
word_aligner		word_aligner
.gitignore		.gitignore
README.md		README.md
download.sh		download.sh
run_stsbenchmark.py		run_stsbenchmark.py
test_align.py		test_align.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Word Aligner for English

Ack

Install

Evaluate on STSBenchmark

Results

Reference

About

Releases

Packages

Languages

rgtjf/monolingual-word-aligner

Folders and files

Latest commit

History

Repository files navigation

A Word Aligner for English

Ack

Install

Evaluate on STSBenchmark

Results

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages