Leveraging the Wikipedia Graph for Evaluating Word Embeddings

Requirements

pip install -r requirements.txt

Calculating WALES

A minimal code example for using the WALES metric can be found in main.py:

import utils
import wales

1. Load Wikipedia Subgraph

graph, idx2word, word2idx = utils.load_graph()

graph is a networkx graph, idx2word and word2idx are dictionaries to translate between node indices and article names.

2. Loading an embedding

emb_dict = utils.load_embedding('glove_50.p')

emb_dict has to be a dictionary (key=word, value=vector).

Note that for memory reasons, we provide only the pretrained GloVe (d=50) embedding used in the paper. To reproduce all the results of the paper, download the embeddings and produce dictionaries for the nodes in the graph.

3. Loading challenges

We evaluate WALES with a set of challenges. To load the human benchmark dataset use

challenges = utils.load_challenges()

challenges has to be a list of word tuples, that occur in word2idx.

4. WALES

Simply instantiate the WALES class and run the .evaluate method.

metric = wales.WALES(graph, word2idx, idx2word, emb_dict, gamma=1.0)
score = metric.evaluate(challenges, verbose=True)
print(score)

In this example, the embedding got a WALES score of 0.54.

>>> 0.5437351135473592

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
utils.py		utils.py
wales.py		wales.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

.gitignore

.gitignore

README.md

README.md

main.py

main.py

requirements.txt

requirements.txt

utils.py

utils.py

wales.py

wales.py

Repository files navigation

Leveraging the Wikipedia Graph for Evaluating Word Embeddings

Requirements

Calculating WALES

1. Load Wikipedia Subgraph

2. Loading an embedding

3. Loading challenges

4. WALES

About

Releases

Packages

Languages

kahlmeyer94/WALES

Folders and files

Latest commit

History

Repository files navigation

Leveraging the Wikipedia Graph for Evaluating Word Embeddings

Requirements

Calculating WALES

1. Load Wikipedia Subgraph

2. Loading an embedding

3. Loading challenges

4. WALES

About

Topics

Resources

Stars

Watchers

Forks

Languages