# Evaluation of Poincare Embeddings

This notebook demonstrates how well poincare embeddings trained using this [implementation](https://github.com/TatsuyaShirakawa/poincare-embedding) perform on the tasks detailed in the [original paper](https://arxiv.org/pdf/1705.08039.pdf).

This is the list of tasks - 
1. WordNet reconstruction
2. WordNet link prediction
3. Link prediction in collaboration networks
4. Lexical entailment on HyperLex

A more detailed explanation of the tasks and the evaluation methodology is present in the individual evaluation subsections.

## Tables

## 1. Setup

Clone the [poincare-embedding](https://github.com/TatsuyaShirakawa/poincare-embedding) repository and follow the README to compile the sources into a binary. Set the variable below to the directory containing the `poincare-embedding` directory.

In [5]:
# Set this to the path of the directory containing the poincare-embedding directory
parent_directory = '/home/jayant/projects/'

## 2. Training

### 2.1 Create the data

In [6]:
import os

# These directories are auto created in the current directory for storing poincare datasets and models
data_directory = 'poincare_data'
models_directory = os.path.join(data_directory, 'models')

# Create directories
! mkdir {data_directory}
! mkdir {models_directory}

In [7]:
# Create the WordNet data
wordnet_file = os.path.join(data_directory, 'wordnet_noun_hypernyms.tsv')
! python {parent_directory}/poincare-embedding/scripts/create_wordnet_noun_hierarchy.py {wordnet_data_file}

In [8]:
hyperlex_url = "http://people.ds.cam.ac.uk/iv250/paper/hyperlex/hyperlex-data.zip"
! wget {hyperlex_url} -P {data_directory}
! unzip {data_directory}/hyperlex-data.zip -d {data_directory}
hyperlex_file = os.path.join(data_directory, 'nouns-verbs', 'hyperlex-nouns.txt')

### 2.2 Traing C++ embeddings

In [9]:
from gensim.utils import check_output

def train_cpp_model(binary_path, data_file, output_file, dim, epochs, neg, num_threads, seed=0):
    """Train a poincare embedding using the c++ implementation
    
    Args:
        binary_path (str): Path to the compiled c++ implementation binary
        
    """
    args = {
        'dim': dim,
        'max_epoch': epochs,
        'neg_size': neg,
        'num_thread': num_threads,
        'learning_rate_init': 0.1,
        'learning_rate_final': 0.0001,
    }
    cmd = [binary_path, data_file, output_file]
    for option, value in args.items():
        cmd.append("--%s" % option)
        cmd.append(str(value))

    return check_output(args=cmd)

In [10]:
cpp_binary_path = os.path.join(parent_directory, 'poincare-embedding', 'work', 'poincare_embedding')

In [11]:
model_sizes = [5, 10, 20, 50, 100]
neg_sizes = [10, 20]
epochs = [50, 100]
threads = [8, 1]

In [12]:
model_files = {}

In [13]:
# Possibly re-write with permutations instead of nested loops?
for epochs_ in epochs:
    for threads_ in threads:
        for neg_size in neg_sizes:
            model_name = 'cpp_epochs_%d_threads_%d_neg_%d' % (epochs_, threads_, neg_size)
            model_files[model_name] = {}
            for model_size in model_sizes:
                output_file_name = '%s_dim_%d' % (model_name, model_size)
                output_file = os.path.join(models_directory, output_file_name)
                print('Training model with size %d neg %d threads %d epochs %d, saving to %s' %
                     (model_size, neg_size, threads_, epochs_, output_file))
                out = train_cpp_model(
                    cpp_binary_path, wordnet_data_file, output_file,
                    model_size, epochs_, neg_size, threads_, seed=0)
                model_files[model_name][model_size] = output_file

Training model with size 5 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_5
Training model with size 10 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_10
Training model with size 20 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_20
Training model with size 50 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_50
Training model with size 100 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_100
Training model with size 5 neg 20 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_20_dim_5
Training model with size 10 neg 20 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_20_dim_10
Training model with size 20 neg 20 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_20_dim_20
Tr

## 3. Loading the embeddings

In [14]:
embeddings = {}

### 3.1 C++ embeddings

In [16]:
import os
import pickle
import re

from gensim.models.keyedvectors import KeyedVectors
import numpy as np
from pygtrie import Trie
from scipy.spatial.distance import euclidean, pdist
from smart_open import smart_open

def transform_cpp_embedding_to_kv(input_file, output_file, encoding='utf8'):
    """Given a C++ embedding tsv filepath, converts it to a KeyedVector-supported file"""
    with smart_open(input_file, 'rb') as f:
        lines = [line.decode(encoding) for line in f]
    if not len(lines):
         raise ValueError("file is empty")
    first_line = lines[0]
    parts = first_line.rstrip().split("\t")
    model_size = len(parts) - 1
    vocab_size = len(lines)
    with open(output_file, 'w') as f:
        f.write('%d %d\n' % (vocab_size, model_size))
        for line in lines:
            f.write(line.replace('\t', ' '))

        
class PoincareEmbedding(object):
    """Load and perform distance operations on poincare embedding"""

    def __init__(self, keyed_vectors):
        """Initialize PoincareEmbedding via a KeyedVectors instance"""
        self.kv = keyed_vectors
        self.init_key_trie()
        
    def init_key_trie(self):
        """Setup trie containing vocab keys for quick prefix lookups"""
        self.key_trie = Trie()
        for key in self.kv.vocab:
            self.key_trie[key] = True
    
    @staticmethod
    def poincare_dist(vector_1, vector_2):
        """Return poincare distance between two vectors"""
        norm_1 = np.linalg.norm(vector_1)
        norm_2 = np.linalg.norm(vector_2)
        euclidean_dist = euclidean(vector_1, vector_2)
        return np.arccosh(
            1 + 2 * (
                (euclidean_dist ** 2) / ((1 - norm_1 ** 2) * (1 - norm_2 ** 2))
            )
        )
        
    @classmethod
    def load_poincare_cpp(cls, input_filename):
        """Load embedding trained via C++ Poincare model

        Args:
            filepath (str): Path to tsv file containing embedding

        Returns:
            PoincareEmbedding instance

        """
        keyed_vectors_filename = input_filename + '.kv'
        transform_cpp_embedding_to_kv(input_filename, keyed_vectors_filename)
        keyed_vectors = KeyedVectors.load_word2vec_format(keyed_vectors_filename)
        os.unlink(keyed_vectors_filename)
        return cls(keyed_vectors)

    @classmethod
    def load_poincare_numpy(cls, input_filename):
        """Load embedding trained via Python numpy Poincare model

        Args:
            filepath (str): Path to pkl file containing embedding

        Returns:
            PoincareEmbedding instance

        """
        keyed_vectors_filename = input_filename + '.kv'
        transform_numpy_embedding_to_kv(input_filename, keyed_vectors_filename)
        keyed_vectors = KeyedVectors.load_word2vec_format(keyed_vectors_filename)
        os.unlink(keyed_vectors_filename)
        return cls(keyed_vectors)
    
    def find_matching_keys(self, word):
        """Find all senses of given word in embedding vocabulary"""
        matches = self.key_trie.items('%s.' % word)
        matching_keys = [''.join(key_chars) for key_chars, value in matches]
        return matching_keys

    def get_vector(self, term):
        """Return vector for given term"""
        return self.kv.word_vec(term)
        
    def get_all_distances(self, term):
        """Return distances to all terms for given term, including itself"""
        term_vector = self.kv.word_vec(term)
        all_vectors = self.kv.syn0
        
        euclidean_dists = np.linalg.norm(term_vector - all_vectors, axis=1)
        norm = np.linalg.norm(term_vector)
        all_norms = np.linalg.norm(all_vectors, axis=1)
        return np.arccosh(
            1 + 2 * (
                (euclidean_dists ** 2) / ((1 - norm ** 2) * (1 - all_norms ** 2))
            )
        )
        
    def get_distance(self, term_1, term_2):
        """Returns distance between vectors for input terms

        Args:
            term_1 (str)
            term_2 (str)

        Returns:
            Poincare distance between the two terms (float)
        
        Note:
            Raises KeyError if either term_1 or term_2 is absent from vocabulary

        """
        vector_1, vector_2 = self.kv[term_1], self.kv[term_2]
        return self.poincare_dist(vector_1, vector_2)

In [35]:
for model_name, models in model_files.items():
    embeddings[model_name] = {}
    for model_size, model_file in models.items():
        embeddings[model_name][model_size] = PoincareEmbedding.load_poincare_cpp(model_file)

### 3.2 Numpy embeddings
TODO

## 4. Evaluation

In [95]:
from prettytable import PrettyTable

def display_results(task_name, results):
    """Display evaluation results of multiple embeddings on a single task in a tabular format
    
    Args:
        task_name (str): name the task being evaluated
        results (dict): mapping between embeddings and corresponding results
    
    """
    header = PrettyTable()
    # TODO: infer widths from table rather than hard-coding
    header.field_names = [" " * 42, " " * 7 + "Model Dimensions" + " " * 8]

    data = PrettyTable()
    data.field_names = ["Model Description", "Metric"] + [str(dim) for dim in sorted(model_sizes)]
    for model_name, model_results in results.items():
        metrics = [metric for metric in model_results.keys()]
        dims = sorted([dim for dim in model_results[metrics[0]].keys()])
        row = [model_name, '\n'.join(metrics)]
        for dim in dims:
            scores = ['%.2f' % model_results[metric][dim] for metric in metrics]
            row.append('\n'.join(scores))
        data.add_row(row)
    
    header_lines = header.get_string(start=0, end=0).split("\n")[:2]
    print('Results for %s task' % task_name)
    print("\n".join(header_lines))
    print(data)        

### 4.1 WordNet reconstruction

In [79]:
import csv
from collections import defaultdict
import itertools


class ReconstructionEvaluation(object):
    """Evaluating reconstruction on given network for given embedding"""
    def __init__(self, filepath, embedding):
        """Initialize evaluation instance with tsv file containing relation pairs and embedding to be evaluated
        
        Args:
            filepath (str): path to tsv file containing relation pairs
            embedding (PoincareEmbedding instance): embedding to be evaluated
        
        Returns
            ReconstructionEvaluation instance

        """
        items = set()
        embedding_vocab = embedding.kv.vocab
        relations = defaultdict(set)
        with smart_open(filepath, 'r') as f:
            reader = csv.reader(f, delimiter='\t')
            for row in reader:
                assert len(row) == 2, 'Hypernym pair has more than two items'
                item_1_index = embedding_vocab[row[0]].index
                item_2_index = embedding_vocab[row[1]].index
                relations[item_1_index].add(item_2_index)
                items.update([item_1_index, item_2_index])
        self.items = items
        self.relations = relations
        self.embedding = embedding
    
    
    @staticmethod
    def get_positive_relation_ranks(distances, positive_relations):
        """
        Given a numpy array of all distances from an item and indices of its positive relations,
        compute ranks of positive relations
        
        Args:
            distances (numpy float array): np array of all distances for a specific item
            positive_relations (list): list of indices of positive relations for the item
        
        Returns:
            list of ranks of positive items in the same order as `positive_indices`
        """
        positive_relation_distances = distances[positive_relations]
        negative_relation_distances = np.ma.array(distances, mask=False)
        negative_relation_distances.mask[positive_relations] = True
        # Compute how many negative relation distances are less than each positive relation distance, plus 1 for rank
        ranks = (negative_relation_distances < positive_relation_distances[:, np.newaxis]).sum(axis=1) + 1
        return list(ranks) 
    
    def evaluate_metric(self, metric, max_n=None):
        """Evaluate given metric for the reconstruction task
            
        Args:
            metric (str): accepted values are 'mean_rank' and 'MAP'
            max_n (int or None): Maximum number of positive relations to evaluate, all if max_n is None
        
        Returns:
            Computed value of given metric (float)

        """
        if metric == 'mean_rank':
            return self.evaluate_mean_rank(max_n)
        elif metric == 'MAP':
            return self.evaluate_map(max_n)
        else:
            raise ValueError('Invalid value for metric')

    def evaluate_mean_rank(self, max_n=None):
        """Evaluate mean rank and MAP for reconstruction
            
        Args:
            max_n (int or None): Maximum number of positive relations to evaluate, all if max_n is None
        
        Returns:
            Computed value of mean rank (float)

        """
        ranks = []
        for i, item in enumerate(self.items, start=1):
            if item not in self.relations:
                continue
            item_relations = list(self.relations[item])
            item_term = self.embedding.kv.index2word[item]
            item_distances = self.embedding.get_all_distances(item_term)
            positive_relation_ranks = self.get_positive_relation_ranks(item_distances, item_relations)
            ranks += positive_relation_ranks
            if max_n is not None and i > max_n:
                break
        return np.mean(ranks)
    
    def evaluate_map(self, max_n=None):
        """Evaluate MAP (Mean Average Precision) for reconstruction
            
        Args:
            max_n (int or None): Maximum number of positive relations to evaluate, all if max_n is None
        
        Returns:
            Computed value of MAP (float)

        """
        raise NotImplementedError

In [45]:
reconstruction_results = {}
metrics = ['mean_rank']

In [49]:
for model_name, models in embeddings.items():
    reconstruction_results[model_name] = {}
    for metric in metrics:
        reconstruction_results[model_name][metric] = {}
    for model_size, embedding in models.items():
        print('Evaluating model %s of size %d' % (model_name, model_size))
        eval_instance = ReconstructionEvaluation(wordnet_file, embedding)
        for metric in metrics:
            reconstruction_results[model_name][metric][model_size] = eval_instance.evaluate_metric(metric, max_n=1000)

Evaluating model cpp_epochs_50_threads_8_neg_10 of size 20
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 10
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 100
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 50
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 5
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 20
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 10
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 100
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 50
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 5
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 20
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 10
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 100
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 50
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 5


In [66]:
display_results('WordNet Reconstruction', reconstruction_results)

Results for WordNet Reconstruction task
+--------------------------------------------+---------------------------------+
|                                            |        Model Dimensions         |
+--------------------------------+-----------+--------+--------+-------+-------+-------+
|       Model Description        |   Metric  |   5    |   10   |   20  |   50  |  100  |
+--------------------------------+-----------+--------+--------+-------+-------+-------+
| cpp_epochs_50_threads_8_neg_10 | mean_rank | 268.44 | 129.66 | 86.78 | 76.58 | 71.15 |
| cpp_epochs_50_threads_8_neg_20 | mean_rank | 251.71 | 145.56 | 96.18 | 72.44 | 57.26 |
| cpp_epochs_50_threads_1_neg_10 | mean_rank | 325.21 | 107.59 | 71.23 | 61.48 | 60.01 |
+--------------------------------+-----------+--------+--------+-------+-------+-------+


### 4.2 WordNet link prediction

#### 4.2.1 Preparing data

In [68]:
import random

def train_test_split(data_file, test_ratio=0.1):
    """Creates train and test files from given data file, returns train/test file names
    
    Args:
        data_file (str): path to data file for which train/test split is to be created
        test_ratio (float): fraction of lines to be used for test data
    
    Returns
        (train_file, test_file): tuple of strings with train file and test file paths
    """
    root_nodes, leaf_nodes = get_root_and_leaf_nodes(data_file)
    test_line_candidates = []
    line_count = 0
    all_nodes = set()
    with open(data_file, 'rb') as f:
        for i, line in enumerate(f):
            node_1, node_2 = line.split()
            all_nodes.update([node_1, node_2])
            if (
                    node_1 not in leaf_nodes
                    and node_2 not in leaf_nodes
                    and node_1 not in root_nodes
                    and node_2 not in root_nodes
                ):
                test_line_candidates.append(i)
            line_count += 1

    num_test_lines = int(test_ratio * line_count)
    if num_test_lines > len(test_line_candidates):
        raise ValueError('Not enough candidate relations for test set')
    print('Choosing %d test lines from %d candidates' % (num_test_lines, len(test_line_candidates)))
    test_line_indices = set(random.sample(test_line_candidates, num_test_lines))
    train_line_indices = set(l for l in range(line_count) if l not in test_line_indices)
    
    train_filename = data_file + '.train'
    test_filename = data_file + '.test'
    train_set_nodes = set()
    with open(data_file, 'rb') as f:
        train_file = open(train_filename, 'wb')
        test_file = open(test_filename, 'wb')
        for i, line in enumerate(f):
            if i in train_line_indices:
                train_set_nodes.update(line.split())
                train_file.write(line)
            elif i in test_line_indices:
                test_file.write(line)
            else:
                raise AssertionError('Line %d not present in either train or test line indices' % i)
        train_file.close()
        test_file.close()
    assert len(train_set_nodes) == len(all_nodes), 'Not all nodes from dataset present in train set relations'
    return (train_filename, test_filename)

In [69]:
def get_root_and_leaf_nodes(data_file):
    """Return keys of root and leaf nodes from a file with transitive closure relations
    
    Args:
        data_file(str): file path containing transitive closure relations
    
    Returns:
        (root_nodes, leaf_nodes) - tuple containing keys of root and leaf nodes
    """
    root_candidates = set()
    leaf_candidates = set()
    with open(data_file, 'rb') as f:
        for line in f:
            nodes = line.split()
            root_candidates.update(nodes)
            leaf_candidates.update(nodes)
    
    with open(data_file, 'rb') as f:
        for line in f:
            node_1, node_2 = line.split()
            if node_1 == node_2:
                continue
            leaf_candidates.discard(node_1)
            root_candidates.discard(node_2)
    
    return (leaf_candidates, root_candidates)

In [70]:
wordnet_train_file, wordnet_test_file = train_test_split(wordnet_file)

Choosing 74324 test lines from 126730 candidates


#### 4.2.2 Training and loading models

In [87]:
# Training models for link prediction
lp_model_files = {}

In [88]:
# Possibly re-write with permutations instead of nested loops?
for epochs_ in epochs:
    for threads_ in threads:
        for neg_size in neg_sizes:
            model_name = 'cpp_epochs_%d_threads_%d_neg_%d' % (epochs_, threads_, neg_size)
            lp_model_files[model_name] = {}
            for model_size in model_sizes:
                output_file_name = '%s_dim_%d' % (model_name, model_size)
                output_file = os.path.join(models_directory, output_file_name)
                print('Training model with size %d neg %d threads %d epochs %d, saving to %s' %
                     (model_size, neg_size, threads_, epochs_, output_file))
                out = train_cpp_model(
                    cpp_binary_path, wordnet_train_file, output_file,
                    model_size, epochs_, neg_size, threads_, seed=0)
                lp_model_files[model_name][model_size] = output_file

Training model with size 5 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_5
Training model with size 10 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_10
Training model with size 20 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_20
Training model with size 50 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_50
Training model with size 100 neg 10 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_10_dim_100
Training model with size 5 neg 20 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_20_dim_5
Training model with size 10 neg 20 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_20_dim_10
Training model with size 20 neg 20 threads 8 epochs 50, saving to poincare_data/models/cpp_epochs_50_threads_8_neg_20_dim_20
Tr

In [89]:
lp_embeddings = {}

In [90]:
for model_name, models in lp_model_files.items():
    lp_embeddings[model_name] = {}
    for model_size, model_file in models.items():
        lp_embeddings[model_name][model_size] = PoincareEmbedding.load_poincare_cpp(model_file)

#### 4.2.3 Evaluating models

In [91]:
class LinkPredictionEvaluation(object):
    """Evaluating reconstruction on given network for given embedding"""
    def __init__(self, train_path, test_path, embedding):
        """Initialize evaluation instance with tsv file containing relation pairs and embedding to be evaluated
        
        Args:
            train_path (str): path to tsv file containing relation pairs used for training
            test_path (str): path to tsv file containing relation pairs to evaluate
            embedding (PoincareEmbedding instance): embedding to be evaluated
        
        Returns
            ReconstructionEvaluation instance

        """
        items = set()
        embedding_vocab = embedding.kv.vocab
        relations = {'known': defaultdict(set), 'unknown': defaultdict(set)}
        data_files = {'known': train_path, 'unknown': test_path}
        for relation_type, data_file in data_files.items():
            with smart_open(data_file, 'r') as f:
                reader = csv.reader(f, delimiter='\t')
                for row in reader:
                    assert len(row) == 2, 'Hypernym pair has more than two items'
                    item_1_index = embedding_vocab[row[0]].index
                    item_2_index = embedding_vocab[row[1]].index
                    relations[relation_type][item_1_index].add(item_2_index)
                    items.update([item_1_index, item_2_index])
        self.items = items
        self.relations = relations
        self.embedding = embedding
    
    
    @staticmethod
    def get_unknown_relation_ranks(distances, unknown_relations, known_relations):
        """
        Given a numpy array of distances and indices of known and unknown positive relations,
        compute ranks of unknown positive relations
        
        Args:
            distances (numpy float array): np array of all distances for a specific item
            unknown_relations (list): list of indices of unknown positive relations
            known_relations (list): list of indices of known positive relations
            
        Returns:
            list of ranks of unknown relations in the same order as `unknown_relations`
        """
        unknown_relation_distances = distances[unknown_relations]
        negative_relation_distances = np.ma.array(distances, mask=False)
        negative_relation_distances.mask[unknown_relations] = True
        negative_relation_distances.mask[known_relations] = True
        # Compute how many negative relation distances are less than each unknown relation distance, plus 1 for rank
        ranks = (negative_relation_distances < unknown_relation_distances[:, np.newaxis]).sum(axis=1) + 1
        return list(ranks) 
    
    def evaluate_metric(self, metric, max_n=None):
        """Evaluate given metric for the reconstruction task
            
        Args:
            metric (str): accepted values are 'mean_rank' and 'MAP'
            max_n (int or None): Maximum number of positive relations to evaluate, all if max_n is None
        
        Returns:
            Computed value of given metric (float)

        """
        if metric == 'mean_rank':
            return self.evaluate_mean_rank(max_n)
        elif metric == 'MAP':
            return self.evaluate_map(max_n)
        else:
            raise ValueError('Invalid value for metric')

    def evaluate_mean_rank(self, max_n=None):
        """Evaluate mean rank and MAP for reconstruction
            
        Args:
            max_n (int or None): Maximum number of positive relations to evaluate, all if max_n is None
        
        Returns:
            Computed value of mean rank (float)

        """
        ranks = []
        for i, item in enumerate(self.items, start=1):
            if item not in self.relations['unknown']:  # No positive relations to predict for this node
                continue
            unknown_relations = list(self.relations['unknown'][item])
            known_relations = list(self.relations['known'][item])
            item_term = self.embedding.kv.index2word[item]
            item_distances = self.embedding.get_all_distances(item_term)
            unknown_relation_ranks = self.get_unknown_relation_ranks(item_distances, unknown_relations, known_relations)
            ranks += unknown_relation_ranks
            if max_n is not None and i > max_n:
                break
        return np.mean(ranks)
    
    def evaluate_map(self, max_n=None):
        """Evaluate MAP (Mean Average Precision) for reconstruction
            
        Args:
            max_n (int or None): Maximum number of positive relations to evaluate, all if max_n is None
        
        Returns:
            Computed value of MAP (float)

        """
        raise NotImplementedError

In [92]:
lp_results = {}
metrics = ['mean_rank']

In [93]:
for model_name, models in lp_embeddings.items():
    lp_results[model_name] = {}
    for metric in metrics:
        lp_results[model_name][metric] = {}
    for model_size, embedding in models.items():
        print('Evaluating model %s of size %d' % (model_name, model_size))
        eval_instance = LinkPredictionEvaluation(wordnet_train_file, wordnet_test_file, embedding)
        for metric in metrics:
            lp_results[model_name][metric][model_size] = eval_instance.evaluate_metric(metric, max_n=1000)

Evaluating model cpp_epochs_50_threads_8_neg_20 of size 20
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 10
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 100
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 50
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 5
Evaluating model cpp_epochs_50_threads_1_neg_20 of size 20
Evaluating model cpp_epochs_50_threads_1_neg_20 of size 10
Evaluating model cpp_epochs_50_threads_1_neg_20 of size 100
Evaluating model cpp_epochs_50_threads_1_neg_20 of size 50
Evaluating model cpp_epochs_50_threads_1_neg_20 of size 5
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 20
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 10
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 100
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 50
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 5
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 20
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 

In [96]:
display_results('WordNet Link Prediction', lp_results)

Results for WordNet Link Prediction task
+--------------------------------------------+---------------------------------+
|                                            |        Model Dimensions         |
+---------------------------------+-----------+--------+--------+-------+-------+-------+
|        Model Description        |   Metric  |   5    |   10   |   20  |   50  |  100  |
+---------------------------------+-----------+--------+--------+-------+-------+-------+
|  cpp_epochs_50_threads_8_neg_20 | mean_rank | 197.63 | 85.96  | 63.21 | 51.94 | 58.72 |
|  cpp_epochs_50_threads_1_neg_20 | mean_rank | 191.36 | 92.09  | 53.68 | 47.57 | 46.26 |
|  cpp_epochs_50_threads_1_neg_10 | mean_rank | 232.96 | 87.99  | 62.41 | 46.84 | 47.23 |
|  cpp_epochs_50_threads_8_neg_10 | mean_rank | 208.80 | 104.13 | 77.91 | 63.97 | 75.42 |
| cpp_epochs_100_threads_8_neg_10 | mean_rank | 202.66 | 99.25  | 77.94 | 67.56 | 66.49 |
| cpp_epochs_100_threads_1_neg_20 | mean_rank | 147.81 | 79.82  | 48.12 | 42.

### 4.3 HyperLex Lexical Entailment

In [60]:
from scipy.stats import spearmanr

class LexicalEntailmentEvaluation(object):
    """Evaluating reconstruction on given network for any embedding"""
    def __init__(self, filepath):
        """Initialize evaluation instance with HyperLex text file containing relation pairs
        
        Args:
            filepath (str): path to HyperLex text file
        
        Returns
            LexicalEntailmentEvaluation instance

        """
        expected_scores = {}
        with smart_open(filepath, 'r') as f:
            reader = csv.DictReader(f, delimiter=' ')
            for row in reader:
                word_1, word_2 = row['WORD1'], row['WORD2']
                expected_scores[(word_1, word_2)] = float(row['AVG_SCORE'])
        self.scores = expected_scores
        self.alpha = 1000
    
    def score_function(self, embedding, word_1, word_2):
        """Given an embedding and two terms, return the predicted score for them (extent to which term_1 is a type of term_2)"""
        try:
            word_1_terms = embedding.find_matching_keys(word_1)
            word_2_terms = embedding.find_matching_keys(word_2)
        except KeyError:
            raise ValueError("No matching terms found for either %s or %s" % (word_1, word_2))
        min_distance = np.inf
        min_term_1, min_term_2 = None, None
        for term_1 in word_1_terms:
            for term_2 in word_2_terms:
                distance = embedding.get_distance(term_1, term_2)
                if distance < min_distance:
                    min_term_1, min_term_2 = term_1, term_2
                    min_distance = distance
        assert min_term_1 is not None and min_term_2 is not None
        vector_1, vector_2 = embedding.get_vector(min_term_1), embedding.get_vector(min_term_2)
        norm_1, norm_2 = np.linalg.norm(vector_1), np.linalg.norm(vector_2)
        return -1 * (1 + self.alpha * (norm_2 - norm_1)) * distance
        
    def evaluate_spearman(self, embedding):
        """Evaluate spearman scores for lexical entailment for given embedding
            
        Args:
            embedding (PoincareEmbedding instance): embedding for which evaluation is to be done
        
        Returns:
            spearman correlation score (float)

        """
        predicted_scores = []
        expected_scores = []
        skipped = 0
        count = 0
        for (word_1, word_2), expected_score in self.scores.items():
            try:
                predicted_score = self.score_function(embedding, word_1, word_2)
            except ValueError:
                skipped += 1
                continue
            count += 1
            predicted_scores.append(predicted_score)
            expected_scores.append(expected_score)
        print('Skipped pairs: %d out of %d' % (skipped, len(self.scores)))
        spearman = spearmanr(expected_scores, predicted_scores)
        return spearman.correlation


In [61]:
entailment_results = {}
eval_instance = LexicalEntailmentEvaluation(hyperlex_file)

In [62]:
for model_name, models in embeddings.items():
    entailment_results[model_name] = {}
    entailment_results[model_name]['spearman'] = {}
    for model_size, embedding in models.items():
        print('Evaluating model %s of size %d' % (model_name, model_size))
        entailment_results[model_name]['spearman'][model_size] = eval_instance.evaluate_spearman(embedding)

Evaluating model cpp_epochs_50_threads_8_neg_10 of size 20
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 10
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 100
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 50
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_8_neg_10 of size 5
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 20
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 10
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 100
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 50
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_8_neg_20 of size 5
Skipped pairs: 182 out of 2163
Evaluating model cpp_epochs_50_threads_1_neg_10 of size 20
Skipped pairs: 182 out of 2163
Evaluating

In [65]:
display_results('Lexical Entailment (HyperLex)', entailment_results)

Results for Lexical Entailment (HyperLex) task
+--------------------------------------------+---------------------------------+
|                                            |        Model Dimensions         |
+--------------------------------+----------+------+------+------+------+------+
|       Model Description        |  Metric  |  5   |  10  |  20  |  50  | 100  |
+--------------------------------+----------+------+------+------+------+------+
| cpp_epochs_50_threads_8_neg_10 | spearman | 0.44 | 0.43 | 0.44 | 0.44 | 0.44 |
| cpp_epochs_50_threads_8_neg_20 | spearman | 0.47 | 0.45 | 0.46 | 0.46 | 0.46 |
| cpp_epochs_50_threads_1_neg_10 | spearman | 0.45 | 0.48 | 0.47 | 0.47 | 0.47 |
+--------------------------------+----------+------+------+------+------+------+


### 4.4 Link Prediction for collaboration networks


In [68]:
# TODO - quite tricky, since the loss function used for training the model on this network is different
# Will require changes to how gradients are calculated in C++ code