<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Similarity-implementations" data-toc-modified-id="Similarity-implementations-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Similarity implementations</a></span><ul class="toc-item"><li><span><a href="#Subject-Similarity-" data-toc-modified-id="Subject-Similarity--1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Subject Similarity <a id="sub"></a></a></span></li><li><span><a href="#OpenKE-based-similarities---" data-toc-modified-id="OpenKE-based-similarities----1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>OpenKE based similarities  <a id="OpenKE"> </a></a></span><ul class="toc-item"><li><span><a href="#Cosine--" data-toc-modified-id="Cosine---1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span>Cosine  <a id="cos"></a></a></span><ul class="toc-item"><li><span><a href="#gen_all_cos()" data-toc-modified-id="gen_all_cos()-1.2.1.1"><span class="toc-item-num">1.2.1.1&nbsp;&nbsp;</span>gen_all_cos()</a></span></li></ul></li><li><span><a href="#DistMult-Avg--" data-toc-modified-id="DistMult-Avg---1.2.2"><span class="toc-item-num">1.2.2&nbsp;&nbsp;</span>DistMult Avg  <a id="avg"></a></a></span></li></ul></li></ul></li><li><span><a href="#Append-module" data-toc-modified-id="Append-module-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Append module</a></span></li><li><span><a href="#AMIE-and-Evaluations" data-toc-modified-id="AMIE-and-Evaluations-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>AMIE and Evaluations</a></span><ul class="toc-item"><li><span><a href="#Baseline-evaluation" data-toc-modified-id="Baseline-evaluation-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Baseline evaluation</a></span></li><li><span><a href="#Enriched-KB-Evaluation" data-toc-modified-id="Enriched-KB-Evaluation-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Enriched KB Evaluation</a></span></li></ul></li></ul></div>

# Similarity implementations

Important files:
- `./OpenKE/benchmarks/FB15K/train2id.txt` : \[int int int\]. Used by OpenKE based models (cosine, DistMult Avg) to find embeddings and Subject Similarity to find similarities.
- `./OpenKE/benchmarks/FB15K/entity2id.txt` : \[/mid int\].  
    - Used for translating similarity df from previous step to a df with this structure: \[ /mid /similar_to /mid \] .
    - Also used by Word2vec notebook.
- `./FB15K/mid2name.tsv` : \[ /mid word \]. Used by word2vec notebook.
- `./FB15K/train.txt` : \[/mid /mid /mid\]. Used by AMIE. Thus, enrichement must happen on this file.

-----

In [1]:
import itertools
from itertools import combinations
import pandas as pd
import random

from OpenKE import models,config
import multiprocessing
import tensorflow as tf
import numpy as np

import os



SUBJ = True
COS = True
DIST_AVG = True


SUBJ_SCORE = 0
COS_THR = .8
DIST_AVG_THR = -3

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


## Subject Similarity <a id="sub"></a>

The definition of similarity in this module is as follows:  

Let $e_i$ be an entity in the KB. Associated to $e_i$ there's a set $S_{e_i}$ defined by:


$$
S_{e_i} = \{ (r,e) | (e_i ,r,e) \in KB \}.
$$

It's the set of tuples $(r,e)$ such that $(e_i,r,e)$ is a triplet in the KB.


We say $e_i$ is similar to $e_j$ if $S_{e_i} \cap S_{e_j} \neq \emptyset$. Moreover, we define the similarity score between $e_i$ and $e_j$ by $sc(e_i,e_j) = | S_{e_i} \cap S_{e_j} |$.

In [147]:
# An implementation of the above is here:
# 
# Read every line in train2id and store the sets S_{e_i} in a list.
#
if SUBJ:
    ent_total = 14951    # From the first line of ent2id.txt
    file = "./OpenKE/benchmarks/FB15K/train2id.txt"
    s = [ [] for _ in range(ent_total)] # A list of lists(sets). Will contain all sets S_{e_i}.

    f = open(file,'r')
    num_lines = f.readline()    # First line of train2id is the number of triplets
    for i in range(int(num_lines)):
        l = f.readline()
        l = l.split()
        try:
            s[int(l[0])] += [ ( int(l[1]) , int(l[2]) ) ]
        except:
            print("something went wrong at triple" + str(i))
            raise
    f.close()
    ####################################
    # ent_total = ent_total // 50     # for speed boost in tests
    tot = ent_total *(ent_total -1)/2
    c = 0 # counter 

    h = []
    t = []
    sc = []
    try:
        for i,j in combinations(range(ent_total),2):
            c +=1
            score = len( set(s[i]) & set(s[j]) )
            if score > SUBJ_SCORE:
                h.append(i)
                t.append(j)
                sc.append(score)
    except KeyboardInterrupt:
        print(str(100*c/ tot) + "%")

1.2668290551811503%


With this data frame in hand, we can play so many different games, i.e. filtering out low scores, etc. But at the end these must be appended to the training file in mid format. So each integer `id` must be translated to `/mid` an then appended to train.txt in the form `/mid /similar_to /mid`.

For the appending operation we have defined another module that takes the filtered data frame as input. The filtered data frame must have only two columns named 'head' and 'tail'.

In [149]:
if SUBJ:
    d = {'head':h , 'tail':t, 'score': sc}
    df = pd.DataFrame(data=d)

    # df.loc[df['score'] == 2] # Nullius in verba
    filt_sub_df = df.copy()
    #filt_sub_df = df.loc[df['score'] > 2].copy() # More filtering if needed
    filt_sub_df.drop(columns='score',inplace=True)
    filt_sub_df # Nullius in verba again

We write the function call for appending in the next cell in comments. So whenever needed you just need to uncomment and call create the train_enriched.txt in `./FB15K` folder. This will be later used by AMIE to mine rules.

In [None]:
# filt_sub_df #feed it to append module 

----

## OpenKE based similarities  <a id="OpenKE"> </a>

We have two similarity routine based on DistMult in OpenKE. One is cos_sim and the other is avg_dist_mult. For these to run and generate the similarity data frame, first we need to run the OpenKE model which learns the embeddings. This is the first task, done in the cell below.

In [2]:
if COS or DIST_AVG:
    con = config.Config()
    con.set_in_path('./OpenKE/benchmarks/FB15K/')

    con.set_test_link_prediction(True)
    con.set_test_triple_classification(True)

    con.set_work_threads(multiprocessing.cpu_count())

    con.set_train_times(5)  # To set the data traversing rounds
    con.set_nbatches(100)     # To split the training triples into several batches
    con.set_alpha(0.1)        # To set the learning rate
    con.set_dimension(100)    # To set the dimensions of the entities and relations at the same time
    # con.set_margin(1)         # To set the margin for the loss function

    con.set_bern(0)            # To set negative sampling algorithms, unif (bern = 0) or bern (bern = 1)
    con.set_ent_neg_rate(1)   # For each positive triple, we construct rate negative triplentity
    con.set_rel_neg_rate(0)

    con.set_opt_method("Adagrad") 

    con.set_export_files("./OpenKE/res/model.vec.tf", 0)  # To set the export file of model paramters, every few rounds
    con.set_out_files("./OpenKE/res/embedding.vec.json") 

    con.init()

    #Set the knowledge embedding model
    con.set_model(models.DistMult)

    con.run()


    #con.test()


    


For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
(4831, 1)
Instructions for updating:
Use tf.cast instead.
Epoch: 0, loss: 107.23864191770554, time: 2.6291282176971436
Epoch: 1, loss: 80.29576188325882, time: 1.8825621604919434
Epoch: 2, loss: 58.39328280091286, time: 1.8455169200897217
Epoch: 3, loss: 38.54874390363693, time: 1.850348949432373
Epoch: 4, loss: 28.009576082229614, time: 1.8214657306671143


In [3]:
embeddings = con.get_parameters('numpy')
ents = embeddings['ent_embeddings'] # Table of all entity vectors
ent_total = len(ents)

The next two sections("Cosine" and "DistMult Avg"), use the embeddings in `ents` and `rels`. In fact "Cosine" only uses `ents`.

-----

### Cosine  <a id="cos"></a>

Vinay has written a few functions that I copy here. We only use the first three.

In [153]:
def dot(x,y):
    return np.sum(x * y) 

# Vector Magnitude
def mag(x):
    return np.sqrt(np.sum(x * x))

# Cosine Similarity
def cosine_similar_to(h,t,ent_embeddings = ents):
    ent_h = ent_embeddings[h]
    ent_t = ent_embeddings[t]
    cos_sim = np.absolute(dot(ent_h,ent_t)) / (mag(ent_h) * mag(ent_t))
    return(cos_sim)


In [154]:
def rand_comb(n): 
    """ Return a generator(iterator) object that simulates random n choose 2.
    
    In other words, randomly select an entry from upper or lower half
    of a symmetric matrix and return it.
    You can also check the histogram to see the choice are not biased 
    toward upper/lower half.
    """
    mem_set = set()
    c = 0
    while c < n*(n-1)/2:
        while True:
            i = random.randint(0,n-1)
            j = random.randint(0,n-1)
            while j==i:
                j = random.randint(0,n-1)
            if set([i,j]) not in mem_set:
                break     
        mem_set.add(frozenset([i,j]))
        c+=1
        yield (i,j)

If you want to generate the data frame partialy then you might want to randomize the process.
`gen_all_cos()` has all the options and you can stop it whenever you want and get partial results.

#### gen_all_cos()

In [155]:
def gen_all_cos(rand=False):
    """Return a data frame containing head, tail, and score for similar tuples.
    
    Rand -- (default=False): chose tuples randomly.
    Also prints the percentage of checked tuples. If interupted (KeyboardInterrupt)
    it still returns the (incomplete) data frame.
    """
    # Pros:
    # 1) you can stop it anytime and get partial results
    # 2) It attempts to exhaust all combinations in a random way. So the histogram of 
    # heads/tails is pretty uniform whenever you stop.
    # Cons:
    # 1) A little bit slower than the next method.
    # 2) probbaly even slower when randomizing and near the end of the process.
    tot = ent_total*(ent_total-1)/2
    c=0
    try:
        head = []
        tail = []
        sc = []
        if rand:
            for i,j in rand_comb(ent_total):
                if not(c%100000):
                    print(100*c/tot) # print progress % every 100k iteration
                c+=1
                score = cosine_similar_to(i,j)
                if score > COS_THR:
                    head.append(i)
                    tail.append(j)
                    sc.append(score)
        else:
            for i,j in combinations(range(ent_total),2):
                if not(c%100000):
                    print(100*c/tot) # print progress % every 100k iteration
                c+=1
                score = cosine_similar_to(i,j)
                if score > COS_THR:
                    head.append(i)
                    tail.append(j)
                    sc.append(score)
    except KeyboardInterrupt:
        print('KeyboardInterrupt at ' + f'{100*c/tot:.3f} %')
        
        if len(head) > len(tail):
            print("head popped!")
            head.pop()
        elif len(head) < len(tail):
            print("tail popped!")
            tail.pop()
        
        if len(sc) < len(head):
            print("pop pop!")
            head.pop()
            tail.pop()
        
        if not(len(sc) == len(head) == len(tail)):
            print("""len s,h,t still don't match :( Ain't 
                  possible but if happened, then last row contains NaNs """)
            h = pd.DataFrame({'head':head})
            t = pd.DataFrame({'tail':head})
            s = pd.DataFrame({'sc':head})
            new = pd.concat([h, t,s], axis=1)
            return new
        
        print("head: " + str(len(head)) + " tail: " + str(len(tail)) + " sc: " + str(len(sc)) )
    
    d = {'head':head , 'tail':tail, 'score': sc}
    df = pd.DataFrame(data=d)
    return df 

In [156]:
if COS:
    df_cos_sim = gen_all_cos()
    #filt_cos_df = df_cos_sim.loc[df_cos_sim['score'] > .85].copy() # more filtering if necc.
    filt_cos_df = df_cos_sim.copy()
    filt_cos_df.drop(columns='score',inplace=True)
    print(filt_cos_df) # feed it to append module

0.0
0.08947847248615265
0.1789569449723053
0.26843541745845795
0.3579138899446106
0.44739236243076325
0.5368708349169159
0.6263493074030685
0.7158277798892212
KeyboardInterrupt at 0.736 %
head: 70 tail: 70 sc: 70


In [None]:
# pass to append module

### DistMult Avg  <a id="avg"></a>

In this method, for each pair $(e_i,e_j)$ of entities we calculate the DistMult loss $l_k$ of 
$(e_i, r_k ,e_j)$ for all $k$. Since the $l_k$ is actully loss value, we take those that are negative and average them to get the score value $Sc(e_i,e_j)$. Then this score value is used to decide wheather $e_i$ and $e_j$ are similar or not.

In [4]:
rels = embeddings['rel_embeddings'] # Get the relation embeddings from OpenKE
rel_total = len(rels)

First we calculate the score for all pairs.

In [5]:
# To optimize the calculations and use GPU, we do it in chunks (batches)
# Hrere you can choose a chunk size and then run the next cell.

chunk_size =  ent_total//3  # choose a chunk size. ent_total//3 for instance worked on server

tot = ent_total * (ent_total - 1)/2
rem = tot % chunk_size
if rem == 0:
    step_total = tot // chunk_size
else:
    print('rem is ' + str(rem))
    step_total = (tot // chunk_size) + 1

print( 'Total steps that the next blocks gonna take: ' + str(step_total))


rem is 1.0
Total steps that the next blocks gonna take: 22429.0


In [6]:
import time

In [7]:
r = np.array(range(rel_total))
h_t_pairs = combinations(range(ent_total),2)
#all_score = np.array([])
all_score = np.zeros([int(step_total),int(chunk_size)])

c = 0
while c < step_total:
    start = time.time()
    p = itertools.islice(h_t_pairs,0,chunk_size)
    h,t = zip(*list(p))
    h = np.array(h)
    t= np.array(t)
    
    r_t = np.tile(r,h.size)
    #h = h.repeat(rel_total)
    #t = t.repeat(rel_total)
    
    #res = con.test_step(h.repeat(rel_total), t.repeat(rel_total), r_t).reshape(-1,rel_total)
    res = con.test_step(h.repeat(rel_total), t.repeat(rel_total), r_t)
    ind = res < 0
    dis_avg = np.divide(np.sum(np.multiply(res,ind).reshape(-1,rel_total),1), np.sum(ind.reshape(-1,rel_total),1)+ 1e-9 ) 
    #all_score = np.append(all_score,dis_avg)
    all_score[c,:] = dis_avg
    c += 1
    stop = time.time()
    print(c , stop - start)

1 45.40415287017822
2 46.16051506996155
3 43.35315203666687
4 52.59141182899475
5 43.17501401901245
6 53.83556580543518


KeyboardInterrupt: 

`all_score` contains the score for all pairs. Now we go through them all to filter.

In [17]:
all_score = all_score.reshape(-1)[:int(tot)]

In [163]:
h = []
t = []
for a in zip(combinations(range(ent_total),2), all_score ):
    if a[1] < DIST_AVG_THR:
        h.append(a[0][0])
        t.append(a[0][1])

In [164]:
d = {'head':h , 'tail': t}
filt_dist_df = pd.DataFrame(data=d)
filt_dist_df

Unnamed: 0,head,tail
0,0,2358


Pass to append module

---

# Append module 

Here we provide the append function. The result from each one of the previous sections, is a data frame with two columns, _head_ and _tail_. That data frame must be appende to `train.txt` in the format of 
```
head_mid /similar_to tail_mid
```
The following function is written to do exactly that.

In [173]:
def append_train(input_df,new_name):
    """ Appends the input data frame to a copy of train.txt.
    
    input_df: --pd.DataFrame: has two columns 'head', and 'tail' containing
    the integer ids for heads and tails of similar tuples.
    new_name: --str: name of the new file will be train_{new_name}.txt 
    """
    import datetime
    new_name = new_name + str(datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S"))
    dest = './FB15K/train_'+ new_name + '.txt'
    while os.path.isfile(dest):
        new_name = input("File already exists. Give another name: ")
        dest = './FB15K/train_'+ new_name + '.txt'
    
    heads = list(input_df['head']) 
    tails = list(input_df['tail']) 

    # Translate int ids to /mid
    ents = pd.read_csv("./OpenKE/benchmarks/FB15K/entity2id.txt",sep = '\t',header=None, skiprows=[0],usecols=[0]) # first row is lineTot
    heads_mid = list(ents.iloc[heads,0]) 
    rels_mid = ['/similar_to']*len(heads)
    tails_mid = list(ents.iloc[tails,0]) 

    d = {'head': heads_mid , 'relation': rels_mid, 'tail':tails_mid}
    df = pd.DataFrame(data=d)

    from shutil import copyfile
    copyfile('./FB15K/train.txt', dest)
    df.to_csv(dest, mode='a', header=False,index=False, sep='\t')
    return dest

In [176]:
# Create the enriched training files.
# it append each df to training.txt and save with a unique name.
# And put the name of all these new files in a list to be later used by AMIE

train_file_name = [] # list of new enriched training files.
if SUBJ:
    train_sub = append_train(filt_sub_df, 'subj')
    train_file_name.append(train_sub)
if COS:    
    train_cos = append_train(filt_cos_df, 'cos')
    train_file_name.append(train_cos)
if DIST_AVG:
    train_dist = append_train(filt_dist_df, 'dist')
    train_file_name.append(train_dist)

    
#train_file_name = [train_sub,train_cos,train_dist]

In [177]:
train_file_name

['./FB15k/train_subj2020-05-12_22-44-36.txt',
 './FB15k/train_cos2020-05-12_22-44-37.txt',
 './FB15k/train_dist2020-05-12_22-44-37.txt']

# AMIE and Evaluations 

Running AMIE on each training file produces two outputs. Firsts the rules `./rules/{KB_name}_rules.txt` and then the evaluation of the rules `./evaluation/{KB_name}_rules_eval.txt`. Then calling the function `eval_frame("./evaluation/{KB_name}_rules_eval.txt")` will measure the accuracy of rules by Hits@10.

After enriching the KB with similarity links, we run the above procedure twice. Once on `train.txt` to get __Baseline evaluations__ and again on `train_enriched_{name}.txt`. Then compare the outputs from `eval_frame()`.

In [178]:
def eval_frame(file, test_len):
    
    # Open file
    f = open(file)
    
    # Hits counter
    hits = 0
    
    # Loop though all facts in KB
    for x in range(test_len):

        # Read line
        fact = f.readline()
        fact = fact.split(' ')
        if fact != ['']:
            # Get target head and tail
            head_target = fact[0]
            tail_target = fact[2][:-1]


            # Get head predictions
            headpreds = f.readline()
            headpreds = headpreds.split(' ')
            headpreds = headpreds[1].split('\t')
            headpreds.pop()

            # Get tail predictions
            tailpreds = f.readline()
            tailpreds = tailpreds.split(' ')
            tailpreds = tailpreds[1].split('\t')
            tailpreds.pop()


            if (head_target in headpreds) and (tail_target in tailpreds):
                if (len(headpreds) < 10) and (len(tailpreds) < 10):
                    hits+=1
        else:
            print('miss')
                
    return hits/(test_len)

## Baseline evaluation 

Just type in the right files in the next cell and continue.

In [179]:
train_add = "./FB15K/train.txt"
test_add = "FB15K/test.txt"
valid_add = "FB15K/valid.txt"

rules_add = "rules/baseline_rules.txt"
eval_add = "evaluation/baseline_rules_eval.txt"

In [180]:
# The text of the commands for running AMIE

AMIE_plus = ("java -XX:-UseGCOverheadLimit -Xmx64g -jar AMIE/amie_plus.jar "
"-minhc 0.25 -mins 50 -minis 0 " 
f"{train_add} > {rules_add}")

Apply_AMIE_RULES = (f'java -jar AMIE/ApplyAMIERules.jar {rules_add}' 
                    f' {train_add} {test_add} {valid_add}'
                    f' {eval_add}')

if not os.path.exists('./rules'):
    os.mkdir('./rules')

In [181]:
# AMIE_plus
Apply_AMIE_RULES

'java -jar AMIE/ApplyAMIERules.jar rules/baseline_rules.txt ./FB15k/train.txt FB15k/test.txt FB15k/valid.txt evaluation/baseline_rules_eval.txt'

The next cell will generate the rules and save them in `rules_add`.

In [None]:
os.system(AMIE_plus)

Clean the output from previous cell (i.e. `rules_add`) before running applying AMIE rules. The header and footer of the file `rules_add` must be deleted. It should only contain rules.

In [182]:
def clean_amie_output(path):
    """
    Warning: this function overwrites the file in path
    """
    with open(path, 'r') as f:
        f_contents = f.readlines()
        
    f_contents = f_contents[13:-3]

    with open(path, 'w') as f:
        f.writelines(f_contents)
        
    print('Rules at %s file cleaned.' % path)

clean_amie_output(rules_add)

FileNotFoundError: [Errno 2] No such file or directory: 'rules/baseline_rules.txt'

In [183]:
if not os.path.exists('./evaluation'):
    os.mkdir('./evaluation')

os.system(Apply_AMIE_RULES)

33280

In [184]:
# Get the lenght of the test file. It is fed to eval_frame()
import subprocess
test_len = subprocess.run(['wc', '-l', test_add], stdout=subprocess.PIPE).stdout.decode('utf-8')
test_len = int(test_len.split()[0])
test_len

59071

In [None]:
print(eval_add)
print('Hits@10: ' + str(eval_frame(eval_add, test_len)))

---

## Enriched KB Evaluation 

Basically repeat everything from Baseline Evaluation for the enriched training file.

In [185]:
# Enriched training file names:
for name in train_file_name:
    print(name[13:])

_subj2020-05-12_22-44-36.txt
_cos2020-05-12_22-44-37.txt
_dist2020-05-12_22-44-37.txt


In [188]:
# For each enriched training file, apply AMIE and show its HIT@10 performance
for name in train_file_name:
    train_add = "FB15K/train" +  name[13:] # From append module
    rules_add = "rules/Enriched_rules" + name[13:] # modify this name if you like
    eval_add = "evaluation/Enriched_eval" + name[13:] # same here

    test_add = "FB15K/test.txt"
    valid_add = "FB15K/valid.txt"

    print("The enriched tr file: " + train_add)
    print("Rules will be saved at: "+ rules_add)
    print("And rule evaluations at: " + eval_add)
    
    # The texts of the commands for running AMIE
    AMIE_plus = ("java -XX:-UseGCOverheadLimit -Xmx4g -jar AMIE/amie_plus.jar "
    "-minhc 0.0 -mins 0 -minis 0 " 
    f"{train_add} > {rules_add}")

    Apply_AMIE_RULES = (f'java -jar AMIE/ApplyAMIERules.jar {rules_add}' 
                        f' {train_add} {test_add} {valid_add}'
                        f' {eval_add}')

    x = os.system(AMIE_plus)
    print("\n AMIE_plus output: " + str(x))
    
    # trim `Enriched_rules{}.txt` again
    clean_amie_output(rules_add)

    y = os.system(Apply_AMIE_RULES) # if output is 256 then you forgot to trim
    print("\n Apply_AMIE_Rules output: " + str(y))
    
    print('\n Hits@10: ' + str(eval_frame(eval_add, test_len)))
    print("\n")

The enriched tr file: FB15k/train_subj2020-05-12_22-44-36.txt
Rules will be saved at: rules/Enriched_rules_subj2020-05-12_22-44-36.txt
And rule evaluations at: evaluation/Enriched_eval_subj2020-05-12_22-44-36.txt

 AMIE_plus output: 33280
Rules at rules/Enriched_rules_subj2020-05-12_22-44-36.txt file cleaned.

 Apply_AMIE_Rules output: 0

 Hits@10: 0.03925784225762218


The enriched tr file: FB15k/train_cos2020-05-12_22-44-37.txt
Rules will be saved at: rules/Enriched_rules_cos2020-05-12_22-44-37.txt
And rule evaluations at: evaluation/Enriched_eval_cos2020-05-12_22-44-37.txt

 AMIE_plus output: 33280
Rules at rules/Enriched_rules_cos2020-05-12_22-44-37.txt file cleaned.

 Apply_AMIE_Rules output: 0

 Hits@10: 0.0011172995209155084


The enriched tr file: FB15k/train_dist2020-05-12_22-44-37.txt
Rules will be saved at: rules/Enriched_rules_dist2020-05-12_22-44-37.txt
And rule evaluations at: evaluation/Enriched_eval_dist2020-05-12_22-44-37.txt

 AMIE_plus output: 33280
Rules at rules/En