# Learning linkset correctness

**Research Questions**: 
- can we train a neural network distingushing between correct and uncorrect exactMatch?



## Preparing the training sets
The training set is based on a set of linksets that have been generated building [Linked Thesaurus fRamework for Environment (LusTRE)](http://linkeddata.ge.imati.cnr.it/) as part of the research activity carried out during two EU funded projects: NatureSDIPlus and eENVplus. 


The procedure adopted to prepare a view of the linksets with the label, BT,NT,RT are described in 
* [Preparing Linkset involving local dumps](http://localhost:8888/notebooks/ai-related/LinkCorrectess/PreparingLinksetWithLocalDumps.ipynb)
, which include all the linksets not involving DBPEDIA
* [Preparing Linkset involving Dbpedia](http://localhost:8888/notebooks/ai-related/LinkCorrectess/PreparingLinksetWithDBPEDIA.ipynb)

### Useful tutorial 
 -  A  useful tutorial about pandas's dataframe is available at https://data36.com/pandas-tutorial-1-basics-reading-data-files-dataframes-data-selection/
 - creating and editing https://www.shanelynn.ie/using-pandas-dataframe-creating-editing-viewing-data-in-python/
 - [Advanced Jupyter Notebook Tricks — Part I](https://blog.dominodatalab.com/lesser-known-ways-of-using-notebooks/)

### Global variable 

In [25]:
### Global variables 
path="data/" # path where to find data
#namesa=['sBT','sprefLabel','sURI','oURI','oprefLabel','oBT', 'KindOfLink'] #column names for training data frame

### Generating the List of validated linksets

In [79]:
# https://dzone.com/articles/listing-a-directory-with-python
import os

TrainingFiles= filter(lambda x: x.endswith('EnrichedLinkeset.csv'),  os.listdir(path)) 
 
for file in TrainingFiles : 
    print(file)
       
    

Thist2AGROVOCEnrichedLinkeset.csv
Thist2EUROVOCEnrichedLinkeset.csv
ThIST2BpediaEnrichedLinkeset.csv
Thist2GEMETEnrichedLinkeset.csv


## What features are we going to consider to characterize a link?


Text and Conceptual similarity among prefered labels  and  broader terms are considered as significant features on which classify a link.

Different approaches are available in order to work out the text similarity

### word2Vec
- A pretrained model for text similarity http://mccormickml.com/2016/04/12/googles-pretrained-word2vec-model-in-python/ (**pretrained model in Users/bubu/model/ but instructions outdated** )
- example of usage in  https://radimrehurek.com/gensim/models/keyedvectors.html
Others  resources 
- https://medium.freecodecamp.org/how-to-get-started-with-word2vec-and-then-how-to-make-it-work-d0a2fca9dad3
- https://www.slideshare.net/lechatpito
- https://code.google.com/archive/p/word2vec/
- [Vector Representations of Words IN TF](https://www.tensorflow.org/tutorials/representation/word2vec)
- [Stanford courser - Word Vector Representations: word2vec](https://www.youtube.com/watch?v=ERibwqs9p38)
- using word2Vec in rapidMiner https://community.rapidminer.com/discussion/43860/synonym-detection-with-word2vec
 -https://www.neuralmarkettrends.com/word2vec-example-process-rapidminer

### Glo Ve

- https://medium.com/@japneet121/word-vectorization-using-glove-76919685ee0b

### Text Similarity
- Basic text similarities https://pypi.org/project/textdistance/



# A - Attempt 1: Let's initialize the Word2Vec with a pre-existing model

## Design choices
- **design choice 1**: We use the Google’s pre-trained model see [here](http://mccormickml.com/2016/04/12/googles-pretrained-word2vec-model-in-python/). It’s 1.5GB! It includes word vectors for a vocabulary of 3 million words and phrases that they trained on roughly 100 billion words from a Google News dataset. The vector length is 300 features.
- **design choice 2**: Similarity(s1,s2) implements a first attempt to work out the similarity between two set of words. It works out the max of sim on the pairs taken in the cardinal product of the sets, not considering the stoplist. 





In [29]:
#It takes very long to be executed
# https://radimrehurek.com/gensim/models/keyedvectors.html
from gensim.models import Word2Vec
from gensim.models import KeyedVectors

#model = Word2Vec(common_texts, size=100, window=5, min_count=1, workers=4)
#word_vectors = model.wv
word_vectors = KeyedVectors.load_word2vec_format("/Users/bubu/model/GoogleNews-vectors-negative300.bin", binary=True)  # C bin format
 

## A1 - How to call the similarity between vectors

In [30]:
#similarity = word_vectors.similarity('africa'.lower(), 'Countries in Africa'.lower().split())
#print(similarity)
docdistance=word_vectors.wmdistance('africa', 'Africa')
print(docdistance)

0.675937254097414


In [31]:
#v= ['woman','man', 'house', 'pippo']
def printifvector(v):
    for e in v:
        print(e +":")
        try:
        #vector = word_vectors.wv.word_vec( word_vectors.doesnt_match(e), use_norm=True)
            print(word_vectors.get_vector(e)) 
        except KeyError as ex:
            print('exception for ' +e)


## A2 - Procedure  work to out similarity on BT according to the first attempt 


In [32]:
# Attempt to work out similarity between two set of words
#lw1 and lw2 are two documents containing set of words
def similarityBetweenSetsSplitingInWords(s1,s2):
    ## remove common words and not indexed words and tokenize
    stoplist = set('for a of the and to in'.split())
    if (type(s1) is not str) or (type(s2) is not str):
        return 0.0
    # tokenize and removing |
    remove= lambda x :x.replace('|',"")
    lw1=list(map(remove, s1.lower().split()))
    lw2=list(map(remove,s2.lower().split()))
    
    strip= lambda x :x.strip()
    lw1=list(map(strip, lw1))
    lw2=list(map(strip, lw2))
    
    
    
    ## what words are indexed?
    lw1 =  [word for word in lw1 if word not in stoplist]
    lw2 =  [word for word in lw2 if word not in stoplist]
    
    print(lw1)
    print(lw2)
    amax=-1.0
    
    # if one of the sets is empty it returns 0
    if not ((len(lw1) == 0) or (len(lw2) == 0)): 
        for i in lw1 :  
            lmax=-1.0
            try:
                #test if i is indexed otherwise exception
                word_vectors.get_vector(i) 
            except KeyError as ex:
                print('exception for ' +i)
                continue
            for ii in lw2 :
                try:
                     #test if i is indexed otherwise exception
                    word_vectors.get_vector(ii)
                except KeyError as ex:
                    print('exception for ' +ii)
                    continue
                sim=word_vectors.similarity(i,ii)
                #print('sim(%s, %s) = %f' %(i,ii,sim) )   
                lmax = max(sim , lmax)
                amax+=lmax
                       
    return amax



In [33]:
print(similarityBetweenSetsSplitingInWords("africa", 'africa'))


['africa']
['africa']
0.0


## maxInSplitWords (M)
Given 
*  two sets of words $S_1$ and $S_2$
*  a similarity functions
*  $(x_i,y_j) \in S_1 \times S_2$
  
maxInSplitWords implements the following mathematical function

$\text{maxInSplitWords(x,y,sim)}=\text{MAX}_{i,j}(sim(x_i,y_j))$

In [34]:
# Attempt to work out similarity between two set of words
# lw1 and lw2 are two documents containing set of words
# function is the similarity function to apply
# it returns the maximun similarity comaring the set product
def maxInSplitWords(s1,s2, function):
    ## remove common words and not indexed words and tokenize
    stoplist = set('for a of the and to in'.split())
    if (type(s1) is not str) or (type(s2) is not str):
        return 0.0
    # tokenize and removing |
    remove= lambda x :x.replace('|',"")
    lw1=list(map(remove, s1.lower().split()))
    lw2=list(map(remove,s2.lower().split()))
    
    strip= lambda x :x.strip()
    lw1=list(map(strip, lw1))
    lw2=list(map(strip, lw2))
    
    
    ## what words are indexed?
    lw1 =  [word for word in lw1 if word not in stoplist]
    lw2 =  [word for word in lw2 if word not in stoplist]
    
    print(lw1)
    print(lw2)
    lmax=float('nan')
    firstTime = True
    # if one of the sets is empty it returns 0
    if not ((len(lw1) == 0) or (len(lw2) == 0)): 
        for i in lw1 :  
            for ii in lw2 :
                sim=function(i,ii)
                if ( math.isnan(lmax)) :
                    lmax=sim
                else:
                    lmax = max(sim , lmax)                     
    return lmax



## SummingMax (SM)

Given 
*  two sets of words $S_1$ and $S_2$
*  a similarity functions
*  $(x_i,y_j) \in S_1 \times S_2$
  
summingMax implements the following mathematical function


$\text{summingMax(x,y,sim)}=\sum_i{\text{MAX}_{i,j}(sim(x_i,y_j))}$

In [35]:
# Attempt to work out similarity between two set of words
# lw1 and lw2 are two documents containing set of words
# function is the similarity function to apply
import math
def summingMax(s1, s2, function):
    ## remove common words and not indexed words and tokenize
    stoplist = set('for a of the and to in'.split())
    if (type(s1) is not str) or (type(s2) is not str):
        return 0.0
    # tokenize and removing |
    remove= lambda x :x.replace('|',"")
    lw1=list(map(remove, s1.lower().split()))
    lw2=list(map(remove,s2.lower().split()))
    
    strip= lambda x :x.strip()
    lw1=list(map(strip, lw1))
    lw2=list(map(strip, lw2))
    
    
    
    ## what words are indexed?
    lw1 =  [word for word in lw1 if word not in stoplist]
    lw2 =  [word for word in lw2 if word not in stoplist]
    
    print(lw1)
    print(lw2)
   
    amax=float('nan')
    #firstTime = True
    # if one of the sets is empty it returns 0
    if not ((len(lw1) == 0) or (len(lw2) == 0)): 
        for i in lw1 :  
            lmax=float('nan')
            for ii in lw2 :
                sim=function(i,ii)
                if (math.isnan(lmax)) :
                    lmax=sim
                    #firstTime =False
                else:
                    lmax = max(sim , lmax)
            if (math.isnan(amax)):
                amax=lmax;
            else: 
                amax+=lmax;
    return amax



In [36]:
print(summingMax('', '', textdistance.hamming.normalized_similarity))


[]
[]
nan


In [37]:
a = 'africa'.lower().split()
b = 'countries in africa'.lower().split()
c ='countries in Europe'.lower().split()
d ='Italy'
similarity = word_vectors.wmdistance(a, b)
print("{:.4f}".format(similarity))
similarity = word_vectors.wmdistance(a, c)
print("{:.4f}".format(similarity))
similarity = word_vectors.wmdistance(b, c)
print("{:.4f}".format(similarity))
similarity = word_vectors.wmdistance(a, d)
print("{:.4f}".format(similarity))

#a='great africa'.lower().strip().split()
#b='Countries in Africa'.lower().strip().split()
#print(a)
#print(b)
#similarity = word_vectors.distances(a,b )
#print(similarity)
#docdistance=word_vectors.wmdistance('woman', 'man')
#print(docdistance)

2.6975
3.7890
1.0915
4.1048


In [73]:
import textdistance
#df  data frame
df=THIST2DBPEDIA_df.drop_duplicates()
##TEST
#df=THIST2DBPEDIA_df[THIST2DBPEDIA_df['sBT']=='']

df['BT_similaritySInW']=0.0 
df['BT_wmdistance']=df['BT_Mwmdistance']= df['BT_SMwmdistance']= 0.0
df['BT_nhammingSim']= df['BT_MnhammingSim']= df['BT_SMnhammingSim']= 0.0



l = range(1, len(df))
for i in l:
    if (df.sBT.iloc[i]!='') and (df.oBT.iloc[i]!=''):
        df['BT_similaritySInW'][i] =similarityBetweenSetsSplitingInWords(df.sBT[i], df.oBT[i])
        df['BT_wmdistance'][i]=word_vectors.wmdistance(df.sBT[i], df.oBT[i])
        df['BT_Mwmdistance'][i]= maxInSplitWords(df.sBT[i], df.oBT[i], word_vectors.wmdistance)
        df['BT_SMwmdistance'][i]= summingMax(df.sBT[i], df.oBT[i], word_vectors.wmdistance)
        df['BT_nhammingSim'][i]=textdistance.hamming.normalized_similarity(df.sBT[i], df.oBT[i])
        df['BT_MnhammingSim'][i]=maxInSplitWords(df.sBT[i], df.oBT[i],textdistance.hamming.normalized_similarity)
        df['BT_SMnhammingSim'][i] =summingMax(df.sBT[i], df.oBT[i],textdistance.hamming.normalized_similarity)


['africa']
['countries', 'africa', 'landlocked', 'countries', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']
exception for 1960


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  app.launch_new_instance()
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


['africa']
['countries', 'africa', 'landlocked', 'countries', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


['africa']
['countries', 'africa', 'landlocked', 'countries', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']
['africa']
['countries', 'africa', 'landlocked', 'countries', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


['africa']
['countries', 'africa', 'landlocked', 'countries', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']
['africa']
['regions', 'africa', 'foreign', 'contacts', 'ancient', 'egypt', 'geography', 'ancient', 'egypt', 'history', 'sudan']
['africa']
['regions', 'africa', 'foreign', 'contacts', 'ancient', 'egypt', 'geography', 'ancient', 'egypt', 'history', 'sudan']
['africa']
['regions', 'africa', 'foreign', 'contacts', 'ancient', 'egypt', 'geography', 'ancient', 'egypt', 'history', 'sudan']
['africa']
['regions', 'africa', 'foreign', 'contacts', 'ancient', 'egypt', 'geography', 'ancient', 'egypt', 'history', 'sudan']
['africa']
['regions', 'africa', 'foreign', 'contacts', 'ancient', 'egypt', 'geography', 'ancient', 'egypt', 'history', 'sudan']
['africa']
['deserts', 'africa', 'geography', 'north', 'africa', 'wikipedia', 'categories', 'named', 'after', 'deserts']
['africa']
['deserts'

['africa', 'central', 'africa']
['member', 'states', 'african', 'union', 'central', 'african', 'countries', 'member', 'states', 'organisation', 'internationale', 'de', 'la', 'francophonie', 'southeast', 'african', 'countries', 'swahili-speaking', 'countries', 'territories', 'east', 'african', 'countries', 'countries', 'africa', 'landlocked', 'countries', 'commonwealth', 'republics', 'bantu', 'countries', 'territories', 'member', 'states', 'commonwealth', 'nations', 'states', 'territories', 'established', '1962', 'french-speaking', 'countries', 'territories', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['africa', 'central', 'africa']
['member', 'states', 'african', 'union', 'central', 'african', 'countries', 'member', 'states', 'organisation', 'internationale', 'de', 'la', 'francophonie', 'southeast', 'african', 'countries', 'swahili-speaking', 'countries', 'territories', 'east', 'african', 'countries', 'countrie

['africa', 'east', 'africa']
['southeast', 'african', 'countries', 'east', 'african', 'countries', 'countries', 'africa', 'bantu', 'countries', 'territories', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', 'states', 'territories', 'established', '1963']
['africa', 'east', 'africa']
['southeast', 'african', 'countries', 'east', 'african', 'countries', 'countries', 'africa', 'bantu', 'countries', 'territories', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', 'states', 'territories', 'established', '1963']
['africa', 'east', 'africa']
['southeast', 'african', 'countries', 'east', 'african', 'countries', 'countries', 'africa', 'bantu', 'countries', 'territories', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', 'states', 'territories', 'established', '1963']
['africa', 'east', 'af

['africa', 'east', 'africa']
['member', 'states', 'arab', 'league', 'east', 'african', 'countries', 'countries', 'africa', 'arabic-speaking', 'countries', 'territories', 'federal', 'republics', 'horn', 'africa', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960']
['africa', 'east', 'africa']
['member', 'states', 'arab', 'league', 'east', 'african', 'countries', 'countries', 'africa', 'arabic-speaking', 'countries', 'territories', 'federal', 'republics', 'horn', 'africa', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960']
['africa', 'east', 'africa']
['member', 'states', 'arab', 'league', 'east', 'african', 'countries', 'countries', 'africa', 'arabic-speaking', 'countries', 'territories', 'federal', 'republics', 'horn', 'africa', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960']
['africa', 'east', 'africa']
['states', 'terri

['africa', 'north', 'africa']
['member', 'states', 'arab', 'league', 'member', 'states', 'organisation', 'internationale', 'de', 'la', 'francophonie', 'north', 'african', 'countries', 'maghrebi', 'countries', 'countries', 'africa', 'arabic-speaking', 'countries', 'territories', 'kingdoms', 'wikipedia', 'categories', 'named', 'after', 'countries']
exception for organisation
exception for maghrebi
exception for arabic-speaking
exception for organisation
exception for maghrebi
exception for arabic-speaking
exception for organisation
exception for maghrebi
exception for arabic-speaking
['africa', 'north', 'africa']
['member', 'states', 'arab', 'league', 'member', 'states', 'organisation', 'internationale', 'de', 'la', 'francophonie', 'north', 'african', 'countries', 'maghrebi', 'countries', 'countries', 'africa', 'arabic-speaking', 'countries', 'territories', 'kingdoms', 'wikipedia', 'categories', 'named', 'after', 'countries']
['africa', 'north', 'africa']
['member', 'states', 'arab', 'le

['africa', 'north', 'africa', 'tunis', 'tunisia', 'tunisia']
['member', 'states', 'arab', 'league', 'north', 'african', 'countries', 'maghrebi', 'countries', 'countries', 'africa', 'arabic-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
exception for maghrebi
exception for arabic-speaking
exception for maghrebi
exception for arabic-speaking
exception for maghrebi
exception for arabic-speaking
exception for tunis
exception for tunisia
exception for tunisia
['africa', 'north', 'africa', 'tunis', 'tunisia', 'tunisia']
['member', 'states', 'arab', 'league', 'north', 'african', 'countries', 'maghrebi', 'countries', 'countries', 'africa', 'arabic-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['africa', 'north', 'africa', 'tunis', 'tunisia', 'tunisia']
['member', 'states', 'arab', 'league', 'north', 'african', 'countries', 'maghrebi', 'countries', 'countries', 'africa'

['africa', 'southern', 'africa', 'france', 'reunion']
['wikipedia', 'categories', 'named', 'after', 'continents', 'afro-eurasia', 'eurasia', 'continents']
exception for afro-eurasia
exception for eurasia
exception for afro-eurasia
exception for eurasia
exception for afro-eurasia
exception for eurasia
exception for afro-eurasia
exception for eurasia
exception for afro-eurasia
exception for eurasia
['africa', 'southern', 'africa', 'france', 'reunion']
['wikipedia', 'categories', 'named', 'after', 'continents', 'afro-eurasia', 'eurasia', 'continents']
['africa', 'southern', 'africa', 'france', 'reunion']
['wikipedia', 'categories', 'named', 'after', 'continents', 'afro-eurasia', 'eurasia', 'continents']
['africa', 'southern', 'africa', 'france', 'reunion']
['wikipedia', 'categories', 'named', 'after', 'continents', 'afro-eurasia', 'eurasia', 'continents']
['africa', 'southern', 'africa', 'france', 'reunion']
['wikipedia', 'categories', 'named', 'after', 'continents', 'afro-eurasia', 'eura

['africa', 'west', 'africa']
['countries', 'africa', 'landlocked', 'countries', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']
['africa', 'west', 'africa']
['countries', 'africa', 'landlocked', 'countries', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']
['africa', 'west', 'africa']
['countries', 'africa', 'landlocked', 'countries', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']
['africa', 'west', 'africa']
['countries', 'africa', 'landlocked', 'countries', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']
['africa', 'west', 'africa']
['member', 'states', 'african', 'union', 'membe

['africa', 'west', 'africa']
['countries', 'africa', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960']
exception for 1960
exception for 1960
exception for 1960
['africa', 'west', 'africa']
['countries', 'africa', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960']
['africa', 'west', 'africa']
['countries', 'africa', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960']
['africa', 'west', 'africa']
['countries', 'africa', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960']
['africa', 'west', 'africa']
['countries', 'africa', 'west', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'establishe

['asia']
['mountain', 'ranges', 'asia', 'mountain', 'ranges', 'bhutan', 'mountain', 'ranges', 'india', 'mountain', 'ranges', 'nepal', 'wikipedia', 'categories', 'named', 'after', 'mountain', 'ranges', 'mountain', 'ranges', 'tibet', 'autonomous', 'region', 'mountain', 'ranges', 'tibet', 'mountain', 'ranges', 'pakistan']
['asia']
['mountain', 'ranges', 'asia', 'mountain', 'ranges', 'bhutan', 'mountain', 'ranges', 'india', 'mountain', 'ranges', 'nepal', 'wikipedia', 'categories', 'named', 'after', 'mountain', 'ranges', 'mountain', 'ranges', 'tibet', 'autonomous', 'region', 'mountain', 'ranges', 'tibet', 'mountain', 'ranges', 'pakistan']
['asia']
['wikipedia', 'categories', 'named', 'after', 'mountain', 'ranges', 'himalayas', 'mountain', 'ranges', 'gilgit-baltistan', 'mountain', 'ranges', 'xinjiang']
exception for himalayas
exception for gilgit-baltistan
exception for xinjiang
['asia']
['wikipedia', 'categories', 'named', 'after', 'mountain', 'ranges', 'himalayas', 'mountain', 'ranges', 'g

['middle', 'east', 'arabian', 'peninsula', 'asia']
['middle', 'eastern', 'countries', 'western', 'asian', 'countries', 'member', 'states', 'arab', 'league', 'arabian', 'peninsula', 'arabic-speaking', 'countries', 'territories', 'countries', 'asia', 'sultanates', 'wikipedia', 'categories', 'named', 'after', 'countries']
['middle', 'east', 'arabian', 'peninsula', 'asia']
['middle', 'eastern', 'countries', 'western', 'asian', 'countries', 'member', 'states', 'arab', 'league', 'arabian', 'peninsula', 'arabic-speaking', 'countries', 'territories', 'countries', 'asia', 'sultanates', 'wikipedia', 'categories', 'named', 'after', 'countries']
['middle', 'east', 'arabian', 'peninsula', 'asia']
['middle', 'eastern', 'countries', 'western', 'asian', 'countries', 'member', 'states', 'arab', 'league', 'arabian', 'peninsula', 'arabic-speaking', 'countries', 'territories', 'countries', 'asia', 'sultanates', 'wikipedia', 'categories', 'named', 'after', 'countries']
['middle', 'east', 'arabian', 'penins

['commonwealth', 'independent', 'states', 'asia']
['russian-speaking', 'countries', 'territories', 'central', 'asian', 'countries', 'landlocked', 'countries', 'countries', 'asia', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['commonwealth', 'independent', 'states', 'asia']
['russian-speaking', 'countries', 'territories', 'central', 'asian', 'countries', 'landlocked', 'countries', 'countries', 'asia', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['commonwealth', 'independent', 'states', 'asia']
['russian-speaking', 'countries', 'territories', 'central', 'asian', 'countries', 'landlocked', 'countries', 'countries', 'asia', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['commonwealth', 'independent', 'states', 'asia']
['iranian-speaking', 'countries', 'territories', 'russian-speaking', 'countries', 'territories', 'central', 'asian', 'countries', 'landlocked', 'countries', 'countries', 'asia', 'wikipedia', 'categ

['commonwealth', 'independent', 'states', 'asia', 'eurasia']
['north', 'asian', 'countries', 'northeast', 'asian', 'countries', 'russian-speaking', 'countries', 'territories', 'member', 'states', 'council', 'europe', 'central', 'asian', 'countries', 'slavic', 'countries', 'territories', 'eastern', 'european', 'countries', 'countries', 'asia', 'federal', 'republics', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries']
['commonwealth', 'independent', 'states', 'asia', 'eurasia']
['north', 'asian', 'countries', 'northeast', 'asian', 'countries', 'russian-speaking', 'countries', 'territories', 'member', 'states', 'council', 'europe', 'central', 'asian', 'countries', 'slavic', 'countries', 'territories', 'eastern', 'european', 'countries', 'countries', 'asia', 'federal', 'republics', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries']
['far', 'east', 'asia']
['greater', 'sunda', 'islands', 'wikipedia', 'categories', 'named', 'after',

['far', 'east', 'asia']
['island', 'countries', 'malay-speaking', 'countries', 'territories', 'archipelagoes', 'pacific', 'ocean', 'countries', 'oceania', 'countries', 'asia', 'countries', 'melanesia', 'southeast', 'asian', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', 'volcanic', 'arc', 'islands']
['far', 'east', 'asia']
['island', 'countries', 'malay-speaking', 'countries', 'territories', 'archipelagoes', 'pacific', 'ocean', 'countries', 'oceania', 'countries', 'asia', 'countries', 'melanesia', 'southeast', 'asian', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', 'volcanic', 'arc', 'islands']
['far', 'east', 'asia']
['island', 'countries', 'malay-speaking', 'countries', 'territories', 'archipelagoes', 'pacific', 'ocean', 'countries', 'oceania', 'countries', 'asia', 'countries', 'melanesia', 'southeast', 'asian', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', 'volcanic', 'arc

['far', 'east', 'malaysian', 'peninsula', 'asia', 'indochina', 'singapore', 'singapore', 'singapore']
['island', 'countries', 'strait', 'malacca', 'non-aligned', 'movement', 'malay-speaking', 'countries', 'territories', 'tamil-speaking', 'countries', 'territories', 'city-states', 'countries', 'asia', 'chinese-speaking', 'countries', 'territories', 'member', 'states', 'commonwealth', 'nations', 'southeast', 'asian', 'countries', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'capitals', 'asia', 'wikipedia', 'categories', 'named', 'after', 'capitals', 'republics']
['far', 'east', 'malaysian', 'peninsula', 'asia', 'indochina', 'singapore', 'singapore', 'singapore']
['island', 'countries', 'strait', 'malacca', 'non-aligned', 'movement', 'malay-speaking', 'countries', 'territories', 'tamil-speaking', 'countries', 'territories', 'city-states', 'countries', 'asia', 'chinese-speaking', 'countries', 'territories', 'member', 'states', 'c

['far', 'east', 'asia', 'indonesia']
['mountains', 'indonesia', 'wikipedia', 'categories', 'named', 'after', 'volcanoes', 'uninhabited', 'islands', 'indonesia', 'active', 'volcanoes', 'indonesia', 'wikipedia', 'categories', 'named', 'after', 'individual', 'mountains', 'islands', 'sunda', 'strait', 'volcanic', 'calderas', 'indonesia']
['far', 'east', 'asia', 'indonesia']
['mountains', 'indonesia', 'wikipedia', 'categories', 'named', 'after', 'volcanoes', 'uninhabited', 'islands', 'indonesia', 'active', 'volcanoes', 'indonesia', 'wikipedia', 'categories', 'named', 'after', 'individual', 'mountains', 'islands', 'sunda', 'strait', 'volcanic', 'calderas', 'indonesia']
['far', 'east', 'malaysian', 'peninsula', 'asia']
['strait', 'malacca', 'malay-speaking', 'countries', 'territories', 'federal', 'monarchies', 'countries', 'asia', 'southeast', 'asian', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries']
exception for malacca
exception for malay-speaking
exception for malacc

['indian', 'peninsula', 'asia']
['kashmiri-speaking', 'countries', 'territories', 'divided', 'regions']
['indian', 'peninsula', 'asia']
['kashmiri-speaking', 'countries', 'territories', 'divided', 'regions']
['indian', 'peninsula', 'asia']
['kashmiri-speaking', 'countries', 'territories', 'divided', 'regions']
['indian', 'peninsula', 'asia']
['kashmiri-speaking', 'countries', 'territories', 'divided', 'regions']
['indian', 'peninsula', 'asia']
['south', 'asian', 'countries', 'socialist', 'states', 'landlocked', 'countries', 'countries', 'asia', 'wikipedia', 'categories', 'named', 'after', 'countries']
['indian', 'peninsula', 'asia']
['south', 'asian', 'countries', 'socialist', 'states', 'landlocked', 'countries', 'countries', 'asia', 'wikipedia', 'categories', 'named', 'after', 'countries']
['indian', 'peninsula', 'asia']
['south', 'asian', 'countries', 'socialist', 'states', 'landlocked', 'countries', 'countries', 'asia', 'wikipedia', 'categories', 'named', 'after', 'countries']
['ind

['middle', 'east', 'asia']
['middle', 'eastern', 'countries', 'island', 'countries', 'western', 'asian', 'countries', 'mediterranean', 'islands', 'nuts', '1', 'statistical', 'regions', 'european', 'union', 'wikipedia', 'categories', 'named', 'after', 'islands', 'southeastern', 'european', 'countries', 'divided', 'regions', 'countries', 'asia', 'member', 'states', 'european', 'union', 'member', 'states', 'commonwealth', 'nations', 'countries', 'europe', 'southern', 'european', 'countries', 'turkish-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'states', 'territories', 'established', '1960', 'republics']
['middle', 'east', 'asia']
['middle', 'eastern', 'countries', 'island', 'countries', 'western', 'asian', 'countries', 'mediterranean', 'islands', 'nuts', '1', 'statistical', 'regions', 'european', 'union', 'wikipedia', 'categories', 'named', 'after', 'islands', 'southeastern', 'european', 'countries', 'divided', 'regions', 'countries', '

['middle', 'east', 'asia']
['iranian-speaking', 'countries', 'territories', 'middle', 'eastern', 'countries', 'western', 'asian', 'countries', 'member', 'states', 'council', 'europe', 'balkan', 'countries', 'azerbaijani-speaking', 'countries', 'territories', 'southeastern', 'european', 'countries', 'countries', 'asia', 'kurdish-speaking', 'countries', 'territories', 'countries', 'europe', 'southern', 'european', 'countries', 'turkish-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['middle', 'east', 'asia']
['iranian-speaking', 'countries', 'territories', 'middle', 'eastern', 'countries', 'western', 'asian', 'countries', 'member', 'states', 'council', 'europe', 'balkan', 'countries', 'azerbaijani-speaking', 'countries', 'territories', 'southeastern', 'european', 'countries', 'countries', 'asia', 'kurdish-speaking', 'countries', 'territories', 'countries', 'europe', 'southern', 'european', 'countries', 'turkish-speaking', 'co

['middle', 'east', 'near', 'east', 'asia']
['states', 'territories', 'established', '1961', 'middle', 'eastern', 'countries', 'western', 'asian', 'countries', 'member', 'states', 'arab', 'league', 'arabian', 'peninsula', 'arabic-speaking', 'countries', 'territories', 'mashriq', 'countries', 'asia', 'wikipedia', 'categories', 'named', 'after', 'countries']
['middle', 'east', 'near', 'east', 'asia']
['middle', 'eastern', 'countries', 'western', 'asian', 'countries', 'member', 'states', 'arab', 'league', 'arabic-speaking', 'countries', 'territories', 'mashriq', 'countries', 'asia', 'levant', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
exception for arabic-speaking
exception for mashriq
exception for levant
exception for arabic-speaking
exception for mashriq
exception for levant
exception for arabic-speaking
exception for mashriq
exception for levant
exception for arabic-speaking
exception for mashriq
exception for levant
exception for arabic-speaking
exception f

['benevento', 'italy']
['transport', 'buildings', 'structures', 'spans', '(architecture)', 'structural', 'engineering', 'crossings']
['benevento', 'italy']
['transport', 'buildings', 'structures', 'spans', '(architecture)', 'structural', 'engineering', 'crossings']
['benevento', 'italy']
['transport', 'buildings', 'structures', 'spans', '(architecture)', 'structural', 'engineering', 'crossings']
['carolina', 'bermuda', 'united', 'kingdom']
['island', 'countries', 'special', 'territories', 'european', 'union', 'british', 'overseas', 'territories', 'archipelagoes', 'atlantic', 'ocean', 'wikipedia', 'categories', 'named', 'after', 'dependent', 'territories', 'british', 'north', 'america', 'wikipedia', 'categories', 'named', 'after', 'islands', 'english', 'colonization', 'americas', 'volcanic', 'islands', 'former', 'british', 'colonies', 'protectorates', 'americas', 'dependent', 'territories', 'north', 'america']
['carolina', 'bermuda', 'united', 'kingdom']
['island', 'countries', 'special

['poland', 'czechoslovakia', 'europe', 'europa', 'island', 'ussr', 'central', 'europe', 'bulgaria', 'slovakia', 'romania']
['geography', 'central', 'europe', 'wikipedia', 'categories', 'named', 'after', 'mountain', 'ranges', 'geography', 'eastern', 'europe', 'geography', 'southeastern', 'europe', 'mountain', 'ranges', 'europe']
exception for czechoslovakia
exception for bulgaria
exception for slovakia
['poland', 'czechoslovakia', 'europe', 'europa', 'island', 'ussr', 'central', 'europe', 'bulgaria', 'slovakia', 'romania']
['geography', 'central', 'europe', 'wikipedia', 'categories', 'named', 'after', 'mountain', 'ranges', 'geography', 'eastern', 'europe', 'geography', 'southeastern', 'europe', 'mountain', 'ranges', 'europe']
['poland', 'czechoslovakia', 'europe', 'europa', 'island', 'ussr', 'central', 'europe', 'bulgaria', 'slovakia', 'romania']
['geography', 'central', 'europe', 'wikipedia', 'categories', 'named', 'after', 'mountain', 'ranges', 'geography', 'eastern', 'europe', 'geogr

['caribbean', 'region', 'greater', 'antilles', 'central', 'america', 'antilles']
['spanish', 'west', 'indies', 'islands', 'haiti', 'wikipedia', 'categories', 'named', 'after', 'islands', 'divided', 'regions', 'islands', 'dominican', 'republic', 'greater', 'antilles']
['caribbean', 'region', 'lesser', 'antilles', 'united', 'kingdom', 'central', 'america', 'antilles']
['island', 'countries', 'british', 'overseas', 'territories', 'wikipedia', 'categories', 'named', 'after', 'dependent', 'territories', 'caribbean', 'territories', 'or', 'dependencies', 'wikipedia', 'categories', 'named', 'after', 'islands', 'leeward', 'islands', '(caribbean)']
exception for (caribbean)
exception for (caribbean)
exception for (caribbean)
exception for antilles
exception for (caribbean)
exception for (caribbean)
exception for (caribbean)
exception for (caribbean)
exception for antilles
['caribbean', 'region', 'lesser', 'antilles', 'united', 'kingdom', 'central', 'america', 'antilles']
['island', 'countries', 

['tertiary', 'cenozoic']
['geological', 'periods', 'cenozoic']
exception for cenozoic
exception for cenozoic
['tertiary', 'cenozoic']
['geological', 'periods', 'cenozoic']
['tertiary', 'cenozoic']
['geological', 'periods', 'cenozoic']
['tertiary', 'cenozoic']
['geological', 'periods', 'cenozoic']
['tertiary', 'cenozoic']
['geological', 'periods', 'cenozoic']
['central', 'africa']
['central', 'african', 'countries', 'member', 'states', 'community', 'portuguese', 'language', 'countries', 'countries', 'africa', 'bantu', 'countries', 'territories', 'southern', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['central', 'africa']
['central', 'african', 'countries', 'member', 'states', 'community', 'portuguese', 'language', 'countries', 'countries', 'africa', 'bantu', 'countries', 'territories', 'southern', 'african', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['central', 'africa']
['central', 'african', 

['central', 'europe']
['central', 'european', 'countries', 'germanic', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'member', 'states', 'european', 'union', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries']
['central', 'europe']
['central', 'european', 'countries', 'germanic', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'member', 'states', 'european', 'union', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries']
['central', 'europe']
['central', 'european', 'countries', 'germanic', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'member', 'states', 'european', 'union', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries']
['central', 'europe']
['central', 'european', 'countries', 'germanic', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'member', 'states', 'european', 'union', 'countrie

['tetrapoda', 'lepidosauria', 'diapsida', 'chordata', 'vertebrata', 'reptilia']
['lepidosaurs']
['tetrapoda', 'lepidosauria', 'diapsida', 'chordata', 'vertebrata', 'reptilia']
['lepidosaurs']
['tetrapoda', 'lepidosauria', 'diapsida', 'chordata', 'vertebrata', 'reptilia']
['lepidosaurs']
['tetrapoda', 'lepidosauria', 'diapsida', 'chordata', 'vertebrata', 'reptilia']
['lepidosaurs']
['tetrapoda', 'diapsida', 'chordata', 'vertebrata', 'reptilia']
['mesozoic', 'reptiles', 'prehistoric', 'marine', 'reptiles', 'diapsids']
exception for tetrapoda
exception for diapsida
exception for chordata
exception for vertebrata
exception for reptilia
['tetrapoda', 'diapsida', 'chordata', 'vertebrata', 'reptilia']
['mesozoic', 'reptiles', 'prehistoric', 'marine', 'reptiles', 'diapsids']
['tetrapoda', 'diapsida', 'chordata', 'vertebrata', 'reptilia']
['mesozoic', 'reptiles', 'prehistoric', 'marine', 'reptiles', 'diapsids']
['tetrapoda', 'diapsida', 'chordata', 'vertebrata', 'reptilia']
['mesozoic', 'reptil

['commonwealth', 'independent', 'states', 'eastern', 'europe']
['russian-speaking', 'countries', 'territories', 'ukrainian-speaking', 'countries', 'territories', 'landlocked', 'countries', 'southeastern', 'european', 'countries', 'romanian-speaking', 'countries', 'territories', 'eastern', 'european', 'countries', 'romance', 'countries', 'territories', 'countries', 'europe', 'southern', 'european', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['commonwealth', 'independent', 'states', 'eastern', 'europe']
['russian-speaking', 'countries', 'territories', 'ukrainian-speaking', 'countries', 'territories', 'landlocked', 'countries', 'southeastern', 'european', 'countries', 'romanian-speaking', 'countries', 'territories', 'eastern', 'european', 'countries', 'romance', 'countries', 'territories', 'countries', 'europe', 'southern', 'european', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['commonwealth', 'independent'

['czechoslovakia', 'europe', 'europa', 'island', 'central', 'europe', 'czech', 'republic']
['geography', 'central', 'europe', 'historical', 'regions', 'czech', 'republic', 'wikipedia', 'categories', 'named', 'after', 'former', 'countries', 'geography', 'czech', 'republic', 'kingdoms', 'countries', 'austria-hungary']
['czechoslovakia', 'europe', 'europa', 'island', 'central', 'europe', 'czech', 'republic']
['geography', 'central', 'europe', 'historical', 'regions', 'czech', 'republic', 'wikipedia', 'categories', 'named', 'after', 'former', 'countries', 'geography', 'czech', 'republic', 'kingdoms', 'countries', 'austria-hungary']
['czechoslovakia', 'europe', 'europa', 'island', 'central', 'europe', 'czech', 'republic']
['geography', 'central', 'europe', 'historical', 'regions', 'czech', 'republic', 'wikipedia', 'categories', 'named', 'after', 'former', 'countries', 'geography', 'czech', 'republic', 'kingdoms', 'countries', 'austria-hungary']
['czechoslovakia', 'europe', 'europa', 'island

['europe', 'europa', 'island']
['finno-ugric', 'countries', 'territories', 'northern', 'european', 'countries', 'russian-speaking', 'countries', 'territories', 'post–russian', 'empire', 'states', 'nuts', '1', 'statistical', 'regions', 'european', 'union', 'member', 'states', 'european', 'union', 'states', 'territories', 'established', '1918', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries', 'baltic', 'states', 'republics']
['europe', 'europa', 'island']
['finno-ugric', 'countries', 'territories', 'northern', 'european', 'countries', 'russian-speaking', 'countries', 'territories', 'post–russian', 'empire', 'states', 'nuts', '1', 'statistical', 'regions', 'european', 'union', 'member', 'states', 'european', 'union', 'states', 'territories', 'established', '1918', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries', 'baltic', 'states', 'republics']
['europe', 'europa', 'island']
['finno-ugric', 'countries', 'territories', 'northe

['europe', 'europa', 'island', 'central', 'europe']
['principalities', 'central', 'european', 'countries', 'germanic', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'landlocked', 'countries', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries']
['europe', 'europa', 'island', 'central', 'europe']
['principalities', 'central', 'european', 'countries', 'germanic', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'landlocked', 'countries', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries']
['europe', 'europa', 'island', 'central', 'europe']
['principalities', 'central', 'european', 'countries', 'germanic', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'landlocked', 'countries', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries']
['europe', 'europa', 'island', 'central', 'europe']
['principalities', 'central', 'european', 

['europe', 'europa', 'island', 'central', 'europe']
['central', 'european', 'countries', 'post–russian', 'empire', 'states', 'ukrainian-speaking', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'slavic', 'countries', 'territories', 'member', 'states', 'european', 'union', 'states', 'territories', 'established', '1918', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries', 'polish-speaking', 'countries', 'territories', 'republics']
['europe', 'europa', 'island', 'central', 'europe']
['central', 'european', 'countries', 'nuts', '1', 'statistical', 'regions', 'european', 'union', 'landlocked', 'countries', 'slavic', 'countries', 'territories', 'member', 'states', 'european', 'union', 'hungarian-speaking', 'countries', 'territories', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
exception for hungarian-speaking
exception for hungarian-speaking
exception for hungarian-speaking
exception fo

['europe', 'europa', 'island', 'central', 'europe', 'germany']
['regions', 'thuringia', 'divided', 'regions', 'wikipedia', 'categories', 'named', 'after', 'former', 'countries', 'regions', 'saxony']
['europe', 'europa', 'island', 'central', 'europe', 'germany']
['regions', 'thuringia', 'divided', 'regions', 'wikipedia', 'categories', 'named', 'after', 'former', 'countries', 'regions', 'saxony']
['europe', 'europa', 'island', 'central', 'europe', 'germany']
['regions', 'thuringia', 'divided', 'regions', 'wikipedia', 'categories', 'named', 'after', 'former', 'countries', 'regions', 'saxony']
['europe', 'europa', 'island', 'central', 'europe', 'germany']
['regions', 'north', 'rhine-westphalia']
exception for rhine-westphalia
exception for rhine-westphalia
exception for rhine-westphalia
exception for rhine-westphalia
exception for rhine-westphalia
exception for rhine-westphalia
['europe', 'europa', 'island', 'central', 'europe', 'germany']
['regions', 'north', 'rhine-westphalia']
['europe'

['europe', 'europa', 'island', 'southern', 'europe']
['member', 'states', 'organisation', 'internationale', 'de', 'la', 'francophonie', 'member', 'states', 'council', 'europe', 'balkan', 'countries', 'bulgarian-speaking', 'countries', 'territories', 'southeastern', 'european', 'countries', 'slavic', 'countries', 'territories', 'member', 'states', 'european', 'union', 'countries', 'europe', 'southern', 'european', 'countries', 'turkish-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['europe', 'europa', 'island', 'southern', 'europe']
['regions', 'croatia', 'historical', 'regions', 'croatia']
['europe', 'europa', 'island', 'southern', 'europe']
['regions', 'croatia', 'historical', 'regions', 'croatia']
['europe', 'europa', 'island', 'southern', 'europe']
['regions', 'croatia', 'historical', 'regions', 'croatia']
['europe', 'europa', 'island', 'southern', 'europe']
['regions', 'croatia', 'historical', 'regions', 'croatia']
['e

['europe', 'europa', 'island', 'southern', 'europe']
['historical', 'regions', 'slovenia', 'regions', 'croatia', 'divided', 'regions', 'historical', 'regions', 'croatia']
['europe', 'europa', 'island', 'southern', 'europe']
['historical', 'regions', 'slovenia', 'regions', 'croatia', 'divided', 'regions', 'historical', 'regions', 'croatia']
['europe', 'europa', 'island', 'southern', 'europe']
['historical', 'regions', 'slovenia', 'regions', 'croatia', 'divided', 'regions', 'historical', 'regions', 'croatia']
['europe', 'europa', 'island', 'southern', 'europe']
['historical', 'regions', 'slovenia', 'regions', 'croatia', 'divided', 'regions', 'historical', 'regions', 'croatia']
['europe', 'europa', 'island', 'southern', 'europe']
['former', 'countries', 'europe', 'modern', 'history', 'balkans', 'wikipedia', 'categories', 'named', 'after', 'former', 'countries', 'former', 'slavic', 'countries', 'former', 'countries', 'balkans', '1992', 'disestablishments', 'yugoslavia']
exception for balka

['europe', 'europa', 'island', 'southern', 'europe']
['balkan', 'countries', 'southeastern', 'european', 'countries', 'romanian-speaking', 'countries', 'territories', 'eastern', 'european', 'countries', 'romance', 'countries', 'territories', 'member', 'states', 'european', 'union', 'hungarian-speaking', 'countries', 'territories', 'countries', 'europe', 'southern', 'european', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['europe', 'europa', 'island', 'southern', 'europe']
['balkan', 'countries', 'southeastern', 'european', 'countries', 'romanian-speaking', 'countries', 'territories', 'eastern', 'european', 'countries', 'romance', 'countries', 'territories', 'member', 'states', 'european', 'union', 'hungarian-speaking', 'countries', 'territories', 'countries', 'europe', 'southern', 'european', 'countries', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['europe', 'europa', 'island', 'southern', 'europe']
['balkan', 'countri

['europe', 'europa', 'island', 'western', 'europe']
['principalities', 'prince-bishoprics', 'spanish-speaking', 'countries', 'territories', 'member', 'states', 'organisation', 'internationale', 'de', 'la', 'francophonie', 'member', 'states', 'council', 'europe', 'països', 'catalans', 'landlocked', 'countries', 'romance', 'countries', 'territories', 'iberian', 'peninsula', 'countries', 'europe', 'pyrenees', 'french-speaking', 'countries', 'territories', 'southern', 'european', 'countries', 'monarchies', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries', 'diarchies', 'southwestern', 'european', 'countries', 'states', 'territories', 'established', '1278']
['europe', 'europa', 'island', 'western', 'europe']
['principalities', 'prince-bishoprics', 'spanish-speaking', 'countries', 'territories', 'member', 'states', 'organisation', 'internationale', 'de', 'la', 'francophonie', 'member', 'states', 'council', 'europe', 'països', 'catalans', 'landlocked', 'countries', 'romance',

['europe', 'europa', 'island', 'western', 'europe']
['geography', 'southwestern', 'europe', 'mountain', 'ranges', 'iberian', 'peninsula', 'mountain', 'ranges', 'aquitaine', 'wikipedia', 'categories', 'named', 'after', 'mountain', 'ranges', 'mountain', 'ranges', 'midi-pyrénées', 'mountain', 'ranges', 'aragon', 'mountain', 'ranges', 'catalonia', 'mountain', 'ranges', 'basque', 'country', 'mountain', 'ranges', 'europe']
exception for iberian
exception for aquitaine
exception for midi-pyrénées
exception for aragon
exception for catalonia
exception for iberian
exception for aquitaine
exception for midi-pyrénées
exception for aragon
exception for catalonia
exception for iberian
exception for aquitaine
exception for midi-pyrénées
exception for aragon
exception for catalonia
exception for iberian
exception for aquitaine
exception for midi-pyrénées
exception for aragon
exception for catalonia
exception for iberian
exception for aquitaine
exception for midi-pyrénées
exception for aragon
exceptio

['europe', 'europa', 'island', 'france', 'western', 'europe']
['former', 'regions', 'france', 'geographical,', 'historical', 'cultural', 'regions', 'france', 'nouvelle-aquitaine']
['europe', 'europa', 'island', 'france', 'western', 'europe']
['former', 'regions', 'france', 'geographical,', 'historical', 'cultural', 'regions', 'france', 'nouvelle-aquitaine']
['europe', 'europa', 'island', 'france', 'western', 'europe']
['former', 'regions', 'france', 'geographical,', 'historical', 'cultural', 'regions', 'france', 'nouvelle-aquitaine']
['europe', 'europa', 'island', 'france', 'western', 'europe']
['former', 'regions', 'france', 'geographical,', 'historical', 'cultural', 'regions', 'france', 'nouvelle-aquitaine']
['europe', 'europa', 'island', 'france', 'western', 'europe']
['regions', 'france', 'peninsulas', 'france', 'geographical,', 'historical', 'cultural', 'regions', 'france', 'former', 'provinces', 'france']
exception for geographical,
exception for geographical,
exception for geogr

['europe', 'europa', 'island', 'western', 'europe', 'luxembourg', 'luxembourg', 'luxembourg']
['member', 'states', 'organisation', 'internationale', 'de', 'la', 'francophonie', 'nuts', '1', 'statistical', 'regions', 'european', 'union', 'germanic', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'landlocked', 'countries', 'benelux', 'member', 'states', 'european', 'union', 'countries', 'europe', 'french-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'western', 'european', 'countries']
['europe', 'europa', 'island', 'western', 'europe', 'luxembourg', 'luxembourg', 'luxembourg']
['member', 'states', 'organisation', 'internationale', 'de', 'la', 'francophonie', 'nuts', '1', 'statistical', 'regions', 'european', 'union', 'germanic', 'countries', 'territories', 'german-speaking', 'countries', 'territories', 'landlocked', 'countries', 'benelux', 'member', 'states', 'european', 'union', 'countries', 'europe', 'french

['europe', 'europa', 'island', 'scandinavia', 'western', 'europe']
['northern', 'european', 'countries', 'nordic', 'countries', 'germanic', 'countries', 'territories', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries', 'scandinavian', 'countries']
['europe', 'europa', 'island', 'scandinavia', 'western', 'europe']
['northern', 'european', 'countries', 'nordic', 'countries', 'germanic', 'countries', 'territories', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries', 'scandinavian', 'countries']
['europe', 'europa', 'island', 'scandinavia', 'western', 'europe']
['northern', 'european', 'countries', 'nordic', 'countries', 'germanic', 'countries', 'territories', 'countries', 'europe', 'wikipedia', 'categories', 'named', 'after', 'countries', 'scandinavian', 'countries']
['europe', 'europa', 'island', 'scandinavia', 'western', 'europe']
['northern', 'european', 'countries', 'nordic', 'countries', 'germanic', 'countries', 'territories'

['europe', 'europa', 'island', 'united', 'kingdom', 'western', 'europe']
['island', 'countries', 'united', 'kingdom', 'by', 'country', 'germanic', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'great', 'britain']
['europe', 'europa', 'island', 'united', 'kingdom', 'western', 'europe']
['island', 'countries', 'united', 'kingdom', 'by', 'country', 'germanic', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'great', 'britain']
['europe', 'europa', 'island', 'united', 'kingdom', 'western', 'europe']
['island', 'countries', 'united', 'kingdom', 'by', 'country', 'germanic', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'great', 'britain']
['europe', 'europa', 'island', 'united', 'kingdom', 'western', 'europe']
['island', 'countries', 'united', 'kingdom', 'by', 'country', 'germanic', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'great',

['halides']
['halides', 'salts', 'bromine', 'compounds']
['halides']
['halides', 'salts', 'bromine', 'compounds']
['halides']
['halides', 'salts', 'bromine', 'compounds']
['halides']
['halides', 'salts', 'bromine', 'compounds']
['halides']
['halides', 'salts', 'bromine', 'compounds']
['halides']
['halides', 'salts', 'fluorine', 'compounds']
['halides']
['halides', 'salts', 'fluorine', 'compounds']
['halides']
['halides', 'salts', 'fluorine', 'compounds']
['halides']
['halides', 'salts', 'fluorine', 'compounds']
['halides']
['halides', 'salts', 'fluorine', 'compounds']
['halides']
['iodine', 'compounds', 'halides']
['halides']
['iodine', 'compounds', 'halides']
['halides']
['iodine', 'compounds', 'halides']
['halides']
['iodine', 'compounds', 'halides']
['halides']
['iodine', 'compounds', 'halides']
['halides', 'fluorides']
['gemstones', 'minerals']
['halides', 'fluorides']
['gemstones', 'minerals']
['halides', 'fluorides']
['gemstones', 'minerals']
['halides', 'fluorides']
['gemstones'

['arthropoda', 'invertebrata', 'mandibulata', 'insecta', 'hymenopteroida']
['insects', 'by', 'classification']
['arthropoda', 'invertebrata', 'mandibulata', 'insecta', 'hymenopteroida']
['insects', 'by', 'classification']
['arthropoda', 'invertebrata', 'mandibulata', 'insecta', 'hymenopteroida']
['insects', 'by', 'classification']
['arthropoda', 'invertebrata', 'mandibulata', 'insecta', 'hymenopteroida']
['insects', 'by', 'classification']
['lepidopteroida', 'arthropoda', 'invertebrata', 'mandibulata', 'insecta']
['insects', 'by', 'classification']
exception for lepidopteroida
exception for arthropoda
exception for invertebrata
exception for mandibulata
exception for insecta
['lepidopteroida', 'arthropoda', 'invertebrata', 'mandibulata', 'insecta']
['insects', 'by', 'classification']
['lepidopteroida', 'arthropoda', 'invertebrata', 'mandibulata', 'insecta']
['insects', 'by', 'classification']
['lepidopteroida', 'arthropoda', 'invertebrata', 'mandibulata', 'insecta']
['insects', 'by', '

['jordan']
['decapolis', 'populated', 'places', 'amman', 'governorate', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'jordan', 'capitals', 'asia', 'wikipedia', 'categories', 'named', 'after', 'capitals']
['jordan']
['decapolis', 'populated', 'places', 'amman', 'governorate', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'jordan', 'capitals', 'asia', 'wikipedia', 'categories', 'named', 'after', 'capitals']
['jordan']
['decapolis', 'populated', 'places', 'amman', 'governorate', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'jordan', 'capitals', 'asia', 'wikipedia', 'categories', 'named', 'after', 'capitals']
['jordan']
['decapolis', 'populated', 'places', 'amman', 'governorate', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'jordan', 'capitals', 'asia', 'wikipedia', 'categories', 'named', 'after', 'capitals']
['kazakhstan']
['cities', 'towns', 'kazakhstan', 'populated', 'places', 'kazakhstan', 'regio

['latin', 'america', 'panama', 'panama', 'city', 'panama', 'central', 'america']
['spanish-speaking', 'countries', 'territories', 'countries', 'north', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', 'countries', 'central', 'america']
['latin', 'america', 'panama', 'panama', 'city', 'panama', 'central', 'america']
['spanish-speaking', 'countries', 'territories', 'countries', 'north', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', 'countries', 'central', 'america']
['latin', 'america', 'panama', 'panama', 'city', 'panama', 'central', 'america']
['spanish-speaking', 'countries', 'territories', 'countries', 'north', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', 'countries', 'central', 'america']
exception for spanish-speaking
exception for spanish-speaking
exception for spanish-speaking
exception for spanish-speaking
exception for spanish-speaking
exception for spanish-speaking
excepti

exception for spanish-speaking
['south', 'america', 'latin', 'america']
['spanish-speaking', 'countries', 'territories', 'countries', 'south', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries']
['south', 'america', 'latin', 'america']
['spanish-speaking', 'countries', 'territories', 'countries', 'south', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries']
['south', 'america', 'latin', 'america']
['spanish-speaking', 'countries', 'territories', 'countries', 'south', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries']
['south', 'america', 'latin', 'america']
['spanish-speaking', 'countries', 'territories', 'countries', 'south', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries']
['lithuania']
['capitals', 'lithuanian', 'counties', 'cities', 'lithuania', 'cities', 'vilnius', 'county', 'capitals', 'europe', 'vilnius', 'city', 'municipality', 'wikipedia', 'categories', 'named', 'after', 'capitals']
exception for lithuania


['caribbean', 'region', 'lesser', 'antilles', 'france', 'central', 'america', 'antilles', 'west', 'indies']
['outermost', 'regions', 'european', 'union', 'departments', 'france', 'island', 'countries', 'regions', 'france', 'caribbean', 'territories', 'or', 'dependencies', 'wikipedia', 'categories', 'named', 'after', 'islands', 'windward', 'islands', 'wikipedia', 'categories', 'named', 'after', 'former', 'countries', 'islands', 'france', 'french', 'north', 'america', 'overseas', 'departments', 'france']
['caribbean', 'region', 'lesser', 'antilles', 'france', 'central', 'america', 'antilles', 'west', 'indies']
['outermost', 'regions', 'european', 'union', 'departments', 'france', 'island', 'countries', 'regions', 'france', 'caribbean', 'territories', 'or', 'dependencies', 'wikipedia', 'categories', 'named', 'after', 'islands', 'windward', 'islands', 'wikipedia', 'categories', 'named', 'after', 'former', 'countries', 'islands', 'france', 'french', 'north', 'america', 'overseas', 'departme

exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
exception for aegean
['mediterranean', 'region', 'europe', 'europa', 'island', 'southern', 'europe', 'greece', 'greek', 'aegean', 'islands']
['landforms', 'south', 'aegean', 'wikipedia', 'categories', 'named', 'after', 'islands', 'prefectures', 'greece', 'archipelagoes', 'mediterranean', 'sea', 'archipelagoes', 'greece', 'traditional', 'geographic', 'divisions', 'greece', 'aegean', 'islands']
['mediterranean', 'region', 'europe', 'europa', 'island', 'southern', 'europe', 'greece', 'greek', 'aegean', 'islands']
['landforms', 'south', 'aegean', 'wikipedia', 'categories', 'nam

['metals']
['metals', 'transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['metals', 'transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['metalloids', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
exception for metalloids
['metals']
['metalloids', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['metalloids', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['metalloids', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['metalloids', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['post-transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', '

['metals']
['transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals']
['transition', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', '

['metals', 'actinides']
['actinides', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'actinides']
['actinides', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'actinides']
['actinides', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'actinides']
['actinides', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'actinides']
['actinides', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'actinides']
['actinides', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'actinides']
['actinides', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'actinides']
['actinides', 'chemical', 'elements', 'wikipedia', 'categor

['metals', 'alkali', 'metals']
['alkali', 'metals', 'coolants', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'alkali', 'metals']
['alkali', 'metals', 'coolants', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'alkali', 'metals']
['alkali', 'metals', 'coolants', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'alkali', 'metals']
['alkali', 'metals', 'coolants', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'alkaline', 'earth', 'metals']
['alkaline', 'earth', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'alkaline', 'earth', 'metals']
['alkaline', 'earth', 'metals', 'chemical', 'elements', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'alkaline', 'earth', 'metal

['metals', 'platinum', 'group']
['precious', 'metals', 'transition', 'metals', 'chemical', 'elements', 'platinum-group', 'metals', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'platinum', 'group']
['precious', 'metals', 'transition', 'metals', 'chemical', 'elements', 'platinum-group', 'metals', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'platinum', 'group']
['precious', 'metals', 'transition', 'metals', 'chemical', 'elements', 'platinum-group', 'metals', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
exception for platinum-group
exception for platinum-group
exception for platinum-group
['metals', 'platinum', 'group']
['precious', 'metals', 'transition', 'metals', 'chemical', 'elements', 'platinum-group', 'metals', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'platinum', 'group']
['precious', 'metals', 'transition', 'metals', 'chemical', 'elements', 'platinum-

['metals', 'rare', 'earths']
['chemical', 'elements', 'lanthanides', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'rare', 'earths']
['chemical', 'elements', 'lanthanides', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'rare', 'earths']
['chemical', 'elements', 'lanthanides', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'rare', 'earths']
['chemical', 'elements', 'lanthanides', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'rare', 'earths']
['chemical', 'elements', 'lanthanides', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'rare', 'earths']
['chemical', 'elements', 'lanthanides', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'rare', 'earths']
['chemical', 'elements', 'lanthanides', 'wikipedia', 'categories', 'named', 'after', 'chemical', 'elements']
['metals', 'rare', 'earths']
['che

['minerals']
['halogen', 'compounds', 'chemical', 'compounds']
['minerals']
['halogen', 'compounds', 'chemical', 'compounds']
['minerals']
['halogen', 'compounds', 'chemical', 'compounds']
['minerals']
['halogen', 'compounds', 'chemical', 'compounds']
['minerals']
['antimony', 'compounds']
['minerals']
['antimony', 'compounds']
['minerals']
['antimony', 'compounds']
['minerals']
['antimony', 'compounds']
['minerals']
['antimony', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['arsenic', 'compounds']
['minerals']
['boron', 'compounds']
['minerals']
['boron', 'compounds']
['minerals']
['boron', 'compounds']
['minerals']
['boron', 'compounds']
['minerals']
['boron', 'compo

['algeria', 'algeri', 'algeria', 'north', 'africa']
['member', 'states', 'african', 'union', 'member', 'states', 'arab', 'league', 'north', 'african', 'countries', 'maghrebi', 'countries', 'countries', 'africa', 'arabic-speaking', 'countries', 'territories', 'states', 'territories', 'established', '1962', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', '1962', 'establishments', 'africa']
['algeria', 'algeri', 'algeria', 'north', 'africa']
['member', 'states', 'african', 'union', 'member', 'states', 'arab', 'league', 'north', 'african', 'countries', 'maghrebi', 'countries', 'countries', 'africa', 'arabic-speaking', 'countries', 'territories', 'states', 'territories', 'established', '1962', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics', '1962', 'establishments', 'africa']
['algeria', 'algeri', 'algeria', 'north', 'africa']
['member', 'states', 'african', 'union', 'member', 'states', 'arab', 'league', 'north', 'african', 'countries', 'maghreb

['oceania']
['regions', 'oceania']
exception for oceania
['oceania']
['regions', 'oceania']
['oceania']
['regions', 'oceania']
['oceania']
['regions', 'oceania']
['oceania']
['regions', 'oceania']
['oceania']
['regions', 'oceania']
exception for oceania
['oceania']
['regions', 'oceania']
['oceania']
['regions', 'oceania']
['oceania']
['regions', 'oceania']
['oceania']
['regions', 'oceania']
['oceania']
['island', 'countries', 'wikipedia', 'categories', 'named', 'after', 'islands', 'countries', 'micronesia', 'atolls', 'pacific', 'ocean', 'british', 'western', 'pacific', 'territories', 'countries', 'oceania', 'member', 'states', 'commonwealth', 'nations', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
exception for oceania
['oceania']
['island', 'countries', 'wikipedia', 'categories', 'named', 'after', 'islands', 'countries', 'micronesia', 'atolls', 'pacific', 'ocean', 'british', 'western', 'pacific', 'territories', 

['polynesia', 'oceania']
['realm', 'new', 'zealand', 'island', 'countries', 'wikipedia', 'categories', 'named', 'after', 'associated', 'states', 'wikipedia', 'categories', 'named', 'after', 'dependent', 'territories', 'associated', 'states', 'new', 'zealand', 'wikipedia', 'categories', 'named', 'after', 'islands', 'autonomous', 'regions', 'british', 'western', 'pacific', 'territories', 'countries', 'oceania', 'countries', 'polynesia', 'wikipedia', 'categories', 'named', 'after', 'countries', 'new', 'zealand–pacific', 'relations']
['polynesia', 'oceania']
['realm', 'new', 'zealand', 'island', 'countries', 'wikipedia', 'categories', 'named', 'after', 'associated', 'states', 'wikipedia', 'categories', 'named', 'after', 'dependent', 'territories', 'associated', 'states', 'new', 'zealand', 'wikipedia', 'categories', 'named', 'after', 'islands', 'autonomous', 'regions', 'british', 'western', 'pacific', 'territories', 'countries', 'oceania', 'countries', 'polynesia', 'wikipedia', 'categories'

['aliphatic', 'hydrocarbons', 'organic', 'materials', 'hydrocarbons']
['hydrocarbons', 'organic', 'compounds']
['aliphatic', 'hydrocarbons', 'organic', 'materials', 'hydrocarbons']
['hydrocarbons', 'organic', 'compounds']
['aliphatic', 'hydrocarbons', 'organic', 'materials', 'hydrocarbons']
['hydrocarbons', 'organic', 'compounds']
['aliphatic', 'hydrocarbons', 'organic', 'materials', 'hydrocarbons']
['hydrocarbons', 'organic', 'compounds']
['aliphatic', 'hydrocarbons', 'organic', 'materials', 'hydrocarbons']
['hydrocarbons', 'organic', 'compounds']
['aliphatic', 'hydrocarbons', 'organic', 'materials', 'hydrocarbons']
['hydrocarbons', 'organic', 'compounds']
['aliphatic', 'hydrocarbons', 'organic', 'materials', 'hydrocarbons']
['hydrocarbons', 'organic', 'compounds']
['aliphatic', 'hydrocarbons', 'organic', 'materials', 'hydrocarbons']
['hydrocarbons', 'organic', 'compounds']
['aliphatic', 'hydrocarbons', 'organic', 'materials', 'hydrocarbons']
['hydrocarbons', 'organic', 'compounds']
[

['oxides', 'selenides', 'tungstates', 'tellurides', 'silicates', 'vanadates', 'tellurates', 'selenites', 'organic', 'minerals', 'sulfides', 'molybdates', 'nitrates', 'sulfosalts', 'selenates', 'sulfates']
['crystalline', 'solids', 'natural', 'materials', 'natural', 'resources', 'mineralogy', 'chemical', 'compounds']
['oxides', 'selenides', 'tungstates', 'tellurides', 'silicates', 'vanadates', 'tellurates', 'selenites', 'organic', 'minerals', 'sulfides', 'molybdates', 'nitrates', 'sulfosalts', 'selenates', 'sulfates']
['crystalline', 'solids', 'natural', 'materials', 'natural', 'resources', 'mineralogy', 'chemical', 'compounds']
['oxides', 'selenides', 'tungstates', 'tellurides', 'silicates', 'vanadates', 'tellurates', 'selenites', 'organic', 'minerals', 'sulfides', 'molybdates', 'nitrates', 'sulfosalts', 'selenates', 'sulfates']
['crystalline', 'solids', 'natural', 'materials', 'natural', 'resources', 'mineralogy', 'chemical', 'compounds']
['oxides', 'selenides', 'tungstates', 'telluri

['planets', 'outer', 'planets']
['multiple', 'trans-neptunian', 'objects', 'plutinos', 'plutoids', 'wikipedia', 'categories', 'named', 'after', 'solar', 'system', 'objects']
['planets', 'outer', 'planets']
['multiple', 'trans-neptunian', 'objects', 'plutinos', 'plutoids', 'wikipedia', 'categories', 'named', 'after', 'solar', 'system', 'objects']
['planets', 'outer', 'planets']
['multiple', 'trans-neptunian', 'objects', 'plutinos', 'plutoids', 'wikipedia', 'categories', 'named', 'after', 'solar', 'system', 'objects']
['planets', 'outer', 'planets']
['multiple', 'trans-neptunian', 'objects', 'plutinos', 'plutoids', 'wikipedia', 'categories', 'named', 'after', 'solar', 'system', 'objects']
['planets', 'outer', 'planets']
['wikipedia', 'categories', 'named', 'after', 'planets', 'outer', 'planets', 'planets', 'solar', 'system', 'wikipedia', 'categories', 'named', 'after', 'solar', 'system', 'objects', 'gas', 'giants']
['planets', 'outer', 'planets']
['wikipedia', 'categories', 'named', 'aft

['terrestrial', 'planets', 'planets']
['wikipedia', 'categories', 'named', 'after', 'planets', 'terrestrial', 'planets', 'nature', 'planets', 'solar', 'system', 'habitable', 'zone', 'planets', 'wikipedia', 'categories', 'named', 'after', 'solar', 'system', 'objects', 'places']
['terrestrial', 'planets', 'planets']
['wikipedia', 'categories', 'named', 'after', 'planets', 'planets', 'solar', 'system', 'wikipedia', 'categories', 'named', 'after', 'solar', 'system', 'objects']
['terrestrial', 'planets', 'planets']
['wikipedia', 'categories', 'named', 'after', 'planets', 'planets', 'solar', 'system', 'wikipedia', 'categories', 'named', 'after', 'solar', 'system', 'objects']
['terrestrial', 'planets', 'planets']
['wikipedia', 'categories', 'named', 'after', 'planets', 'planets', 'solar', 'system', 'wikipedia', 'categories', 'named', 'after', 'solar', 'system', 'objects']
['terrestrial', 'planets', 'planets']
['wikipedia', 'categories', 'named', 'after', 'planets', 'planets', 'solar', 'system

['precambrian']
['geological', 'eons', 'precambrian']
exception for precambrian
['precambrian']
['geological', 'eons', 'precambrian']
['precambrian']
['geological', 'eons', 'precambrian']
['precambrian']
['geological', 'eons', 'precambrian']
['precambrian']
['geological', 'eons', 'precambrian']
['upper', 'precambrian', 'precambrian']
['geological', 'eons', 'precambrian']
exception for precambrian
exception for precambrian
exception for precambrian
['upper', 'precambrian', 'precambrian']
['geological', 'eons', 'precambrian']
['upper', 'precambrian', 'precambrian']
['geological', 'eons', 'precambrian']
['upper', 'precambrian', 'precambrian']
['geological', 'eons', 'precambrian']
['upper', 'precambrian', 'precambrian']
['geological', 'eons', 'precambrian']
['railroads', 'materials']
['road', 'infrastructure', 'pedestrian', 'infrastructure', 'floors', 'road', 'construction', 'building', 'materials', 'pavement', 'engineering']
['railroads', 'materials']
['road', 'infrastructure', 'pedestria

['seismology']
['length', 'physical', 'quantities', 'dynamics', '(mechanics)', 'temporal', 'rates', 'kinematics', 'velocity']
['seismology']
['length', 'physical', 'quantities', 'dynamics', '(mechanics)', 'temporal', 'rates', 'kinematics', 'velocity']
['seismology']
['length', 'physical', 'quantities', 'dynamics', '(mechanics)', 'temporal', 'rates', 'kinematics', 'velocity']
['shore', 'features']
['bodies', 'water', 'coastal', 'oceanic', 'landforms']
['shore', 'features']
['bodies', 'water', 'coastal', 'oceanic', 'landforms']
['shore', 'features']
['bodies', 'water', 'coastal', 'oceanic', 'landforms']
['shore', 'features']
['bodies', 'water', 'coastal', 'oceanic', 'landforms']
['shore', 'features']
['bodies', 'water', 'coastal', 'oceanic', 'landforms']
['shore', 'features']
['bodies', 'water', 'coastal', 'oceanic', 'landforms']
['shore', 'features']
['bodies', 'water', 'coastal', 'oceanic', 'landforms']
['shore', 'features']
['bodies', 'water', 'coastal', 'oceanic', 'landforms']
['shor

['south', 'america']
['member', 'states', 'community', 'portuguese', 'language', 'countries', 'romance', 'countries', 'territories', 'federal', 'republics', 'countries', 'south', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries', 'member', 'states', 'union', 'south', 'american', 'nations']
['south', 'america']
['member', 'states', 'community', 'portuguese', 'language', 'countries', 'romance', 'countries', 'territories', 'federal', 'republics', 'countries', 'south', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries', 'member', 'states', 'union', 'south', 'american', 'nations']
['south', 'america']
['member', 'states', 'community', 'portuguese', 'language', 'countries', 'romance', 'countries', 'territories', 'federal', 'republics', 'countries', 'south', 'america', 'wikipedia', 'categories', 'named', 'after', 'countries', 'member', 'states', 'union', 'south', 'american', 'nations']
['south', 'america']
['member', 'states', 'community', 'portuguese', 'lan

['southern', 'europe']
['island', 'countries', 'states', 'territories', 'established', '1964', 'nuts', '1', 'statistical', 'regions', 'european', 'union', 'wikipedia', 'categories', 'named', 'after', 'islands', 'archipelagoes', 'mediterranean', 'sea', 'member', 'states', 'european', 'union', 'countries', 'europe', 'southern', 'european', 'countries', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['southern', 'europe']
['island', 'countries', 'states', 'territories', 'established', '1964', 'nuts', '1', 'statistical', 'regions', 'european', 'union', 'wikipedia', 'categories', 'named', 'after', 'islands', 'archipelagoes', 'mediterranean', 'sea', 'member', 'states', 'european', 'union', 'countries', 'europe', 'southern', 'european', 'countries', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries', 'republics']
['southern', 'europe']
['island', 'countries', 'states',

['trinidad', 'tobago']
['islands', 'trinidad', 'tobago', 'wikipedia', 'categories', 'named', 'after', 'islands']
['trinidad', 'tobago']
['islands', 'trinidad', 'tobago', 'wikipedia', 'categories', 'named', 'after', 'islands']
['tuscany', 'islands', 'europe', 'europa', 'island']
['wikipedia', 'categories', 'named', 'after', 'islands', 'province', 'livorno', 'islands', 'tuscany']
exception for tuscany
exception for livorno
exception for tuscany
exception for livorno
exception for tuscany
exception for livorno
exception for tuscany
exception for livorno
exception for tuscany
['tuscany', 'islands', 'europe', 'europa', 'island']
['wikipedia', 'categories', 'named', 'after', 'islands', 'province', 'livorno', 'islands', 'tuscany']
['tuscany', 'islands', 'europe', 'europa', 'island']
['wikipedia', 'categories', 'named', 'after', 'islands', 'province', 'livorno', 'islands', 'tuscany']
['tuscany', 'islands', 'europe', 'europa', 'island']
['wikipedia', 'categories', 'named', 'after', 'islands', '

['uzbekistan']
['cities', 'uzbekistan', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'uzbekistan', 'populated', 'places', 'tashkent', 'region', 'capitals', 'asia', 'wikipedia', 'categories', 'named', 'after', 'capitals']
['uzbekistan']
['cities', 'uzbekistan', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'uzbekistan', 'populated', 'places', 'tashkent', 'region', 'capitals', 'asia', 'wikipedia', 'categories', 'named', 'after', 'capitals']
['uzbekistan']
['cities', 'uzbekistan', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'uzbekistan', 'populated', 'places', 'tashkent', 'region', 'capitals', 'asia', 'wikipedia', 'categories', 'named', 'after', 'capitals']
['uzbekistan']
['cities', 'uzbekistan', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'uzbekistan', 'populated', 'places', 'tashkent', 'region', 'capitals', 'asia', 'wikipedia', 'categories', 'named', 'after', 'capitals']
['venezuela']
['wikipedi

['caribbean', 'region', 'greater', 'antilles', 'central', 'america', 'antilles', 'west', 'indies']
['island', 'countries', 'wikipedia', 'categories', 'named', 'after', 'islands', 'countries', 'caribbean', 'countries', 'north', 'america', 'member', 'states', 'commonwealth', 'nations', 'states', 'territories', 'established', '1962', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries']
['caribbean', 'region', 'greater', 'antilles', 'central', 'america', 'antilles', 'west', 'indies']
['island', 'countries', 'wikipedia', 'categories', 'named', 'after', 'islands', 'countries', 'caribbean', 'countries', 'north', 'america', 'member', 'states', 'commonwealth', 'nations', 'states', 'territories', 'established', '1962', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries']
['caribbean', 'region', 'greater', 'antilles', 'central', 'america', 'antilles', 'west', 'indies']
['island', 'countries',

['caribbean', 'region', 'lesser', 'antilles', 'central', 'america', 'antilles', 'west', 'indies']
['island', 'countries', 'countries', 'caribbean', 'countries', 'north', 'america', 'member', 'states', 'commonwealth', 'nations', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries']
['caribbean', 'region', 'lesser', 'antilles', 'central', 'america', 'antilles', 'west', 'indies']
['island', 'countries', 'countries', 'caribbean', 'countries', 'north', 'america', 'member', 'states', 'commonwealth', 'nations', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries']
['caribbean', 'region', 'lesser', 'antilles', 'central', 'america', 'antilles', 'west', 'indies']
['island', 'countries', 'countries', 'caribbean', 'countries', 'north', 'america', 'member', 'states', 'commonwealth', 'nations', 'english-speaking', 'countries', 'territories', 'wikipedia', 'categories', 'named', 'after', 'countries'

['greece']
['wikipedia', 'categories', 'named', 'after', 'islands', 'tourism', 'greece', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'greece', 'municipalities', 'ionian', 'islands', '(region)']
['greece']
['wikipedia', 'categories', 'named', 'after', 'islands', 'tourism', 'greece', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'greece', 'municipalities', 'ionian', 'islands', '(region)']
['greece']
['wikipedia', 'categories', 'named', 'after', 'islands', 'tourism', 'greece', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'greece', 'municipalities', 'ionian', 'islands', '(region)']
['greece']
['wikipedia', 'categories', 'named', 'after', 'islands', 'tourism', 'greece', 'wikipedia', 'categories', 'named', 'after', 'populated', 'places', 'greece', 'municipalities', 'ionian', 'islands', '(region)']


#### Printing the similaritities according to the Similarity 

In [71]:
#df['KindOfLink']=df['KindOfLink'].fillna('W')
df.head()


Unnamed: 0.1,Unnamed: 0,s,o,KindOfLink,sBT,sNT,sRT,sPrefLabel,sAltLabels,oPrefLabel,...,oBT,oNT,oRT,similaritySInW,wmdistance,Mwmdistance,SMwmdistance,nhammingSim,MnhammingSim,SMnhammingSim
559,559,http://linkeddata.ge.imati.cnr.it/resource/ThI...,http://dbpedia.org/resource/Category:Seams,C,,,mineral deposits| mining| stratiform deposits|...,seams,,Seams,...,Sewing,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0
560,560,http://linkeddata.ge.imati.cnr.it/resource/ThI...,http://dbpedia.org/resource/Category:Tides,E,,,ocean circulation| tidal currents| littoral er...,tides,,Tides,...,Geophysics| Moon| Physical oceanography| Tidal...,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0
561,561,http://linkeddata.ge.imati.cnr.it/resource/ThI...,http://dbpedia.org/resource/Category:Tides,C,,,ocean circulation| tidal currents| littoral er...,tides,,Tides,...,Geophysics| Moon| Physical oceanography| Tidal...,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0
562,562,http://linkeddata.ge.imati.cnr.it/resource/ThI...,http://dbpedia.org/resource/Category:Transport...,E,,,subways| pipelines| sediment transport,transportation,,Transportation,...,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0
563,563,http://linkeddata.ge.imati.cnr.it/resource/ThI...,http://dbpedia.org/resource/Category:Ornamenta...,W,,,paleontology| shells,ornamentation,,Ornamentation,...,Melody| Musical terminology| Musical technique...,,Ornament,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [75]:

#saving the results in Thist2DBPEDIA_ok_res.csv
df.to_csv(path+'Thist2DBPEDIA_ok_res.csv', sep=';') 

In [76]:
df['RT_similaritySInW']=0.0 
df['RT_wmdistance']=df['RT_Mwmdistance']= df['RT_SMwmdistance']= 0.0
df['RT_nhammingSim']= df['RT_MnhammingSim']= df['RT_SMnhammingSim']= 0.0

l = range(1, len(df))
for i in l:
    if (df.sRT.iloc[i]!='') and (df.oRT.iloc[i]!=''):
         df['RT_similaritySInW'][i] =similarityBetweenSetsSplitingInWords(df.sRT[i], df.oRT[i])
         df['RT_wmdistance'][i]=word_vectors.wmdistance(df.sRT[i], df.oRT[i])
         df['RT_Mwmdistance'][i]= maxInSplitWords(df.sRT[i], df.oRT[i], word_vectors.wmdistance)
         df['RT_SMwmdistance'][i]= summingMax(df.sRT[i], df.oRT[i], word_vectors.wmdistance)
         df['RT_nhammingSim'][i]=textdistance.hamming.normalized_similarity(df.sRT[i], df.oRT[i])
         df['RT_MnhammingSim'][i]=maxInSplitWords(df.sRT[i], df.oRT[i],textdistance.hamming.normalized_similarity)
         df['RT_SMnhammingSim'][i] =summingMax(df.sRT[i], df.oRT[i],textdistance.hamming.normalized_similarity)

['gulf', 'guinea', 'volta', 'river', 'volta', 'basin']
['guinea-bissau', 'equatorial', 'guinea', 'french', 'guiana', 'guyana', 'papua', 'new', 'guinea', 'new', 'guinea', 'guinea']
exception for guinea-bissau
exception for guiana
exception for guyana
exception for papua
exception for guinea-bissau
exception for guiana
exception for guyana
exception for papua
exception for guinea-bissau
exception for guiana
exception for guyana
exception for papua
exception for guinea-bissau
exception for guiana
exception for guyana
exception for papua
exception for guinea-bissau
exception for guiana
exception for guyana
exception for papua
exception for guinea-bissau
exception for guiana
exception for guyana
exception for papua


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if __name__ == '__main__':


['gulf', 'guinea', 'volta', 'river', 'volta', 'basin']
['guinea-bissau', 'equatorial', 'guinea', 'french', 'guiana', 'guyana', 'papua', 'new', 'guinea', 'new', 'guinea', 'guinea']


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  # Remove the CWD from sys.path while we load stuff.
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  # This is added back by InteractiveShellApp.init_path()
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if sys.path[0] == '':
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  del sys.path[0]
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the 

['gulf', 'guinea', 'volta', 'river', 'volta', 'basin']
['guinea-bissau', 'equatorial', 'guinea', 'french', 'guiana', 'guyana', 'papua', 'new', 'guinea', 'new', 'guinea', 'guinea']
['gulf', 'guinea', 'volta', 'river', 'volta', 'basin']
['guinea-bissau', 'equatorial', 'guinea', 'french', 'guiana', 'guyana', 'papua', 'new', 'guinea', 'new', 'guinea', 'guinea']
['gulf', 'guinea', 'volta', 'river', 'volta', 'basin']
['guinea-bissau', 'equatorial', 'guinea', 'french', 'guiana', 'guyana', 'papua', 'new', 'guinea', 'new', 'guinea', 'guinea']
['caucasus', 'azov', 'sea', 'ussr', 'urals']
['soviet', 'union']
exception for caucasus
exception for azov
exception for urals
['caucasus', 'azov', 'sea', 'ussr', 'urals']
['soviet', 'union']
['caucasus', 'azov', 'sea', 'ussr', 'urals']
['soviet', 'union']
['caucasus', 'azov', 'sea', 'ussr', 'urals']
['soviet', 'union']
['caucasus', 'azov', 'sea', 'ussr', 'urals']
['soviet', 'union']
['malaysia', 'brunei', 'indonesia', 'pacific', 'ocean']
['kalimantan']
ex

['seismic', 'energy', 'richter', 'scale', 'isoseismic', 'maps', 'seismic', 'zoning', 'seismograms', 'magnitude', 'intraplate', 'processes', 'faults', 'stick-slip', 'moonquakes', 'detection', 'focal', 'mechanism', 'tiltmeters', 'vibration', 'damage', 'reservoirs', 'cratering', 'main', 'shocks', 'surface', 'waves', 'applied', 'geology', 'focus', 'clinometers', 'foundations', 'ground', 'motion', 'arrival', 'time', 'aftershocks', 'seismic', 'intensity', 'mantle', 'explosions', 'paleoseismicity', 'epicenters', 'plate', 'tectonics', 'seismic', 'gaps', 'elastic', 'waves', 'geologic', 'hazards', 'radon', 'emanometry', 'b-values', 'swarms', 'earthquake', 'prediction', 'q', 'rock', 'mechanics', 'dilatancy', 'fluid', 'injection', 'mohorovicic', 'discontinuity', 'icequakes', 'acoustical', 'emissions', 'quiescence', 'microseisms', 'teleseismic', 'signals', 'geos', 'soil', 'mechanics', 'seismic', 'moment', 'seismic', 'response', 'seismic', 'sources', 'slope', 'stability', 'seismographs', 'stress', '

['polynesia', 'islands', 'pacific', 'ocean']
['american', 'samoa']
exception for polynesia
exception for samoa
exception for samoa
exception for samoa
['polynesia', 'islands', 'pacific', 'ocean']
['american', 'samoa']
['polynesia', 'islands', 'pacific', 'ocean']
['american', 'samoa']
['polynesia', 'islands', 'pacific', 'ocean']
['american', 'samoa']
['polynesia', 'islands', 'pacific', 'ocean']
['american', 'samoa']
['phenols']
['alcohol', 'alcoholic', 'drinks']
['phenols']
['alcohol', 'alcoholic', 'drinks']
['phenols']
['alcohol', 'alcoholic', 'drinks']
['phenols']
['alcohol', 'alcoholic', 'drinks']
['phenols']
['alcohol', 'alcoholic', 'drinks']
['activation', 'energy']
['isomerases', 'ligases', 'hydrolases', 'oxidoreductases', 'enzymes', 'by', 'function', 'lyases', 'transferases']
exception for isomerases
exception for oxidoreductases
exception for lyases
exception for isomerases
exception for oxidoreductases
exception for lyases
['activation', 'energy']
['isomerases', 'ligases', 'hyd

['steel', 'industry', 'metallurgy', 'alloys', 'iron']
['steels']
['steel', 'industry', 'metallurgy', 'alloys', 'iron']
['steels']
['steel', 'industry', 'metallurgy', 'alloys', 'iron']
['steels']
['steel', 'industry', 'metallurgy', 'alloys', 'iron']
['steels']
['steel', 'industry', 'metallurgy', 'alloys', 'iron']
['steels']
['steel', 'industry', 'metallurgy', 'alloys', 'iron']
['steels']
['steel', 'industry', 'metallurgy', 'alloys', 'iron']
['steels']
['steel', 'industry', 'metallurgy', 'alloys', 'iron']
['steels']
['steel', 'industry', 'metallurgy', 'alloys', 'iron']
['steels']
['irrigation', 'streams', 'channelization', 'waterways', 'water', 'supply', 'channels']
['canals', 'by', 'country']
['irrigation', 'streams', 'channelization', 'waterways', 'water', 'supply', 'channels']
['canals', 'by', 'country']
['irrigation', 'streams', 'channelization', 'waterways', 'water', 'supply', 'channels']
['canals', 'by', 'country']
['irrigation', 'streams', 'channelization', 'waterways', 'water', '

['monographs', 'publications', 'lexicons', 'glossaries']
['encyclopedias']
['monographs', 'publications', 'lexicons', 'glossaries']
['encyclopedias']
['monographs', 'publications', 'lexicons', 'glossaries']
['encyclopedias']
['monographs', 'publications', 'lexicons', 'glossaries']
['encyclopedias']
['monographs', 'publications', 'lexicons', 'glossaries']
['encyclopedias']
['photography', 'photogeology', 'aerial', 'photography', 'imagery', 'multispectral', 'scanner', 'multispectral', 'analysis', 'space', 'photography', 'photogeologic', 'maps', 'photogrammetry', 'remote', 'sensing']
['mosaic']
exception for photogeology
exception for photogeologic
['photography', 'photogeology', 'aerial', 'photography', 'imagery', 'multispectral', 'scanner', 'multispectral', 'analysis', 'space', 'photography', 'photogeologic', 'maps', 'photogrammetry', 'remote', 'sensing']
['mosaic']
['photography', 'photogeology', 'aerial', 'photography', 'imagery', 'multispectral', 'scanner', 'multispectral', 'analysis

['stress', 'fields', 'finite', 'strain', 'analysis', 'elasticity', 'extensometers', "poisson's", 'ratio', 'viscoelasticity', 'bearing', 'capacity', 'stressmeters', 'release', 'fractures', 'tension', 'shear', 'rigidity', 'strength', 'pore', 'pressure', 'brittleness', 'creep', 'rock', 'mechanics', 'failures', 'structural', 'analysis', "hooke's", 'law', 'torsion', 'rupture', 'shear', 'stress', 'hysteresis', 'deformation', 'yield', 'strength', 'strain', 'stress', 'drops', 'pressure', 'seismology', 'shear', 'strength']
['mood', 'disorders', 'positive', 'psychology', 'happiness', 'positive', 'mental', 'attitude', 'motivation']
['stress', 'fields', 'finite', 'strain', 'analysis', 'elasticity', 'extensometers', "poisson's", 'ratio', 'viscoelasticity', 'bearing', 'capacity', 'stressmeters', 'release', 'fractures', 'tension', 'shear', 'rigidity', 'strength', 'pore', 'pressure', 'brittleness', 'creep', 'rock', 'mechanics', 'failures', 'structural', 'analysis', "hooke's", 'law', 'torsion', 'ruptur

['taiga', 'environment', 'cyclones', 'climatic', 'controls', 'semi-arid', 'environment', 'atmospheric', 'precipitation', 'meteorology', 'temperate', 'environment', 'global', 'soils', 'drought', 'arid', 'environment', 'humidity', 'equatorial', 'region', 'paleoclimatology', 'soil-water', 'balance', 'length', 'day', 'greenhouse', 'effect', 'climatologic', 'maps', 'arctic', 'environment', 'subtropical', 'environment', 'climate-induced', 'circulation', 'intermittent', 'stream', 'sedimentation', 'pluviometric', 'station', 'latitude', 'winds', 'temperature', 'alpine', 'environment', 'climate', 'effects', 'desertification', 'storms', 'obliquity', 'ecliptic', 'humid', 'environment', 'atmosphere', 'factors', 'climate', 'change', 'erosion', 'cycle', 'milankovitch', 'theory', 'foehn', 'monsoons', 'paleoatmosphere', 'pedogenesis', 'tropical', 'environment', 'global', 'warming', 'global', 'change', 'el', 'nino']
['meteorology']
['taiga', 'environment', 'cyclones', 'climatic', 'controls', 'semi-arid'

['passband', 'filters', 'wiener', 'filters', 'continuous', 'filters', 'geotextiles', 'recursive', 'filters', 'seismic', 'methods', 'dispersive', 'filters', 'multichannel', 'methods', 'adaptive', 'filters', 'numerical', 'filters', 'multichannel', 'filters', 'kalman', 'filters', 'elastic', 'waves', 'spatial', 'frequency', 'filters', 'analog', 'filters', 'discrete', 'filters', 'optimal', 'filters', 'deconvolution', 'signals', 'noise', 'distortion', 'seismology']
['signal', 'processing', 'filter', 'spam', 'filtering', 'anti-spam', 'optical', 'filters']
['passband', 'filters', 'wiener', 'filters', 'continuous', 'filters', 'geotextiles', 'recursive', 'filters', 'seismic', 'methods', 'dispersive', 'filters', 'multichannel', 'methods', 'adaptive', 'filters', 'numerical', 'filters', 'multichannel', 'filters', 'kalman', 'filters', 'elastic', 'waves', 'spatial', 'frequency', 'filters', 'analog', 'filters', 'discrete', 'filters', 'optimal', 'filters', 'deconvolution', 'signals', 'noise', 'distorti

['sewage', 'sludge', 'thermal', 'pollution', 'leaking', 'underground', 'storage', 'tanks', 'sulfuric', 'acid', 'toxic', 'materials', 'urban', 'geology', 'cyanides', 'water', 'treatment', 'metals', 'regulations', 'surface', 'water', 'waste', 'disposal', 'potability', 'hydrology', 'heavy', 'metals', 'urban', 'environment', 'pollutants', 'polluted', 'water', 'soil', 'gases', 'impurities', 'geochemistry', 'background', 'level', 'remediation', 'purification', 'environmental', 'monitoring', 'radioactive', 'waste', 'point', 'sources', 'industrial', 'waste', 'human', 'ecology', 'hydrocarbons', 'acid', 'mine', 'drainage', 'environmental', 'geology', 'medical', 'geology', 'waste', 'disposal', 'sites', 'sewers', 'geochemical', 'anomalies', 'conservation', 'oil', 'spills', 'geologic', 'hazards', 'dilution', 'decontamination', 'water', 'management', 'environmental', 'impact', 'evaluation', 'land', 'use', 'waste', 'water', 'reclamation', 'near-field', 'radioactivity', 'bioremediation', 'protected', 

['popular', 'geology', 'current', 'research']
['years', 'births', 'by', 'year', 'deaths', 'by', 'year', 'years', 'music', 'films', 'by', 'year', 'history', 'years', 'film']
['popular', 'geology', 'current', 'research']
['years', 'births', 'by', 'year', 'deaths', 'by', 'year', 'years', 'music', 'films', 'by', 'year', 'history', 'years', 'film']
['popular', 'geology', 'current', 'research']
['years', 'births', 'by', 'year', 'deaths', 'by', 'year', 'years', 'music', 'films', 'by', 'year', 'history', 'years', 'film']
['popular', 'geology', 'current', 'research']
['years', 'births', 'by', 'year', 'deaths', 'by', 'year', 'years', 'music', 'films', 'by', 'year', 'history', 'years', 'film']
['popular', 'geology', 'current', 'research']
['years', 'births', 'by', 'year', 'deaths', 'by', 'year', 'years', 'music', 'films', 'by', 'year', 'history', 'years', 'film']
['paleontology', 'predation', 'nutrients', 'trophic', 'analysis', 'diet', 'metabolism']
['food', 'science', 'digestive', 'system']
['pa

['crystal', 'structure']
['quasiregular', 'polyhedra', 'platonic', 'solids', 'archimedean', 'solids', 'images', 'polyhedra', 'tessellation', 'kepler–poinsot', 'polyhedra', 'prismatoid', 'polyhedra', 'uniform', 'polyhedra', 'catalan', 'solids', 'johnson', 'solids', 'pyramids', 'bipyramids']
['crystal', 'structure']
['quasiregular', 'polyhedra', 'platonic', 'solids', 'archimedean', 'solids', 'images', 'polyhedra', 'tessellation', 'kepler–poinsot', 'polyhedra', 'prismatoid', 'polyhedra', 'uniform', 'polyhedra', 'catalan', 'solids', 'johnson', 'solids', 'pyramids', 'bipyramids']
['runoff', 'atmospheric', 'precipitation', 'hydrology', 'crystallization', 'geochemistry', 'sedimentation', 'salt', 'water', 'watersheds', 'storms', 'water']
['clouds']
['runoff', 'atmospheric', 'precipitation', 'hydrology', 'crystallization', 'geochemistry', 'sedimentation', 'salt', 'water', 'watersheds', 'storms', 'water']
['clouds']
['runoff', 'atmospheric', 'precipitation', 'hydrology', 'crystallization', 'geoc

['bogs', 'marshes', 'lacustrine', 'features', 'shore', 'features', 'spartina', 'alterniflora', 'conservation', 'geomorphology', 'vegetation', 'wilderness', 'areas', 'grasslands', 'swamps', 'ecology', 'fluvial', 'features']
['landforms', 'bodies', 'water']
['bogs', 'marshes', 'lacustrine', 'features', 'shore', 'features', 'spartina', 'alterniflora', 'conservation', 'geomorphology', 'vegetation', 'wilderness', 'areas', 'grasslands', 'swamps', 'ecology', 'fluvial', 'features']
['landforms', 'bodies', 'water']
['autochthons', 'allochthons', 'structural', 'geology', 'tectonics', 'erosion', 'features', 'overthrust', 'faults']
['microsoft', 'windows']
exception for autochthons
exception for allochthons
['autochthons', 'allochthons', 'structural', 'geology', 'tectonics', 'erosion', 'features', 'overthrust', 'faults']
['microsoft', 'windows']
['autochthons', 'allochthons', 'structural', 'geology', 'tectonics', 'erosion', 'features', 'overthrust', 'faults']
['microsoft', 'windows']
['autochthons

In [77]:

#saving the results in Thist2DBPEDIA_ok_res.csv
df.to_csv(path+'Thist2DBPEDIA_ok_resWithRT.csv', sep=';') 

# Learning the Model


https://academy.rapidminer.com/courses/creating-a-decision-tree-model