# Re-implementing procedure outlined in "Entity Profiling in Knowledge Graphs" (Zhang Et al.)
# This notebook will implement the candidate label creation step
Using a subset of wikidata related to Q44 ("beer")

### Pre-requisite steps to run this notebook
1. If you do not have kgtk installed, or do not have the kgtk query command, first install this with `pip install -e <path to local kgtk repo>`
2. You'll need to have a subset of wikidata partitioned into different files on your machine. You need to create this yourself, or if you have access to the Table_Linker google drive then you can download the Q44 example data here: https://drive.google.com/drive/folders/1U3Tc25rRwu6xy74mPDOG5LIjhUXpbD9A?usp=sharing

In [12]:
import os
import pandas as pd
from utility import run_command
from utility import rename_cols_and_overwrite_id
from label_discretization import discretize_labels

### Parameters
**Required**  
*item_file*: file path for the file that contains entity to entity relationships (e.g. wikibase-item)  
*time_file*: file path for the file that contains entity to time-type values  
*quantity_file*: file path for the file that contains entity to quantity-type values  
*label_file*: file path for the file that contains wikidata labels  
*work_dir*: path to folder where files created by this notebook should be stored  
*store_dir*: path to folder containing the sqlite3.db file that we will use for our queries. We will reuse an existing file if there is one in this folder. Otherwise we will create a new one.

**Optional**    
*string_file*: file path for the file that contains entity to string-type values  

In [75]:
data_dir = "../../Q44/data" # my data files are all in the same directory, so I'll reuse this path prefix

# **REQUIRED**
item_file = "{}/Q44.part.wikibase-item.tsv".format(data_dir)
time_file = "{}/Q44.part.time.tsv".format(data_dir)
quantity_file = "{}/Q44.part.quantity.tsv".format(data_dir)
label_file = "{}/Q44.label.en.tsv".format(data_dir)
work_dir = "../../Q44/profiler_work_string_and_untrimmed_quantity"
store_dir = "../../Q44"

# **optional**
string_file = "{}/Q44.part.string.tsv".format(data_dir)

### Process parameters and set up variables / file names

In [76]:
# Ensure paths are absolute
item_file = os.path.abspath(item_file)
time_file = os.path.abspath(time_file)
quantity_file = os.path.abspath(quantity_file)
label_file = os.path.abspath(label_file)
work_dir = os.path.abspath(work_dir)
store_dir = os.path.abspath(store_dir)
if string_file:
    string_file = os.path.abspath(string_file)
    
# Create directories
if not os.path.exists(work_dir):
    os.makedirs(work_dir)
output_dir = "{}/label_creation".format(work_dir)
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

# adding some environment variables we'll be using frequently
os.environ['ITEM_FILE'] = item_file
os.environ['TIME_FILE'] = time_file
os.environ['QUANTITY_FILE'] = quantity_file
os.environ['LABEL_FILE'] = label_file
os.environ['STORE'] = "{}/wikidata.sqlite3.db".format(store_dir)
os.environ['OUT'] = output_dir
os.environ['kgtk'] = "kgtk" # Need to do this for kgtk to be recognized as a command when passing it through a subprocess call

# Outline of procedure:
**Goal**:<br>
We want to create candidate label sets including
- Attribute value labels (type, property, *attribute*)
- Realtional entity labels (type, property, *entity*)
- Attribute interval labels (type, property, *range of attribute values*)
- Relational attribute labels (type, property, *attribute or attribute range of another entity*)

To enable subsequent filtering of these labels, we also want to count:
- The number of entities of each type
- The number of entities that match each label (call these "positives")
    
**Steps**:

0. Create type-mapping
1. Count the number of entities of each type
    - *optional future step*: define type with P279 transitive closure in addition to P31. 
2. Create AVLs trivially from attribute files along with counts of the positive entities for each label
    - At this step, we should also contribute to a mapping of entities --> matching attribute labels to facilitate creating RALs in a later step  
3. Create RELs trivially from entity relation files along with counts of positive entities for each label
4. Create AILs by discretizing the AVLs we found, along with counts of positive entities for each label
    - See label_discretization notebook for some exploration of discretization approach that led to the method that is implemented in this notebook
    - At this step, we should also contribute to a mapping of entities --> matching attribute labels to facilitate creating RALs in a later step
5. Create RALs by using the entities --> attribute labels table that we built in steps 2 and 4. Also keep track of counts of positive entities for each label
    
*Misc issues encountered*
- kgtk rename-columns doesn't always work when input file == output file. Getting around this right now by creating temp files... 

## 0. Create type mapping
Mapping is from entity (Q node) to the entity's type (another Q node). Using P31 only for now, but can add P279* as well later

In [434]:
!kgtk filter -p ' ; P31 ; ' -i $ITEM_FILE -o $OUT/type_mapping.tsv

In [435]:
!head -5 $OUT/type_mapping.tsv | column -t -s $'\t'

id              node1     label  node2
Q1000597-P31-1  Q1000597  P31    Q3957
Q1011-P31-2     Q1011     P31    Q112099
Q1011-P31-1     Q1011     P31    Q3624078
Q1019-P31-2     Q1019     P31    Q112099


## 1. Count number of entities of each type:
Use the entity --> type mapping we created in step 0 to do this

In [453]:
!kgtk query -i $OUT/type_mapping.tsv -i $LABEL_FILE \
-o $OUT/entity_counts_per_type.tsv --graph-cache $STORE \
--match 'type: (n1)-[]->(type), `'"$LABEL_FILE"'`: (type)-[:label]->(lab)' \
--return 'distinct type as type, lab as type_label, count(distinct n1) as count, "_" as id' \
--where 'lab.kgtk_lqstring_lang_suffix = "en"' \
--order-by 'count(distinct n1) desc'

In [454]:
rename_cols_and_overwrite_id("$OUT/entity_counts_per_type", ".tsv", "type type_label count", "node1 label node2")

In [455]:
!head -10 $OUT/entity_counts_per_type.tsv | column -t -s $'\t'

node1     label                    node2  id
Q131734   'brewery'@en             87     E1
Q3624078  'sovereign state'@en     69     E2
Q4830453  'business'@en            50     E3
Q6256     'country'@en             26     E4
Q6881511  'enterprise'@en          23     E5
Q7270     'republic'@en            18     E6
Q179164   'unitary state'@en       16     E7
Q1998962  'beer style'@en          16     E8
Q123480   'landlocked country'@en  15     E9


## 2. Create AVLs with counts of positive entities
At this step we also want to keep track of entities --> matching attribute labels for future use. This will help when we are creating RALs (step 5)

To accomplish these goals, we will do the following:

For each attribute type (string, time, quantity), we will 1. use the entity --> type mapping along with the attribute data file to create an entity_attribute_labels file that has a mapping of entity --> labels applicable to the entity, and 2. use the entity_attribute_labels file to aggregate labels with counts of matching entities which we will save in a candidate_labels file

### 2.1 strings
Creating mapping of entity --> string attribute labels

In [28]:
if not string_file:
    print("No string attribute file was provided in the parameters section, skipping this step.")
else:
    # perform query
    command = "$kgtk query -i $OUT/type_mapping.tsv -i STRING_FILE -i LABEL_FILE \
               -o $OUT/entity_attribute_labels_string.tsv --graph-cache $STORE \
               --match '`STRING_FILE`: (n1)-[l {label:p}]->(n2), type: (n1)-[]->(type), `LABEL_FILE`: (p)-[:label]->(lab)' \
               --return 'distinct n1 as entity, type as type, p as prop, n2 as value, lab as property_label, \"_\" as id' \
               --where 'lab.kgtk_lqstring_lang_suffix = \"en\"' \
               --order-by 'n1'"
    run_command(command, {"STRING_FILE" : string_file, "LABEL_FILE" : label_file})
    # reformat columns to be in KGTK format
    rename_cols_and_overwrite_id("$OUT/entity_attribute_labels_string", ".tsv", "type prop value", "node1 label node2")
    # view header of result
    run_command("head -5 $OUT/entity_attribute_labels_string.tsv | column -t -s $'\t'")

entity    node1  label  node2                property_label           id
Q1000597  Q3957  P281   "DE14"               'postal code'@en         E1
Q1000597  Q3957  P373   "Burton upon Trent"  'Commons category'@en    E2
Q1000597  Q3957  P473   "01283"              'local dialing code'@en  E3
Q1000597  Q3957  P613   "SK245225"           'OS grid reference'@en   E4



Aggregating distinct labels w/ positive entity counts

In [33]:
if not string_file:
    print("No string attribute file was provided in the parameters section, skipping this step.")
else:
    # perform query
    command = "$kgtk query -i $OUT/entity_attribute_labels_string.tsv \
               -o $OUT/candidate_labels_avl_string.tsv --graph-cache $STORE \
               --match 'labels: (type)-[l {label:prop, property_label:lab, entity:e}]->(val)' \
               --return 'distinct type as type, prop as prop, val as value, count(distinct e) as positives, lab as property_label, \"_\" as id' \
               --order-by 'count(distinct e) desc'"
    run_command(command)
    # reformat columns to be in KGTK format
    rename_cols_and_overwrite_id("$OUT/candidate_labels_avl_string", ".tsv", "type prop value", "node1 label node2")
    # view header of result
    run_command("head -5 $OUT/candidate_labels_avl_string.tsv | column -t -s $'\t'")

node1     label  node2    positives  property_label     id
Q3624078  P3238  "0"      34         'trunk prefix'@en  E1
Q3624078  P3238  novalue  14         'trunk prefix'@en  E2
Q6256     P3238  "0"      12         'trunk prefix'@en  E3
Q179164   P3238  "0"      9          'trunk prefix'@en  E4



### 2.2 Times

Looking at what precisions we need to deal with...

In [39]:
!kgtk query -i $TIME_FILE $LABEL_FILE\
--graph-cache $STORE \
--match '`'"$TIME_FILE"'`: (n1)-[l {label:p}]->(n2), `'"$LABEL_FILE"'`: (p)-[:label]->(lab)' \
--return 'distinct kgtk_date_precision(n2) as precisions, count(n1) as count' \
--limit 10 \
| column -t -s $'\t'

precisions  count
6           12
7           40
8           12
9           697
10          48
11          469


From the above, we have several precisions below precision of year=9. We don't have kgtk type interpretation functions for these granularities, so for now we'll interpret them all as years. Furthermore, we will interpret all times at the year granularity for now.

Additional work can be done later to create labels with finer time granularity if desired.

Creating mapping of entity --> year attribute labels

In [40]:
!kgtk query -i $OUT/type_mapping.tsv -i $TIME_FILE -i $LABEL_FILE \
-o $OUT/entity_attribute_labels_time.year.tsv --graph-cache $STORE \
--match '`'"$TIME_FILE"'`: (n1)-[l {label:p}]->(n2), type: (n1)-[]->(type), `'"$LABEL_FILE"'`: (p)-[:label]->(p_lab), `'"$LABEL_FILE"'`: (type)-[:label]->(t_lab)' \
--return 'distinct n1 as entity, type as type, p as prop, kgtk_date_year(n2) as value, t_lab as type_label, p_lab as property_label, "_" as id' \
--where 't_lab.kgtk_lqstring_lang_suffix = "en" AND p_lab.kgtk_lqstring_lang_suffix = "en"' \
--order-by 'n1'

In [41]:
rename_cols_and_overwrite_id("$OUT/entity_attribute_labels_time.year", ".tsv", "type prop value", "node1 label node2")

In [42]:
!head -5 $OUT/entity_attribute_labels_time.year.tsv | column -t -s $'\t'

entity  node1     label  node2  type_label            property_label  id
Q1011   Q112099   P571   1975   'island nation'@en    'inception'@en  E1
Q1011   Q3624078  P571   1975   'sovereign state'@en  'inception'@en  E2
Q1019   Q112099   P571   1960   'island nation'@en    'inception'@en  E3
Q1019   Q3624078  P571   1960   'sovereign state'@en  'inception'@en  E4


Aggregating distinct labels w/ positive entity counts

In [43]:
!kgtk query -i $OUT/entity_attribute_labels_time.year.tsv \
-o $OUT/candidate_labels_avl_time.year.tsv --graph-cache $STORE \
--match 'labels: (n1)-[l {label:p, property_label:lab, entity:e}]->(val)' \
--return 'distinct n1 as type, p as prop, val as value, count(distinct e) as positives, lab as property_label, "_" as id' \
--order-by 'count(distinct e) desc'

In [44]:
rename_cols_and_overwrite_id("$OUT/candidate_labels_avl_time.year", ".tsv", "type prop value", "node1 label node2")

In [45]:
!head -5 $OUT/candidate_labels_avl_time.year.tsv | column -t -s $'\t'

node1     label  node2  positives  property_label  id
Q3624078  P571   1991   8          'inception'@en  E1
Q3624078  P571   1918   7          'inception'@en  E2
Q6256     P571   1918   5          'inception'@en  E3
Q6256     P571   1991   5          'inception'@en  E4


## 2.3 Quantities
Creating mapping of entity --> quantity attribute labels

Note, quantities may have units. We will separate out the quantity value and units into separate columns

In [46]:
!kgtk query -i $OUT/type_mapping.tsv -i $QUANTITY_FILE -i $LABEL_FILE \
-o $OUT/entity_attribute_labels_quantity.tsv --graph-cache $STORE \
--match '`'"$QUANTITY_FILE"'`: (n1)-[l {label:p}]->(n2), type: (n1)-[]->(type), `'"$LABEL_FILE"'`: (p)-[:label]->(p_lab), `'"$LABEL_FILE"'`: (type)-[:label]->(t_lab)' \
--return 'distinct n1 as entity, type as type, p as prop, kgtk_quantity_number(n2) as value, kgtk_quantity_si_units(n2) as si_units, kgtk_quantity_wd_units(n2) as wd_units, t_lab as type_label, p_lab as property_label, "_" as id' \
--where 't_lab.kgtk_lqstring_lang_suffix = "en" AND p_lab.kgtk_lqstring_lang_suffix = "en"' \
--order-by 'n1'

In [47]:
rename_cols_and_overwrite_id("$OUT/entity_attribute_labels_quantity", ".tsv", "type prop value", "node1 label node2")

In [48]:
display(pd.read_csv("{}/entity_attribute_labels_quantity.tsv".format(os.environ["OUT"]), delimiter = '\t', nrows=5).fillna(""))


Unnamed: 0,entity,node1,label,node2,si_units,wd_units,type_label,property_label,id
0,Q1000597,Q3957,P1082,75074.0,,,'town'@en,'population'@en,E1
1,Q1011,Q112099,P1081,0.57,,,'island nation'@en,'Human Development Index'@en,E2
2,Q1011,Q3624078,P1081,0.57,,,'sovereign state'@en,'Human Development Index'@en,E3
3,Q1011,Q112099,P1081,0.572,,,'island nation'@en,'Human Development Index'@en,E4
4,Q1011,Q3624078,P1081,0.572,,,'sovereign state'@en,'Human Development Index'@en,E5


Aggregating distinct labels w/ positive entity counts

In [49]:
!kgtk query -i $OUT/entity_attribute_labels_quantity.tsv \
-o $OUT/candidate_labels_avl_quantity.tsv --graph-cache $STORE \
--match 'labels: (n1)-[l {label:p, property_label:lab, entity:e, si_units:si, wd_units:wd}]->(val)' \
--return 'distinct n1 as type, p as prop, val as value, count(distinct e) as positives, lab as property_label, "_" as id, si as si_units, wd as wd_units' \
--order-by 'count(distinct e) desc'

In [50]:
rename_cols_and_overwrite_id("$OUT/candidate_labels_avl_quantity", ".tsv", "type prop value", "node1 label node2")

In [51]:
display(pd.read_csv("{}/candidate_labels_avl_quantity.tsv".format(os.environ["OUT"]), delimiter = '\t', nrows=5).fillna(""))


Unnamed: 0,node1,label,node2,positives,property_label,id,si_units,wd_units
0,Q3624078,P3000,18.0,54,'marriageable age'@en,E1,,Q24564698
1,Q3624078,P2997,18.0,52,'age of majority'@en,E2,,Q24564698
2,Q3624078,P2884,230.0,40,'mains voltage'@en,E3,,Q25250
3,Q3624078,P1279,1.7,25,'inflation rate'@en,E4,,Q11229
4,Q3624078,P1279,2.8,25,'inflation rate'@en,E5,,Q11229


### 2.4 Combining entity --> attribute label mappings to single table

In [53]:
command = "$kgtk cat \
           -i $OUT/entity_attribute_labels_time.year.tsv \
           -i $OUT/entity_attribute_labels_quantity.tsv \
           -o $OUT/entity_AVLs_all.tsv"
if string_file:
    command += " -i $OUT/entity_attribute_labels_string.tsv"

run_command(command)

## 3. Create RELs with counts of positive entities
We do this the same way we created AVLs, except we use the entity to entity relation data file, and we don't need to save the intermediate entity --> labels file since these labels won't contribute to RALs later

In [54]:
!kgtk query -i $ITEM_FILE -i $OUT/type_mapping.tsv -i $LABEL_FILE \
-o $OUT/candidate_labels_rel_item.tsv --graph-cache $STORE \
--match '`'"$ITEM_FILE"'`: (n1)-[l {label:p}]->(n2), type: (n1)-[]->(type), `'"$LABEL_FILE"'`: (p)-[:label]->(lab)' \
--return 'distinct type as type, p as prop, n2 as value, count(distinct n1) as positives, lab as property_label, "_" as id' \
--where 'lab.kgtk_lqstring_lang_suffix = "en"' \
--order-by 'count(distinct n1) desc'

In [55]:
rename_cols_and_overwrite_id("$OUT/candidate_labels_rel_item", ".tsv", "type prop value", "node1 label node2")

In [56]:
!head -10 $OUT/candidate_labels_rel_item.tsv | column -t -s $'\t'

node1     label  node2    positives  property_label            id
Q131734   P452   Q869095  77         'industry'@en             E1
Q3624078  P463   Q1065    68         'member of'@en            E2
Q3624078  P463   Q7817    67         'member of'@en            E3
Q3624078  P530   Q183     67         'diplomatic relation'@en  E4
Q3624078  P463   Q376150  66         'member of'@en            E5
Q3624078  P530   Q865     66         'diplomatic relation'@en  E6
Q3624078  P463   Q17495   65         'member of'@en            E7
Q3624078  P463   Q191384  65         'member of'@en            E8
Q3624078  P463   Q656801  65         'member of'@en            E9


## 4. Create AILs with counts of positive entities
Similar to what we did for AVLs, we also want to keep track of entities --> matching attribute labels for future use in RAL creation (step 5)

We will create attribute *interval* labels from our attribute *value* labels that we previously created. The code that does this is explored in the explore_label_discretization notebook, and implemented in label_discretization.py.

For each entity --> labels file that has a numeric value type (year or quantity) we will:
1. Create a corresponding entity --> bucketed labels file. For example, a label in the input that looks like <country, population, 1,000,000> might get summarized (bucketed) in the output to look like <country, population, (500,000, 2,000,000)>.
2. Use the resulting bucketed entity_attribute_labels file to once again aggregate labels with counts of matching entities. This will give us a candidate_labels_ail file.

Note, the code will create some output about labels that it may not be creating good buckets for.

### 4.1 Years
Create entity --> bucketed labels file

In [57]:
avl_file_in = "{}/entity_attribute_labels_time.year.tsv".format(os.environ["OUT"])
ail_file_out = "{}/entity_attribute_labels_time.year_bucketed.tsv".format(os.environ["OUT"])
discretize_labels(avl_file_in, ail_file_out)

only one sample for label:
     entity  node1 label  node2 type_label  property_label  id lower_bound  \
4  Q1020773  Q3957  P571   1892  'town'@en  'inception'@en  E5               

  upper_bound  
4              

only one sample for label:
     entity    node1 label  node2        type_label  property_label  id  \
5  Q1020773  Q902814  P571   1892  'border town'@en  'inception'@en  E6   

  lower_bound upper_bound  
5                          

only one sample for label:
  entity     node1 label  node2                type_label  property_label  id  \
7  Q1027  Q2221906  P571   1968  'geographic location'@en  'inception'@en  E8   

  lower_bound upper_bound  
7                          

only one sample for label:
  entity     node1 label  node2                   type_label  property_label  \
9  Q1027  Q4198907  P571   1968  'parliamentary republic'@en  'inception'@en   

    id lower_bound upper_bound  
9  E10                          

no knee found for these labels:
    entity    

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve=

only one sample for label:
    entity     node1 label  node2                     type_label  \
73  Q15180  Q1323642  P576   1991  'transcontinental country'@en   

                             property_label   id lower_bound upper_bound  
73  'dissolved, abolished or demolished'@en  E74                          

only one sample for label:
    entity     node1 label  node2                type_label  property_label  \
74  Q15180  Q1335818  P571   1922  'supranational union'@en  'inception'@en   

     id lower_bound upper_bound  
74  E75                          

only one sample for label:
    entity     node1 label  node2                type_label  \
75  Q15180  Q1335818  P576   1991  'supranational union'@en   

                             property_label   id lower_bound upper_bound  
75  'dissolved, abolished or demolished'@en  E76                          

no knee found for these labels:
      entity     node1 label  node2               type_label  \
77    Q15180  Q3024240  P576 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedl

only one sample for label:
    entity     node1  label  node2             type_label  \
147  Q1726  Q1066984  P1249   1158  'financial centre'@en   

                           property_label    id lower_bound upper_bound  
147  'time of earliest written record'@en  E148                          

only one sample for label:
    entity     node1 label  node2             type_label  property_label  \
148  Q1726  Q1066984  P571   1158  'financial centre'@en  'inception'@en   

       id lower_bound upper_bound  
148  E149                          

only one sample for label:
    entity     node1  label  node2     type_label  \
149  Q1726  Q1180262  P1249   1158  'residenz'@en   

                           property_label    id lower_bound upper_bound  
149  'time of earliest written record'@en  E150                          

only one sample for label:
    entity     node1 label  node2     type_label  property_label    id  \
150  Q1726  Q1180262  P571   1158  'residenz'@en  'inception'@en

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', in

only one sample for label:
    entity    node1 label  node2         type_label  property_label    id  \
262   Q228  Q208500  P571   1278  'principality'@en  'inception'@en  E263   

    lower_bound upper_bound  
262                          

only one sample for label:
    entity      node1 label  node2            type_label  property_label  \
266   Q229  Q11396118  P571   1960  'divided country'@en  'inception'@en   

       id lower_bound upper_bound  
266  E267                          

no knee found for these labels:
    entity  node1 label  node2  type_label  property_label    id lower_bound  \
269   Q229  Q7275  P571   1960  'state'@en  'inception'@en  E270               
390    Q35  Q7275  P571    800  'state'@en  'inception'@en  E391               
436    Q40  Q7275  P571   1918  'state'@en  'inception'@en  E437               

    upper_bound  
269              
390              
436              
using median distance to k nearest neighbor instead (42.0)

only one sample for

only one sample for label:
    entity    node1 label  node2          type_label  \
515   Q663  Q214609  P575   1825  'base material'@en   

                          property_label    id lower_bound upper_bound  
515  'time of discovery or invention'@en  E516                          

only one sample for label:
       entity      node1 label  node2               type_label  \
517  Q6923815  Q91325574  P571   1835  'Trappist monastery'@en   

     property_label    id lower_bound upper_bound  
517  'inception'@en  E518                          

only one sample for label:
    entity   node1 label  node2              type_label  property_label    id  \
541   Q837  Q82794  P571   1768  'geographic region'@en  'inception'@en  E542   

    lower_bound upper_bound  
541                          

only one sample for label:
      entity  node1 label  node2     type_label   property_label    id  \
548  Q869095  Q8148  P580  -3500  'industry'@en  'start time'@en  E549   

    lower_bound upper

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


In [58]:
display(pd.read_csv("{}/entity_attribute_labels_time.year_bucketed.tsv".format(os.environ["OUT"]), delimiter = '\t', nrows=11).fillna(""))


Unnamed: 0,entity,node1,label,node2,type_label,property_label,id,lower_bound,upper_bound
0,Q1011,Q112099,P571,1975,'island nation'@en,'inception'@en,E1,619.0,
1,Q1011,Q3624078,P571,1975,'sovereign state'@en,'inception'@en,E2,1615.5,
2,Q1019,Q112099,P571,1960,'island nation'@en,'inception'@en,E3,619.0,
3,Q1019,Q3624078,P571,1960,'sovereign state'@en,'inception'@en,E4,1615.5,
4,Q1020773,Q3957,P571,1892,'town'@en,'inception'@en,E5,,
5,Q1020773,Q902814,P571,1892,'border town'@en,'inception'@en,E6,,
6,Q1027,Q112099,P571,1968,'island nation'@en,'inception'@en,E7,619.0,
7,Q1027,Q2221906,P571,1968,'geographic location'@en,'inception'@en,E8,,
8,Q1027,Q3624078,P571,1968,'sovereign state'@en,'inception'@en,E9,1615.5,
9,Q1027,Q4198907,P571,1968,'parliamentary republic'@en,'inception'@en,E10,,


Aggregating distinct interval labels with positive entity counts

In [59]:
!kgtk query -i $OUT/entity_attribute_labels_time.year_bucketed.tsv \
-o $OUT/candidate_labels_ail_time.year.tsv \
--graph-cache $STORE \
--match 'labels: (type)-[l {label:prop, property_label:lab, entity:e, lower_bound:lb, upper_bound:ub}]->(val)' \
--return 'type as type, prop as prop, lb as lower_bound, ub as upper_bound, count(e) as positives, lab as property_label, "_" as id' \
--order-by 'count(e) desc'

In [60]:
rename_cols_and_overwrite_id("$OUT/candidate_labels_ail_time.year", ".tsv", "type prop lower_bound", "node1 label node2")

In [61]:
display(pd.read_csv("{}/candidate_labels_ail_time.year.tsv".format(os.environ["OUT"]), delimiter = '\t', nrows=15).fillna(""))

Unnamed: 0,node1,label,node2,upper_bound,positives,property_label,id
0,Q3624078,P571,1615.5,,71,'inception'@en,E1
1,Q4830453,P571,1733.5,,46,'inception'@en,E2
2,Q131734,P571,1922.5,,20,'inception'@en,E3
3,Q51576574,P571,956.5,,20,'inception'@en,E4
4,Q7270,P571,1635.5,,19,'inception'@en,E5
5,Q131734,P571,1797.0,1922.5,16,'inception'@en,E6
6,Q179164,P571,1859.5,,14,'inception'@en,E7
7,Q123480,P571,1529.5,,13,'inception'@en,E8
8,Q3624078,P571,581.5,1370.5,12,'inception'@en,E9
9,Q112099,P571,619.0,,11,'inception'@en,E10


### 4.2 Quantities
Create entity --> bucketed labels file

In [62]:
avl_file_in = "{}/entity_attribute_labels_quantity.tsv".format(os.environ["OUT"])
ail_file_out = "{}/entity_attribute_labels_quantity_bucketed.tsv".format(os.environ["OUT"])
discretize_labels(avl_file_in, ail_file_out)

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
        entity  node1  label  node2 si_units wd_units type_label  \
0     Q1000597  Q3957  P1082  75074                    'town'@en   
1451  Q1020773  Q3957  P1082  64764                    'town'@en   

       property_label     id lower_bound upper_bound  
0     'population'@en     E1                          
1451  'population'@en  E1452                          
using median distance to k nearest neighbor instead (10310.0)

only one sample for label:
    entity    node1  label node2 si_units wd_units          type_label  \
673  Q1011  Q112099  P6897    87            Q11229  'island nation'@en   

         property_label    id lower_bound upper_bound  
673  'literacy rate'@en  E674                          



  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
        entity  node1  label node2 si_units wd_units type_label  \
1453  Q1020773  Q3957  P2044   525            Q11573  'town'@en   

                      property_label     id lower_bound upper_bound  
1453  'elevation above sea level'@en  E1454                          

only one sample for label:
        entity    node1  label node2 si_units wd_units        type_label  \
1454  Q1020773  Q902814  P2044   525            Q11573  'border town'@en   

                      property_label     id lower_bound upper_bound  
1454  'elevation above sea level'@en  E1455                          

only one sample for label:
     entity     node1  label node2 si_units wd_units  \
2002  Q1027  Q2221906  P1198     8            Q11229   

                    type_label          property_label     id lower_bound  \
2002  'geographic location'@en  'unemployment rate'@en  E2003               

     upper_bound  
2002              

only one sample for label:
     entity    

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
     entity     node1  label node2 si_units wd_units  \
3298  Q1027  Q2221906  P2855    15            Q11229   

                    type_label property_label     id lower_bound upper_bound  
3298  'geographic location'@en  'VAT-rate'@en  E3299                          

only one sample for label:
     entity     node1  label node2 si_units wd_units  \
3300  Q1027  Q4198907  P2855    15            Q11229   

                       type_label property_label     id lower_bound  \
3300  'parliamentary republic'@en  'VAT-rate'@en  E3301               

     upper_bound  
3300              

only one sample for label:
     entity     node1  label node2 si_units wd_units  \
3304  Q1027  Q2221906  P2884   230            Q25250   

                    type_label      property_label     id lower_bound  \
3304  'geographic location'@en  'mains voltage'@en  E3305               

     upper_bound  
3304              

only one sample for label:
     entity     node1  lab

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument


only one sample for label:
     entity    node1  label node2 si_units wd_units             type_label  \
3824  Q1033  Q512187  P1198     8            Q11229  'federal republic'@en   

              property_label     id lower_bound upper_bound  
3824  'unemployment rate'@en  E3825                          

no knee found for these labels:
       entity    node1  label        node2 si_units wd_units  \
3866    Q1033  Q512187  P2046       923768           Q712226   
11669  Q15180  Q512187  P2046  2.24022e+07           Q712226   

                  type_label property_label      id lower_bound upper_bound  
3866   'federal republic'@en      'area'@en   E3867                          
11669  'federal republic'@en      'area'@en  E11670                          
using median distance to k nearest neighbor instead (21478432.0)

only one sample for label:
     entity    node1  label node2 si_units wd_units             type_label  \
4208  Q1033  Q512187  P2219  -1.5            Q11229  'federal

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
     entity      node1  label        node2 si_units wd_units  \
6047  Q1246  Q15634554  P1082  1.88302e+06                     

                               type_label   property_label     id lower_bound  \
6047  'state with limited recognition'@en  'population'@en  E6048               

     upper_bound  
6047              

only one sample for label:
     entity      node1  label  node2 si_units wd_units  \
6099  Q1246  Q15634554  P2046  10909           Q712226   

                               type_label property_label     id lower_bound  \
6099  'state with limited recognition'@en      'area'@en  E6100               

     upper_bound  
6099              

only one sample for label:
     entity      node1  label node2 si_units wd_units  \
6287  Q1246  Q15634554  P2219   3.6            Q11229   

                               type_label  \
6287  'state with limited recognition'@en   

                                    property_label     id lower_bou

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
         entity     node1  label  node2 si_units wd_units  \
6491  Q12875697  Q1349648  P1082  66919                     

                       type_label   property_label     id lower_bound  \
6491  'municipality of Greece'@en  'population'@en  E6492               

     upper_bound  
6491              

only one sample for label:
         entity    node1  label        node2 si_units wd_units    type_label  \
6492  Q12877510  Q131734  P2226  3.89603e+06                    'brewery'@en   

                  property_label     id lower_bound upper_bound  
6492  'market capitalization'@en  E6493                          

only one sample for label:
         entity      node1  label        node2 si_units wd_units  \
6493  Q12877510  Q15075508  P2226  3.89603e+06                     

           type_label              property_label     id lower_bound  \
6493  'beer brand'@en  'market capitalization'@en  E6494               

     upper_bound  
6493           

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
     entity      node1  label   node2 si_units wd_units           type_label  \
6884   Q142  Q20181813  P1120  593865                    'colonial power'@en   

             property_label     id lower_bound upper_bound  
6884  'number of deaths'@en  E6885                          

no knee found for these labels:
      entity     node1  label   node2 si_units wd_units            type_label  \
6885    Q142  Q3624078  P1120  593865                    'sovereign state'@en   
58692    Q38  Q3624078  P1120  633133                    'sovereign state'@en   

              property_label      id lower_bound upper_bound  
6885   'number of deaths'@en   E6886                          
58692  'number of deaths'@en  E58693                          
using median distance to k nearest neighbor instead (39268.0)

no knee found for these labels:
      entity      node1  label   node2 si_units wd_units  \
6886    Q142  Q51576574  P1120  593865                     
58693    

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve=

no knee found for these labels:
      entity      node1  label node2 si_units wd_units           type_label  \
7996    Q142  Q20181813  P2927   0.3            Q11229  'colonial power'@en   
49580    Q31  Q20181813  P2927   0.8            Q11229  'colonial power'@en   

                      property_label      id lower_bound upper_bound  
7996   'water as percent of area'@en   E7997                          
49580  'water as percent of area'@en  E49581                          
using median distance to k nearest neighbor instead (0.5000000000000001)



  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', in

no knee found for these labels:
      entity  node1  label node2 si_units   wd_units     type_label  \
8051    Q142  Q7270  P3270     6           Q24564698  'republic'@en   
20655   Q183  Q7270  P3270     5           Q24564698  'republic'@en   
20661   Q183  Q7270  P3270     6           Q24564698  'republic'@en   
60080    Q38  Q7270  P3270     6           Q24564698  'republic'@en   
67810    Q41  Q7270  P3270     5           Q24564698  'republic'@en   
81727   Q948  Q7270  P3270     6           Q24564698  'republic'@en   

                                property_label      id lower_bound upper_bound  
8051   'compulsory education (minimum age)'@en   E8052                          
20655  'compulsory education (minimum age)'@en  E20656                          
20661  'compulsory education (minimum age)'@en  E20662                          
60080  'compulsory education (minimum age)'@en  E60081                          
67810  'compulsory education (minimum age)'@en  E67811           

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve=

no knee found for these labels:
      entity    node1  label node2 si_units   wd_units  \
10124   Q145  Q202686  P2997    18           Q24564698   
66071   Q408  Q202686  P2997    18           Q24564698   

                    type_label        property_label      id lower_bound  \
10124  'Commonwealth realm'@en  'age of majority'@en  E10125               
66071  'Commonwealth realm'@en  'age of majority'@en  E66072               

      upper_bound  
10124              
66071              
using median distance to k nearest neighbor instead (0.0)

only one sample for label:
      entity    node1  label node2 si_units   wd_units          type_label  \
10127   Q145  Q112099  P2999    16           Q24564698  'island nation'@en   

            property_label      id lower_bound upper_bound  
10127  'age of consent'@en  E10128                          

only one sample for label:
      entity      node1  label node2 si_units   wd_units           type_label  \
10128   Q145  Q20181813  P2999

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity    node1  label node2 si_units   wd_units          type_label  \
10137   Q145  Q112099  P3271    18           Q24564698  'island nation'@en   
18465    Q17  Q112099  P3271    15           Q24564698  'island nation'@en   
37501   Q229  Q112099  P3271    15           Q24564698  'island nation'@en   

                                property_label      id lower_bound upper_bound  
10137  'compulsory education (maximum age)'@en  E10138                          
18465  'compulsory education (maximum age)'@en  E18466                          
37501  'compulsory education (maximum age)'@en  E37502                          
using median distance to k nearest neighbor instead (0.0)

only one sample for label:
      entity    node1  label node2 si_units   wd_units  \
10139   Q145  Q202686  P3271    18           Q24564698   

                    type_label                           property_label  \
10139  'Commonwealth realm'@en  'compulsory education

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity     node1  label node2 si_units wd_units  \
10539   Q148  Q1520223  P1198     5            Q11229   
33338   Q225  Q1520223  P1198    28            Q11229   
46572    Q30  Q1520223  P1198   6.7            Q11229   

                         type_label          property_label      id  \
10539  'constitutional republic'@en  'unemployment rate'@en  E10540   
33338  'constitutional republic'@en  'unemployment rate'@en  E33339   
46572  'constitutional republic'@en  'unemployment rate'@en  E46573   

      lower_bound upper_bound  
10539                          
33338                          
46572                          
using median distance to k nearest neighbor instead (1.7000000000000002)

only one sample for label:
      entity    node1  label node2 si_units wd_units            type_label  \
10541   Q148  Q842112  P1198     5            Q11229  'socialist state'@en   

               property_label      id lower_bound upper_bound  
1054

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity     node1  label node2 si_units wd_units  \
11167   Q148  Q1520223  P2219   6.7            Q11229   
33611   Q225  Q1520223  P2219   2.5            Q11229   
47577    Q30  Q1520223  P2219   1.6            Q11229   

                         type_label  \
11167  'constitutional republic'@en   
33611  'constitutional republic'@en   
47577  'constitutional republic'@en   

                                     property_label      id lower_bound  \
11167  'real gross domestic product growth rate'@en  E11168               
33611  'real gross domestic product growth rate'@en  E33612               
47577  'real gross domestic product growth rate'@en  E47578               

      upper_bound  
11167              
33611              
47577              
using median distance to k nearest neighbor instead (0.9000000000000002)

only one sample for label:
      entity    node1  label node2 si_units wd_units            type_label  \
11169   Q148  Q842112 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument


no knee found for these labels:
      entity    node1  label node2 si_units wd_units          type_label  \
11358   Q148  Q859563  P2855    13            Q11229  'secular state'@en   
14647   Q159  Q859563  P2855    20            Q11229  'secular state'@en   

      property_label      id lower_bound upper_bound  
11358  'VAT-rate'@en  E11359                          
14647  'VAT-rate'@en  E14648                          
using median distance to k nearest neighbor instead (7.0)

no knee found for these labels:
      entity     node1  label node2 si_units wd_units  \
11359   Q148  Q1520223  P2884   220            Q25250   
33743   Q225  Q1520223  P2884   230            Q25250   
47877    Q30  Q1520223  P2884   120            Q25250   

                         type_label      property_label      id lower_bound  \
11359  'constitutional republic'@en  'mains voltage'@en  E11360               
33743  'constitutional republic'@en  'mains voltage'@en  E33744               
47877  'constitut

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity    node1  label node2 si_units   wd_units          type_label  \
11370   Q148  Q859563  P2997    18           Q24564698  'secular state'@en   
12803   Q155  Q859563  P2997    18           Q24564698  'secular state'@en   
14663   Q159  Q859563  P2997    18           Q24564698  'secular state'@en   

             property_label      id lower_bound upper_bound  
11370  'age of majority'@en  E11371                          
12803  'age of majority'@en  E12804                          
14663  'age of majority'@en  E14664                          
using median distance to k nearest neighbor instead (0.0)

only one sample for label:
      entity     node1  label node2 si_units   wd_units  \
11371   Q148  Q1520223  P3270     6           Q24564698   

                         type_label                           property_label  \
11371  'constitutional republic'@en  'compulsory education (minimum age)'@en   

           id lower_bound upper_bound  
1

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity     node1  label node2 si_units wd_units  \
11571   Q148  Q1520223  P5167    28                     
48122    Q30  Q1520223  P5167   778                     

                         type_label                     property_label  \
11571  'constitutional republic'@en  'vehicles per thousand people'@en   
48122  'constitutional republic'@en  'vehicles per thousand people'@en   

           id lower_bound upper_bound  
11571  E11572                          
48122  E48123                          
using median distance to k nearest neighbor instead (750.0)

only one sample for label:
      entity    node1  label node2 si_units wd_units            type_label  \
11573   Q148  Q842112  P5167    28                    'socialist state'@en   

                          property_label      id lower_bound upper_bound  
11573  'vehicles per thousand people'@en  E11574                          

only one sample for label:
      entity    node1  label n

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
       entity    node1  label node2 si_units wd_units             type_label  \
11652  Q14835  Q134626  P2044   204            Q11573  'district capital'@en   
18800   Q1726  Q134626  P2044   519            Q11573  'district capital'@en   

                       property_label      id lower_bound upper_bound  
11652  'elevation above sea level'@en  E11653                          
18800  'elevation above sea level'@en  E18801                          
using median distance to k nearest neighbor instead (315.0)

no knee found for these labels:
       entity      node1  label node2 si_units wd_units  \
11653  Q14835  Q42744322  P2044   204            Q11573   
18806   Q1726  Q42744322  P2044   519            Q11573   
25230   Q2079  Q42744322  P2044   113            Q11573   

                               type_label                  property_label  \
11653  'urban municipality of Germany'@en  'elevation above sea level'@en   
18806  'urban municipality 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', in

no knee found for these labels:
        entity     node1  label        node2 si_units wd_units  \
11660   Q15180  Q3024240  P1082  2.93048e+08                     
11688  Q153015  Q3024240  P1082  4.80666e+06                     

                    type_label   property_label      id lower_bound  \
11660  'historical country'@en  'population'@en  E11661               
11688  'historical country'@en  'population'@en  E11689               

      upper_bound  
11660              
11688              
using median distance to k nearest neighbor instead (288240910.0)

only one sample for label:
       entity     node1  label        node2 si_units wd_units  \
11666  Q15180  Q1335818  P2046  2.24022e+07           Q712226   

                     type_label property_label      id lower_bound upper_bound  
11666  'supranational union'@en      'area'@en  E11667                          

no knee found for these labels:
        entity     node1  label        node2 si_units wd_units  \
11667   Q

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
        entity      node1  label        node2 si_units wd_units  \
11687  Q153015  Q26879769  P1082  4.80666e+06                     

                                         type_label   property_label      id  \
11687  'state in the Confederation of the Rhine'@en  'population'@en  E11688   

      lower_bound upper_bound  
11687                          

no knee found for these labels:
        entity    node1  label        node2 si_units wd_units    type_label  \
11689  Q153015  Q417175  P1082  4.80666e+06                    'kingdom'@en   
76323  Q756617  Q417175  P1082  5.70725e+06                    'kingdom'@en   

        property_label      id lower_bound upper_bound  
11689  'population'@en  E11690                          
76323  'population'@en  E76324                          
using median distance to k nearest neighbor instead (900590.0)

only one sample for label:
        entity      node1  label  node2 si_units wd_units  \
11690  Q153015  Q26

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument


no knee found for these labels:
      entity     node1  label node2 si_units   wd_units        type_label  \
12802   Q155  Q4209223  P2997    18           Q24564698  'Rechtsstaat'@en   
14659   Q159  Q4209223  P2997    18           Q24564698  'Rechtsstaat'@en   
20615   Q183  Q4209223  P2997    18           Q24564698  'Rechtsstaat'@en   
27224   Q212  Q4209223  P2997    18           Q24564698  'Rechtsstaat'@en   
64525    Q40  Q4209223  P2997    18           Q24564698  'Rechtsstaat'@en   

             property_label      id lower_bound upper_bound  
12802  'age of majority'@en  E12803                          
14659  'age of majority'@en  E14660                          
20615  'age of majority'@en  E20616                          
27224  'age of majority'@en  E27225                          
64525  'age of majority'@en  E64526                          
using median distance to k nearest neighbor instead (0.0)

no knee found for these labels:
      entity     node1  label node2 si_uni

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedl

no knee found for these labels:
      entity     node1  label node2 si_units wd_units        type_label  \
12961   Q155  Q4209223  P6897    92            Q11229  'Rechtsstaat'@en   
20909   Q183  Q4209223  P6897    99            Q11229  'Rechtsstaat'@en   
64795    Q40  Q4209223  P6897    99            Q11229  'Rechtsstaat'@en   

           property_label      id lower_bound upper_bound  
12961  'literacy rate'@en  E12962                          
20909  'literacy rate'@en  E20910                          
64795  'literacy rate'@en  E64796                          
using median distance to k nearest neighbor instead (0.0)

only one sample for label:
      entity    node1  label node2 si_units wd_units          type_label  \
12962   Q155  Q859563  P6897    92            Q11229  'secular state'@en   

           property_label      id lower_bound upper_bound  
12962  'literacy rate'@en  E12963                          

only one sample for label:
      entity     node1  label node2 si_u

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
       entity      node1  label  node2 si_units wd_units  \
13022  Q15887  Q13220204  P2046  197.5           Q712226   

                                              type_label property_label  \
13022  'second-level administrative country subdivisi...      'area'@en   

           id lower_bound upper_bound  
13022  E13023                          

only one sample for label:
       entity    node1  label  node2 si_units wd_units  \
13023  Q15887  Q328584  P2046  197.5           Q712226   

                          type_label property_label      id lower_bound  \
13023  'municipality of Slovenia'@en      'area'@en  E13024               

      upper_bound  
13023              



  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity    node1  label        node2 si_units wd_units        type_label  \
13561   Q159  Q185145  P2046  1.70754e+07           Q712226  'great power'@en   
13569   Q159  Q185145  P2046  1.71252e+07           Q712226  'great power'@en   
13577   Q159  Q185145  P2046  1.71252e+07           Q712226  'great power'@en   

      property_label      id lower_bound upper_bound  
13561      'area'@en  E13562                          
13569      'area'@en  E13570                          
13577      'area'@en  E13578                          
using median distance to k nearest neighbor instead (9.0)

only one sample for label:
      entity     node1  label      node2 si_units wd_units  \
14256   Q159  Q1323642  P2135  3.335e+11             Q4917   

                          type_label      property_label      id lower_bound  \
14256  'transcontinental country'@en  'total exports'@en  E14257               

      upper_bound  
14256              

only one s

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not po

no knee found for these labels:
      entity     node1  label node2 si_units wd_units  \
14640   Q159  Q1323642  P2855    20            Q11229   
71723    Q43  Q1323642  P2855    18            Q11229   

                          type_label property_label      id lower_bound  \
14640  'transcontinental country'@en  'VAT-rate'@en  E14641               
71723  'transcontinental country'@en  'VAT-rate'@en  E71724               

      upper_bound  
14640              
71723              
using median distance to k nearest neighbor instead (2.0)

only one sample for label:
      entity    node1  label node2 si_units wd_units        type_label  \
14641   Q159  Q185145  P2855    20            Q11229  'great power'@en   

      property_label      id lower_bound upper_bound  
14641  'VAT-rate'@en  E14642                          

no knee found for these labels:
      entity     node1  label node2 si_units wd_units  \
14648   Q159  Q1323642  P2884   220            Q25250   
71727    Q43  Q132

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity      node1  label node2 si_units   wd_units  \
14662   Q159  Q63791824  P2997    18           Q24564698   
20618   Q183  Q63791824  P2997    18           Q24564698   
24893   Q191  Q63791824  P2997    18           Q24564698   
25807   Q211  Q63791824  P2997    18           Q24564698   
52436    Q33  Q63791824  P2997    18           Q24564698   
54184    Q34  Q63791824  P2997    18           Q24564698   
55915    Q35  Q63791824  P2997    18           Q24564698   
57321    Q36  Q63791824  P2997    18           Q24564698   
58074    Q37  Q63791824  P2997    18           Q24564698   

                                    type_label        property_label      id  \
14662  'countries bordering the Baltic Sea'@en  'age of majority'@en  E14663   
20618  'countries bordering the Baltic Sea'@en  'age of majority'@en  E20619   
24893  'countries bordering the Baltic Sea'@en  'age of majority'@en  E24894   
25807  'countries bordering the Baltic Sea'@en 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity     node1  label node2 si_units   wd_units  \
14728   Q159  Q1323642  P3270     6           Q24564698   
71747    Q43  Q1323642  P3270     6           Q24564698   

                          type_label                           property_label  \
14728  'transcontinental country'@en  'compulsory education (minimum age)'@en   
71747  'transcontinental country'@en  'compulsory education (minimum age)'@en   

           id lower_bound upper_bound  
14728  E14729                          
71747  E71748                          
using median distance to k nearest neighbor instead (0.0)

only one sample for label:
      entity    node1  label node2 si_units   wd_units        type_label  \
14729   Q159  Q185145  P3270     6           Q24564698  'great power'@en   

                                property_label      id lower_bound upper_bound  
14729  'compulsory education (minimum age)'@en  E14730                          

no knee found for these 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity     node1  label node2 si_units wd_units  \
15072   Q159  Q1323642  P6591  45.4            Q25267   

                          type_label                   property_label      id  \
15072  'transcontinental country'@en  'maximum temperature record'@en  E15073   

      lower_bound upper_bound  
15072                          

only one sample for label:
      entity    node1  label node2 si_units wd_units        type_label  \
15073   Q159  Q185145  P6591  45.4            Q25267  'great power'@en   

                        property_label      id lower_bound upper_bound  
15073  'maximum temperature record'@en  E15074                          

no knee found for these labels:
      entity     node1  label node2 si_units wd_units        type_label  \
15075   Q159  Q4209223  P6591  45.4            Q25267  'Rechtsstaat'@en   
20891   Q183  Q4209223  P6591  40.3            Q25267  'Rechtsstaat'@en   

                        property_label      id lo

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedl

no knee found for these labels:
      entity    node1  label node2 si_units wd_units  \
16573    Q16  Q223832  P1125  32.1                     
64972   Q408  Q223832  P1125  30.5                     

                                type_label         property_label      id  \
16573  'dominion of the British Empire'@en  'Gini coefficient'@en  E16574   
64972  'dominion of the British Empire'@en  'Gini coefficient'@en  E64973   

      lower_bound upper_bound  
16573                          
64972                          
using median distance to k nearest neighbor instead (1.599999999999983)

no knee found for these labels:
      entity    node1  label node2 si_units wd_units  \
16577    Q16  Q223832  P1198     7            Q11229   
64976   Q408  Q223832  P1198     6            Q11229   

                                type_label          property_label      id  \
16577  'dominion of the British Empire'@en  'unemployment rate'@en  E16578   
64976  'dominion of the British Empire'@e

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity    node1  label node2 si_units   wd_units  \
17593    Q16  Q223832  P3270     6           Q24564698   
66116   Q408  Q223832  P3270     5           Q24564698   

                                type_label  \
17593  'dominion of the British Empire'@en   
66116  'dominion of the British Empire'@en   

                                property_label      id lower_bound upper_bound  
17593  'compulsory education (minimum age)'@en  E17594                          
66116  'compulsory education (minimum age)'@en  E66117                          
using median distance to k nearest neighbor instead (1.0)

no knee found for these labels:
      entity    node1  label  node2 si_units wd_units  \
17601    Q16  Q223832  P3529  41280             Q4917   
66124   Q408  Q223832  P3529  46555             Q4917   

                                type_label      property_label      id  \
17601  'dominion of the British Empire'@en  'median income'@en  E17602   


  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedl

only one sample for label:
      entity    node1  label node2 si_units wd_units               type_label  \
17784    Q16  Q202686  P7422   -63            Q25267  'Commonwealth realm'@en   

                        property_label      id lower_bound upper_bound  
17784  'minimum temperature record'@en  E17785                          

only one sample for label:
      entity    node1  label node2 si_units wd_units  \
17785    Q16  Q223832  P7422   -63            Q25267   

                                type_label                   property_label  \
17785  'dominion of the British Empire'@en  'minimum temperature record'@en   

           id lower_bound upper_bound  
17785  E17786                          

only one sample for label:
        entity    node1  label  node2 si_units wd_units    type_label  \
17788  Q161140  Q206361  P1128  27355                    'concern'@en   

       property_label      id lower_bound upper_bound  
17788  'employees'@en  E17789                        

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity    node1  label   node2 si_units wd_units          type_label  \
18549    Q17  Q112099  P5982  621000                    'island nation'@en   
18551    Q17  Q112099  P5982  635156                    'island nation'@en   

                       property_label      id lower_bound upper_bound  
18549  'annual number of weddings'@en  E18550                          
18551  'annual number of weddings'@en  E18552                          
using median distance to k nearest neighbor instead (14156.0)

no knee found for these labels:
      entity    node1  label node2 si_units wd_units          type_label  \
18553    Q17  Q112099  P6591    41            Q25267  'island nation'@en   
74702   Q695  Q112099  P6591    35            Q25267  'island nation'@en   

                        property_label      id lower_bound upper_bound  
18553  'maximum temperature record'@en  E18554                          
74702  'maximum temperature record'@en  E74703 

only one sample for label:
      entity   node1  label   node2 si_units wd_units  \
18793  Q1726  Q22865  P1540  717308                     

                             type_label        property_label      id  \
18793  'independent city of Germany'@en  'male population'@en  E18794   

      lower_bound upper_bound  
18793                          

only one sample for label:
      entity      node1  label   node2 si_units wd_units  \
18794  Q1726  Q42744322  P1540  717308                     

                               type_label        property_label      id  \
18794  'urban municipality of Germany'@en  'male population'@en  E18795   

      lower_bound upper_bound  
18794                          

only one sample for label:
      entity node1  label   node2 si_units wd_units type_label  \
18795  Q1726  Q515  P1540  717308                    'city'@en   

             property_label      id lower_bound upper_bound  
18795  'male population'@en  E18796                         

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', in


using median distance to k nearest neighbor instead (406.0)

only one sample for label:
      entity      node1  label node2 si_units wd_units  \
18808  Q1726  Q85631896  P2044   519            Q11573   

                           type_label                  property_label      id  \
18808  'urban district of Bavaria'@en  'elevation above sea level'@en  E18809   

      lower_bound upper_bound  
18808                          

no knee found for these labels:
      entity     node1  label   node2 si_units wd_units  \
18809  Q1726  Q1066984  P2046  310.71           Q712226   
18821  Q1726  Q1066984  P2046  310.74           Q712226   

                  type_label property_label      id lower_bound upper_bound  
18809  'financial centre'@en      'area'@en  E18810                          
18821  'financial centre'@en      'area'@en  E18822                          
using median distance to k nearest neighbor instead (0.029999999984866008)

no knee found for these labels:
      entity  

only one sample for label:
        entity node1  label node2 si_units wd_units    type_label  \
18858  Q172668  Q532  P2044   478            Q11573  'village'@en   

                       property_label      id lower_bound upper_bound  
18858  'elevation above sea level'@en  E18859                          

only one sample for label:
        entity    node1  label   node2 si_units wd_units  \
18859  Q182809  Q209824  P1082  246238                     

                    type_label   property_label      id lower_bound  \
18859  'oblast of Bulgaria'@en  'population'@en  E18860               

      upper_bound  
18859              

only one sample for label:
        entity    node1  label node2 si_units wd_units  \
18860  Q182809  Q209824  P2044   232            Q11573   

                    type_label                  property_label      id  \
18860  'oblast of Bulgaria'@en  'elevation above sea level'@en  E18861   

      lower_bound upper_bound  
18860                          


  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly

no knee found for these labels:
      entity     node1  label node2 si_units wd_units        type_label  \
20609   Q183  Q4209223  P2927   2.3            Q11229  'Rechtsstaat'@en   
64519    Q40  Q4209223  P2927   1.7            Q11229  'Rechtsstaat'@en   

                      property_label      id lower_bound upper_bound  
20609  'water as percent of area'@en  E20610                          
64519  'water as percent of area'@en  E64520                          
using median distance to k nearest neighbor instead (0.5999999999999995)

no knee found for these labels:
      entity   node1  label node2 si_units wd_units          type_label  \
20610   Q183  Q43702  P2927   2.3            Q11229  'federal state'@en   
49582    Q31  Q43702  P2927   0.8            Q11229  'federal state'@en   

                      property_label      id lower_bound upper_bound  
20610  'water as percent of area'@en  E20611                          
49582  'water as percent of area'@en  E49583           

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedl

no knee found for these labels:
      entity   node1  label node2 si_units wd_units          type_label  \
20634   Q183  Q43702  P3086   100           Q180154  'federal state'@en   
20640   Q183  Q43702  P3086    50           Q180154  'federal state'@en   
66110   Q408  Q43702  P3086   100           Q180154  'federal state'@en   
66114   Q408  Q43702  P3086    50           Q180154  'federal state'@en   

         property_label      id lower_bound upper_bound  
20634  'speed limit'@en  E20635                          
20640  'speed limit'@en  E20641                          
66110  'speed limit'@en  E66111                          
66114  'speed limit'@en  E66115                          
using median distance to k nearest neighbor instead (0.0)

no knee found for these labels:
      entity   node1  label node2 si_units   wd_units          type_label  \
20652   Q183  Q43702  P3270     5           Q24564698  'federal state'@en   
20658   Q183  Q43702  P3270     6           Q24564698  'f

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', in

no knee found for these labels:
      entity  node1  label node2 si_units wd_units     type_label  \
20895   Q183  Q7270  P6591  40.3            Q25267  'republic'@en   
74704   Q695  Q7270  P6591    35            Q25267  'republic'@en   

                        property_label      id lower_bound upper_bound  
20895  'maximum temperature record'@en  E20896                          
74704  'maximum temperature record'@en  E74705                          
using median distance to k nearest neighbor instead (5.2999999999999705)

no knee found for these labels:
      entity     node1  label node2 si_units wd_units            type_label  \
20896   Q183  Q3624078  P6794  9.19             Q4916  'sovereign state'@en   
20902   Q183  Q3624078  P6794  9.35             Q4916  'sovereign state'@en   

          property_label      id lower_bound upper_bound  
20896  'minimum wage'@en  E20897                          
20902  'minimum wage'@en  E20903                          
using median distanc

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity    node1  label  node2 si_units wd_units  \
21106   Q184  Q123480  P1125  0.297                     

                    type_label         property_label      id lower_bound  \
21106  'landlocked country'@en  'Gini coefficient'@en  E21107               

      upper_bound  
21106              

only one sample for label:
      entity    node1  label  node2 si_units wd_units          type_label  \
21107   Q184  Q179164  P1125  0.297                    'unitary state'@en   

              property_label      id lower_bound upper_bound  
21107  'Gini coefficient'@en  E21108                          

only one sample for label:
      entity    node1  label  node2 si_units wd_units         type_label  \
21109   Q184  Q619610  P1125  0.297                    'social state'@en   

              property_label      id lower_bound upper_bound  
21109  'Gini coefficient'@en  E21110                          

only one sample for label:
      entity    nod

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity      node1  label  node2 si_units wd_units  \
25238  Q2079  Q61708099  P2046  297.8           Q712226   

                          type_label property_label      id lower_bound  \
25238  'urban district in Saxony'@en      'area'@en  E25239               

      upper_bound  
25238              

only one sample for label:
      entity    node1  label node2 si_units wd_units          type_label  \
27258   Q212  Q179164  P3529  9218            Q81893  'unitary state'@en   

           property_label      id lower_bound upper_bound  
27258  'median income'@en  E27259                          

only one sample for label:
      entity     node1  label node2 si_units wd_units            type_label  \
27259   Q212  Q3624078  P3529  9218            Q81893  'sovereign state'@en   

           property_label      id lower_bound upper_bound  
27259  'median income'@en  E27260                          

only one sample for label:
      entity     node1  lab

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the differenc

no knee found for these labels:
      entity    node1  label node2 si_units   wd_units  \
28590   Q213  Q123480  P3270     6           Q24564698   
44079    Q28  Q123480  P3270     3           Q24564698   

                    type_label                           property_label  \
28590  'landlocked country'@en  'compulsory education (minimum age)'@en   
44079  'landlocked country'@en  'compulsory education (minimum age)'@en   

           id lower_bound upper_bound  
28590  E28591                          
44079  E44080                          
using median distance to k nearest neighbor instead (3.0)

no knee found for these labels:
      entity    node1  label node2 si_units   wd_units  \
28593   Q213  Q123480  P3271    15           Q24564698   
44081    Q28  Q123480  P3271    16           Q24564698   

                    type_label                           property_label  \
28593  'landlocked country'@en  'compulsory education (maximum age)'@en   
44081  'landlocked country'@en 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return

no knee found for these labels:
      entity    node1  label node2 si_units wd_units          type_label  \
30507   Q219  Q179164  P6591  45.2            Q25267  'unitary state'@en   
52625    Q33  Q179164  P6591  37.2            Q25267  'unitary state'@en   

                        property_label      id lower_bound upper_bound  
30507  'maximum temperature record'@en  E30508                          
52625  'maximum temperature record'@en  E52626                          
using median distance to k nearest neighbor instead (7.999999999999986)

only one sample for label:
      entity    node1  label node2 si_units wd_units          type_label  \
32235   Q222  Q179164  P2927   5.7            Q11229  'unitary state'@en   

                      property_label      id lower_bound upper_bound  
32235  'water as percent of area'@en  E32236                          

no knee found for these labels:
      entity      node1  label node2 si_units wd_units  \
32445   Q222  Q51576574  P6897    

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument


no knee found for these labels:
      entity     node1  label node2 si_units wd_units  \
33887   Q225  Q1520223  P6897    97            Q11229   
48132    Q30  Q1520223  P6897  99.4            Q11229   

                         type_label      property_label      id lower_bound  \
33887  'constitutional republic'@en  'literacy rate'@en  E33888               
48132  'constitutional republic'@en  'literacy rate'@en  E48133               

      upper_bound  
33887              
48132              
using median distance to k nearest neighbor instead (2.3999999999996664)

only one sample for label:
      entity   node1  label node2 si_units wd_units  \
34260   Q227  Q56061  P1198   5.2            Q11229   

                                   type_label          property_label      id  \
34260  'administrative territorial entity'@en  'unemployment rate'@en  E34261   

      lower_bound upper_bound  
34260                          

only one sample for label:
      entity   node1  label  no

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity   node1  label   node2 si_units wd_units  \
34872   Q227  Q56061  P2573  108600                     
34876   Q227  Q56061  P2573  117058                     
34880   Q227  Q56061  P2573   91820                     

                                   type_label  \
34872  'administrative territorial entity'@en   
34876  'administrative territorial entity'@en   
34880  'administrative territorial entity'@en   

                              property_label      id lower_bound upper_bound  
34872  'number of out-of-school children'@en  E34873                          
34876  'number of out-of-school children'@en  E34877                          
34880  'number of out-of-school children'@en  E34881                          
using median distance to k nearest neighbor instead (8458.0)

only one sample for label:
      entity   node1  label node2 si_units wd_units  \
34884   Q227  Q56061  P2855    18            Q11229   

                          

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity    node1  label node2 si_units wd_units         type_label  \
35863   Q228  Q208500  P6897   100            Q11229  'principality'@en   

           property_label      id lower_bound upper_bound  
35863  'literacy rate'@en  E35864                          

only one sample for label:
      entity      node1  label node2 si_units wd_units            type_label  \
36292   Q229  Q11396118  P1198    16            Q11229  'divided country'@en   

               property_label      id lower_bound upper_bound  
36292  'unemployment rate'@en  E36293                          

no knee found for these labels:
      entity  node1  label node2 si_units wd_units  type_label  \
36295   Q229  Q7275  P1198    16            Q11229  'state'@en   
54582    Q35  Q7275  P1198     7            Q11229  'state'@en   
63004    Q40  Q7275  P1198     5            Q11229  'state'@en   

               property_label      id lower_bound upper_bound  
36295  'unemployment ra

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity      node1  label node2 si_units wd_units            type_label  \
37152   Q229  Q11396118  P2219   2.8            Q11229  'divided country'@en   

                                     property_label      id lower_bound  \
37152  'real gross domestic product growth rate'@en  E37153               

      upper_bound  
37152              

no knee found for these labels:
      entity  node1  label node2 si_units wd_units  type_label  \
37155   Q229  Q7275  P2219   2.8            Q11229  'state'@en   
55592    Q35  Q7275  P2219   1.1            Q11229  'state'@en   
64228    Q40  Q7275  P2219   1.5            Q11229  'state'@en   

                                     property_label      id lower_bound  \
37155  'real gross domestic product growth rate'@en  E37156               
55592  'real gross domestic product growth rate'@en  E55593               
64228  'real gross domestic product growth rate'@en  E64229               

      upper_bound  
37

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedl

only one sample for label:
      entity      node1  label node2 si_units   wd_units  \
37482   Q229  Q11396118  P2997    18           Q24564698   

                 type_label        property_label      id lower_bound  \
37482  'divided country'@en  'age of majority'@en  E37483               

      upper_bound  
37482              

no knee found for these labels:
      entity  node1  label node2 si_units   wd_units  type_label  \
37485   Q229  Q7275  P2997    18           Q24564698  'state'@en   
55917    Q35  Q7275  P2997    18           Q24564698  'state'@en   
64528    Q40  Q7275  P2997    18           Q24564698  'state'@en   

             property_label      id lower_bound upper_bound  
37485  'age of majority'@en  E37486                          
55917  'age of majority'@en  E55918                          
64528  'age of majority'@en  E64529                          
using median distance to k nearest neighbor instead (0.0)

only one sample for label:
      entity      node1  

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
       entity     node1  label node2 si_units wd_units            type_label  \
40837  Q23482  Q2264924  P2044    12            Q11573  'port settlement'@en   

                       property_label      id lower_bound upper_bound  
40837  'elevation above sea level'@en  E40838                          

only one sample for label:
       entity    node1  label node2 si_units wd_units              type_label  \
40838  Q23482  Q484170  P2044    12            Q11573  'commune of France'@en   

                       property_label      id lower_bound upper_bound  
40838  'elevation above sea level'@en  E40839                          

only one sample for label:
       entity     node1  label   node2 si_units wd_units  \
40840  Q23482  Q2264924  P2046  240.62           Q712226   

                 type_label property_label      id lower_bound upper_bound  
40840  'port settlement'@en      'area'@en  E40841                          

only one sample for label:
  

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
        entity      node1  label node2 si_units wd_units  \
42230  Q241898  Q15273785  P5982   124                     

                                           type_label  \
42230  'Belgian municipality with city privileges'@en   

                       property_label      id lower_bound upper_bound  
42230  'annual number of weddings'@en  E42231                          

only one sample for label:
        entity    node1  label node2 si_units wd_units  \
42231  Q241898  Q493522  P5982   124                     

                         type_label                  property_label      id  \
42231  'municipality of Belgium'@en  'annual number of weddings'@en  E42232   

      lower_bound upper_bound  
42231                          

only one sample for label:
      entity  node1  label node2 si_units wd_units    type_label  \
42697   Q258  Q6256  P8328  7.24                    'country'@en   

             property_label      id lower_bound upper_bound 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity   node1  label  node2 si_units   wd_units  \
44298   Q283  Q11173  P2056  9.069           Q20966455   

                   type_label      property_label      id lower_bound  \
44298  'chemical compound'@en  'heat capacity'@en  E44299               

      upper_bound  
44298              

only one sample for label:
      entity      node1  label  node2 si_units   wd_units  \
44299   Q283  Q11723014  P2056  9.069           Q20966455   

                         type_label      property_label      id lower_bound  \
44299  'dihydrogen chalcogenide'@en  'heat capacity'@en  E44300               

      upper_bound  
44299              

only one sample for label:
      entity   node1  label  node2 si_units   wd_units  type_label  \
44300   Q283  Q50690  P2056  9.069           Q20966455  'oxide'@en   

           property_label      id lower_bound upper_bound  
44300  'heat capacity'@en  E44301                          

only one sample for label:
  

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity   node1  label node2 si_units wd_units              type_label  \
44316   Q283  Q11173  P2101     0            Q25267  'chemical compound'@en   

           property_label      id lower_bound upper_bound  
44316  'melting point'@en  E44317                          

only one sample for label:
      entity      node1  label node2 si_units wd_units  \
44317   Q283  Q11723014  P2101     0            Q25267   

                         type_label      property_label      id lower_bound  \
44317  'dihydrogen chalcogenide'@en  'melting point'@en  E44318               

      upper_bound  
44317              

only one sample for label:
      entity   node1  label node2 si_units wd_units  type_label  \
44318   Q283  Q50690  P2101     0            Q25267  'oxide'@en   

           property_label      id lower_bound upper_bound  
44318  'melting point'@en  E44319                          

only one sample for label:
      entity   node1  label node2 si_un

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the differenc

no knee found for these labels:
      entity   node1  label  node2 si_units   wd_units  \
44346   Q283  Q11173  P3071  188.8           Q20966455   
44349   Q283  Q11173  P3071   69.9           Q20966455   

                   type_label               property_label      id  \
44346  'chemical compound'@en  'standard molar entropy'@en  E44347   
44349  'chemical compound'@en  'standard molar entropy'@en  E44350   

      lower_bound upper_bound  
44346                          
44349                          
using median distance to k nearest neighbor instead (118.89999999999999)

no knee found for these labels:
      entity      node1  label  node2 si_units   wd_units  \
44347   Q283  Q11723014  P3071  188.8           Q20966455   
44350   Q283  Q11723014  P3071   69.9           Q20966455   

                         type_label               property_label      id  \
44347  'dihydrogen chalcogenide'@en  'standard molar entropy'@en  E44348   
44350  'dihydrogen chalcogenide'@en  'standa

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity     node1  label node2 si_units   wd_units            type_label  \
45547    Q29  Q3624078  P2998    18           Q24564698  'sovereign state'@en   
66077   Q408  Q3624078  P2998    18           Q24564698  'sovereign state'@en   
76974   Q801  Q3624078  P2998    21           Q24564698  'sovereign state'@en   

              property_label      id lower_bound upper_bound  
45547  'age of candidacy'@en  E45548                          
66077  'age of candidacy'@en  E66078                          
76974  'age of candidacy'@en  E76975                          
using median distance to k nearest neighbor instead (0.0)

no knee found for these labels:
      entity      node1  label node2 si_units   wd_units  \
45548    Q29  Q51576574  P2998    18           Q24564698   
76975   Q801  Q51576574  P2998    21           Q24564698   

                       type_label         property_label      id lower_bound  \
45548  'Mediterranean country'@en  'age

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity     node1  label node2 si_units wd_units  type_label  \
45699    Q29  Q1250464  P7422   -30            Q25267  'realm'@en   

                        property_label      id lower_bound upper_bound  
45699  'minimum temperature record'@en  E45700                          

no knee found for these labels:
      entity      node1  label node2 si_units wd_units  \
45701    Q29  Q51576574  P7422   -30            Q25267   
60298    Q38  Q51576574  P7422 -49.6            Q25267   

                       type_label                   property_label      id  \
45701  'Mediterranean country'@en  'minimum temperature record'@en  E45702   
60298  'Mediterranean country'@en  'minimum temperature record'@en  E60299   

      lower_bound upper_bound  
45701                          
60298                          
using median distance to k nearest neighbor instead (19.60000000000001)

no knee found for these labels:
      entity     node1  label node2 si_units

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity     node1  label node2 si_units wd_units  \
48125    Q30  Q5255892  P5167   778                     

                     type_label                     property_label      id  \
48125  'democratic republic'@en  'vehicles per thousand people'@en  E48126   

      lower_bound upper_bound  
48125                          

only one sample for label:
      entity     node1  label node2 si_units wd_units       type_label  \
48126    Q30  Q1489259  P6591  56.7            Q25267  'superpower'@en   

                        property_label      id lower_bound upper_bound  
48126  'maximum temperature record'@en  E48127                          

only one sample for label:
      entity     node1  label node2 si_units wd_units  \
48127    Q30  Q1520223  P6591  56.7            Q25267   

                         type_label                   property_label      id  \
48127  'constitutional republic'@en  'maximum temperature record'@en  E48128   

      lowe

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity    node1  label  node2 si_units wd_units        type_label  \
50801    Q32  Q165116  P3529  52493             Q4917  'grand duchy'@en   

           property_label      id lower_bound upper_bound  
50801  'median income'@en  E50802                          

no knee found for these labels:
      entity    node1  label node2 si_units wd_units               type_label  \
50960    Q32  Q123480  P5167   661                    'landlocked country'@en   
50964    Q32  Q123480  P5167   662                    'landlocked country'@en   
61825    Q39  Q123480  P5167   539                    'landlocked country'@en   

                          property_label      id lower_bound upper_bound  
50960  'vehicles per thousand people'@en  E50961                          
50964  'vehicles per thousand people'@en  E50965                          
61825  'vehicles per thousand people'@en  E61826                          
using median distance to k nearest neighbor 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity      node1  label node2 si_units wd_units  \
52636    Q33  Q63791824  P7422 -51.5            Q25267   
54372    Q34  Q63791824  P7422   -53            Q25267   
56170    Q35  Q63791824  P7422 -31.2            Q25267   

                                    type_label  \
52636  'countries bordering the Baltic Sea'@en   
54372  'countries bordering the Baltic Sea'@en   
56170  'countries bordering the Baltic Sea'@en   

                        property_label      id lower_bound upper_bound  
52636  'minimum temperature record'@en  E52637                          
54372  'minimum temperature record'@en  E54373                          
56170  'minimum temperature record'@en  E56171                          
using median distance to k nearest neighbor instead (1.5)

only one sample for label:
         entity     node1  label    node2 si_units wd_units     type_label  \
52639  Q3319685  Q4830453  P2403  1.2e+10             Q4917  'business'@en   


  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
       entity     node1  label node2 si_units   wd_units  \
54377  Q34366  Q5852411  P2999    17           Q24564698   

                    type_label       property_label      id lower_bound  \
54377  'state of Australia'@en  'age of consent'@en  E54378               

      upper_bound  
54377              

only one sample for label:
      entity      node1  label node2 si_units wd_units  \
54581    Q35  Q66724388  P1198     7            Q11229   

                                              type_label  \
54581  'autonomous country within the Kingdom of Denm...   

               property_label      id lower_bound upper_bound  
54581  'unemployment rate'@en  E54582                          

only one sample for label:
      entity      node1  label    node2 si_units wd_units  \
54716    Q35  Q66724388  P2046  42925.5           Q712226   

                                              type_label property_label  \
54716  'autonomous country within the Kin

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument


no knee found for these labels:
      entity      node1  label node2 si_units wd_units  \
55901    Q35  Q66724388  P2855     0            Q11229   
55906    Q35  Q66724388  P2855    25            Q11229   

                                              type_label property_label  \
55901  'autonomous country within the Kingdom of Denm...  'VAT-rate'@en   
55906  'autonomous country within the Kingdom of Denm...  'VAT-rate'@en   

           id lower_bound upper_bound  
55901  E55902                          
55906  E55907                          
using median distance to k nearest neighbor instead (25.0)

only one sample for label:
      entity      node1  label node2 si_units wd_units  \
55911    Q35  Q66724388  P2884   230            Q25250   

                                              type_label      property_label  \
55911  'autonomous country within the Kingdom of Denm...  'mains voltage'@en   

           id lower_bound upper_bound  
55911  E55912                          

o

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity      node1  label node2 si_units wd_units           type_label  \
56168    Q35  Q20181813  P7422 -31.2            Q25267  'colonial power'@en   

                        property_label      id lower_bound upper_bound  
56168  'minimum temperature record'@en  E56169                          

only one sample for label:
      entity      node1  label node2 si_units wd_units  \
56171    Q35  Q66724388  P7422 -31.2            Q25267   

                                              type_label  \
56171  'autonomous country within the Kingdom of Denm...   

                        property_label      id lower_bound upper_bound  
56171  'minimum temperature record'@en  E56172                          

only one sample for label:
      entity  node1  label node2 si_units wd_units  type_label  \
56172    Q35  Q7275  P7422 -31.2            Q25267  'state'@en   

                        property_label      id lower_bound upper_bound  
56172  'minimum temper

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity    node1  label node2 si_units wd_units          type_label  \
61666    Q39  Q170156  P2855   7.7            Q11229  'confederation'@en   

      property_label      id lower_bound upper_bound  
61666  'VAT-rate'@en  E61667                          

only one sample for label:
      entity    node1  label node2 si_units wd_units          type_label  \
61670    Q39  Q170156  P2884   230            Q25250  'confederation'@en   

           property_label      id lower_bound upper_bound  
61670  'mains voltage'@en  E61671                          

only one sample for label:
      entity    node1  label node2 si_units   wd_units          type_label  \
61674    Q39  Q170156  P2997    18           Q24564698  'confederation'@en   

             property_label      id lower_bound upper_bound  
61674  'age of majority'@en  E61675                          

only one sample for label:
      entity    node1  label node2 si_units   wd_units          type_lab

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity    node1  label node2 si_units wd_units  \
65112   Q408  Q223832  P2049  4000           Q828224   

                                type_label property_label      id lower_bound  \
65112  'dominion of the British Empire'@en     'width'@en  E65113               

      upper_bound  
65112              

only one sample for label:
      entity     node1  label node2 si_units wd_units            type_label  \
65113   Q408  Q3624078  P2049  4000           Q828224  'sovereign state'@en   

      property_label      id lower_bound upper_bound  
65113     'width'@en  E65114                          

only one sample for label:
      entity   node1  label node2 si_units wd_units          type_label  \
65114   Q408  Q43702  P2049  4000           Q828224  'federal state'@en   

      property_label      id lower_bound upper_bound  
65114     'width'@en  E65115                          

no knee found for these labels:
      entity    node1  label  node2 si

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity    node1  label node2 si_units   wd_units  \
66076   Q408  Q223832  P2998    18           Q24564698   

                                type_label         property_label      id  \
66076  'dominion of the British Empire'@en  'age of candidacy'@en  E66077   

      lower_bound upper_bound  
66076                          

only one sample for label:
      entity   node1  label node2 si_units   wd_units          type_label  \
66078   Q408  Q43702  P2998    18           Q24564698  'federal state'@en   

              property_label      id lower_bound upper_bound  
66078  'age of candidacy'@en  E66079                          

no knee found for these labels:
      entity    node1  label node2 si_units   wd_units  \
66079   Q408  Q202686  P3000    16           Q24564698   
66083   Q408  Q202686  P3000    18           Q24564698   

                    type_label         property_label      id lower_bound  \
66079  'Commonwealth realm'@en  'marriageab

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity  node1  label node2 si_units wd_units     type_label  \
67798    Q41  Q7270  P2997    18                    'republic'@en   

             property_label      id lower_bound upper_bound  
67798  'age of majority'@en  E67799                          

no knee found for these labels:
      entity   node1  label   node2 si_units wd_units  \
69600   Q424  Q41614  P2046  181035           Q712226   
79289   Q869  Q41614  P2046  513120           Q712226   

                         type_label property_label      id lower_bound  \
69600  'constitutional monarchy'@en      'area'@en  E69601               
79289  'constitutional monarchy'@en      'area'@en  E79290               

      upper_bound  
69600              
79289              
using median distance to k nearest neighbor instead (332084.5)

no knee found for these labels:
      entity   node1  label node2 si_units wd_units  \
69915   Q424  Q41614  P2219     7            Q11229   
79639   Q869  Q4

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedl

no knee found for these labels:
      entity   node1  label node2 si_units wd_units  \
70068   Q424  Q41614  P2855    10            Q11229   
79745   Q869  Q41614  P2855    10            Q11229   

                         type_label property_label      id lower_bound  \
70068  'constitutional monarchy'@en  'VAT-rate'@en  E70069               
79745  'constitutional monarchy'@en  'VAT-rate'@en  E79746               

      upper_bound  
70068              
79745              
using median distance to k nearest neighbor instead (0.0)

no knee found for these labels:
      entity   node1  label node2 si_units wd_units  \
70071   Q424  Q41614  P2884   230            Q25250   
79747   Q869  Q41614  P2884   220            Q25250   

                         type_label      property_label      id lower_bound  \
70071  'constitutional monarchy'@en  'mains voltage'@en  E70072               
79747  'constitutional monarchy'@en  'mains voltage'@en  E79748               

      upper_bound  
7007

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity     node1  label node2 si_units   wd_units  \
71739    Q43  Q1323642  P3001    58           Q24564698   
71743    Q43  Q1323642  P3001    60           Q24564698   

                          type_label       property_label      id lower_bound  \
71739  'transcontinental country'@en  'retirement age'@en  E71740               
71743  'transcontinental country'@en  'retirement age'@en  E71744               

      upper_bound  
71739              
71743              
using median distance to k nearest neighbor instead (2.0)

no knee found for these labels:
        entity node1  label node2 si_units wd_units  type_label  \
73212  Q454138    Q5  P1971    10                    'human'@en   
73213  Q454138    Q5  P1971    21                    'human'@en   

                property_label      id lower_bound upper_bound  
73212  'number of children'@en  E73213                          
73213  'number of children'@en  E73214                         

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity      node1  label node2 si_units wd_units  \
73719    Q55  Q15304003  P2219   2.1            Q11229   

                                           type_label  \
73719  'country of the Kingdom of the Netherlands'@en   

                                     property_label      id lower_bound  \
73719  'real gross domestic product growth rate'@en  E73720               

      upper_bound  
73719              

only one sample for label:
      entity      node1  label node2 si_units wd_units  \
73853    Q55  Q15304003  P2884   230            Q25250   

                                           type_label      property_label  \
73853  'country of the Kingdom of the Netherlands'@en  'mains voltage'@en   

           id lower_bound upper_bound  
73853  E73854                          

only one sample for label:
      entity      node1  label node2 si_units wd_units  \
73855    Q55  Q15304003  P2927  18.7            Q11229   

                         

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity      node1  label node2 si_units   wd_units  \
73901    Q55  Q15304003  P3270     5           Q24564698   

                                           type_label  \
73901  'country of the Kingdom of the Netherlands'@en   

                                property_label      id lower_bound upper_bound  
73901  'compulsory education (minimum age)'@en  E73902                          

only one sample for label:
      entity      node1  label node2 si_units   wd_units  \
73903    Q55  Q15304003  P3271    18           Q24564698   

                                           type_label  \
73903  'country of the Kingdom of the Netherlands'@en   

                                property_label      id lower_bound upper_bound  
73903  'compulsory education (maximum age)'@en  E73904                          

only one sample for label:
      entity      node1  label  node2 si_units wd_units  \
73905    Q55  Q15304003  P3529  38584             Q4917   

  

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity     node1  label  node2 si_units wd_units     type_label  \
74114    Q61  Q1549591  P2927  10.67            Q11229  'big city'@en   

                      property_label      id lower_bound upper_bound  
74114  'water as percent of area'@en  E74115                          

only one sample for label:
      entity    node1  label  node2 si_units wd_units             type_label  \
74115    Q61  Q475050  P2927  10.67            Q11229  'federal district'@en   

                      property_label      id lower_bound upper_bound  
74115  'water as percent of area'@en  E74116                          

only one sample for label:
      entity  node1  label  node2 si_units wd_units    type_label  \
74116    Q61  Q5119  P2927  10.67            Q11229  'capital'@en   

                      property_label      id lower_bound upper_bound  
74116  'water as percent of area'@en  E74117                          

only one sample for label:
      entity    

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity   node1  label node2 si_units wd_units             type_label  \
74139   Q663  Q11344  P2101  1220            Q42289  'chemical element'@en   

           property_label      id lower_bound upper_bound  
74139  'melting point'@en  E74140                          

only one sample for label:
      entity    node1  label node2 si_units wd_units          type_label  \
74140   Q663  Q214609  P2101  1220            Q42289  'base material'@en   

           property_label      id lower_bound upper_bound  
74140  'melting point'@en  E74141                          

only one sample for label:
      entity   node1  label node2 si_units wd_units             type_label  \
74141   Q663  Q11344  P2101   660            Q25267  'chemical element'@en   

           property_label      id lower_bound upper_bound  
74141  'melting point'@en  E74142                          

only one sample for label:
      entity    node1  label node2 si_units wd_units          

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


no knee found for these labels:
      entity   node1  label node2 si_units   wd_units             type_label  \
74153   Q663  Q11344  P2404    10           Q21006887  'chemical element'@en   
74163   Q663  Q11344  P2404     5           Q21006887  'chemical element'@en   

                                  property_label      id lower_bound  \
74153  'time-weighted average exposure limit'@en  E74154               
74163  'time-weighted average exposure limit'@en  E74164               

      upper_bound  
74153              
74163              
using median distance to k nearest neighbor instead (5.0)

no knee found for these labels:
      entity    node1  label node2 si_units   wd_units          type_label  \
74154   Q663  Q214609  P2404    10           Q21006887  'base material'@en   
74164   Q663  Q214609  P2404     5           Q21006887  'base material'@en   

                                  property_label      id lower_bound  \
74154  'time-weighted average exposure limit'@en  E7

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity   node1  label node2 si_units wd_units              type_label  \
78682   Q837  Q82794  P2219   0.6            Q11229  'geographic region'@en   

                                     property_label      id lower_bound  \
78682  'real gross domestic product growth rate'@en  E78683               

      upper_bound  
78682              

no knee found for these labels:
      entity   node1  label node2 si_units wd_units              type_label  \
78838   Q837  Q82794  P2855     0            Q11229  'geographic region'@en   
78841   Q837  Q82794  P2855    13            Q11229  'geographic region'@en   

      property_label      id lower_bound upper_bound  
78838  'VAT-rate'@en  E78839                          
78841  'VAT-rate'@en  E78842                          
using median distance to k nearest neighbor instead (13.0)

only one sample for label:
      entity   node1  label node2 si_units wd_units              type_label  \
78844   Q837  Q82794 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  return (a - min(a)) / (max(a) - min(a))
The line is probably not polynomial, try plotting
the difference curve with plt.plot(knee.x_difference, knee.y_difference)
Also check that you aren't mistakenly setting the curve argument


only one sample for label:
      entity     node1  label        node2 si_units wd_units  \
79238   Q869  Q3624078  P1174  3.25883e+07                     

                 type_label          property_label      id lower_bound  \
79238  'sovereign state'@en  'visitors per year'@en  E79239               

      upper_bound  
79238              

only one sample for label:
      entity   node1  label        node2 si_units wd_units  \
79239   Q869  Q41614  P1174  3.25883e+07                     

                         type_label          property_label      id  \
79239  'constitutional monarchy'@en  'visitors per year'@en  E79240   

      lower_bound upper_bound  
79239                          

only one sample for label:
      entity   node1  label node2 si_units wd_units  \
79241   Q869  Q41614  P1198   0.9            Q11229   

                         type_label          property_label      id  \
79241  'constitutional monarchy'@en  'unemployment rate'@en  E79242   

      lower

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
      entity    node1  label node2 si_units wd_units              type_label  \
80514   Q916  Q200464  P6897    66            Q11229  'Portuguese Empire'@en   

           property_label      id lower_bound upper_bound  
80514  'literacy rate'@en  E80515                          

only one sample for label:
          entity    node1  label node2 si_units  wd_units  type_label  \
80549  Q93552342  Q253481  P2665     5           Q2080811  'lager'@en   

               property_label      id lower_bound upper_bound  
80549  'alcohol by volume'@en  E80550                          



  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve=

no knee found for these labels:
          entity      node1  label node2 si_units wd_units       type_label  \
80553  Q93552342  Q15075508  P6088    20                    'beer brand'@en   
80566  Q93557205  Q15075508  P6088  29.5                    'beer brand'@en   
80578  Q93558270  Q15075508  P6088    20                    'beer brand'@en   
80592  Q93559285  Q15075508  P6088    15                    'beer brand'@en   
80605  Q93560567  Q15075508  P6088    26                    'beer brand'@en   

             property_label      id lower_bound upper_bound  
80553  'beer bitterness'@en  E80554                          
80566  'beer bitterness'@en  E80567                          
80578  'beer bitterness'@en  E80579                          
80592  'beer bitterness'@en  E80593                          
80605  'beer bitterness'@en  E80606                          
using median distance to k nearest neighbor instead (3.5)

no knee found for these labels:
          entity node1  label 

  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')
  kneedle = KneeLocator(range(len(distances)), distances, S=1, curve='convex', direction='increasing', interp_method = 'polynomial')


only one sample for label:
          entity    node1  label node2 si_units  wd_units  type_label  \
80601  Q93560567  Q217825  P2665   5.5           Q2080811  'stout'@en   

               property_label      id lower_bound upper_bound  
80601  'alcohol by volume'@en  E80602                          

only one sample for label:
          entity    node1  label node2 si_units wd_units  type_label  \
80608  Q93560567  Q217825  P6088    26                    'stout'@en   

             property_label      id lower_bound upper_bound  
80608  'beer bitterness'@en  E80609                          

only one sample for label:
          entity    node1  label node2 si_units  wd_units  \
82397  Q97412285  Q524679  P2665   0.4           Q2080811   

                  type_label          property_label      id lower_bound  \
82397  'low-alcohol beer'@en  'alcohol by volume'@en  E82398               

      upper_bound  
82397              



In [64]:
display(pd.read_csv("{}/entity_attribute_labels_quantity_bucketed.tsv".format(os.environ["OUT"]), delimiter = '\t', nrows=10).fillna(""))

Unnamed: 0,entity,node1,label,node2,si_units,wd_units,type_label,property_label,id,lower_bound,upper_bound
0,Q1000597,Q3957,P1082,75074.0,,,'town'@en,'population'@en,E1,,
1,Q1011,Q112099,P1081,0.57,,,'island nation'@en,'Human Development Index'@en,E2,0.534,0.616
2,Q1011,Q3624078,P1081,0.57,,,'sovereign state'@en,'Human Development Index'@en,E3,0.565,0.5865
3,Q1011,Q112099,P1081,0.572,,,'island nation'@en,'Human Development Index'@en,E4,0.534,0.616
4,Q1011,Q3624078,P1081,0.572,,,'sovereign state'@en,'Human Development Index'@en,E5,0.565,0.5865
5,Q1011,Q112099,P1081,0.575,,,'island nation'@en,'Human Development Index'@en,E6,0.534,0.616
6,Q1011,Q3624078,P1081,0.575,,,'sovereign state'@en,'Human Development Index'@en,E7,0.565,0.5865
7,Q1011,Q112099,P1081,0.585,,,'island nation'@en,'Human Development Index'@en,E8,0.534,0.616
8,Q1011,Q3624078,P1081,0.585,,,'sovereign state'@en,'Human Development Index'@en,E9,0.565,0.5865
9,Q1011,Q112099,P1081,0.589,,,'island nation'@en,'Human Development Index'@en,E10,0.534,0.616


Aggregating distinct interval labels with positive entity counts

In [65]:
!kgtk query -i $OUT/entity_attribute_labels_quantity_bucketed.tsv \
-o $OUT/candidate_labels_ail_quantity.tsv \
--graph-cache $STORE \
--match 'labels: (type)-[l {label:prop, property_label:lab, si_units:si, wd_units:wd, entity:e, lower_bound:lb, upper_bound:ub}]->(val)' \
--return 'type as type, prop as prop, si as si_units, wd as wd_units, lb as lower_bound, ub as upper_bound, count(e) as positives, lab as property_label, "_" as id' \
--order-by 'count(e) desc'

In [66]:
rename_cols_and_overwrite_id("$OUT/candidate_labels_ail_quantity", ".tsv", "type prop lower_bound", "node1 label node2")

In [67]:
display(pd.read_csv("{}/candidate_labels_ail_quantity.tsv".format(os.environ["OUT"]), delimiter = '\t', nrows=10).fillna(""))

Unnamed: 0,node1,label,si_units,wd_units,node2,upper_bound,positives,property_label,id
0,Q3624078,P2131,,Q4917,,800249400000.0,2752,'nominal GDP'@en,E1
1,Q3624078,P1082,,,,41699680.0,2689,'population'@en,E2
2,Q3624078,P2134,,Q4917,,67399250000.0,2529,'total reserves'@en,E3
3,Q3624078,P2132,,Q4917,,21120.5,2424,'nominal GDP per capita'@en,E4
4,Q3624078,P1081,,,0.595,0.9405,1893,'Human Development Index'@en,E5
5,Q3624078,P1279,,Q11229,-5.6,42.7,1516,'inflation rate'@en,E6
6,Q3624078,P4010,,Q550207,,1305344000000.0,1514,'GDP (PPP)'@en,E7
7,Q3624078,P2299,,Q550207,,31357.38,1463,'PPP GDP per capita'@en,E8
8,Q6256,P2131,,Q4917,,333496000000.0,951,'nominal GDP'@en,E9
9,Q6256,P2132,,Q4917,,29427.0,939,'nominal GDP per capita'@en,E10


### 4.3 Combining entity --> attribute interval label mappings to single table

In [68]:
!kgtk cat \
-i $OUT/entity_attribute_labels_quantity_bucketed.tsv \
-i $OUT/entity_attribute_labels_time.year_bucketed.tsv \
-o $OUT/entity_AILs_all.tsv

## 5. Create RALs with counts of positive entities

### 5.0 Trying out different ways to create these

Here is an idea of what kind of values we should find (creating RALs from scratch for quantity attributes only)

In [69]:
!kgtk query -i $ITEM_FILE -i $QUANTITY_FILE \
-i $OUT/type_mapping.tsv -i $LABEL_FILE --graph-cache $STORE \
--match '`'"$ITEM_FILE"'`: (n1)-[l1 {label:p1}]->(n2), type: (n1)-[]->(type1), `'"$LABEL_FILE"'`: (p1)-[:label]->(lab1), `'"$QUANTITY_FILE"'`: (n2)-[l2 {label:p2}]->(n3), type: (n2)-[]->(type2), `'"$LABEL_FILE"'`: (p2)-[:label]->(lab2)' \
--return 'distinct type1 as type1, p1 as prop1, type2 as type2, p2 as prop2, n3 as value, count(distinct n1) as positives, lab1 as prop1_label, lab2 as prop2_label' \
--where 'lab1.kgtk_lqstring_lang_suffix = "en"' \
--where 'lab2.kgtk_lqstring_lang_suffix = "en"' \
--order-by 'count(distinct n1) desc' \
--limit 5 \
| column -t -s $'\t'

type1     prop1  type2     prop2  value   positives  prop1_label               prop2_label
Q3624078  P530   Q3624078  P1081  +0.801  68         'diplomatic relation'@en  'Human Development Index'@en
Q3624078  P530   Q3624078  P1081  +0.809  68         'diplomatic relation'@en  'Human Development Index'@en
Q3624078  P530   Q3624078  P1081  +0.814  68         'diplomatic relation'@en  'Human Development Index'@en
Q3624078  P530   Q3624078  P1081  +0.824  68         'diplomatic relation'@en  'Human Development Index'@en
Q3624078  P530   Q3624078  P1081  +0.829  68         'diplomatic relation'@en  'Human Development Index'@en


Now using the REL table and the entities --> attribute labels table that we built in steps 2 and 4. Also keep track of counts of positive entities for each label

Trying to reuse RELs, however, when counting positives, we would need to sum each num_pos that matches the line - not sure how to do this, so the below won't capture when type1 --> x1 and type1 --> x2 resolve to be the same label: i.e. type1 --> typex --> val. See that the results of this method differ from the above method which we are confident in. See further down for alternate solution.

In [70]:
!kgtk query -i $OUT/candidate_labels_rel_item.tsv -i $OUT/entity_attribute_labels_quantity.tsv \
--graph-cache $STORE \
--match 'rel: (t1)-[l1 {label:p1, positives:num_pos}]->(v1), entity_attribute: (t2)-[l {entity:v1, label:p2}]->(v2)' \
--return 't1 as type, p1 as prop, t2 as value_type, p2 as value_prop, v2 as value_val, num_pos as positives, "_" as id' \
--order-by "kgtk_quantity_number_int(num_pos) desc" \
--limit 5 \
| column -t -s $'\t'

type      prop  value_type  value_prop  value_val  positives  id
Q3624078  P530  Q3624078    P1081       0.801      67         _
Q3624078  P530  Q4209223    P1081       0.801      67         _
Q3624078  P530  Q43702      P1081       0.801      67         _
Q3624078  P530  Q619610     P1081       0.801      67         _
Q3624078  P530  Q63791824   P1081       0.801      67         _


This way should work. Note that these results match the first method that we are confident in.

In [72]:
!kgtk query -i $ITEM_FILE -i $OUT/type_mapping.tsv \
-i $OUT/entity_attribute_labels_quantity.tsv --graph-cache $STORE \
--match '`'"$ITEM_FILE"'`: (n1)-[l1 {label:p1}]->(n2), type: (n1)-[]->(t1), entity_attribute: (t2)-[l2 {label:p2, entity:n2}]->(val)' \
--return 't1 as type1, p1 as prop1, t2 as type2, p2 as prop2, val as value, count(distinct n1) as positives, "_" as id' \
--order-by "count(distinct n1) desc" \
--limit 5 \
| column -t -s $'\t'

type1     prop1  type2     prop2  value  positives  id
Q3624078  P530   Q3624078  P1081  0.801  68         _
Q3624078  P530   Q3624078  P1081  0.809  68         _
Q3624078  P530   Q3624078  P1081  0.814  68         _
Q3624078  P530   Q3624078  P1081  0.824  68         _
Q3624078  P530   Q3624078  P1081  0.829  68         _


And now doing this for all attribute types:

### 5.1 RALs created from attribute *value* labels:

In [73]:
!kgtk query -i $ITEM_FILE -i $OUT/type_mapping.tsv -i $LABEL_FILE \
-i $OUT/entity_AVLs_all.tsv -o $OUT/candidate_labels_ravl.tsv --graph-cache $STORE \
--match '`'"$ITEM_FILE"'`: (n1)-[l1 {label:p1}]->(n2), type: (n1)-[]->(t1), entity_AVLs: (t2)-[l2 {label:p2, entity:n2, si_units:si, wd_units:wd}]->(val), `'"$LABEL_FILE"'`: (p2)-[:label]->(lab2)' \
--return 't1 as type1, p1 as prop1, t2 as type2, p2 as prop2, lab2 as prop2_label, val as value, count(distinct n1) as positives, si as si_units, wd as wd_units, "_" as id' \
--order-by "count(distinct n1) desc" \
--where 'lab2.kgtk_lqstring_lang_suffix = "en"'

In [74]:
rename_cols_and_overwrite_id("$OUT/candidate_labels_ravl", ".tsv", "type1 prop1 type2", "node1 label node2")

In [406]:
display(pd.read_csv("{}/candidate_labels_ravl.tsv".format(os.environ["OUT"]), delimiter = '\t', nrows=10).fillna(""))

Unnamed: 0,node1,label,node2,prop2,prop2_label,value,positives,si_units,wd_units,id
0,Q131734,P452,Q8148,P373,'Commons category'@en,Beer brewing,77,,,E1
1,Q131734,P452,Q8148,P580,'start time'@en,-3500,77,,,E2
2,Q3624078,P530,Q3624078,P1081,'Human Development Index'@en,0.801,68,,,E3
3,Q3624078,P530,Q3624078,P1081,'Human Development Index'@en,0.809,68,,,E4
4,Q3624078,P530,Q3624078,P1081,'Human Development Index'@en,0.814,68,,,E5
5,Q3624078,P530,Q3624078,P1081,'Human Development Index'@en,0.824,68,,,E6
6,Q3624078,P530,Q3624078,P1081,'Human Development Index'@en,0.829,68,,,E7
7,Q3624078,P530,Q3624078,P1081,'Human Development Index'@en,0.83,68,,,E8
8,Q3624078,P530,Q3624078,P1081,'Human Development Index'@en,0.834,68,,,E9
9,Q3624078,P530,Q3624078,P1081,'Human Development Index'@en,0.839,68,,,E10


### 5.2 RALs created from attribute *interval* labels:

In [77]:
!kgtk query -i $ITEM_FILE -i $OUT/type_mapping.tsv -i $LABEL_FILE \
-i $OUT/entity_AILs_all.tsv -o $OUT/candidate_labels_rail.tsv --graph-cache $STORE \
--match '`'"$ITEM_FILE"'`: (n1)-[l1 {label:p1}]->(n2), type: (n1)-[]->(t1), entity_AILs: (t2)-[l2 {label:p2, entity:n2, lower_bound:lb, upper_bound:ub, wd_units:wd, si_units:si}]->(val), `'"$LABEL_FILE"'`: (p2)-[:label]->(lab2)' \
--return 't1 as type1, p1 as prop1, t2 as type2, p2 as prop2, lab2 as prop2_label, si as si_units, wd as wd_units, lb as lower_bound, ub as upper_bound, count(distinct n1) as positives, "_" as id' \
--order-by "count(distinct n1) desc" \
--where 'lab2.kgtk_lqstring_lang_suffix = "en"'

In [78]:
rename_cols_and_overwrite_id("$OUT/candidate_labels_rail", ".tsv", "type1 prop1 type2", "node1 label node2")

In [79]:
display(pd.read_csv("{}/candidate_labels_rail.tsv".format(os.environ["OUT"]), delimiter = '\t', nrows=10).fillna(""))

Unnamed: 0,node1,label,node2,prop2,prop2_label,si_units,wd_units,lower_bound,upper_bound,positives,id
0,Q131734,P452,Q8148,P580,'start time'@en,,,,,77,E1
1,Q131734,P17,Q3624078,P1081,'Human Development Index'@en,,,0.595,0.9405,70,E2
2,Q131734,P17,Q3624078,P1279,'inflation rate'@en,,Q11229,-5.6,42.7,70,E3
3,Q131734,P17,Q3624078,P2131,'nominal GDP'@en,,Q4917,,800249000000.0,70,E4
4,Q131734,P17,Q3624078,P2219,'real gross domestic product growth rate'@en,,Q11229,-4.61,,70,E5
5,Q131734,P17,Q3624078,P2250,'life expectancy'@en,,Q577,70.0592,83.0841,70,E6
6,Q131734,P17,Q3624078,P2299,'PPP GDP per capita'@en,,Q550207,,31357.4,70,E7
7,Q131734,P17,Q3624078,P2132,'nominal GDP per capita'@en,,Q4917,,21120.5,69,E8
8,Q131734,P17,Q3624078,P2134,'total reserves'@en,,Q4917,,67399200000.0,69,E9
9,Q131734,P17,Q3624078,P4841,'total fertility rate'@en,,,1.1235,2.1815,69,E10
