# Creating a subset of Wikidata

This notebook illustrates how to create a subset of Wikidata. We use as an example https://www.wikidata.org/wiki/Q11173 (chemical compound)

Parameters are set up in the first cell so that we can run this notebook in batch mode. Example invocation command:

```
papermill Example8\ -\ Wikidata\ Subset.ipynb exmaple8.out.ipynb \
-p wikidata_home /Users/pedroszekely/Downloads/kypher \
-p wikidata_file all.10.tsv.gz \
-p wikidata_parts_folder /Users/pedroszekely/Downloads/kypher/output.all.10 \
-p subset_name Q11173 \
-p home /Users/pedroszekely/Downloads/kypher \
-p delete_database yes 
```

In [1]:
# Parameters
wikidata_home = "/Users/pedroszekely/Downloads/kypher"
wikidata_file = "wikidata-20200803-all-edges.tsv.gz"
wikidata_file = "all.10.tsv.gz"
wikidata_parts_folder = "/Users/pedroszekely/Downloads/kypher/useful_wikidata_files"
wikidata_parts_folder = "/Users/pedroszekely/Downloads/kypher/output.all.10"
subset_name = "Q11173"
#subset_name = "Q318"
#subset_name = "Q5"
home = "/Users/pedroszekely/Downloads/kypher"
delete_database = "yes"

In [2]:
temp_folder = subset_name + "-temp"
output_folder = subset_name

In [3]:
import io
import os
import subprocess
import sys

import numpy as np
import pandas as pd

# from IPython.display import display, HTML, Image
# from pandas_profiling import ProfileReport

A convenience function to run templetazed commands, substituting NAME with the name of the dataset and substituting other keys provided in a dictionary.

In [4]:
def run_command(command, substitution_dictionary = {}):
    """Run a templetized command."""
    cmd = command.replace("NAME", subset_name)
    for k, v in substitution_dictionary.items():
        cmd = cmd.replace(k, v)
    
    print(cmd)
    output = subprocess.run([cmd], shell=True, universal_newlines=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    print(output.stdout)
    print(output.stderr)
    #print(output.returncode)

### Set up environment variables and folders that we need

In [5]:
# folder containing wikidata broken down into smaller files.
os.environ['WIKIDATA_PARTS'] = wikidata_parts_folder
# path of folder where the wikidata parts folder is stored.
os.environ['WIKIDATA_HOME'] = wikidata_home
# name of the dataset
os.environ['NAME'] = subset_name
# folder where to put the output
os.environ['OUT'] = "{}/{}".format(home, output_folder)
# temporary folder
os.environ['TEMP'] = "{}/{}".format(home, temp_folder)
# kgtk command to run
os.environ['kgtk'] = "kgtk"
os.environ['kgtk'] = "time kgtk --debug"
# absolute path of the db
os.environ['STORE'] = "{}/{}/wikidata.sqlite3.db".format(home, temp_folder)

In [6]:
cd $home

/Users/pedroszekely/Downloads/kypher


In [7]:
!mkdir $output_folder
!mkdir $temp_folder

mkdir: Q11173: File exists
mkdir: Q11173-temp: File exists


In [8]:
!rm $OUT/*.tsv $OUT/*.tsv.gz
!rm $TEMP/*.tsv $TEMP/*.tsv.gz

rm: /Users/pedroszekely/Downloads/kypher/Q11173/*.tsv: No such file or directory
rm: /Users/pedroszekely/Downloads/kypher/Q11173-temp/*.tsv: No such file or directory


In [9]:
if delete_database:
    print("Deleted database")
    !rm $TEMP/wikidata.sqlite3.db

Deleted database


### Extract the Q-nodes for the items we want
Here we assume that the subset is for an individual q-node, so that the subset name is the name of the q-node. We should generalize this so that this query can be passed in as a parameter. We construct a file that contains all the node1s that are isa of the given NAME q-node

In [10]:
command = "$kgtk query -i $WIKIDATA_PARTS/all.isa.tsv.gz -i $WIKIDATA_PARTS/all.P279star.tsv.gz \
    --graph-cache $STORE \
    -o $TEMP/all.isa.NAME.tsv.gz  \
    --match 'isa: (n1)-[l:isa]->(n2:NAME)' \
    --return 'distinct n1, l.label, n2'"
run_command(command)

$kgtk query -i $WIKIDATA_PARTS/all.isa.tsv.gz -i $WIKIDATA_PARTS/all.P279star.tsv.gz     --graph-cache $STORE     -o $TEMP/all.isa.Q11173.tsv.gz      --match 'isa: (n1)-[l:isa]->(n2:Q11173)'     --return 'distinct n1 , l.label, n2'

[2020-09-29 17:00:54 sqlstore]: IMPORT graph directly into table graph_1 from /Users/pedroszekely/Downloads/kypher/output.all.10/all.isa.tsv.gz ...
[2020-09-29 17:00:54 sqlstore]: IMPORT graph directly into table graph_2 from /Users/pedroszekely/Downloads/kypher/output.all.10/all.P279star.tsv.gz ...
[2020-09-29 17:00:54 query]: SQL Translation:
---------------------------------------------
  SELECT DISTINCT graph_1_c1."node1", graph_1_c1."label", graph_1_c1."node2"
     FROM graph_1 AS graph_1_c1
     WHERE graph_1_c1."label"=?
     AND graph_1_c1."node2"=?
  PARAS: ['isa', 'Q11173']
---------------------------------------------
[2020-09-29 17:00:54 sqlstore]: CREATE INDEX on table graph_1 column label ...
[2020-09-29 17:00:54 sqlstore]: ANALYZE INDEX on ta

### Generate the parts of this dataset

In [31]:
types = [
    "time",
    "wikibase_item",
    "math",
    "wikibase_form",
    "quantity",
    "string",
    "external_id",
    "commonsMedia",
    "globe_coordinate",
    "monolingualtext",
    "musical_notation",
    "geo_shape",
    "wikibase_property",
    "url",
]
command = "$kgtk query -i $TEMP/all.isa.NAME.tsv.gz -i $WIKIDATA_PARTS/part.TYPE_FILE.tsv.gz --graph-cache $STORE  \
    -o $OUT/NAME.part.TYPE_FILE.tsv.gz  \
    --match 'NAME: (n1)-[]->(), TYPE_FILE: (n1)-[l]->(n2)' \
    --return 'distinct l, n1, l.label, n2' \
    --order-by 'n1, l.label, n2'"
for type in types:
    run_command(command, {"TYPE_FILE": type.replace("-", "_")})


$kgtk query -i $TEMP/all.isa.Q11173.tsv.gz -i $WIKIDATA_PARTS/part.time.tsv.gz --graph-cache $STORE      -o $OUT/Q11173.part.time.tsv.gz      --match 'Q11173: (n1)-[]->(), time: (n1)-[l]->(n2)'     --return 'distinct l, n1, l.label, n2'     --order-by 'n1, l.label, n2'

[2020-09-29 17:03:08 query]: SQL Translation:
---------------------------------------------
  SELECT DISTINCT graph_4_c2."id", graph_3_c1."node1", graph_4_c2."label", graph_4_c2."node2"
     FROM graph_3 AS graph_3_c1, graph_4 AS graph_4_c2
     WHERE graph_3_c1."node1"=graph_4_c2."node1"
     ORDER BY graph_3_c1."node1" ASC, graph_4_c2."label" ASC, graph_4_c2."node2" ASC
  PARAS: []
---------------------------------------------
        0.72 real         0.57 user         0.13 sys

$kgtk query -i $TEMP/all.isa.Q11173.tsv.gz -i $WIKIDATA_PARTS/part.wikibase_item.tsv.gz --graph-cache $STORE      -o $OUT/Q11173.part.wikibase_item.tsv.gz      --match 'Q11173: (n1)-[]->(), wikibase_item: (n1)-[l]->(n2)'     --return 'distinc


[2020-09-29 17:03:16 query]: SQL Translation:
---------------------------------------------
  SELECT DISTINCT graph_15_c2."id", graph_15_c2."node1", graph_15_c2."label", graph_15_c2."node2"
     FROM graph_15 AS graph_15_c2, graph_3 AS graph_3_c1
     WHERE graph_15_c2."node1"=graph_3_c1."node1"
     ORDER BY graph_15_c2."node1" ASC, graph_15_c2."label" ASC, graph_15_c2."node2" ASC
  PARAS: []
---------------------------------------------
        0.67 real         0.54 user         0.11 sys

$kgtk query -i $TEMP/all.isa.Q11173.tsv.gz -i $WIKIDATA_PARTS/part.wikibase_property.tsv.gz --graph-cache $STORE      -o $OUT/Q11173.part.wikibase_property.tsv.gz      --match 'Q11173: (n1)-[]->(), wikibase_property: (n1)-[l]->(n2)'     --return 'distinct l, n1, l.label, n2'     --order-by 'n1, l.label, n2'

[2020-09-29 17:03:17 query]: SQL Translation:
---------------------------------------------
  SELECT DISTINCT graph_16_c2."id", graph_16_c2."node1", graph_16_c2."label", graph_16_c2."node2"
  

### Generate a P279star file

First generate the P279 and P31 or every node2 in the wikibase_item file.

In [12]:
command_p279 = "$kgtk query -i $OUT/NAME.part.wikibase_item.tsv.gz -i $WIKIDATA_PARTS/all.P279.tsv.gz --graph-cache $STORE \
-o $TEMP/node2.P279.tsv.gz \
--match 'NAME: ()-[]->(n1), P279: (n1)-[l]->(n2)' \
--return 'distinct l, n1 as node1, l.label, n2' \
--order-by 'n1, l.label, n2'"

command_p31 = "$kgtk query -i $OUT/NAME.part.wikibase_item.tsv.gz -i $WIKIDATA_PARTS/all.P31.tsv.gz --graph-cache $STORE \
-o $TEMP/node2.P31.tsv.gz \
--match 'NAME: ()-[]->(n1), P31: (n1)-[l]->(n2)' \
--return 'distinct l, n1 as node1, l.label, n2' \
--order-by 'n1, l.label, n2'"

run_command(command_p279)
run_command(command_p31)

!$kgtk cat $TEMP/node2.P279.tsv.gz $TEMP/node2.P31.tsv.gz | gzip > $TEMP/$NAME.P279_P31.tsv.gz


$kgtk query -i $OUT/Q11173.part.wikibase_item.tsv.gz -i $WIKIDATA_PARTS/all.P279.tsv.gz --graph-cache $STORE -o $TEMP/node2.P279.tsv.gz --match 'Q11173: ()-[]->(n1), P279: (n1)-[l]->(n2)' --return 'distinct l, n1 as node1, l.label, n2' --order-by 'n1, l.label, n2'

[2020-09-29 17:01:12 sqlstore]: IMPORT graph directly into table graph_18 from /Users/pedroszekely/Downloads/kypher/Q11173/Q11173.part.wikibase_item.tsv.gz ...
[2020-09-29 17:01:12 sqlstore]: IMPORT graph directly into table graph_19 from /Users/pedroszekely/Downloads/kypher/output.all.10/all.P279.tsv.gz ...
[2020-09-29 17:01:12 query]: SQL Translation:
---------------------------------------------
  SELECT DISTINCT graph_19_c2."id", graph_18_c1."node2" "node1", graph_19_c2."label", graph_19_c2."node2"
     FROM graph_18 AS graph_18_c1, graph_19 AS graph_19_c2
     WHERE graph_18_c1."node2"=graph_19_c2."node1"
     ORDER BY graph_18_c1."node2" ASC, graph_19_c2."label" ASC, graph_19_c2."node2" ASC
  PARAS: []
----------------

In [13]:
!kgtk cat -i $OUT/$NAME.part.*.tsv.gz  | gzip > $TEMP/$NAME.all_1.tsv.gz

In [14]:
command_node1 = "$kgtk query -i $WIKIDATA_PARTS/all.P279star.tsv.gz -i $TEMP/NAME.all_1.tsv.gz \
    --graph-cache $STORE  \
    -o $TEMP/P279star.1.tsv.gz \
    --match 'P279star: (n1)-[l]->(n2), all_1: (n1)-[]->()' \
    --return 'distinct l, n1, l.label, n2'"

command_node2 = "$kgtk query -i $WIKIDATA_PARTS/all.P279star.tsv.gz -i $TEMP/NAME.all_1.tsv.gz \
    --graph-cache $STORE  \
    -o $TEMP/P279star.2.tsv.gz \
    --match 'P279star: (n1)-[l]->(n2), all_1: ()-[]->(n1)' \
    --return 'distinct l, n1 as node1, l.label, n2'" 

cat_command = "$kgtk cat $TEMP/P279star.1.tsv.gz $TEMP/P279star.2.tsv.gz | gzip > $OUT/NAME.P279star.tsv.gz"

run_command(command_node1)
run_command(command_node2)
run_command(cat_command)

$kgtk query -i $WIKIDATA_PARTS/all.P279star.tsv.gz -i $TEMP/Q11173.all_1.tsv.gz     --graph-cache $STORE      -o $TEMP/P279star.1.tsv.gz     --match 'P279star: (n1)-[l]->(n2), all_1: (n1)-[]->()'     --return 'distinct l, n1, l.label, n2'

[2020-09-29 17:01:15 sqlstore]: IMPORT graph directly into table graph_21 from /Users/pedroszekely/Downloads/kypher/Q11173-temp/Q11173.all_1.tsv.gz ...
[2020-09-29 17:01:15 query]: SQL Translation:
---------------------------------------------
  SELECT DISTINCT graph_2_c1."id", graph_2_c1."node1", graph_2_c1."label", graph_2_c1."node2"
     FROM graph_2 AS graph_2_c1, graph_21 AS graph_21_c2
     WHERE graph_21_c2."node1"=graph_2_c1."node1"
  PARAS: []
---------------------------------------------
[2020-09-29 17:01:15 sqlstore]: CREATE INDEX on table graph_21 column node1 ...
[2020-09-29 17:01:15 sqlstore]: ANALYZE INDEX on table graph_21 column node1 ...
[2020-09-29 17:01:15 sqlstore]: CREATE INDEX on table graph_2 column node1 ...
[2020-09-29 17:01

### Construct a file with all the edges so we can produce the final outputs

### Get info on all properties

In [15]:
!$kgtk cat $OUT/*.gz | gzip > $TEMP/everything_1.tsv.gz

        0.66 real         0.54 user         0.11 sys


First get a list of all the proerties used in this file

In [16]:
!$kgtk query -i $TEMP/everything_1.tsv.gz --graph-cache $STORE \
-o $TEMP/properties.tsv \
--match '(n1)-[l]->(n2)' \
--return 'distinct l.label as node1, "dummy" as label, "dummy" as node2' 

[2020-09-29 17:01:18 sqlstore]: IMPORT graph directly into table graph_22 from /Users/pedroszekely/Downloads/kypher/Q11173-temp/everything_1.tsv.gz ...
[2020-09-29 17:01:18 query]: SQL Translation:
---------------------------------------------
  SELECT DISTINCT graph_22_c1."label" "node1", ? "label", ? "node2"
     FROM graph_22 AS graph_22_c1
  PARAS: ['dummy', 'dummy']
---------------------------------------------
        0.72 real         0.59 user         0.13 sys


Now get all the info in these properties

In [17]:
!$kgtk query -i $TEMP/properties.tsv -i $WIKIDATA_PARTS/part.wikibase_item.tsv.gz --graph-cache $STORE \
-o $OUT/$NAME.properties.tsv.gz \
--match 'wikibase_item: (p)-[l]->(n2), properties: (p)-[]->()' \
--return 'distinct l, p, l.label, n2' 

[2020-09-29 17:01:19 sqlstore]: IMPORT graph directly into table graph_23 from /Users/pedroszekely/Downloads/kypher/Q11173-temp/properties.tsv ...
[2020-09-29 17:01:19 query]: SQL Translation:
---------------------------------------------
  SELECT DISTINCT graph_5_c1."id", graph_5_c1."node1", graph_5_c1."label", graph_5_c1."node2"
     FROM graph_23 AS graph_23_c2, graph_5 AS graph_5_c1
     WHERE graph_23_c2."node1"=graph_5_c1."node1"
  PARAS: []
---------------------------------------------
[2020-09-29 17:01:19 sqlstore]: CREATE INDEX on table graph_23 column node1 ...
[2020-09-29 17:01:19 sqlstore]: ANALYZE INDEX on table graph_23 column node1 ...
        0.71 real         0.55 user         0.13 sys


### Generate the labels, aliases and descriptions
We want the labels, aliases and descriptions for every q-node in our dataset. THis means that we need these lables for all q-nodes that appear in the node1 or node2 position.

The first step is to concatenate all the files in our dataset.

In [18]:
!$kgtk cat $OUT/*.gz | gzip > $TEMP/everything_2.tsv.gz

        0.65 real         0.53 user         0.10 sys


Now we extract the labels from from our input wikidata folder. We do this matching node1, thend node 2, then we concatenate the resulting label files.

In [19]:
labels = [
    "label",
    "alias",
    "description"
]

command_node1 = "$kgtk query -i $TEMP/everything_2.tsv.gz -i $WIKIDATA_PARTS/part.LABEL.en.tsv.gz --graph-cache $STORE  \
    -o $TEMP/NAME.LABEL.en.1.tsv.gz  \
    --match 'everything_2: (n1)-[]->(), part: (n1)-[l]->(n2)' \
    --return 'distinct l, n1, l.label, n2' \
    --order-by 'n1, l.label, n2'"

command_node2 = "$kgtk query -i $TEMP/everything_2.tsv.gz -i $WIKIDATA_PARTS/part.LABEL.en.tsv.gz --graph-cache $STORE  \
    -o $TEMP/NAME.LABEL.en.2.tsv.gz  \
    --match 'everything_2: ()-[]->(n1), part: (n1)-[l]->(n2)' \
    --return 'distinct l, n1 as node1, l.label, n2' \
    --order-by 'n1, l.label, n2'"

cat_command = "kgtk cat $TEMP/NAME.LABEL.*.gz | gzip > $OUT/NAME.LABEL.en.tsv.gz"

for label in labels:
    run_command(command_node1, {"LABEL": label})
    run_command(command_node2, {"LABEL": label})
    run_command(cat_command, {"LABEL": label})


$kgtk query -i $TEMP/everything_2.tsv.gz -i $WIKIDATA_PARTS/part.label.en.tsv.gz --graph-cache $STORE      -o $TEMP/Q11173.label.en.1.tsv.gz      --match 'everything_2: (n1)-[]->(), part: (n1)-[l]->(n2)'     --return 'distinct l, n1, l.label, n2'     --order-by 'n1, l.label, n2'

[2020-09-29 17:01:21 sqlstore]: IMPORT graph directly into table graph_24 from /Users/pedroszekely/Downloads/kypher/Q11173-temp/everything_2.tsv.gz ...
[2020-09-29 17:01:21 sqlstore]: IMPORT graph directly into table graph_25 from /Users/pedroszekely/Downloads/kypher/output.all.10/part.label.en.tsv.gz ...
[2020-09-29 17:01:21 query]: SQL Translation:
---------------------------------------------
  SELECT DISTINCT graph_25_c2."id", graph_24_c1."node1", graph_25_c2."label", graph_25_c2."node2"
     FROM graph_24 AS graph_24_c1, graph_25 AS graph_25_c2
     WHERE graph_24_c1."node1"=graph_25_c2."node1"
     ORDER BY graph_24_c1."node1" ASC, graph_25_c2."label" ASC, graph_25_c2."node2" ASC
  PARAS: []
------------

### Summary of what we got

In [20]:
%%bash
for f in $OUT/*.tsv.gz; do
    echo -n `basename $f`
    gzcat $f | wc -l
done

Q11173.P279star.tsv.gz     171
Q11173.alias.en.tsv.gz     929
Q11173.description.en.tsv.gz     325
Q11173.label.en.tsv.gz     318
Q11173.part.commonsMedia.tsv.gz     169
Q11173.part.external_id.tsv.gz    4693
Q11173.part.geo_shape.tsv.gz       1
Q11173.part.globe_coordinate.tsv.gz       1
Q11173.part.math.tsv.gz       1
Q11173.part.monolingualtext.tsv.gz      26
Q11173.part.musical_notation.tsv.gz       1
Q11173.part.quantity.tsv.gz     454
Q11173.part.string.tsv.gz     660
Q11173.part.time.tsv.gz       1
Q11173.part.url.tsv.gz       2
Q11173.part.wikibase_form.tsv.gz       1
Q11173.part.wikibase_item.tsv.gz    1047
Q11173.part.wikibase_property.tsv.gz       1
Q11173.properties.tsv.gz       1


Unzip the everything file as graph-statistics cannont work with gz files

In [21]:
!rm $TEMP/everything_2.tsv

rm: /Users/pedroszekely/Downloads/kypher/Q11173-temp/everything_2.tsv: No such file or directory


In [22]:
!gunzip --keep $TEMP/everything_2.tsv.gz

In [23]:
!$kgtk graph-statistics --log $OUT/$NAME.everything.statistics.txt \
    --statistics-only --pagerank -i $TEMP/everything_2.tsv \
    | gzip > $OUT/$NAME.statistics.tsv.gz

       15.21 real         4.85 user         0.86 sys


In [24]:
!cat $OUT/$NAME.everything.statistics.txt

loading the TSV graph now ...
graph loaded! It has 6726 nodes and 7214 edges

###Top relations:
P31	416
P4964	400
P6689	318
P638	275
P274	265
P231	261
P235	255
P234	252
P661	252
P662	251

###PageRank
Max pageranks
300	Q423762	0.006476
37	Q43656	0.013584
3	Q11173	0.016631
169	Q193572	0.010468
160	Q171877	0.009084


In [25]:
!exa -l $WIKIDATA_PARTS

.[1;33mr[31mw[0m[38;5;244m-[33mr[38;5;244m--[33mr[38;5;244m--[0m  [1;32m71[0m[32mk[0m [1;33mpedroszekely[0m [34m27 Sep  8:37[0m all-distribution.tsv
.[1;33mr[31mw[0m[38;5;244m-[33mr[38;5;244m--[33mr[38;5;244m--[0m  [1;32m555[0m [1;33mpedroszekely[0m [34m27 Sep  8:42[0m [31mall.isa.Q318.tsv.gz[0m
.[1;33mr[31mw[0m[38;5;244m-[33mr[38;5;244m--[33mr[38;5;244m--[0m   [1;32m88[0m [1;33mpedroszekely[0m [34m27 Sep  8:42[0m [31mall.isa.Q13442814.tsv.gz[0m
.[1;33mr[31mw[0m[38;5;244m-[33mr[38;5;244m--[33mr[38;5;244m--[0m [1;32m577[0m[32mk[0m [1;33mpedroszekely[0m [34m27 Sep  8:42[0m [31mall.isa.tsv.gz[0m
.[1;33mr[31mw[0m[38;5;244m-[33mr[38;5;244m--[33mr[38;5;244m--[0m [1;32m842[0m[32mk[0m [1;33mpedroszekely[0m [34m27 Sep  8:41[0m [31mall.P31.tsv.gz[0m
.[1;33mr[31mw[0m[38;5;244m-[33mr[38;5;244m--[33mr[38;5;244m--[0m [1;32m885[0m[32mk[0m [1;33mpedroszekely[0m [34m27 Sep  8:41[0m [31mall.P3

Example of how to do simple queries

In [26]:
!$kgtk query -i $OUT/$NAME.label.en.tsv.gz --graph-cache $STORE \
--match '(n:P39)-[:label]->(n2)' \
--return 'n as node, n2 as label'

[2020-09-29 17:01:45 sqlstore]: IMPORT graph directly into table graph_28 from /Users/pedroszekely/Downloads/kypher/Q11173/Q11173.label.en.tsv.gz ...
[2020-09-29 17:01:45 query]: SQL Translation:
---------------------------------------------
  SELECT graph_28_c1."node1" "node", graph_28_c1."node2" "label"
     FROM graph_28 AS graph_28_c1
     WHERE graph_28_c1."label"=?
     AND graph_28_c1."node1"=?
  PARAS: ['label', 'P39']
---------------------------------------------
[2020-09-29 17:01:45 sqlstore]: CREATE INDEX on table graph_28 column label ...
[2020-09-29 17:01:45 sqlstore]: ANALYZE INDEX on table graph_28 column label ...
[2020-09-29 17:01:45 sqlstore]: CREATE INDEX on table graph_28 column node1 ...
[2020-09-29 17:01:45 sqlstore]: ANALYZE INDEX on table graph_28 column node1 ...
node	label
        0.90 real         0.59 user         0.18 sys


Example of how to get statistics on the properties. 

In [27]:
!kgtk query -i $TEMP/everything_2.tsv.gz -i $WIKIDATA_PARTS/part.label.en.tsv.gz --graph-cache $STORE \
--match 'everything: (n1)-[l:P106]->(n2), label: (n2)-[:label]->(label)' \
--return 'distinct l.label as property_id, label as property_label, n2 as value, count(n2) as value_count' \
--order-by 'count(n2) desc' \
--limit 10 \
| column -t -s $'\t' 

^C
