In [1]:
import kgtk

#### The notebook provide basic example on how to use KGTK tools on CSKG 
#### **CSKG** is a commonsense knowledge graph that combines seven popular sources into a consolidated representation:
* ATOMIC
* ConceptNet
* FrameNet
* Roget
* Visual Genome
* Wikidata (We use the Wikidata-CS subset)
* WordNet

#### **CSKG** stores data with nine dimension as following:
| d | node1 | relation | node2 | node1;label | node2;label | relation;label | relation;dimension | source | sentence |
|---|-------|----------|-------|-------------|-------------|----------------|--------------------|--------|----------|
| /c/en/0-/r/DefinedAs-/c/en/empty_set-0000 | /c/en/0 | /r/DefinedAs | /c/en/empty_set | 0 | empty set | defined as | | CN | [[0]] is the [[empty set]]. |


#### **KGTK** is a Python library for easy manipulation with knowledge graphs. In this notebook, we adapt KGTK tools on CSKG.

#### 1.**Query** is the basic operation to search specific pattern on CSKG and return the results according to a return specification. 
* The following example is a simple query with an anonymous edge pattern
* We use limit to control the number of feedback

In [9]:
!kgtk query -i cskg.tsv --match '()-[]->()' --limit 5

id	node1	relation	node2	node1;label	node2;label	relation;label	relation;dimension	source	sentence
/c/en/0-/r/DefinedAs-/c/en/empty_set-0000	/c/en/0	/r/DefinedAs	/c/en/empty_set	0	empty set	defined as		CN	[[0]] is the [[empty set]].
/c/en/0-/r/DefinedAs-/c/en/first_limit_ordinal-0000	/c/en/0	/r/DefinedAs	/c/en/first_limit_ordinal	0	first limit ordinal	defined as		CN	[[0]] is the [[first limit ordinal]].
/c/en/0-/r/DefinedAs-/c/en/number_zero-0000	/c/en/0	/r/DefinedAs	/c/en/number_zero	0	number zero	defined as		CN	[[0]] is the [[number zero]].
/c/en/0-/r/HasContext-/c/en/internet_slang-0000	/c/en/0	/r/HasContext	/c/en/internet_slang	0	internet slang	has context		CN	
/c/en/0-/r/HasProperty-/c/en/pronounced_zero-0000	/c/en/0	/r/HasProperty	/c/en/pronounced_zero	0	pronounced zero	has property		CN	[["0"]] is [[pronounced zero]]


#### 2.**Return** can format and modify the result based on variables. (Avoid time-costing query)

In [10]:
!kgtk query -i cskg.tsv --match '(p)-[r]->(n)'\
                        --limit 5 \
                        --return 'r,p,n,r.relation,r.source'

id	node1	node2	relation	source
/c/en/0-/r/DefinedAs-/c/en/empty_set-0000	/c/en/0	/c/en/empty_set	/r/DefinedAs	CN
/c/en/0-/r/DefinedAs-/c/en/first_limit_ordinal-0000	/c/en/0	/c/en/first_limit_ordinal	/r/DefinedAs	CN
/c/en/0-/r/DefinedAs-/c/en/number_zero-0000	/c/en/0	/c/en/number_zero	/r/DefinedAs	CN
/c/en/0-/r/HasContext-/c/en/internet_slang-0000	/c/en/0	/c/en/internet_slang	/r/HasContext	CN
/c/en/0-/r/HasProperty-/c/en/pronounced_zero-0000	/c/en/0	/c/en/pronounced_zero	/r/HasProperty	CN


#### 3.**Where** holds a possibly complex Boolean expression that gets evaluated as additional edge filter
* The follow example extract all knowledge related with bus in ConceptNet.

In [14]:
!kgtk query -i cskg.tsv --match '(p)-[r]->(n)' \
                        --limit 20 \
                        --where 'r.source= "CN" and p = "/c/en/bus"' \
                        --return 'r,p,n,r.relation,r.source'

id	node1	node2	relation	source
/c/en/bus-/r/AtLocation-/c/en/big_cities-0000	/c/en/bus	/c/en/big_cities	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/bus_station-0000	/c/en/bus	/c/en/bus_station	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/bus_stop-0000	/c/en/bus	/c/en/bus_stop	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/city-0000	/c/en/bus	/c/en/city	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/computer-0000	/c/en/bus	/c/en/computer	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/garage-0000	/c/en/bus	/c/en/garage	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/michigan-0000	/c/en/bus	/c/en/michigan	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/new_york-0000	/c/en/bus	/c/en/new_york	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/school-0000	/c/en/bus	/c/en/school	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/seats-0000	/c/en/bus	/c/en/seats	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/street-0000	/c/en/bus	/c/en/street	/r/AtLocation	CN
/c/en/bus-/r/AtLocation-/c/en/use-0000	

* The follow example extract 20 knowledge pairs start with bus and with relation in CapableOf, IsA or HasA in ConceptNet.

In [15]:
!kgtk query -i cskg.tsv --match '(p)-[r]->(n)' \
                        --limit 20 \
                        --where 'r.source= "CN" and p = "/c/en/bus" and r.relation in ["/r/CapableOf","/r/IsA","/r/HasA"]' \
                        --return 'r,p,n,r.relation,r.source'

id	node1	node2	relation	source
/c/en/bus-/r/CapableOf-/c/en/carry_passengers-0000	/c/en/bus	/c/en/carry_passengers	/r/CapableOf	CN
/c/en/bus-/r/CapableOf-/c/en/drive_down_street-0000	/c/en/bus	/c/en/drive_down_street	/r/CapableOf	CN
/c/en/bus-/r/CapableOf-/c/en/drive_to_city-0000	/c/en/bus	/c/en/drive_to_city	/r/CapableOf	CN
/c/en/bus-/r/CapableOf-/c/en/go_into_town-0000	/c/en/bus	/c/en/go_into_town	/r/CapableOf	CN
/c/en/bus-/r/CapableOf-/c/en/near_terminal-0000	/c/en/bus	/c/en/near_terminal	/r/CapableOf	CN
/c/en/bus-/r/CapableOf-/c/en/seat_capacity-0000	/c/en/bus	/c/en/seat_capacity	/r/CapableOf	CN
/c/en/bus-/r/CapableOf-/c/en/stop_at_corner-0000	/c/en/bus	/c/en/stop_at_corner	/r/CapableOf	CN
/c/en/bus-/r/CapableOf-/c/en/take_across_country-0000	/c/en/bus	/c/en/take_across_country	/r/CapableOf	CN
/c/en/bus-/r/CapableOf-/c/en/transport_many_people_at_once-0000	/c/en/bus	/c/en/transport_many_people_at_once	/r/CapableOf	CN
/c/en/bus-/r/CapableOf-/c/en/transport_people-0000	/c/en/bus	/c/e

* **Query** also support multi hop edge connnection. 
* This example shows what the function of each part of bus.

In [41]:
!kgtk query -i cskg.tsv --match '(p)-[r]->(n)-[r2 {relation:"/r/HasProperty"}]->(sn)' \
                        --limit 30 \
                        --where 'r.source= "CN" and p = "/c/en/bus" and r.relation ="/r/HasA"' \
                        --return 'r,n,sn,r.relation,r.source'

id	node2	node2	relation	source
/c/en/bus-/r/HasA-/c/en/windows-0000	/c/en/windows	/c/en/clear_and_solid	/r/HasA	CN
/c/en/bus-/r/HasA-/c/en/windows-0000	/c/en/windows	/c/en/for_looking_outside	/r/HasA	CN
/c/en/bus-/r/HasA-/c/en/windows-0000	/c/en/windows	/c/en/opened_or_closed	/r/HasA	CN
/c/en/bus-/r/HasA-/c/en/windows-0000	/c/en/windows	/c/en/unlikely_to_parts_of_fences	/r/HasA	CN


#### The output file can be stored and visualized

In [53]:
!kgtk query -i cskg.tsv --match '(p)-[r]->(n)' \
                        --limit 30 \
                        --where 'r.source= "CN" and p = "/c/en/bus" and r.relation in ["/r/CapableOf","/r/IsA","/r/HasA"]' \
                        --return 'r,p,n,r.relation as label,r.source'\
                        -o bus-example-query.tsv

In [54]:
!kgtk visualize-graph -i bus-example-query.tsv\
                     --node-color-hex \
                     --show-text above \
                     --edge-color-hex \
                     --show-edge-label \
                     --node-size-default 4\
                     --edge-width-default 0.5\
                     -o show_node_label.html

In [62]:
<img src="../visualization.png", width=540, height=480>

SyntaxError: invalid syntax (1532694037.py, line 1)