# Use cases

We query IMKG to demonstrate several use cases about what IMKG enables. Some ideas:
* memes that include sponge bob
* what is the most memable actor in wikidata
* memes that come from movies that have adult content
* memes that come from books
* gender of characters in memes

## 0. Setup

In [1]:
import os
import os.path

from kgtk.configure_kgtk_notebooks import ConfigureKGTK
from kgtk.functions import kgtk, kypher

In [2]:
# Parameters

# Folders on local machine where to create the output and temporary files:
input_path = "wikidata"
output_path = "projects"
project_name = "tutorial-kypher"

In [3]:
big_files=["label"]

additional_files = {
    "P31": "derived.P31.tsv.gz",
    "items": "claims.wikibase-item.tsv.gz",
    "P1963": "derived.P1963computed.count.star.tsv.gz",
    "external": "claims.external-id.tsv.gz",
    "indegree": "metadata.in_degree.tsv.gz",
    "outdegree": "metadata.out_degree.tsv.gz",
    "pagerank": "metadata.pagerank.directed.tsv.gz"
}

ck = ConfigureKGTK(big_files)
ck.configure_kgtk(input_graph_path=input_path, 
                  output_path=output_path, 
                  project_name=project_name,
                  additional_files=additional_files)

User home: /Users/filipilievski
Current dir: /Users/filipilievski/mcs/imkg
KGTK dir: /Users/filipilievski/mcs
Use-cases dir: /Users/filipilievski/mcs/use-cases


In [4]:
ck.print_env_variables()

USE_CASES_DIR: /Users/filipilievski/mcs/use-cases
STORE: projects/tutorial-kypher/temp.tutorial-kypher/wikidata.sqlite3.db
GRAPH: wikidata
kgtk: kgtk
OUT: projects/tutorial-kypher
EXAMPLES_DIR: /Users/filipilievski/mcs/examples
KGTK_LABEL_FILE: wikidata/labels.en.tsv.gz
kypher: kgtk query --graph-cache projects/tutorial-kypher/temp.tutorial-kypher/wikidata.sqlite3.db
KGTK_GRAPH_CACHE: projects/tutorial-kypher/temp.tutorial-kypher/wikidata.sqlite3.db
KGTK_OPTION_DEBUG: false
TEMP: projects/tutorial-kypher/temp.tutorial-kypher
label: wikidata/labels.en.tsv.gz
P31: wikidata/derived.P31.tsv.gz
items: wikidata/claims.wikibase-item.tsv.gz
P1963: wikidata/derived.P1963computed.count.star.tsv.gz
external: wikidata/claims.external-id.tsv.gz
indegree: wikidata/metadata.in_degree.tsv.gz
outdegree: wikidata/metadata.out_degree.tsv.gz
pagerank: wikidata/metadata.pagerank.directed.tsv.gz


## Example 1: All memes that depict the entity Q83279 ("SpongeBob SquarePants")

In [7]:
!kgtk query -i $TEMP/imkg.kgtk.gz \
    --match '(h)-[:`m4s:fromImage`]->(:Q83279),\
            (h)-[:`rdf:type`]->(:`kym:Meme`)' \
    --return 'count(distinct h)'

count(DISTINCT graph_33_c1."node1")
130


In [8]:
!kgtk query -i $TEMP/imkg.kgtk.gz \
    --match '(h)-[:`m4s:fromImage`]->(:Q83279),\
            (h)-[:`rdf:type`]->(:`kym:Meme`),\
            (h)-[r]->(t)' \
    --return 'distinct h,r.label,t' \
    -o $TEMP/sponge.kgtk.gz

In [9]:
kgtk("""visualize-graph 
        -i $TEMP/sponge.kgtk.gz
        -o viz/iflip_kym.graph.html""")

## Example 2: What is the most memable people in Wikidata

In [10]:
!kgtk query -i $TEMP/imkg.kgtk.gz \
    --match '(h)-[]->(person),\
            (h)-[:`rdf:type`]->(:`kym:Meme`),\
            (person)-[:P31]->(:Q5)' \
    --return 'person, count(h) as c' \
    --order-by 'c desc' \
    --limit 3

node2	c
Q22686	145
Q18738659	72
Q15935	56


## Example 3: Memes that are based on films

In [12]:
!kgtk query -i $TEMP/imkg.kgtk.gz \
    --match '(h)-[:`m4s:fromAbout`]->(t),\
             (t)-[:P31]->(:Q11424)' \
    --return 'count (distinct h)' \
    --limit 10

count(DISTINCT graph_33_c1."node1")
413


Let's visualize this

In [None]:
!kgtk query -i $TEMP/imkg.kgtk.gz \
    --match '(h1)-[:`m4s:fromAbout`]->(t1),\
            (h2)-[:`m4s:fromAbout`]->(t2),\
             (t1)-[:P31]->(:Q11424),\
             (t2)-[:P31]->(:Q11424),\
             (h1)-[r]->(h2)' \
    --return 'h1 as node1, r.label, h2 as node2' \
    --limit 10 \
    -o $TEMP/film_memes.kgtk.gz

In [None]:
kgtk("""visualize-graph 
        -i $TEMP/film_memes.kgtk.gz
        -o viz/iflip_kym.graph.html""")

## Example 4: Gender of people in IMKG

In [14]:
!kgtk query -i $TEMP/imkg.kgtk.gz \
    --match '()-[]->(person),\
            (person)-[:P21]->(gender)' \
    --return 'gender, count(distinct person)' \
    --limit 15

node2	count(DISTINCT graph_33_c1."node2")
Q1052281	20
Q1097630	4
Q12964198	1
Q2449503	2
Q43445	67
Q44148	170
Q48270	14
Q6581072	2798
Q6581097	10333
