# Wikidata - Class Hierarchy

Return the Wikidata inheritance tree (with mappings to external resources) for an area of interest (a Wikidata item).

Execute the following queries and save the results to the indicated CSV files.

The examples below explore Agent Q24229398, which includes humans and organizations. 

* Query the Wikidata class hierarchy (QLever, saved to __class_hierarchy.csv__)

```
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?item ?label ?directSuper {
  	 ?item wdt:P279* wd:Q24229398 ; rdfs:label ?label ;   
           wdt:P279 ?directSuper .
	 FILTER(lang(?label) = "en") 
} 
```

* Query the external class mappings (QLever, saved to __class_mappings.csv__)

```
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?item ?pred ?ext {
  	 { SELECT DISTINCT ?item {?item wdt:P279* wd:Q24229398} }
     VALUES ?pred {wdt:P1709 wdt:P2888 wdt:P3950}
	 ?item ?pred ?ext 
} 
```

Then, execute the following cell which creates an RDF/OWL declaration (__ttl_output.ttl__) that can be examined further.

In [1]:
with open("ttl_output.ttl", "w") as ttl:
    with open("class_hierarchy.csv", "r") as inputs:
        while True:
            line = inputs.readline()
            if len(line) == 0:
                break
            if not line.startswith("http"):
                continue
            if '"' in line:
                # Commas in label
                quote_split = line.split('"')
                item = quote_split[0][:-1]
                label = quote_split[1]
                super_class = quote_split[2].replace('\n', '')
            else:
                comma_split = line.split(',')
                item = comma_split[0]
                label = comma_split[1]
                super_class = comma_split[2].replace('\n', '')
            ttl.write(f'<{item}> rdfs:subClassOf <{super_class}> ; rdfs:label "{label}" .\n')
    with open("class_mappings.csv", "r") as inputs:
        while True:
            line = inputs.readline()
            if len(line) == 0:
                break
            if not line.startswith("http"):
                continue
            comma_split = line.split(',')
            ext_ref = comma_split[2].replace('\n', '')
            # Only capturing the ontology mappings for several references
            if "schema.org" in ext_ref or "dbpedia" in ext_ref or "vcard" in ext_ref or "ns/org" in ext_ref:
                ttl.write(f'<{comma_split[0]}> <{comma_split[1]}> <{ext_ref}> .\n')
        