## Olympics and Austrian athletes

Contents
* combines all files to a massive triple store
* lists the name of every gold medallist
* Lists the names of every athlete, with at least one medal, alongside their total number of medals (sorted by the number of medals)
* what else can we ask?

### Uncomment if not installed

In [1]:
# import sys
# !{sys.executable} -m pip install rdflib pandas

In [2]:
from rdflib import Graph
import pandas as pd

### Import ttl files

In [3]:
g = Graph()
g.parse("ttl/Athletes.ttl", format="turtle")
print(len(g))

15317


In [4]:
g.parse("ttl/NOC_Regions.ttl", format="turtle")
print(len(g))

15547


In [5]:
g.parse("ttl/Games.ttl", format="turtle")
print(len(g))

15832


In [6]:
g.parse("ttl/Events.ttl", format="turtle")
print(len(g))

17143


In [7]:
g.parse("ttl/Medals.ttl", format="turtle")
print(len(g))

17146


In [8]:
g.parse("ttl/Instance.ttl", format="turtle")
print(len(g))

32570


### Lists all Austrians that won a gold medal

In [9]:
result = g.query("""
    PREFIX ex: <http://example.org/ontology/olympics/>
    PREFIX rdfs:  <http://www.w3.org/2000/01/rdf-schema#>

    SELECT DISTINCT ?name
    WHERE {
     ?instance ex:athlete ?athlete;
     ex:medal "Gold" .
     ?athlete rdfs:label ?name .
    }
""")

pd.set_option('display.max_rows', 500)
df = pd.DataFrame(result, columns=result.vars)
df.index +=1
df

Unnamed: 0,name
1,"Anton Engelbert ""Toni"" Sailer"
2,Felix Gottwald
3,Annemarie Moser-Prll
4,Mario Stecher
5,Petra Kronberger
6,Martin Koch
7,Alfred Berger
8,Thomas Morgenstern
9,Andreas Kofler
10,Egon Zimmermann


### Lists the names of every athlete, with at least one medal, alongside their total number of medals (sorted by the number of medals)

In [12]:
result = g.query("""
PREFIX ex: <http://example.org/ontology/olympics/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?name (COUNT(?name) As ?noOfMedals)
WHERE {
  ?instance ex:athlete ?athlete ;
            ex:medal   ?medal .
  ?athlete  rdfs:label    ?name .
}
GROUP BY ?name
ORDER BY DESC(?noOfMedals)
""")

print("Total ",len(result))

df = pd.DataFrame(result, columns=result.vars)
df.index +=1
df

Total  313


Unnamed: 0,name,noOfMedals
1,Felix Gottwald,7
2,Marlies Schild (-Raich),4
3,Mario Stecher,4
4,Thomas Morgenstern,4
5,Martin Hllwarth,4
6,Hermann Maier,4
7,Gregor Schlierenzauer,4
8,Stephan Eberharter,4
9,Benjamin Raich,4
10,Klaus Sulzenbacher,4
