## Olympics and Austrian athletes

Contents
1. Lists all Austrian Gold Medalists
2. Lists the names of every athlete, with at least one medal, alongside their total number of medals (sorted by the number of medals)
* lists the name of every gold medallist
* Lists the names of every athlete, with at least one medal, alongside their total number of medals (sorted by the number of medals)
* what else can we ask?

### Uncomment if not installed

In [1]:
# import sys
# !{sys.executable} -m pip install rdflib pandas

In [2]:
from rdflib import Graph
import pandas as pd

### Import ttl files

Switch between the data source folder for turtle files.  
"ttl" imports the files generated with OpenRefine  
"tarql" imports the files generated with tarql

In [3]:
source = "ttl"
# source = "tarql"

In [4]:
g = Graph()
g.parse(source + "/Athletes.ttl", format="turtle")
print(len(g))

15317


In [5]:
g.parse("ttl/NOC_Regions.ttl", format="turtle")
print(len(g))

15547


In [6]:
g.parse(source + "/Games.ttl", format="turtle")
print(len(g))

15832


In [7]:
g.parse(source + "/Events.ttl", format="turtle")
print(len(g))

17143


In [8]:
g.parse("ttl/Medals.ttl", format="turtle")
print(len(g))

17146


In [9]:
g.parse(source + "/Instance.ttl", format="turtle")
print(len(g))

32570


## 1. Lists all Austrian Gold Medalists

In [10]:
result = g.query("""
    PREFIX ex: <http://example.org/ontology/olympics/>
    PREFIX rdfs:  <http://www.w3.org/2000/01/rdf-schema#>

    SELECT DISTINCT ?name
    WHERE {
     ?instance ex:athlete ?athlete;
     ex:medal "Gold"@en .
     ?athlete rdfs:label ?name .
    }
""")

pd.set_option('display.max_rows', 500)
df = pd.DataFrame(result, columns=result.vars)
df.index +=1
df

Unnamed: 0,name
1,Roman Hagara
2,"Elisabeth ""Sissy"" Max-Theurer"
3,Andreas Kofler
4,Michaela Dorfmeister
5,Karl Martin Alois Schfer
6,Thomas Morgenstern
7,Emese Nemeth-Hunyady (-Jrvinen)
8,Doris Neuner
9,Wolfgang Schwarz
10,Christoph Bieler


## 2. Lists the names of every athlete, with at least one medal, alongside their total number of medals (sorted by the number of medals)

In [11]:
result = g.query("""
PREFIX ex: <http://example.org/ontology/olympics/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?name (COUNT(?name) As ?noOfMedals)
WHERE {
  ?instance ex:athlete ?athlete ;
            ex:medal   ?medal .
  ?athlete  rdfs:label    ?name .
}
GROUP BY ?name
ORDER BY DESC(?noOfMedals)
""")

print("Total ",len(result))

df = pd.DataFrame(result, columns=result.vars)
df.index +=1
df

Total  313


Unnamed: 0,name,noOfMedals
1,Felix Gottwald,7
2,Martin Hllwarth,4
3,Mario Stecher,4
4,Benjamin Raich,4
5,Stephan Eberharter,4
6,Gregor Schlierenzauer,4
7,Hermann Maier,4
8,Klaus Sulzenbacher,4
9,Marlies Schild (-Raich),4
10,Thomas Morgenstern,4


In [12]:
result = g.query("""
PREFIX ex: <http://example.org/ontology/olympics/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>

SELECT ?year (MAX(?age) As ?maxAge)
WHERE {
 ?instance ex:games   ?games ;
           ex:event   ?event ;
           ex:athlete ?athlete .
 
 ?event rdfs:subClassOf <http://example.org/resource/olympics/sport/AlpineSkiing> .
 
 ?games dbo:year ?year .
 
 ?athlete foaf:age ?age .
}
GROUP BY ?year ORDER BY ?year
""")

print("Total ",len(result))

df = pd.DataFrame(result, columns=result.vars)
df.index +=1
df



Total  1


Unnamed: 0,year,maxAge


In [13]:
athlete = "'Maier'"

result = g.query("""
    PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX dbo: <http://dbpedia.org/ontology/>
    PREFIX ex: <http://example.org/ontology/olympics/>
    PREFIX noc: <http://example.org/resource/olympics/NOC/>
    
SELECT DISTINCT ?name ?year ?cityName ?seasonName
WHERE {
 ?instance ex:games   ?games ;
           ex:athlete ?athlete .
 
 ?games dbo:location ?city ;
        ex:season ?season;
        dbo:year ?year .
 
 ?city rdfs:label ?cityName .
 
 ?season rdfs:label ?seasonName .
 
 ?athlete rdfs:label ?name .
 FILTER CONTAINS(?name,""" + athlete +""")
}
ORDER BY ?year
""")

df = pd.DataFrame(result, columns=result.vars)
df.index +=1
df


Unnamed: 0,name,year,cityName,seasonName
1,Markus Maier,1936,Garmisch-Partenkirchen,Winter
2,Ulrike Maier,1988,Calgary,Winter
3,Ulrike Maier,1992,Albertville,Winter
4,Monika Maierhofer,1992,Albertville,Winter
5,Monika Maierhofer,1994,Lillehammer,Winter
6,Hermann Maier,1998,Nagano,Winter
7,Ariane Maier,2000,Sydney,Summer
8,Alexander Maier,2002,Salt Lake City,Winter
9,Hermann Maier,2006,Torino,Winter
10,Alexander Maier,2006,Torino,Winter
