## Olympics and Austrian athletes

Contents
1. Lists all Austrian Gold Medalists
2. Lists the names of every athlete, with at least one medal, alongside their total number of medals (sorted by the number of medals)
3. Search for an athlete, show when and where the athlete took part in OG - and if available list the medal won

### Uncomment if not installed

In [1]:
# import sys
# !{sys.executable} -m pip install rdflib pandas

In [2]:
from rdflib import Graph
import pandas as pd

### Import ttl files

Switch between the data source folder for turtle files.  
"ttl" imports the files generated with OpenRefine  
"tarql" imports the files generated with tarql

In [3]:
source = "ttl"
# source = "tarql"

In [4]:
g = Graph()
g.parse(source + "/Athletes.ttl", format="turtle")
print(len(g))

15317


In [5]:
g.parse("ttl/NOC_Regions.ttl", format="turtle")
print(len(g))

15547


In [6]:
g.parse(source + "/Games.ttl", format="turtle")
print(len(g))

15832


In [7]:
g.parse(source + "/Events.ttl", format="turtle")
print(len(g))

17143


In [8]:
g.parse("ttl/Medals.ttl", format="turtle")
print(len(g))

17146


In [9]:
g.parse(source + "/Instance.ttl", format="turtle")
print(len(g))

32442


## 1. Lists all Austrian Gold Medalists

In [10]:
result = g.query("""

    PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX dbo: <http://dbpedia.org/ontology/>
    PREFIX ex: <http://example.org/ontology/olympics/>
    PREFIX noc: <http://example.org/resource/olympics/NOC/>

    SELECT ?athleteLabel ?medalLabel ?cityName ?year
    WHERE {
        ?instance ex:athlete ?athlete;
                  ex:medal ?medal;
                  ex:games ?games;
                  ex:event ?event .
                  
        ?athlete rdfs:label ?athleteLabel .
        
        ?medal rdfs:label ?medalLabel .
        
        ?games dbo:location ?city ;
        ex:season ?season;
        dbo:year ?year .
        
        ?city rdfs:label ?cityName .
         
        ?season rdfs:label ?seasonName .
        
        FILTER CONTAINS(?medalLabel,"Gold") .

    }
    ORDER BY DESC(?year)
    
""")

print("Total ",len(result))

pd.set_option('display.max_rows', 500)
df = pd.DataFrame(result, columns=result.vars)
df.index +=1
df

Total  108


Unnamed: 0,athleteLabel,medalLabel,cityName,year
1,Julia Dujmovits,Gold,Sochi,2014
2,Matthias Mayer,Gold,Sochi,2014
3,Anna Fenninger (-Veith),Gold,Sochi,2014
4,Mario Matt,Gold,Sochi,2014
5,Bernhard Gruber,Gold,Vancouver,2010
6,Andreas Linger,Gold,Vancouver,2010
7,David Kreiner,Gold,Vancouver,2010
8,Thomas Morgenstern,Gold,Vancouver,2010
9,Wolfgang Linger,Gold,Vancouver,2010
10,Felix Gottwald,Gold,Vancouver,2010


## 2. Lists the names of every athlete, with at least one medal, alongside their total number of medals (sorted by the number of medals)

In [11]:
result = g.query("""

    PREFIX ex: <http://example.org/ontology/olympics/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

    SELECT ?name (COUNT(?name) As ?noOfMedals)
    WHERE {
        ?instance ex:athlete ?athlete ;
                  ex:medal   ?medal .
        
        ?athlete  rdfs:label    ?name .
    }
    GROUP BY ?name
    ORDER BY DESC(?noOfMedals)
    
""")

print("Total ",len(result))

df = pd.DataFrame(result, columns=result.vars)
df.index +=1
df

Total  313


Unnamed: 0,name,noOfMedals
1,Felix Gottwald,7
2,Martin Hllwarth,4
3,Klaus Sulzenbacher,4
4,Benjamin Raich,4
5,Marlies Schild (-Raich),4
6,Stephan Eberharter,4
7,Gregor Schlierenzauer,4
8,Thomas Morgenstern,4
9,Mario Stecher,4
10,Hermann Maier,4


### 3. Search for an athlete, show when and where the athlete took part in OG - and if available list the medal won

In [12]:
athlete = "'Maier'"

result = g.query("""
    PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX dbo: <http://dbpedia.org/ontology/>
    PREFIX ex: <http://example.org/ontology/olympics/>
    PREFIX noc: <http://example.org/resource/olympics/NOC/>
    
    SELECT ?name ?year ?cityName ?seasonName ?medalLabel ?eventLabel
    WHERE {
        ?instance ex:games   ?games ;
                  ex:athlete ?athlete;
                  ex:event ?event .
           
        OPTIONAL { 
            ?instance ex:medal ?medal .
            ?medal rdfs:label ?medalLabel
        } . 
        
        ?athlete rdfs:label ?name .

        ?games dbo:location ?city ;
               ex:season ?season;
               dbo:year ?year .
 
        ?city rdfs:label ?cityName .
 
        ?season rdfs:label ?seasonName .
 
        ?event rdfs:label ?eventLabel
 
        FILTER CONTAINS(?name,""" + athlete +""")
    }
    ORDER BY ?year
""")

df = pd.DataFrame(result, columns=result.vars)
df.index +=1
df.fillna(value='', inplace=True)
df


Unnamed: 0,name,year,cityName,seasonName,medalLabel,eventLabel
1,Markus Maier,1936,Garmisch-Partenkirchen,Winter,,Nordic Combined Men's Individual
2,Ulrike Maier,1988,Calgary,Winter,,Alpine Skiing Women's Slalom
3,Ulrike Maier,1988,Calgary,Winter,,Alpine Skiing Women's Giant Slalom
4,Ulrike Maier,1988,Calgary,Winter,,Alpine Skiing Women's Combined
5,Ulrike Maier,1992,Albertville,Winter,,Alpine Skiing Women's Giant Slalom
6,Monika Maierhofer,1992,Albertville,Winter,,Alpine Skiing Women's Slalom
7,Ulrike Maier,1992,Albertville,Winter,,Alpine Skiing Women's Super G
8,Ulrike Maier,1992,Albertville,Winter,,Alpine Skiing Women's Combined
9,Monika Maierhofer,1994,Lillehammer,Winter,,Alpine Skiing Women's Slalom
10,Hermann Maier,1998,Nagano,Winter,,Alpine Skiing Men's Combined
