## Playing with Austrian athletes and formatting options

Contents
* combine Athletes_01_AUT.ttl and noc_regions.ttl
* how many Austrians took part in OGs
* query an athlete, order by age and see how many times he/she took part in OGs
* format the above output in different ways
* average Age of all Austrian athletes
* oldest and youngest athlete

### Uncomment if not installed

In [1]:
# import sys
# !{sys.executable} -m pip install rdflib pandas

In [2]:
from rdflib import Graph
from pandas import DataFrame

### Import ttl files

In [3]:
g = Graph()
g.parse("ttl/Athletes_01_AUT.ttl", format="turtle")
print(len(g))

15317


In [4]:
g.parse("ttl/noc_regions.ttl", format="turtle")
print(len(g))

15547


### How many Austrians took part in Olympic Games

In [73]:
result = g.query("""
PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT (COUNT(DISTINCT ?name) AS ?count)
WHERE {
    ?athlete a foaf:Person;
             rdfs:label ?name;
}
""")

for row in result:
    print("%s Austrian took part" % row)

2334 Austrian took part


### Query Name, order by age and see how many times an Athlete took part in OGs: 

eg: Hubert Raudaschl, Christoph Sumann, Hermann Maier, ...

In [74]:
result = g.query("""
PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX ex: <http://example.org/ontology/olympics/>
PREFIX noc: <http://example.org/resource/olympics/NOC/>

SELECT ?athleteLabel ?age ?country
WHERE {
    ?athlete a foaf:Person;
             rdfs:label "Hubert Raudaschl";
             foaf:age ?age;
             rdfs:label ?athleteLabel;
             ex:noc ?noc .
    ?noc dbo:ground ?country .
}
ORDER BY ?age
""")

print("Total ",len(result))

for row in result:
    print(row)

Total  9
(rdflib.term.Literal('Hubert Raudaschl'), rdflib.term.Literal('22', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#int')), rdflib.term.Literal('Austria'))
(rdflib.term.Literal('Hubert Raudaschl'), rdflib.term.Literal('26', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#int')), rdflib.term.Literal('Austria'))
(rdflib.term.Literal('Hubert Raudaschl'), rdflib.term.Literal('30', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#int')), rdflib.term.Literal('Austria'))
(rdflib.term.Literal('Hubert Raudaschl'), rdflib.term.Literal('33', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#int')), rdflib.term.Literal('Austria'))
(rdflib.term.Literal('Hubert Raudaschl'), rdflib.term.Literal('37', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#int')), rdflib.term.Literal('Austria'))
(rdflib.term.Literal('Hubert Raudaschl'), rdflib.term.Literal('41', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#int')), rdf

#### Format %

In [30]:
print("Total: ",len(result))

for row in result:
    print("Athlete: %s, Age: %s, COO: %s" % row)

Total:  9
Athlete: Hubert Raudaschl, Age: 22, COO: Austria
Athlete: Hubert Raudaschl, Age: 26, COO: Austria
Athlete: Hubert Raudaschl, Age: 30, COO: Austria
Athlete: Hubert Raudaschl, Age: 33, COO: Austria
Athlete: Hubert Raudaschl, Age: 37, COO: Austria
Athlete: Hubert Raudaschl, Age: 41, COO: Austria
Athlete: Hubert Raudaschl, Age: 46, COO: Austria
Athlete: Hubert Raudaschl, Age: 49, COO: Austria
Athlete: Hubert Raudaschl, Age: 53, COO: Austria


#### with index

In [31]:
for index,row in enumerate(result):
    print(index+1, "Athlete: %s, Age: %s, COO: %s" % row)

1 Athlete: Hubert Raudaschl, Age: 22, COO: Austria
2 Athlete: Hubert Raudaschl, Age: 26, COO: Austria
3 Athlete: Hubert Raudaschl, Age: 30, COO: Austria
4 Athlete: Hubert Raudaschl, Age: 33, COO: Austria
5 Athlete: Hubert Raudaschl, Age: 37, COO: Austria
6 Athlete: Hubert Raudaschl, Age: 41, COO: Austria
7 Athlete: Hubert Raudaschl, Age: 46, COO: Austria
8 Athlete: Hubert Raudaschl, Age: 49, COO: Austria
9 Athlete: Hubert Raudaschl, Age: 53, COO: Austria


#### with panda DataFrame

In [32]:
DataFrame(result, columns=result.vars)

Unnamed: 0,athleteLabel,age,country
0,Hubert Raudaschl,22,Austria
1,Hubert Raudaschl,26,Austria
2,Hubert Raudaschl,30,Austria
3,Hubert Raudaschl,33,Austria
4,Hubert Raudaschl,37,Austria
5,Hubert Raudaschl,41,Austria
6,Hubert Raudaschl,46,Austria
7,Hubert Raudaschl,49,Austria
8,Hubert Raudaschl,53,Austria


### Average Age of all Austrian athletes

In [76]:
result = g.query("""
PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT (ROUND(AVG(?age)) AS ?rounded_average_age)
WHERE {
    ?athlete a foaf:Person;
             rdfs:label ?name;
             foaf:age ?age; 
}
""")   
for row in result:
    print("Average Age: %s" % row)

Average Age: 27


### Oldest and youngest athlete

In [72]:
result = g.query("""
PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT (MAX(?age) as ?maxAge) (MIN(?age) as ?minAge)
WHERE {
    ?athlete a foaf:Person;
             rdfs:label ?name;
             foaf:age ?age;
}
""")   
for row in result:
    print ("Oldest: %s \nYoungest: %s" % row)

Oldest: 76 
Youngest: 12
