# Using Python and Wikidata to identify and compare the heights of various world leaders

Today's challenge consists on finding out who is the tallest from the following leaders:


*   Robert Mugabe
*   Vladimir Putin
*   Vladimir Lenin
*   Adolf Hitler
*   Joseph Stalin



First, we import the library which we will use. 

You may need to run `pip install mkwikidata` first

In [None]:
!pip install mkwikidata

Collecting mkwikidata
  Downloading https://files.pythonhosted.org/packages/df/f4/76d912e03f037cb78f12eb694b3bb89ed19b4110a2b1d14cb9cbe6a0f777/mkwikidata-0.14-py2.py3-none-any.whl
Collecting requests>=2.25.1 (from mkwikidata)
  Downloading https://files.pythonhosted.org/packages/2d/61/08076519c80041bc0ffa1a8af0cbd3bf3e2b62af10435d269a9d0f40564d/requests-2.27.1-py2.py3-none-any.whl (63kB)
Collecting charset-normalizer~=2.0.0; python_version >= "3" (from requests>=2.25.1->mkwikidata)
  Downloading https://files.pythonhosted.org/packages/06/b3/24afc8868eba069a7f03650ac750a778862dc34941a4bebeb58706715726/charset_normalizer-2.0.12-py3-none-any.whl
Installing collected packages: charset-normalizer, requests, mkwikidata
  Found existing installation: requests 2.21.0
    Uninstalling requests-2.21.0:
      Successfully uninstalled requests-2.21.0
Successfully installed charset-normalizer-2.0.12 mkwikidata-0.14 requests-2.27.1


In [None]:
import mkwikidata 

We have created and tested our world leader query in advance on the Wikidata Query interface and copied the SPARQL code below. See https://w.wiki/4ugN 

In [None]:
query = """
SELECT DISTINCT ?leader ?leaderLabel ?height WHERE {
  VALUES ?o {
    wd:Q48352
    wd:Q2285706
  }
  ?leader wdt:P21 wd:Q6581097;
    wdt:P39 ?position.
  ?position wdt:P279 ?o.
  ?leader wdt:P2048 ?height.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC (?height)
LIMIT 55
"""


Run the query

In [None]:
query_result = mkwikidata.run_query(query, params={ })

What does the query output?

In [None]:
query_result

{'head': {'vars': ['leader', 'leaderLabel', 'height']},
 'results': {'bindings': [{'leader': {'type': 'uri',
     'value': 'http://www.wikidata.org/entity/Q74660'},
    'leaderLabel': {'xml:lang': 'en',
     'type': 'literal',
     'value': 'Peter Hussing'},
    'height': {'datatype': 'http://www.w3.org/2001/XMLSchema#decimal',
     'type': 'literal',
     'value': '196'}},
   {'leader': {'type': 'uri',
     'value': 'http://www.wikidata.org/entity/Q3579995'},
    'leaderLabel': {'xml:lang': 'en',
     'type': 'literal',
     'value': 'Édouard Philippe'},
    'height': {'datatype': 'http://www.w3.org/2001/XMLSchema#decimal',
     'type': 'literal',
     'value': '194'}},
   {'leader': {'type': 'uri',
     'value': 'http://www.wikidata.org/entity/Q19958436'},
    'leaderLabel': {'xml:lang': 'en',
     'type': 'literal',
     'value': 'Octavian Morariu'},
    'height': {'datatype': 'http://www.w3.org/2001/XMLSchema#decimal',
     'type': 'literal',
     'value': '193'}},
   {'leader': {'

What type of object do we have? 

In [None]:
print(type(query_result))

<class 'dict'>


We can extract the data to a list

In [None]:
data = [{"name" : x["leaderLabel"]["value"], "height" : int(x["height"]["value"])} for x in query_result["results"]["bindings"]]

In [None]:
print (data)

[{'name': 'Peter Hussing', 'height': 196}, {'name': 'Édouard Philippe', 'height': 194}, {'name': 'Octavian Morariu', 'height': 193}, {'name': 'Robert Busnel', 'height': 192}, {'name': 'Jean-Luc Rougé', 'height': 190}, {'name': 'Franklin Delano Roosevelt', 'height': 189}, {'name': 'Carlos Roberto Flores', 'height': 188}, {'name': 'Paul Goze', 'height': 188}, {'name': 'Manuel Zelaya', 'height': 187}, {'name': 'Zdravko Hebel', 'height': 187}, {'name': 'Manfred Deckert', 'height': 186}, {'name': 'Saddam Hussein', 'height': 186}, {'name': 'Harald V of Norway', 'height': 186}, {'name': 'Ronald Reagan', 'height': 185}, {'name': 'Rafael Leonardo Callejas Romero', 'height': 185}, {'name': 'Serge Blanco', 'height': 185}, {'name': 'Jean-Patrick Lescarboura', 'height': 185}, {'name': 'George Weah', 'height': 184}, {'name': 'George W. Bush', 'height': 183}, {'name': 'Olivier Girault', 'height': 183}, {'name': 'Gilles Quénéhervé', 'height': 183}, {'name': 'Philippe Bernat-Salles', 'height': 181}, {'

If we want, we can stick the data in a data frame and plot it

In [None]:
import pandas as pd
pd.DataFrame(data).set_index("name").head(10).plot.barh().invert_yaxis()


And we can look at the full data frame

In [None]:
pd.DataFrame(data).set_index("name")

Unnamed: 0_level_0,height
name,Unnamed: 1_level_1
Peter Hussing,196
Édouard Philippe,194
Octavian Morariu,193
Robert Busnel,192
Jean-Luc Rougé,190
Franklin Delano Roosevelt,189
Carlos Roberto Flores,188
Paul Goze,188
Manuel Zelaya,187
Zdravko Hebel,187


### But who is the tallest of the leaders listed in our challenge?

Let's create a list of our `targets`

In [None]:
our_leaders = ['Robert Mugabe', 'Vladimir Putin','Vladimir Lenin', 'Adolf Hitler','Joseph Stalin']

Then we can iterate over our dictionary query results, looking to see if they are in our list. If they match, compare their height. Each time we find one who is taller update the name and height. 

In [None]:
max_leader = ""
max_height = 0

for x in query_result["results"]["bindings"]:
    if x["leaderLabel"]["value"] in our_leaders:
        if int(x["height"]["value"]) > max_height:
            max_height = int(x["height"]["value"])
            max_leader = x["leaderLabel"]["value"]


Print out our result

In [None]:
print("The tallest leader is " + max_leader + ", who is "+ str(max_height) + "cm tall.")

The tallest leader is Adolf Hitler, who is 174cm tall.
