# Lab | APIS

In this lab, you will collect historical data about the Nobel Prize winners using [this free and non-authenticated API](https://www.nobelprize.org/organization/developer-zone-2/). According to the documentation available [here](https://app.swaggerhub.com/apis/NobelMedia/NobelMasterData/2.1#/default/get_nobelPrizes). The base url is: "http://api.nobelprize.org/2.1/" followed by a string to specify what kind of information do you want to retrieve. The acceptable options are:

* nobelPrices
* nobelPrice/category/year
* laureates
* laureate/laureateID

# Getting the information using requests

Use the Python `requests`, and `json` libraries to obtain the information of ALL the Nobel Prizes. Make sure to verify that you get the proper status code (200).

The json outputs are simple plain text that need to be converted into the corresponding nested dictionary. Use the `.json()` method to cast the output into a Python dictionary.

Use the Pandas library to collect all the information into a Panda's DataFrame.

In [2]:
import requests
import json
import pandas as pd

url = "http://api.nobelprize.org/2.1/nobelPrizes?limit=100000"

response = requests.get(url)

if response.status_code == 200:
    print("All good!")
    print("==============")
    print("\n")
else:
    print(response.status_code)
    raise Exception("Didn't work")
    
info = response.json()
info

All good!




{'nobelPrizes': [{'awardYear': '1901',
   'category': {'en': 'Chemistry', 'no': 'Kjemi', 'se': 'Kemi'},
   'categoryFullName': {'en': 'The Nobel Prize in Chemistry',
    'no': 'Nobelprisen i kjemi',
    'se': 'Nobelpriset i kemi'},
   'dateAwarded': '1901-11-12',
   'prizeAmount': 150782,
   'prizeAmountAdjusted': 10531894,
   'links': [{'rel': 'nobelPrize',
     'href': 'https://api.nobelprize.org/2/nobelPrize/che/1901',
     'action': 'GET',
     'types': 'application/json'}],
   'laureates': [{'id': '160',
     'knownName': {'en': "Jacobus H. van 't Hoff"},
     'fullName': {'en': "Jacobus Henricus van 't Hoff"},
     'portion': '1',
     'sortOrder': '1',
     'motivation': {'en': 'in recognition of the extraordinary services he has rendered by the discovery of the laws of chemical dynamics and osmotic pressure in solutions',
      'se': 'såsom ett erkännande av den utomordentliga förtjänst han inlagt genom upptäckten av lagarna för den kemiska dynamiken och för det osmotiska tryck

In [3]:
len(info)

3

In [5]:
info.keys()

dict_keys(['nobelPrizes', 'meta', 'links'])

In [7]:
len(info['nobelPrizes'])

670

In [9]:
nobel_df = pd.DataFrame(info['nobelPrizes'])

# Processing the output

Process the Pandas DataFrame in order to have only the following columns:

- category
- dateAwarded (as DateTime in "yyyy-mm-dd" format)
- prizeAmount
- prizeAmountAdjusted
- Number_of_laureates
- motivation
- laureate_ids (as a list)

In [10]:
nobel_df.columns

Index(['awardYear', 'category', 'categoryFullName', 'dateAwarded',
       'prizeAmount', 'prizeAmountAdjusted', 'links', 'laureates',
       'topMotivation'],
      dtype='object')

In [11]:
nobel_df.head()

Unnamed: 0,awardYear,category,categoryFullName,dateAwarded,prizeAmount,prizeAmountAdjusted,links,laureates,topMotivation
0,1901,"{'en': 'Chemistry', 'no': 'Kjemi', 'se': 'Kemi'}","{'en': 'The Nobel Prize in Chemistry', 'no': '...",1901-11-12,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '160', 'knownName': {'en': 'Jacobus H....",
1,1901,"{'en': 'Literature', 'no': 'Litteratur', 'se':...","{'en': 'The Nobel Prize in Literature', 'no': ...",1901-11-14,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '569', 'knownName': {'en': 'Sully Prud...",
2,1901,"{'en': 'Peace', 'no': 'Fred', 'se': 'Fred'}","{'en': 'The Nobel Peace Prize', 'no': 'Nobels ...",1901-12-10,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '462', 'knownName': {'en': 'Henry Duna...",
3,1901,"{'en': 'Physics', 'no': 'Fysikk', 'se': 'Fysik'}","{'en': 'The Nobel Prize in Physics', 'no': 'No...",1901-11-12,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '1', 'knownName': {'en': 'Wilhelm Conr...",
4,1901,"{'en': 'Physiology or Medicine', 'no': 'Fysiol...",{'en': 'The Nobel Prize in Physiology or Medic...,1901-10-30,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '293', 'knownName': {'en': 'Emil von B...",


In [60]:
nobel_df['number_of_laureates'] = nobel_df['laureates'].map(len, na_action='ignore')

In [61]:
nobel_df['laureate_ids'] = nobel_df['laureates'].map(lambda el: [i['id'] for i in el], na_action='ignore')

In [41]:
nobel_df['category_en'] = nobel_df['category'].map(lambda elem: elem.get('en'))

In [48]:
nobel_df['dateAwarded_datetime'] = pd.to_datetime(nobel_df['dateAwarded'])

In [62]:
nobel_df['motivation'] = nobel_df['laureates'].map(lambda e: [i['motivation']['en'] for i in e][0], na_action='ignore')

In [64]:
new_df = nobel_df[
    ['category_en', 'dateAwarded_datetime', 'prizeAmount', 'prizeAmountAdjusted',
     'number_of_laureates', 'motivation', 'laureate_ids']
]

In [65]:
new_df.head()

Unnamed: 0,category_en,dateAwarded_datetime,prizeAmount,prizeAmountAdjusted,number_of_laureates,motivation,laureate_ids
0,Chemistry,1901-11-12,150782,10531894,1.0,in recognition of the extraordinary services h...,[160]
1,Literature,1901-11-14,150782,10531894,1.0,in special recognition of his poetic compositi...,[569]
2,Peace,1901-12-10,150782,10531894,2.0,for his humanitarian efforts to help wounded s...,"[462, 463]"
3,Physics,1901-11-12,150782,10531894,1.0,in recognition of the extraordinary services h...,[1]
4,Physiology or Medicine,1901-10-30,150782,10531894,1.0,"for his work on serum therapy, especially its ...",[293]


# Getting a Pandas DataFrame with the details of awarded authors/institutions

If you dive deeper and use the API to retrieve the details of some laureate_ids, you will notice that not allways the Nobel Prize was awarded to individuals. In some cases, the awards were given to institutions.

Get the unique ids from the previous datasets and prepare the following functions:

- get_name(laureate) ( it should return the english name 'fullName' of the individual or 'orgName' of the institution )

- get_gender(laureate) ( it should return the gender or 'Unknown' for individuals, and 'None' for institutions )

- get_birthdate(laureate) ( it should return the birthdate when it's avaialble or 'Unknown' otherwise )

- get_age(laureate) ( it should return the age of the awarded individual or 'Unknown' when it's not avaialble or for institutions )

- get_city(laureate) ( it should return the english name of the city when it's available or 'Unknown' otherwise )

- get_country(laureate) ( it should return the english name of the country when it's available or 'Unknown' otherwise )

- get_continent(laureate) ( it should return the english name of the continent when it's available or 'Unknown' otherwise )

- get_latitude(laureate) ( it should return the city's latitude when it's available or 'Unknown' otherwise )

- get_longitude(laureate) ( it should return the city's longitude
 when it's available or 'Unknown' otherwise )

Create the following dictionaries:

```python
laureates_dict = {"ID": [], "Name": [], "Gender": [], \
                  "Birth_date": [], "Age": [], \
                  "City": [], "Country": [], "Continent": [], \
                  "Latitude": [], "Longitude": []}                        

functions_dict = {"ID": None, "Name": get_name, "Gender": get_gender, \
                  "Birth_date": get_birthdate, "Age": get_age, \
                  "City": get_city, "Country": get_country, "Continent": get_continent, \
                  "Latitude": get_latitude, "Longitude": get_longitude}
```

For each unique `laureate_id` of the previous DataFrame make an API call to get the details of the awarded individual/intitution and iterate of the previous dictionaries keys in order to add the corresponding information of each `laureate_id` in the empty lists of `laureates_dict`.

Finally, create a Pandas DataFrame named `laureates_df` using the `laureates_dict`.

**API call for laureate ID 160 - indiv. Ex.**

In [122]:
url = "http://api.nobelprize.org/2.1/laureate/160"

id_response = requests.get(url)

if id_response.status_code == 200:
    print("All good!")
    print("==============")
    print("\n")
else:
    print(id_response.status_code)
    raise Exception("Didn't work")
    
id_info = id_response.json()
id_info

All good!




[{'id': '160',
  'knownName': {'en': "Jacobus H. van 't Hoff",
   'se': "Jacobus H. van 't Hoff"},
  'givenName': {'en': 'Jacobus H.', 'se': 'Jacobus H.'},
  'familyName': {'en': "van 't Hoff", 'se': "van 't Hoff"},
  'fullName': {'en': "Jacobus Henricus van 't Hoff",
   'se': "Jacobus Henricus van 't Hoff"},
  'fileName': 'hoff',
  'gender': 'male',
  'birth': {'date': '1852-08-30',
   'place': {'city': {'en': 'Rotterdam', 'no': 'Rotterdam', 'se': 'Rotterdam'},
    'country': {'en': 'the Netherlands',
     'no': 'Nederland',
     'se': 'Nederländerna'},
    'cityNow': {'en': 'Rotterdam',
     'no': 'Rotterdam',
     'se': 'Rotterdam',
     'sameAs': ['https://www.wikidata.org/wiki/Q34370',
      'https://www.wikipedia.org/wiki/Rotterdam'],
     'latitude': '51.925205',
     'longitude': '4.489110'},
    'countryNow': {'en': 'the Netherlands',
     'no': 'Nederland',
     'se': 'Nederländerna',
     'sameAs': ['https://www.wikidata.org/wiki/Q55'],
     'latitude': '52.316667',
     'lo

**API call for laureate ID 467 - org. Ex.**

In [148]:
url = "http://api.nobelprize.org/2.1/laureate/467"

org_response = requests.get(url)

if org_response.status_code == 200:
    print("All good!")
    print("==============")
    print("\n")
else:
    print(org_response.status_code)
    raise Exception("Didn't work")
    
org_info = org_response.json()
org_info

All good!




[{'id': '467',
  'orgName': {'en': 'Institute of International Law',
   'no': 'Folkerettsinstituttet',
   'se': 'Institutet för internationell rätt'},
  'nativeName': 'Institut de droit international',
  'fileName': 'international-law',
  'founded': {'date': '1873-00-00',
   'place': {'city': {'en': 'Ghent', 'no': 'Gent', 'se': 'Gent'},
    'country': {'en': 'Belgium',
     'no': 'Belgia',
     'se': 'Belgien',
     'sameAs': 'https://www.wikidata.org/wiki/Q31'},
    'cityNow': {'en': 'Ghent',
     'no': 'Gent',
     'se': 'Gent',
     'sameAs': ['https://www.wikidata.org/wiki/Q1296',
      'https://www.wikipedia.org/wiki/Ghent']},
    'countryNow': {'en': 'Belgium',
     'no': 'Belgia',
     'se': 'Belgien',
     'sameAs': ['https://www.wikidata.org/wiki/Q31']},
    'continent': {'en': 'Europe', 'no': 'Europa', 'se': 'Europa'},
    'locationString': {'en': 'Ghent, Belgium',
     'no': 'Gent, Belgia',
     'se': 'Gent, Belgien'}}},
  'wikipedia': {'slug': 'Institut_de_Droit_Internation

**Use the API to retrieve the details of some laureate_ids**

In [177]:
import time
from tqdm import tqdm

In [179]:
nobel_df.head()

Unnamed: 0,awardYear,category,categoryFullName,dateAwarded,prizeAmount,prizeAmountAdjusted,links,laureates,topMotivation,number_of_laureates,laureate_ids,category_en,dateAwarded_datetime,motivation
0,1901,"{'en': 'Chemistry', 'no': 'Kjemi', 'se': 'Kemi'}","{'en': 'The Nobel Prize in Chemistry', 'no': '...",1901-11-12,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '160', 'knownName': {'en': 'Jacobus H....",,1.0,[160],Chemistry,1901-11-12,in recognition of the extraordinary services h...
1,1901,"{'en': 'Literature', 'no': 'Litteratur', 'se':...","{'en': 'The Nobel Prize in Literature', 'no': ...",1901-11-14,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '569', 'knownName': {'en': 'Sully Prud...",,1.0,[569],Literature,1901-11-14,in special recognition of his poetic compositi...
2,1901,"{'en': 'Peace', 'no': 'Fred', 'se': 'Fred'}","{'en': 'The Nobel Peace Prize', 'no': 'Nobels ...",1901-12-10,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '462', 'knownName': {'en': 'Henry Duna...",,2.0,"[462, 463]",Peace,1901-12-10,for his humanitarian efforts to help wounded s...
3,1901,"{'en': 'Physics', 'no': 'Fysikk', 'se': 'Fysik'}","{'en': 'The Nobel Prize in Physics', 'no': 'No...",1901-11-12,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '1', 'knownName': {'en': 'Wilhelm Conr...",,1.0,[1],Physics,1901-11-12,in recognition of the extraordinary services h...
4,1901,"{'en': 'Physiology or Medicine', 'no': 'Fysiol...",{'en': 'The Nobel Prize in Physiology or Medic...,1901-10-30,150782,10531894,"[{'rel': 'nobelPrize', 'href': 'https://api.no...","[{'id': '293', 'knownName': {'en': 'Emil von B...",,1.0,[293],Physiology or Medicine,1901-10-30,"for his work on serum therapy, especially its ..."


In [201]:
# ids = []
# for l in nobel_df['laureate_ids'].dropna().values:
#     for item in l:
#         ids.append(int(item))

ids = [int(item) for l in nobel_df['laureate_ids'].dropna().values for item in l]
unique_ids = set(ids)
# .values is not required, it works without it just fine
# .values is an attribute in the pandas series class

In [212]:
# the never-ending list of user-defined functions that could have been a few functions less long

def get_name(laureate):
    if 'fullName' in laureate[0]:
        return laureate[0]['fullName'].get('en', 'Unknown')
    elif 'orgName' in laureate[0]:
        return laureate[0]['orgName'].get('en', 'Unknown')


def get_gender(laureate):
    if 'gender' in laureate[0]:
        return laureate[0]['gender']
    elif 'orgName' in laureate[0]:
        return 'None'
    else:
        return 'Unknown'


def get_birthdate(laureate):
    return laureate[0].get('birth', {}).get('date', 'Unknown')


def get_age(laureate):
    if 'birth' in laureate[0]:
        birthday = laureate[0]['birth']['date'][:4]
        awarded = laureate[0]['nobelPrizes'][0]['dateAwarded'][:4]
        age = int(awarded) - int(birthday)
        return age
    elif 'orgName' in laureate[0]:
        return 'Unknown'


def get_city(laureate):
    if 'birth' in laureate[0]:
        return laureate[0]['birth'].get('place', {}).get('cityNow', {}).get('en', 'Unknown')
    elif 'orgName' in laureate[0]:
        return laureate[0].get('founded', {}).get('place', {}).get('cityNow', {}).get('en', 'Unknown')
    else:
        return 'Unknown'


def get_country(laureate):
    if 'birth' in laureate[0]:
        return laureate[0]['birth'].get('place', {}).get('countryNow', {}).get('en', 'Unknown')
    elif 'orgName' in laureate[0]:
        return laureate[0].get('founded', {}).get('place', {}).get('countryNow', {}).get('en', 'Unknown')
    else:
        return 'Unknown'


def get_continent(laureate):
    if 'birth' in laureate[0]:
        return laureate[0]['birth'].get('place', {}).get('continent', {}).get('en', 'Unknown')
    elif 'orgName' in laureate[0]:
        return laureate[0].get('founded', {}).get('place', {}).get('continent', {}).get('en', 'Unknown')
    else:
        return 'Unknown'


def get_latitude(laureate):
    return laureate[0].get('birth', {}).get('place', {}).get('cityNow', {}).get('latitude', 'Unknown')


def get_longitude(laureate):
    return laureate[0].get('birth', {}).get('place', {}).get('cityNow', {}).get('longitude', 'Unknown')


laureates_dict = {"ID": [], "Name": [], "Gender": [], \
                  "Birth_date": [], "Age": [], \
                  "City": [], "Country": [], "Continent": [], \
                  "Latitude": [], "Longitude": []}

functions_dict = {"Name": get_name, "Gender": get_gender, \
                  "Birth_date": get_birthdate, "Age": get_age, \
                  "City": get_city, "Country": get_country, "Continent": get_continent, \
                  "Latitude": get_latitude, "Longitude": get_longitude}

In [213]:
responses = []

for index, id in enumerate(tqdm(unique_ids)):
    url = "https://api.nobelprize.org/2/laureate/" + str(id)
    response = requests.get(url)
    if response.status_code == 200:
        responses.append(response.json())

100%|█████████████████████████████████████████| 992/992 [04:09<00:00,  3.98it/s]


In [215]:
for laureate in responses:
    laureates_dict['ID'].append(id)
    for col_name, func in functions_dict.items():
        laureates_dict[col_name].append(func(laureate))

laureates_df = pd.DataFrame(laureates_dict)
laureates_df

Unnamed: 0,ID,Name,Gender,Birth_date,Age,City,Country,Continent,Latitude,Longitude
0,1034,Wilhelm Conrad Röntgen,male,1845-03-27,56,Remscheid,Germany,Europe,51.178742,7.189696
1,1034,Hendrik Antoon Lorentz,male,1853-07-18,49,Arnhem,the Netherlands,Europe,51.984257,5.910857
2,1034,Pieter Zeeman,male,1865-05-25,37,Zonnemaire,the Netherlands,Europe,51.713056,3.951111
3,1034,Antoine Henri Becquerel,male,1852-12-15,51,Paris,France,Europe,48.860093,2.355954
4,1034,Pierre Curie,male,1859-05-15,44,Paris,France,Europe,48.860093,2.355954
...,...,...,...,...,...,...,...,...,...,...
987,1034,Louis E. Brus,male,1943-00-00,80,"Cleveland, OH",USA,North America,41.496386,-81.710675
988,1034,Aleksey Yekimov,male,1945-00-00,78,St. Petersburg,Russia,Europe,59.956651,30.333547
989,1034,Jon Fosse,male,1959-09-29,64,Haugesund,Norway,Europe,59.410150,5.275511
990,1034,Narges Mohammadi,female,1972-04-21,51,Zanjan,Iran,Asia,36.666667,48.483333


# Country ranking

Get a ranking countries by the number of times that they had been awarded in any category.

In [222]:
# Your code here
laureates_df.groupby('Country')['ID'].count().sort_values(ascending=False).head(60)

Country
USA                       296
United Kingdom             91
Germany                    84
France                     63
Russia                     30
Sweden                     30
Poland                     28
Japan                      28
Switzerland                24
Canada                     22
Italy                      20
Austria                    20
the Netherlands            19
Norway                     13
China                      12
Denmark                    12
Hungary                    11
Scotland                   11
Australia                  11
Belgium                    10
India                       9
South Africa                9
Spain                       7
Israel                      6
Egypt                       6
Ukraine                     6
Czech Republic              6
Finland                     5
Unknown                     5
Ireland                     5
Northern Ireland            5
Argentina                   4
Belarus                     4
Ro