The Nobel Prize has been among the most prestigious international awards since 1901. Each year, awards are bestowed in chemistry, literature, physics, physiology or medicine, economics, and peace. In addition to the honor, prestige, and substantial prize money, the recipient also gets a gold medal with an image of Alfred Nobel (1833 - 1896), who established the prize.

![](Nobel_Prize.png)

The Nobel Foundation has made a dataset available of all prize winners from the outset of the awards from 1901 to 2023. The dataset used in this project is from the Nobel Prize API and is available in the `nobel.csv` file in the `data` folder.

In this project, you'll get a chance to explore and answer several questions related to this prizewinning data. And we encourage you then to explore further questions that you're interested in!

In [52]:
# Loading in required libraries
import pandas as pd
import seaborn as sns
import numpy as np

# Start coding here!

In [53]:
nobel = pd.read_csv('data/nobel.csv')
nobel.head(25)

Unnamed: 0,year,category,prize,motivation,prize_share,laureate_id,laureate_type,full_name,birth_date,birth_city,birth_country,sex,organization_name,organization_city,organization_country,death_date,death_city,death_country
0,1901,Chemistry,The Nobel Prize in Chemistry 1901,"""in recognition of the extraordinary services ...",1/1,160,Individual,Jacobus Henricus van 't Hoff,1852-08-30,Rotterdam,Netherlands,Male,Berlin University,Berlin,Germany,1911-03-01,Berlin,Germany
1,1901,Literature,The Nobel Prize in Literature 1901,"""in special recognition of his poetic composit...",1/1,569,Individual,Sully Prudhomme,1839-03-16,Paris,France,Male,,,,1907-09-07,Châtenay,France
2,1901,Medicine,The Nobel Prize in Physiology or Medicine 1901,"""for his work on serum therapy, especially its...",1/1,293,Individual,Emil Adolf von Behring,1854-03-15,Hansdorf (Lawice),Prussia (Poland),Male,Marburg University,Marburg,Germany,1917-03-31,Marburg,Germany
3,1901,Peace,The Nobel Peace Prize 1901,,1/2,462,Individual,Jean Henry Dunant,1828-05-08,Geneva,Switzerland,Male,,,,1910-10-30,Heiden,Switzerland
4,1901,Peace,The Nobel Peace Prize 1901,,1/2,463,Individual,Frédéric Passy,1822-05-20,Paris,France,Male,,,,1912-06-12,Paris,France
5,1901,Physics,The Nobel Prize in Physics 1901,"""in recognition of the extraordinary services ...",1/1,1,Individual,Wilhelm Conrad Röntgen,1845-03-27,Lennep (Remscheid),Prussia (Germany),Male,Munich University,Munich,Germany,1923-02-10,Munich,Germany
6,1902,Chemistry,The Nobel Prize in Chemistry 1902,"""in recognition of the extraordinary services ...",1/1,161,Individual,Hermann Emil Fischer,1852-10-09,Euskirchen,Prussia (Germany),Male,Berlin University,Berlin,Germany,1919-07-15,Berlin,Germany
7,1902,Literature,The Nobel Prize in Literature 1902,"""the greatest living master of the art of hist...",1/1,571,Individual,Christian Matthias Theodor Mommsen,1817-11-30,Garding,Schleswig (Germany),Male,,,,1903-11-01,Charlottenburg,Germany
8,1902,Medicine,The Nobel Prize in Physiology or Medicine 1902,"""for his work on malaria, by which he has show...",1/1,294,Individual,Ronald Ross,1857-05-13,Almora,India,Male,University College,Liverpool,United Kingdom,1932-09-16,Putney Heath,United Kingdom
9,1902,Peace,The Nobel Peace Prize 1902,,1/2,464,Individual,Élie Ducommun,1833-02-19,Geneva,Switzerland,Male,,,,1906-12-07,Bern,Switzerland


In [54]:
# Most common gender & birth country
top_gender = nobel['sex'].value_counts().idxmax()
top_gender_count =  nobel['sex'].value_counts().max()
print('The most commonly awarded gender is: ' + top_gender + ', ' + str(top_gender_count) + ' awards')

top_country = nobel['birth_country'].value_counts().idxmax()
top_country_count = nobel['birth_country'].value_counts().max()
print('The most commonly awarded birth country is: ' + top_country + ', ' + str(top_country_count) + ' awards')

The most commonly awarded gender is: Male, 905 awards
The most commonly awarded birth country is: United States of America, 291 awards


In [55]:
# Decade with highest ratio of US-born vs total winners, in all categories

# Winners per decade
nobel_decades =  nobel.copy()
nobel_decades['decade'] = (nobel_decades['year'] // 10) * 10

winners_per_decade = nobel_decades.groupby('decade')['decade'].size()

us_winners_decade = nobel_decades[nobel_decades['birth_country'] == 'United States of America'].groupby('decade').size()

max_decade_usa = (us_winners_decade / winners_per_decade).idxmax()
max_decade_usa_count = (us_winners_decade / winners_per_decade).max().round(2)
print('Decade with most US-born laureates is ' + str(max_decade_usa) + '-09, with a ratio of ' + str(max_decade_usa_count) + ' US:total awards.')

Decade with most US-born laureates is 2000-09, with a ratio of 0.42 US:total awards.


In [56]:
# Highest female laureate decade & category combination

female_winners_decade_cat = (
    nobel_decades[nobel_decades['sex'] == 'Female']
    .groupby(['decade', 'category'])
    .size()
)
winners_per_decade_cat = nobel_decades.groupby(['decade', 'category']).size()

max_female_test = (female_winners_decade_cat / winners_per_decade_cat).idxmax()
max_female_dict = {max_female_test[0]:max_female_test[1]}

print(max_female_dict)


{2020: 'Literature'}


In [57]:
# First female laureate & category

first_woman = nobel_decades[nobel_decades['sex'] == 'Female'] \
	.sort_values(by='year') \
	.iloc[0]

first_woman_name = first_woman['full_name']
first_woman_category = first_woman['category']

print(first_woman_name + ' in ' + first_woman_category)


Marie Curie, née Sklodowska in Physics


In [58]:
# List of names of repeat winners
repeat_list = nobel['full_name'].value_counts()
repeat_list = repeat_list[repeat_list > 1].index.tolist()

print(repeat_winners)

['Comité international de la Croix Rouge (International Committee of the Red Cross)', 'Linus Carl Pauling', 'John Bardeen', 'Frederick Sanger', 'Marie Curie, née Sklodowska', 'Office of the United Nations High Commissioner for Refugees (UNHCR)']
