The Nobel Prize has been among the most prestigious international awards since 1901. Each year, awards are bestowed in chemistry, literature, physics, physiology or medicine, economics, and peace. In addition to the honor, prestige, and substantial prize money, the recipient also gets a gold medal with an image of Alfred Nobel (1833 - 1896), who established the prize.

![](Nobel_Prize.png)

The Nobel Foundation has made a dataset available of all prize winners from the outset of the awards from 1901 to 2023. The dataset used in this project is from the Nobel Prize API and is available in the `nobel.csv` file in the `data` folder.

In this project, you'll get a chance to explore and answer several questions related to this prizewinning data. And we encourage you then to explore further questions that you're interested in!

In [200]:
# Loading in required libraries
import pandas as pd
import seaborn as sns
import numpy as np

# Import the data and display its first few rows as a dataframe
nobel = pd.read_csv("data/nobel.csv")
nobel.head()

Unnamed: 0,year,category,prize,motivation,prize_share,laureate_id,laureate_type,full_name,birth_date,birth_city,birth_country,sex,organization_name,organization_city,organization_country,death_date,death_city,death_country
0,1901,Chemistry,The Nobel Prize in Chemistry 1901,"""in recognition of the extraordinary services ...",1/1,160,Individual,Jacobus Henricus van 't Hoff,1852-08-30,Rotterdam,Netherlands,Male,Berlin University,Berlin,Germany,1911-03-01,Berlin,Germany
1,1901,Literature,The Nobel Prize in Literature 1901,"""in special recognition of his poetic composit...",1/1,569,Individual,Sully Prudhomme,1839-03-16,Paris,France,Male,,,,1907-09-07,Châtenay,France
2,1901,Medicine,The Nobel Prize in Physiology or Medicine 1901,"""for his work on serum therapy, especially its...",1/1,293,Individual,Emil Adolf von Behring,1854-03-15,Hansdorf (Lawice),Prussia (Poland),Male,Marburg University,Marburg,Germany,1917-03-31,Marburg,Germany
3,1901,Peace,The Nobel Peace Prize 1901,,1/2,462,Individual,Jean Henry Dunant,1828-05-08,Geneva,Switzerland,Male,,,,1910-10-30,Heiden,Switzerland
4,1901,Peace,The Nobel Peace Prize 1901,,1/2,463,Individual,Frédéric Passy,1822-05-20,Paris,France,Male,,,,1912-06-12,Paris,France


In [201]:
# Extract the most commonly awarded gender and birth country stored as strings
top_gender = list(nobel["sex"].value_counts(ascending = True).index)[-1]
top_country = list(nobel["birth_country"].value_counts(ascending = True).index)[-1]
print("Most commonly awarded gender: " + top_gender)
print("Most commonly awarded birth country: " + top_country)

Most commonly awarded gender: Male
Most commonly awarded birth country: United States of America


In [202]:
# Extract the decade with the highest proportion of US-born winners
nobel["US-born"] = (nobel["birth_country"] == "United States of America")
nobel["decade"] = np.floor(nobel["year"]/10)*10
proportions = np.floor(nobel.groupby("decade").sum()["US-born"]) / np.floor(nobel.groupby("decade").count()["US-born"])
max_decade_usa = int(proportions.idxmax())
print("Decade with the highest proportion of US-born winners: " + str(max_decade_usa))

Decade with the highest proportion of US-born winners: 2000


In [203]:
# Extract the decade-category pair with the highest proportion of female laureates
df["is_female"] = (df["sex"] == "Female")
df.groupby(["decade", "category"]).head()
proportions = np.floor(df.groupby(["decade", "category"]).sum()["is_female"]) / np.floor(df.groupby(["decade", "category"]).count()["is_female"])
max_female_dict = {int(proportions.idxmax()[0]):proportions.idxmax()[1]}
print("Decade-category pair with the highest proportion of female laureates: " + str(max_female_dict))

Decade-category pair with the highest proportion of female laureates: {2020: 'Literature'}


In [204]:
# Extract the first woman to receive a Nobel prize
first_woman_name = list(nobel[nobel["sex"] == "Female"]["full_name"])[0]
first_woman_category = list(nobel[nobel["sex"] == "Female"]["category"])[0]
print("First woman name: " + first_woman_name)
print("First woman category: " + first_woman_category)

First woman name: Marie Curie, née Sklodowska
First woman category: Physics


In [205]:
# Extract the list of names of individuals or organizations which have won multiple Nobel prizes throughout the years
repeat_list = list(nobel["full_name"].value_counts()[nobel["full_name"].value_counts() > 1].index)
print("List of names of individuals or organizations which have won multiple Nobel prizes: " + str(repeat_list))

List of names of individuals or organizations which have won multiple Nobel prizes: ['Comité international de la Croix Rouge (International Committee of the Red Cross)', 'Linus Carl Pauling', 'John Bardeen', 'Frederick Sanger', 'Marie Curie, née Sklodowska', 'Office of the United Nations High Commissioner for Refugees (UNHCR)']
