# Visualizing the History of Nobel Prize Winners 

The Nobel Prize has been among the most prestigious international awards since 1901. Each year, awards are bestowed in chemistry, literature, physics, physiology or medicine, economics, and peace. In addition to the honor, prestige, and substantial prize money, the recipient also gets a gold medal with an image of Alfred Nobel (1833 - 1896), who established the prize.

The Nobel Foundation has made a dataset available of all prize winners from the outset of the awards from 1901 to 2023. The dataset used in this project is from the Nobel Prize API and is available in the `nobel.csv` file in the `data` folder.

In this project, you'll get a chance to explore and answer several questions related to this prizewinning data. And we encourage you then to explore further questions that you're interested in!

In [99]:
# Loading in required libraries
import pandas as pd
import seaborn as sns
import numpy as np

# Reading in the Nobel Prize data
nobel = pd.read_csv('data/nobel.csv')

# Taking a look at the first several winners
nobel.head()

Unnamed: 0,year,category,prize,motivation,prize_share,laureate_id,laureate_type,full_name,birth_date,birth_city,birth_country,sex,organization_name,organization_city,organization_country,death_date,death_city,death_country
0,1901,Chemistry,The Nobel Prize in Chemistry 1901,"""in recognition of the extraordinary services ...",1/1,160,Individual,Jacobus Henricus van 't Hoff,1852-08-30,Rotterdam,Netherlands,Male,Berlin University,Berlin,Germany,1911-03-01,Berlin,Germany
1,1901,Literature,The Nobel Prize in Literature 1901,"""in special recognition of his poetic composit...",1/1,569,Individual,Sully Prudhomme,1839-03-16,Paris,France,Male,,,,1907-09-07,Châtenay,France
2,1901,Medicine,The Nobel Prize in Physiology or Medicine 1901,"""for his work on serum therapy, especially its...",1/1,293,Individual,Emil Adolf von Behring,1854-03-15,Hansdorf (Lawice),Prussia (Poland),Male,Marburg University,Marburg,Germany,1917-03-31,Marburg,Germany
3,1901,Peace,The Nobel Peace Prize 1901,,1/2,462,Individual,Jean Henry Dunant,1828-05-08,Geneva,Switzerland,Male,,,,1910-10-30,Heiden,Switzerland
4,1901,Peace,The Nobel Peace Prize 1901,,1/2,463,Individual,Frédéric Passy,1822-05-20,Paris,France,Male,,,,1912-06-12,Paris,France


## What is the most commonly awarded gender and birth country?

In [100]:
# Get the value counts of the columns and select the first index
top_country = nobel['birth_country'].value_counts().index[0]
top_gender = nobel['sex'].value_counts().index[0]

print('The most common gender: ' + top_gender + '\nThe most common country: ' + top_country)

The most common gender: Male
The most common country: United States of America


## What Decade had the highest proportion of US-born winners?

In [101]:
# Crating the is_us_born_winner column which store a boolean of whether or not the recipient was born in the United States
nobel['is_us_born_winner'] = nobel['birth_country'] == 'United States of America'

# Creating the decade column which contains the decade each prize was recieved in 
nobel['decade'] = (np.floor(nobel['year'] / 10) * 10).astype(int)

# Sorting the proportion of USA winners per decade so that it is in descending order
prop_usa_winners = nobel.groupby('decade', as_index=False)['is_us_born_winner'].mean().sort_values('is_us_born_winner', ascending=False)

# Storing the decade with the maximum proportion of USA recipients
max_decade_usa = prop_usa_winners.iloc[0]['decade'].astype(int)

max_decade_usa

2000

## What decade and category pair had the highest proportion of female laureates?

In [102]:
# Create a column of boolean values indicating whether or not the recipient was female
nobel['is_female'] = nobel['sex'] == 'Female'

# Create a dataframe grouped by category and decade that contains a descending column of proportion of females that recieved the award 
prop_female_decade_category = nobel.groupby(['decade', 'category'], as_index=False)['is_female'].mean().sort_values(by='is_female', ascending=False)

# Create a dictionary containing the decade and category in which females had the greatest proportion of laureates 
max_female_dict = {prop_female_decade_category.iloc[0]['decade'] : prop_female_decade_category.iloc[0]['category']}

# Re-extract values for use in answer string 
decade, category = list(max_female_dict.items())[0]

'In the decade ' + str(decade) + ' females had the greatest proportion of laureates compared to males they have ever had. This occured in the category of ' + category + '.'

'In the decade 2020 females had the greatest proportion of laureates compared to males they have ever had. This occured in the category of Literature.'

## Who was the first woman to recieve a Nobel prize, and in what category?

In [103]:
# Subset nobel df for female laureates
female_laureates = nobel[nobel['is_female']]

# Find the first female to recieve a nobel prize by finding the row with the min year
first_female_row = female_laureates[female_laureates['year'] == female_laureates['year'].min()]

# Extract the name of the first female laureate 
first_woman_name = first_female_row.at[first_female_row.index[0], 'full_name']

# Extract the category in which the first female laureate recieved their award 
first_woman_category = first_female_row.at[first_female_row.index[0], 'category']

f'The first woman ever to become a Nobel Laureate was none other than {first_woman_name}. She recieved her award in the field of {first_woman_category} for her research of radiation phenomena.'

'The first woman ever to become a Nobel Laureate was none other than Marie Curie, née Sklodowska. She recieved her award in the field of Physics for her research of radiation phenomena.'

## Which individuals or organizations have won multiple Nobel Prizes throughout the years?

In [104]:
# Extract the category of full_name
num_prizes_series = nobel['full_name'].value_counts()

# Subset the pandas series for winners who have recieved more than Nobel prize
repeat_winners = num_prizes_series[num_prizes_series > 1]

# Create a list of the indicies
repeat_list = repeat_winners.index.to_list()

print('The following are people and organizations who have won more than one nobel prize:\n')
for i, item in enumerate(repeat_list):
    print(f'\t{i+1}. {item}')

The following are people and organizations who have won more than one nobel prize:

	1. Comité international de la Croix Rouge (International Committee of the Red Cross)
	2. Linus Carl Pauling
	3. John Bardeen
	4. Frederick Sanger
	5. Marie Curie, née Sklodowska
	6. Office of the United Nations High Commissioner for Refugees (UNHCR)
