The Nobel Prize has been among the most prestigious international awards since 1901. Each year, awards are bestowed in chemistry, literature, physics, physiology or medicine, economics, and peace. In addition to the honor, prestige, and substantial prize money, the recipient also gets a gold medal with an image of Alfred Nobel (1833 - 1896), who established the prize.

![](Nobel_Prize.png)

The Nobel Foundation has made a dataset available of all prize winners from the outset of the awards from 1901 to 2023. The dataset used in this project is from the Nobel Prize API and is available in the `nobel.csv` file in the `data` folder.

In this project, you'll get a chance to explore and answer several questions related to this prizewinning data. And we encourage you then to explore further questions that you're interested in!

In [1]:
# Loading in required libraries
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

# Start coding here!
nobel_prize = pd.read_csv("data/nobel.csv")

print(nobel_prize.columns)

male_count = nobel_prize[nobel_prize['sex'] == 'Male'].count()

female_count = nobel_prize[nobel_prize['sex'] == 'Female'].count()

print(male_count)
print(female_count)

top_gender = "Male"



Index(['year', 'category', 'prize', 'motivation', 'prize_share', 'laureate_id',
       'laureate_type', 'full_name', 'birth_date', 'birth_city',
       'birth_country', 'sex', 'organization_name', 'organization_city',
       'organization_country', 'death_date', 'death_city', 'death_country'],
      dtype='object')
year                    905
category                905
prize                   905
motivation              840
prize_share             905
laureate_id             905
laureate_type           905
full_name               905
birth_date              903
birth_city              899
birth_country           904
sex                     905
organization_name       708
organization_city       707
organization_country    707
death_date              569
death_city              552
death_country           558
dtype: int64
year                    65
category                65
prize                   65
motivation              58
prize_share             65
laureate_id             65
laur

In [2]:
country_counts = nobel_prize.groupby("birth_country")["prize"].count()
print(country_counts.sort_values(ascending = False))

top_country = "United States of America"

birth_country
United States of America                        291
United Kingdom                                   91
Germany                                          67
France                                           58
Sweden                                           30
                                               ... 
French protectorate of Tunisia (now Tunisia)      1
Free City of Danzig (Poland)                      1
Faroe Islands (Denmark)                           1
Ethiopia                                          1
Yemen                                             1
Name: prize, Length: 129, dtype: int64


In [3]:
nobel_prize["USA_born"] = nobel_prize["birth_country"] == 'United States of America'

print(nobel_prize["USA_born"])

0      False
1      False
2      False
3      False
4      False
       ...  
995     True
996    False
997    False
998    False
999     True
Name: USA_born, Length: 1000, dtype: bool


In [4]:
nobel_prize["decade"]= nobel_prize["year"]//10*10

print(nobel_prize["decade"])

decade_count = nobel_prize.groupby("decade")["USA_born"].mean()

print(decade_count.idxmax())

max_decade_usa = 2000



0      1900
1      1900
2      1900
3      1900
4      1900
       ... 
995    2020
996    2020
997    2020
998    2020
999    2020
Name: decade, Length: 1000, dtype: int64
2000


In [5]:
nobel_prize["female"] = nobel_prize["sex"] == 'Female'

female = nobel_prize.groupby(["decade", "category"])["female"].mean()

print (female)
print (female.idxmax())

max_female_dict = {2020: 'Literature'}




decade  category  
1900    Chemistry     0.000000
        Literature    0.100000
        Medicine      0.000000
        Peace         0.071429
        Physics       0.076923
                        ...   
2020    Economics     0.111111
        Literature    0.500000
        Medicine      0.125000
        Peace         0.285714
        Physics       0.166667
Name: female, Length: 72, dtype: float64
(2020, 'Literature')


In [6]:
nobel_prize["female"] = nobel_prize["sex"] == 'Female'

first_female = nobel_prize[nobel_prize["sex"] == 'Female'].sort_values(by = "year", ascending = True)


print (first_female[["year", "category", "full_name"]])

first_woman_name = "Marie Curie, née Sklodowska"
first_woman_category = "Physics"
                           


     year    category                                          full_name
19   1903     Physics                        Marie Curie, née Sklodowska
29   1905       Peace  Baroness Bertha Sophie Felicita von Suttner, n...
51   1909  Literature                      Selma Ottilia Lovisa Lagerlöf
62   1911   Chemistry                        Marie Curie, née Sklodowska
128  1926  Literature                                     Grazia Deledda
..    ...         ...                                                ...
982  2022  Literature                                       Annie Ernaux
993  2023     Physics                                    Anne L’Huillier
998  2023       Peace                                   Narges Mohammadi
989  2023    Medicine                                     Katalin Karikó
999  2023   Economics                                     Claudia Goldin

[65 rows x 3 columns]


In [7]:
name_count = nobel_prize["full_name"].value_counts()

repeat = name_count[name_count >= 2]

print(repeat)

repeat_list= []

for full_name in name_count.index:

    if name_count[full_name] >= 2:
    
       repeat_list.append(full_name)
    
    
print(repeat_list)



Comité international de la Croix Rouge (International Committee of the Red Cross)    3
Linus Carl Pauling                                                                   2
John Bardeen                                                                         2
Frederick Sanger                                                                     2
Marie Curie, née Sklodowska                                                          2
Office of the United Nations High Commissioner for Refugees (UNHCR)                  2
Name: full_name, dtype: int64
['Comité international de la Croix Rouge (International Committee of the Red Cross)', 'Linus Carl Pauling', 'John Bardeen', 'Frederick Sanger', 'Marie Curie, née Sklodowska', 'Office of the United Nations High Commissioner for Refugees (UNHCR)']
