The Nobel Prize has been among the most prestigious international awards since 1901. Each year, awards are bestowed in chemistry, literature, physics, physiology or medicine, economics, and peace. In addition to the honor, prestige, and substantial prize money, the recipient also gets a gold medal with an image of Alfred Nobel (1833 - 1896), who established the prize.

![](Nobel_Prize.png)

The Nobel Foundation has made a dataset available of all prize winners from the outset of the awards from 1901 to 2023. The dataset used in this project is from the Nobel Prize API and is available in the `nobel.csv` file in the `data` folder.

In this project, you'll get a chance to explore and answer several questions related to this prizewinning data. And we encourage you then to explore further questions that you're interested in!

In [90]:
# Loading in required libraries
import pandas as pd
import seaborn as sns
import numpy as np

# Start coding here!
df = pd.read_csv('data/nobel.csv')
df.head(50)

Unnamed: 0,year,category,prize,motivation,prize_share,laureate_id,laureate_type,full_name,birth_date,birth_city,birth_country,sex,organization_name,organization_city,organization_country,death_date,death_city,death_country
0,1901,Chemistry,The Nobel Prize in Chemistry 1901,"""in recognition of the extraordinary services ...",1/1,160,Individual,Jacobus Henricus van 't Hoff,1852-08-30,Rotterdam,Netherlands,Male,Berlin University,Berlin,Germany,1911-03-01,Berlin,Germany
1,1901,Literature,The Nobel Prize in Literature 1901,"""in special recognition of his poetic composit...",1/1,569,Individual,Sully Prudhomme,1839-03-16,Paris,France,Male,,,,1907-09-07,Châtenay,France
2,1901,Medicine,The Nobel Prize in Physiology or Medicine 1901,"""for his work on serum therapy, especially its...",1/1,293,Individual,Emil Adolf von Behring,1854-03-15,Hansdorf (Lawice),Prussia (Poland),Male,Marburg University,Marburg,Germany,1917-03-31,Marburg,Germany
3,1901,Peace,The Nobel Peace Prize 1901,,1/2,462,Individual,Jean Henry Dunant,1828-05-08,Geneva,Switzerland,Male,,,,1910-10-30,Heiden,Switzerland
4,1901,Peace,The Nobel Peace Prize 1901,,1/2,463,Individual,Frédéric Passy,1822-05-20,Paris,France,Male,,,,1912-06-12,Paris,France
5,1901,Physics,The Nobel Prize in Physics 1901,"""in recognition of the extraordinary services ...",1/1,1,Individual,Wilhelm Conrad Röntgen,1845-03-27,Lennep (Remscheid),Prussia (Germany),Male,Munich University,Munich,Germany,1923-02-10,Munich,Germany
6,1902,Chemistry,The Nobel Prize in Chemistry 1902,"""in recognition of the extraordinary services ...",1/1,161,Individual,Hermann Emil Fischer,1852-10-09,Euskirchen,Prussia (Germany),Male,Berlin University,Berlin,Germany,1919-07-15,Berlin,Germany
7,1902,Literature,The Nobel Prize in Literature 1902,"""the greatest living master of the art of hist...",1/1,571,Individual,Christian Matthias Theodor Mommsen,1817-11-30,Garding,Schleswig (Germany),Male,,,,1903-11-01,Charlottenburg,Germany
8,1902,Medicine,The Nobel Prize in Physiology or Medicine 1902,"""for his work on malaria, by which he has show...",1/1,294,Individual,Ronald Ross,1857-05-13,Almora,India,Male,University College,Liverpool,United Kingdom,1932-09-16,Putney Heath,United Kingdom
9,1902,Peace,The Nobel Peace Prize 1902,,1/2,464,Individual,Élie Ducommun,1833-02-19,Geneva,Switzerland,Male,,,,1906-12-07,Bern,Switzerland


In [91]:
top_gender = df['sex'].mode()[0]
print(top_gender)
top_country = df['birth_country'].mode()[0]
print(top_country)

Male
United States of America


In [92]:
df['decade'] = (df['year'] // 10) * 10
ratio_by_decade = (
    df.groupby('decade').apply(lambda x: (x['birth_country'] == 'United States of America').mean())
)

max_decade_usa = int(ratio_by_decade.idxmax())
print(max_decade_usa)

2000


In [93]:
max_female = df.groupby(['decade', 'category']).apply(lambda x: (x['sex'] == 'Female').mean()).idxmax()
print(max_female)
max_female_dict = {max_female[0]: max_female[1]}
print(max_female_dict)

(2020, 'Literature')
{2020: 'Literature'}


In [94]:
female_laurates = df[df['sex'] == 'Female']
print(female_laurates.head)
earliest_year = female_laurates['year'].min()
print(earliest_year)
first_female_winners = female_laurates[female_laurates['year'] ==  earliest_year]
print(first_female_winners)
first_woman_name = first_female_winners['full_name'].iloc[0]
print(first_woman_name)
first_woman_category = first_female_winners['category'].iloc[0]
print(first_woman_category)

<bound method NDFrame.head of      year    category  ... death_country decade
19   1903     Physics  ...        France   1900
29   1905       Peace  ...       Austria   1900
51   1909  Literature  ...        Sweden   1900
62   1911   Chemistry  ...        France   1910
128  1926  Literature  ...         Italy   1920
..    ...         ...  ...           ...    ...
982  2022  Literature  ...           NaN   2020
989  2023    Medicine  ...           NaN   2020
993  2023     Physics  ...           NaN   2020
998  2023       Peace  ...           NaN   2020
999  2023   Economics  ...           NaN   2020

[65 rows x 19 columns]>
1903
    year category  ... death_country decade
19  1903  Physics  ...        France   1900

[1 rows x 19 columns]
Marie Curie, née Sklodowska
Physics


In [95]:
counts = df['full_name'].value_counts()
print(counts)
repeat_list = [name for name, cnt in counts.items() if cnt >=2]
print(repeat_list)

Comité international de la Croix Rouge (International Committee of the Red Cross)    3
Linus Carl Pauling                                                                   2
John Bardeen                                                                         2
Frederick Sanger                                                                     2
Marie Curie, née Sklodowska                                                          2
                                                                                    ..
Karl Ziegler                                                                         1
Giulio Natta                                                                         1
Giorgos Seferis                                                                      1
Sir John Carew Eccles                                                                1
Claudia Goldin                                                                       1
Name: full_name, Length: 993, dtype: int64
