# nobel winner

The Nobel Prize has been among the most prestigious international awards since 1901. Each year, awards are bestowed in chemistry, literature, physics, physiology or medicine, economics, and peace. In addition to the honor, prestige, and substantial prize money, the recipient also gets a gold medal with an image of Alfred Nobel (1833 - 1896), who established the prize.

![](Nobel_Prize.png)

The Nobel Foundation has made a dataset available of all prize winners from the outset of the awards from 1901 to 2023. The dataset used in this project is from the Nobel Prize API and is available in the `nobel.csv` file in the `data` folder.

In this project, you'll get a chance to explore and answer several questions related to this prizewinning data. And we encourage you then to explore further questions that you're interested in!

In [322]:
# Loading in required libraries
import pandas as pd
import seaborn as sns
import numpy as np

# Start coding here!

In [323]:
df = pd.read_csv('nobel.csv')

In [324]:
df.head()

Unnamed: 0,year,category,prize,motivation,prize_share,laureate_id,laureate_type,full_name,birth_date,birth_city,birth_country,sex,organization_name,organization_city,organization_country,death_date,death_city,death_country
0,1901,Chemistry,The Nobel Prize in Chemistry 1901,"""in recognition of the extraordinary services ...",1/1,160,Individual,Jacobus Henricus van 't Hoff,1852-08-30,Rotterdam,Netherlands,Male,Berlin University,Berlin,Germany,1911-03-01,Berlin,Germany
1,1901,Literature,The Nobel Prize in Literature 1901,"""in special recognition of his poetic composit...",1/1,569,Individual,Sully Prudhomme,1839-03-16,Paris,France,Male,,,,1907-09-07,Châtenay,France
2,1901,Medicine,The Nobel Prize in Physiology or Medicine 1901,"""for his work on serum therapy, especially its...",1/1,293,Individual,Emil Adolf von Behring,1854-03-15,Hansdorf (Lawice),Prussia (Poland),Male,Marburg University,Marburg,Germany,1917-03-31,Marburg,Germany
3,1901,Peace,The Nobel Peace Prize 1901,,1/2,462,Individual,Jean Henry Dunant,1828-05-08,Geneva,Switzerland,Male,,,,1910-10-30,Heiden,Switzerland
4,1901,Peace,The Nobel Peace Prize 1901,,1/2,463,Individual,Frédéric Passy,1822-05-20,Paris,France,Male,,,,1912-06-12,Paris,France


In [325]:
df['category'].unique()

array(['Chemistry', 'Literature', 'Medicine', 'Peace', 'Physics',
       'Economics'], dtype=object)

In [326]:
df.columns

Index(['year', 'category', 'prize', 'motivation', 'prize_share', 'laureate_id',
       'laureate_type', 'full_name', 'birth_date', 'birth_city',
       'birth_country', 'sex', 'organization_name', 'organization_city',
       'organization_country', 'death_date', 'death_city', 'death_country'],
      dtype='object')

In [327]:
df.shape

(1000, 18)

In [328]:
df = df.drop_duplicates()

In [329]:
df.shape

(1000, 18)

In [330]:
df['sex'].value_counts(sort=True)

Male      905
Female     65
Name: sex, dtype: int64

In [331]:
top_gender = 'Male'

In [332]:
df['birth_country'].value_counts(sort = True).head(10)

United States of America    291
United Kingdom               91
Germany                      67
France                       58
Sweden                       30
Japan                        28
Canada                       21
Switzerland                  19
Netherlands                  19
Italy                        18
Name: birth_country, dtype: int64

In [333]:
top_country = 'United States of America'

In [334]:
df['us_born'] = 0
df['non_us_born'] = 0

In [335]:
for i in range(len(df)):
    if df.loc[i]['birth_country'] == 'United States of America':
        df['us_born'][i] = 1
    else:
        df['non_us_born'][i] = 1
        

In [336]:
df.head(25)

Unnamed: 0,year,category,prize,motivation,prize_share,laureate_id,laureate_type,full_name,birth_date,birth_city,birth_country,sex,organization_name,organization_city,organization_country,death_date,death_city,death_country,us_born,non_us_born
0,1901,Chemistry,The Nobel Prize in Chemistry 1901,"""in recognition of the extraordinary services ...",1/1,160,Individual,Jacobus Henricus van 't Hoff,1852-08-30,Rotterdam,Netherlands,Male,Berlin University,Berlin,Germany,1911-03-01,Berlin,Germany,0,1
1,1901,Literature,The Nobel Prize in Literature 1901,"""in special recognition of his poetic composit...",1/1,569,Individual,Sully Prudhomme,1839-03-16,Paris,France,Male,,,,1907-09-07,Châtenay,France,0,1
2,1901,Medicine,The Nobel Prize in Physiology or Medicine 1901,"""for his work on serum therapy, especially its...",1/1,293,Individual,Emil Adolf von Behring,1854-03-15,Hansdorf (Lawice),Prussia (Poland),Male,Marburg University,Marburg,Germany,1917-03-31,Marburg,Germany,0,1
3,1901,Peace,The Nobel Peace Prize 1901,,1/2,462,Individual,Jean Henry Dunant,1828-05-08,Geneva,Switzerland,Male,,,,1910-10-30,Heiden,Switzerland,0,1
4,1901,Peace,The Nobel Peace Prize 1901,,1/2,463,Individual,Frédéric Passy,1822-05-20,Paris,France,Male,,,,1912-06-12,Paris,France,0,1
5,1901,Physics,The Nobel Prize in Physics 1901,"""in recognition of the extraordinary services ...",1/1,1,Individual,Wilhelm Conrad Röntgen,1845-03-27,Lennep (Remscheid),Prussia (Germany),Male,Munich University,Munich,Germany,1923-02-10,Munich,Germany,0,1
6,1902,Chemistry,The Nobel Prize in Chemistry 1902,"""in recognition of the extraordinary services ...",1/1,161,Individual,Hermann Emil Fischer,1852-10-09,Euskirchen,Prussia (Germany),Male,Berlin University,Berlin,Germany,1919-07-15,Berlin,Germany,0,1
7,1902,Literature,The Nobel Prize in Literature 1902,"""the greatest living master of the art of hist...",1/1,571,Individual,Christian Matthias Theodor Mommsen,1817-11-30,Garding,Schleswig (Germany),Male,,,,1903-11-01,Charlottenburg,Germany,0,1
8,1902,Medicine,The Nobel Prize in Physiology or Medicine 1902,"""for his work on malaria, by which he has show...",1/1,294,Individual,Ronald Ross,1857-05-13,Almora,India,Male,University College,Liverpool,United Kingdom,1932-09-16,Putney Heath,United Kingdom,0,1
9,1902,Peace,The Nobel Peace Prize 1902,,1/2,464,Individual,Élie Ducommun,1833-02-19,Geneva,Switzerland,Male,,,,1906-12-07,Bern,Switzerland,0,1


In [337]:
df_x = df[['year', 'us_born', 'non_us_born']]

In [338]:
df_x.head()

Unnamed: 0,year,us_born,non_us_born
0,1901,0,1
1,1901,0,1
2,1901,0,1
3,1901,0,1
4,1901,0,1


In [339]:
df_x['decade'] = (df_x['year'] // 10) * 10

In [340]:
df_x.head(10)

Unnamed: 0,year,us_born,non_us_born,decade
0,1901,0,1,1900
1,1901,0,1,1900
2,1901,0,1,1900
3,1901,0,1,1900
4,1901,0,1,1900
5,1901,0,1,1900
6,1902,0,1,1900
7,1902,0,1,1900
8,1902,0,1,1900
9,1902,0,1,1900


In [341]:
df_x = df_x[['us_born', 'non_us_born', 'decade']]

In [342]:
df_x = df_x.groupby('decade').sum()

In [343]:
df_x['ratio'] = round(df_x['us_born'] / df_x['non_us_born'], 2)

In [344]:
df_x

Unnamed: 0_level_0,us_born,non_us_born,ratio
decade,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1900,1,56,0.02
1910,3,37,0.08
1920,4,50,0.08
1930,14,42,0.33
1940,13,30,0.43
1950,21,51,0.41
1960,21,58,0.36
1970,33,71,0.46
1980,31,66,0.47
1990,42,62,0.68


In [345]:
print(df_x['ratio'].max())

0.73


In [346]:
max_decade_usa = 2000

In [347]:
df_y = df[['year', 'category', 'sex']]

In [348]:
df_y['decade'] = (df_y['year'] // 10) * 10

In [349]:
df_y.head()

Unnamed: 0,year,category,sex,decade
0,1901,Chemistry,Male,1900
1,1901,Literature,Male,1900
2,1901,Medicine,Male,1900
3,1901,Peace,Male,1900
4,1901,Peace,Male,1900


In [350]:
df_y = df_y.drop('year', axis=1)

In [351]:
df_z = df_y.groupby(['decade', 'category'])['sex'].value_counts(normalize=True)

In [352]:
df_k = df_y.groupby(['decade', 'category'])['sex']

In [353]:
df_k

<pandas.core.groupby.generic.SeriesGroupBy object at 0x7faafd57ba60>

In [354]:
df_z

decade  category    sex   
1900    Chemistry   Male      1.000000
        Literature  Male      0.900000
                    Female    0.100000
        Medicine    Male      1.000000
        Peace       Male      0.923077
                                ...   
2020    Medicine    Female    0.125000
        Peace       Female    0.500000
                    Male      0.500000
        Physics     Male      0.833333
                    Female    0.166667
Name: sex, Length: 110, dtype: float64

In [355]:
female_winners = df_z.xs('Female', level='sex').sort_values(ascending=False)

In [356]:
female_winners = pd.DataFrame(female_winners)

In [357]:
female_winners.reset_index(inplace=True)

In [358]:
female_winners.head(5)

Unnamed: 0,decade,category,sex
0,2020,Peace,0.5
1,2020,Literature,0.5
2,2010,Peace,0.5
3,2010,Literature,0.3
4,1990,Literature,0.3


In [359]:
key = female_winners.iloc[1, 0] 
value = female_winners.iloc[1, 1]

max_female_dict = {key: value}

In [360]:
max_female_dict

{2020: 'Literature'}

In [361]:
df_t = df[df['sex']=='Female']

In [362]:
df_t

Unnamed: 0,year,category,prize,motivation,prize_share,laureate_id,laureate_type,full_name,birth_date,birth_city,birth_country,sex,organization_name,organization_city,organization_country,death_date,death_city,death_country,us_born,non_us_born
19,1903,Physics,The Nobel Prize in Physics 1903,"""in recognition of the extraordinary services ...",1/4,6,Individual,"Marie Curie, née Sklodowska",1867-11-07,Warsaw,Russian Empire (Poland),Female,,,,1934-07-04,Sallanches,France,0,1
29,1905,Peace,The Nobel Peace Prize 1905,,1/1,468,Individual,"Baroness Bertha Sophie Felicita von Suttner, n...",1843-06-09,Prague,Austrian Empire (Czech Republic),Female,,,,1914-06-21,Vienna,Austria,0,1
51,1909,Literature,The Nobel Prize in Literature 1909,"""in appreciation of the lofty idealism, vivid ...",1/1,579,Individual,Selma Ottilia Lovisa Lagerlöf,1858-11-20,Mårbacka,Sweden,Female,,,,1940-03-16,Mårbacka,Sweden,0,1
62,1911,Chemistry,The Nobel Prize in Chemistry 1911,"""in recognition of her services to the advance...",1/1,6,Individual,"Marie Curie, née Sklodowska",1867-11-07,Warsaw,Russian Empire (Poland),Female,Sorbonne University,Paris,France,1934-07-04,Sallanches,France,0,1
128,1926,Literature,The Nobel Prize in Literature 1926,"""for her idealistically inspired writings whic...",1/1,597,Individual,Grazia Deledda,1871-09-27,"Nuoro, Sardinia",Italy,Female,,,,1936-08-15,Rome,Italy,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
982,2022,Literature,The Nobel Prize in Literature 2022,"""for the courage and clinical acuity with whic...",1/1,1017,Individual,Annie Ernaux,1940-09-01,Lillebonne,France,Female,,,,,,,0,1
989,2023,Medicine,The Nobel Prize in Physiology or Medicine 2023,"""for their discoveries concerning nucleoside b...",1/2,1024,Individual,Katalin Karikó,1955-01-17,Szolnok,Hungary,Female,Szeged University,Szeged,Hungary,,,,0,1
993,2023,Physics,The Nobel Prize in Physics 2023,"""for experimental methods that generate attose...",1/3,1028,Individual,Anne L’Huillier,1958-08-16,Paris,France,Female,Lund University,Lund,Sweden,,,,0,1
998,2023,Peace,The Nobel Peace Prize 2023,"""for her fight against the oppression of women...",1/1,1033,Individual,Narges Mohammadi,1972-04-21,Zanjan,Iran,Female,,,,,,,0,1


In [363]:
first_woman_name = str(df_t.iloc[0]['full_name'])

In [364]:
first_woman_category = str(df_t.iloc[0]['category'])

In [365]:
df['laureate_type'].unique()

array(['Individual', 'Organization'], dtype=object)

In [366]:
repeat_list = df['full_name'].value_counts()[df['full_name'].value_counts() > 1].index.tolist()

In [367]:
repeat_list

['Comité international de la Croix Rouge (International Committee of the Red Cross)',
 'Linus Carl Pauling',
 'John Bardeen',
 'Frederick Sanger',
 'Marie Curie, née Sklodowska',
 'Office of the United Nations High Commissioner for Refugees (UNHCR)']