# Nobel Prize Dataset Analysis

## About the Dataset :
Between 1901 and 2016, the Nobel Prizes and the Prize in Economic Sciences were awarded 579 times to 911 people and organizations. The Nobel Prize is an international award administered by the Nobel Foundation in Stockholm, Sweden, and based on the fortune of Alfred Nobel, Swedish inventor and entrepreneur. In 1968, Sveriges Riksbank established The Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel, founder of the Nobel Prize. Each Prize consists of a medal, a personal diploma, and a cash award.

A person or organization awarded the Nobel Prize is called Nobel Laureate. The word "laureate" refers to being signified by the laurel wreath. In ancient Greece, laurel wreaths were awarded to victors as a sign of honor.

### Analysing data with basic python operation

## Read the data of the format .yaml type

In [202]:
import json

In [205]:
# using with open command to read the file

with open('./data/prize.json','r') as f:
         data = json.load(f)
#printing first 2 items from the list of data    
data[0:3]

[{'Year': 1901,
  'Category': 'Chemistry',
  'Prize': 'The Nobel Prize in Chemistry 1901',
  'Motivation': '"in recognition of the extraordinary services he has rendered by the discovery of the laws of chemical dynamics and osmotic pressure in solutions"',
  'Prize Share': '1/1',
  'Laureate ID': 160,
  'Laureate Type': 'Individual',
  'Full Name': "Jacobus Henricus van 't Hoff",
  'Birth Date': '1852-08-30',
  'Birth City': 'Rotterdam',
  'Birth Country': 'Netherlands',
  'Sex': 'Male',
  'Organization Name': 'Berlin University',
  'Organization City': 'Berlin',
  'Organization Country': 'Germany',
  'Death Date': '1911-03-01',
  'Death City': 'Berlin',
  'Death Country': 'Germany'},
 {'Year': 1901,
  'Category': 'Literature',
  'Prize': 'The Nobel Prize in Literature 1901',
  'Motivation': '"in special recognition of his poetic composition, which gives evidence of lofty idealism, artistic perfection and a rare combination of the qualities of both heart and intellect"',
  'Prize Share

Now let's find answers to some preliminary questions such as 

### Women who got the first Nobel Prize ?

In [208]:
fem = ['Full Name']

for i in data :
    
    if i.get('Sex') == 'Female':
        print(i.get('Full Name') + ' was the first female to have received the Nobel Prize.')
        break

## This can be another way of fetching the 1st Nobel prize winner

#for i in range(len(data)):
#    if ((data[i]['Sex'] == 'Female')):
#        for j in fem:
#            print(j , ':', data[i][j])
#        break
        


Marie Curie, nÃ©e Sklodowska was the first female to have received the Nobel Prize


We all know about Marie Curie, who was a Polish and naturalized-French physicist and chemist who conducted pioneering research on radioactivity. She was the first woman to win a Nobel Prize, the first person and only woman to win twice, the only person to win a Nobel Prize in two different sciences, and was part of the Curie family legacy of five Nobel Prizes.

### How many have come from india?
Nobel prizes have been awarded since 1901, but India got its independence in 1947. Until 1947, the born country has “British India” as the label. Also include people whose death country was India.

Note: Print category, full name, birth country, death country and sex.

In [219]:
#Adding attributes to be printed into a list

info = ['Category','Full Name', 'Sex','Birth Country', 'Death Country']

for i in data:
    if (i.get('Birth Country',i) == 'India' or i.get('Death Country',i) == 'India' or i.get('Birth Country',i) == 'British India (India)'):
            for j in info :
                print(j +':' +i.get(j,i))
            print('\n')
            
           
#Another way of deriving the same result
        
#for i in range(len(data)):
#    if (data[i]['Birth Country'] == 'India' or data[i]['Death Country'] == 'India' or data[i]['Birth Country'] == 'British India (India)') :
#        for j in info :
#            print(j, ':', data[i][j])
#        print('\n')

Category:Medicine
Full Name:Ronald Ross
Sex:Male
Birth Country:India
Death Country:United Kingdom


Category:Literature
Full Name:Rudyard Kipling
Sex:Male
Birth Country:British India (India)
Death Country:United Kingdom


Category:Literature
Full Name:Rabindranath Tagore
Sex:Male
Birth Country:India
Death Country:India


Category:Physics
Full Name:Sir Chandrasekhara Venkata Raman
Sex:Male
Birth Country:India
Death Country:India


Category:Medicine
Full Name:Har Gobind Khorana
Sex:Male
Birth Country:India
Death Country:United States of America


Category:Peace
Full Name:Mother Teresa
Sex:Female
Birth Country:Ottoman Empire (Republic of Macedonia)
Death Country:India


Category:Economics
Full Name:Amartya Sen
Sex:Male
Birth Country:India
Death Country:


Category:Chemistry
Full Name:Venkatraman Ramakrishnan
Sex:Male
Birth Country:India
Death Country:


Category:Peace
Full Name:Kailash Satyarthi
Sex:Male
Birth Country:India
Death Country:




Some surprising results! In addition to the well known Indian Nobel Laureates, a famous UK winner, Rudyard Kipling, were born in India. There is only one winner who wasn’t born in India but died there: Mother Teresa.

### Calculate category wise number of prizes for the people who came from India?

In [221]:
def unique(list1,attrib):
    un_li = []
    str(attrib)
    for i in list1:
        if i.get(attrib) not in un_li:
            un_li.append(i.get(attrib))
            
    return un_li

#calling the unique function to fetch unique categories of Nobel prize
nobel_category = unique(data,'Category')   
print(nobel_category)

# Finding out how many Nobel Prize winners India has 
category = {}

for i in nobel_category:
    cat_len = 0
    for j in data:
        if (j.get('Birth Country',j) == 'India' or j.get('Death Country',j) == 'India' or j.get('Birth Country',j) == 'British India (India)') and (j.get('Category',j)==i):
            cat_len += 1
    category[i] = cat_len
        
print(category)  

#Another way of finding the above result
#for i in nobel_category:
#    cat_len = 0
#    for j in range(len(data)) :
#        if ((data[j]['Birth Country'] == 'India') or (data[j]['Death Country'] == 'India') or (data[j]['Birth Country'] == 'British India (India)')) and (data[j]["Category"]==i):
#            cat_len += 1
#    category[i] = cat_len
        
#print(category)        



['Chemistry', 'Literature', 'Medicine', 'Peace', 'Physics', 'Economics']
{'Chemistry': 1, 'Literature': 2, 'Medicine': 2, 'Peace': 2, 'Physics': 1, 'Economics': 1}


### Which country has produced the highest number of Nobel winners for category `Chemistry`?
Note: Print the Country and the count of nobel winners.

In [255]:
#finding unique countries
country = unique(data,'Birth Country')
#print(country)

cont_name = []
for i in data:
    con_len = 0
    if (i.get('Category') == 'Chemistry'):
        cont_name.append(i.get('Birth Country'))
b = cont_name


#2nd way of writing code is
#for i in range(len(data)):
#    if (data[i]["Category"] == "Chemistry"):
#        cont_name.append(data[i]['Birth Country'])
#b = cont_name

#USing Pandas to find out which country has won maximum nobel prize in Chemistry

import pandas as pd
p=  pd.Series(b).value_counts()
print(p[0:10])
print('\n')
print("Country with maximum Nobel Prize in Chemistry category is", p.index[0] +  " having" , p.values[0] , 'awards')

United States of America    57
Germany                     23
United Kingdom              22
France                      10
Japan                        7
Sweden                       5
Netherlands                  5
Russia                       4
Canada                       4
Austria                      4
dtype: int64


Country with maximum Nobel Prize in Chemistry category is United States of America having 57 awards


In [256]:
#Another way to find out which country has won maximum nobel prize in Chemistry
from collections import Counter
import operator

c= Counter(b)
max_nobel_winners = max(i for i in c.values())
#print (max_nobel_winners)

print("Highest count of nobel Winners:" ,max_nobel_winners)


max_country = [name for name, count in c.items() if count== max_nobel_winners]
print("Country with highest count of nobel Winners:" ,max_country[0])

Highest count of nobel Winners: 57
Country with highest count of nobel Winners: United States of America


### Which Organization won the most nobel prizes in the category "Physics" and "Chemistry" ?
Note: Print the Organization name and count of nobel prizes.

In [263]:
organization = []
category = ['Physics', 'Chemistry']

#appending Organization name in a list for Physics and Chemistry Category
for i in data:
    for j in category:
        if(i.get('Category',i)==j):
             organization.append(i.get('Organization Name',i))
o= organization

#Another way of appending Organization name in a list for Physics and Chemistry Category
#for i in range(len(data)):
#    for j in category :
#        if (data[i]['Category'] == j) :
#            organization.append(data[i]['Organization Name'])
#o= organization


#USing Pandas to find out Organization that won the maximum Nobel Prize in Physics and Chemistry category
import pandas as pd
p = pd.Series(o).value_counts()

print("Organization that won the maximum Nobel Prize in Physics and Chemistry category is", p.index[0] +  " having" , p.values[0] , 'awards')

Organization that won the maximum Nobel Prize in Physics and Chemistry category is University of California having 24 awards


In [261]:
#Another way to find out Organization that won the maximum Nobel Prize in Physics and Chemistry category

from collections import Counter
import operator

cont = Counter(o)
max_nobel_winners_PC = max(i for i in cont.values())
#print (max_nobel_winners)

print("Highest count of nobel Winners in Physics and Chemistry :" , max_nobel_winners_PC)


max_country = [name for name, count in cont.items() if count== max_nobel_winners_PC]
print("Country with highest count of nobel Winners:" ,max_country[0])

Highest count of nobel Winners in Physics and Chemistry : 24
Country with highest count of nobel Winners: University of California


### What was the Motivation for awarding the Nobel Prize for Marie Curie, nÃ©e Sklodowska?

In [190]:
for i in data:
    if (i.get('Full Name') == 'Marie Curie, nÃ©e Sklodowska'):
        print( 'Motivation is ' , i.get('Motivation',i))
        print('\n')

Motivation is  "in recognition of the extraordinary services they have rendered by their joint researches on the radiation phenomena discovered by Professor Henri Becquerel"


Motivation is  "in recognition of her services to the advancement of chemistry by the discovery of the elements radium and polonium, by the isolation of radium and the study of the nature and compounds of this remarkable element"




### In which category people got Noble Prize in the year 1994?
Note: Print both category and full name.

In [201]:
full_name = []

for i in data:
    if (i.get('Year') == 1994):
        full_name = i.get('Full Name')
        category = i.get('Category')
        print(full_name + ' got the Nobel Prize for ', category)
        print('\n')
    

George A. Olah got the Nobel Prize for  Chemistry


John C. Harsanyi got the Nobel Prize for  Economics


John F. Nash Jr. got the Nobel Prize for  Economics


Reinhard Selten got the Nobel Prize for  Economics


Kenzaburo Oe got the Nobel Prize for  Literature


Alfred G. Gilman got the Nobel Prize for  Medicine


Martin Rodbell got the Nobel Prize for  Medicine


Yasser Arafat got the Nobel Prize for  Peace


Shimon Peres got the Nobel Prize for  Peace


Yitzhak Rabin got the Nobel Prize for  Peace


Bertram N. Brockhouse got the Nobel Prize for  Physics


Clifford G. Shull got the Nobel Prize for  Physics


