<a href="https://colab.research.google.com/github/Coder-Akshat008/Data_Science_Projects/blob/main/people_on_banknotes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# People on Banknotes

Whose faces appear on banknotes?

The file `people-on-banknotes.csv` contains data about individuals featured on banknotes from 38 countries. This dataset spans all 22 subregions and sub-subregions of the world, as defined by the United Nations Statistics Division's geoscheme.

It profiles 241 people, detailing their occupations and the year they first appeared on a banknote. Additionally, it includes their year of death — or `NaN` if they were still alive when the dataset was compiled.

Most banknotes were issued after the featured individual’s death. The column `first_death_diff` calculates the difference between the year of their first appearance on a banknote and their year of death (or remains `NaN` if the person was still living at the time of curation).




#Importing Data

In [10]:
import pandas as pd

df = pd.read_csv('/content/people-on-banknotes.csv')
df.head(10)

Unnamed: 0,country,currency,name,gender,occupation,value,first_appearance,death,first_death_diff,currency_code
0,Argentina,Argentine Peso,Eva Perón,F,Activist,100,2012,1952,60.0,ARS
1,Argentina,Argentine Peso,Julio Argentino Roca,M,Head of Gov't,100,1988,1914,74.0,ARS
2,Argentina,Argentine Peso,Domingo Faustino Sarmiento,M,Head of Gov't,50,1999,1888,111.0,ARS
3,Argentina,Argentine Peso,Juan Manuel de Rosas,M,Politician,20,1992,1877,115.0,ARS
4,Argentina,Argentine Peso,Manuel Belgrano,M,Founder,10,1970,1820,150.0,ARS
5,Australia,Australian Dollar,David Unaipon,M,STEM,50,1995,1967,28.0,AUD
6,Australia,Australian Dollar,Mary Gilmore,F,Writer,10,1993,1962,31.0,AUD
7,Australia,Australian Dollar,Reverend John Flynn,M,Religious figure,20,1994,1951,43.0,AUD
8,Australia,Australian Dollar,Banjo Paterson,M,Writer,10,1993,1941,52.0,AUD
9,Australia,Australian Dollar,Edith Cowan,F,Politician,50,1995,1932,63.0,AUD


Lenght of CSV


In [8]:
len(df)

279

### Quick cleaning

The same person can appear on multiple banknotes. Below we drop the `value` column and remove duplicate people.

In [None]:
df = df.drop(columns=['value'])
df = df.drop_duplicates(subset="name")
df

Unnamed: 0,country,currency,name,gender,occupation,first_appearance,death,first_death_diff,currency_code
0,Argentina,Argentine Peso,Eva Perón,F,Activist,2012,1952,60.0,ARS
1,Argentina,Argentine Peso,Julio Argentino Roca,M,Head of Gov't,1988,1914,74.0,ARS
2,Argentina,Argentine Peso,Domingo Faustino Sarmiento,M,Head of Gov't,1999,1888,111.0,ARS
3,Argentina,Argentine Peso,Juan Manuel de Rosas,M,Politician,1992,1877,115.0,ARS
4,Argentina,Argentine Peso,Manuel Belgrano,M,Founder,1970,1820,150.0,ARS
...,...,...,...,...,...,...,...,...,...
274,Venezuela,Venezuelan Bolivar,Francisco de Miranda,M,Military,1968,1816,152.0,VES
275,Venezuela,Venezuelan Bolivar,Simón Rodrigues,M,Educator,2007,1854,153.0,VES
276,Venezuela,Venezuelan Bolivar,Ezequiel Zamora,M,Military,2018,1860,158.0,VES
277,Venezuela,Venezuelan Bolivar,Rafael Urdaneta,M,Head of Gov't,2018,1845,173.0,VES


### Project Ideas

- What proportion of individuals featured are male versus female?
	- Hint: Use `value_counts(normalize=True)` to calculate percentages.

- Are writers or politicians more commonly depicted?

- What percentage of featured individuals are musicians?

- What percentage of banknotes were issued before the person’s death?
	- Hint: Look for negative values or NaN in `first_death_diff`.

- Who is the oldest historical figure in the dataset?

- Which countries feature the oldest historical figures on their banknotes?
	- Hint: Group by country and aggregate the year of death using the median. Sort the results.

- What percentage of individuals died at least 100 years before appearing on a banknote?

- Which individuals appeared on a banknote just one year after their death?


'''What proportion of individuals featured are male versus female?'''

In [None]:
df['gender'].value_counts(normalize=True)

Unnamed: 0_level_0,proportion
gender,Unnamed: 1_level_1
M,0.78853
F,0.21147


'''***Which individuals appeared on a banknote just one year after their death?***'''

In [None]:
df.query('first_death_diff == 1')

Unnamed: 0,country,currency,name,gender,occupation,value,first_appearance,death,first_death_diff,currency_code
63,Colombia,Colombian Peso,Gabriel García Márquez,M,Writer,50000,2015,2014,1.0,COP
173,Nigeria,Nigerian Naira,General Murtala Mohammed,M,Head of Gov't,20,1977,1976,1.0,NGN
190,Philippines,Philippine Piso,Corazon C. Aquino,F,Head of Gov't,500,2010,2009,1.0,PHP
191,Philippines,Philippine Piso,Manuel A. Roxas,M,Head of Gov't,100,1949,1948,1.0,PHP


'''***What percentage of individuals died at least 100 years before appearing on a banknote?***'''

In [None]:
df.query('first_death_diff >= 100')

Unnamed: 0,country,currency,name,gender,occupation,value,first_appearance,death,first_death_diff,currency_code
2,Argentina,Argentine Peso,Domingo Faustino Sarmiento,M,Head of Gov't,50,1999,1888,111.0,ARS
3,Argentina,Argentine Peso,Juan Manuel de Rosas,M,Politician,20,1992,1877,115.0,ARS
4,Argentina,Argentine Peso,Manuel Belgrano,M,Founder,10,1970,1820,150.0,ARS
12,Australia,Australian Dollar,Mary Reibey,F,Other,20,1994,1855,139.0,AUD
24,Bolivia,Bolivian Boliviano,Pablo Zárate Willka,M,Revolutionary,50,2018,1905,113.0,BOB
...,...,...,...,...,...,...,...,...,...,...
274,Venezuela,Venezuelan Bolivar,Francisco de Miranda,M,Military,200,1968,1816,152.0,VES
275,Venezuela,Venezuelan Bolivar,Simón Rodrigues,M,Educator,20,2007,1854,153.0,VES
276,Venezuela,Venezuelan Bolivar,Ezequiel Zamora,M,Military,100,2018,1860,158.0,VES
277,Venezuela,Venezuelan Bolivar,Rafael Urdaneta,M,Head of Gov't,10,2018,1845,173.0,VES


'''***Who is the oldest historical figure in the dataset?***'''

In [None]:
df['first_appearance'].min()

1869

''' Oldest banknote with a person on it according to dataset '''


In [None]:
df.query('first_appearance == 1869')

Unnamed: 0,country,currency,name,gender,occupation,value,first_appearance,death,first_death_diff,currency_code
258,United States,US Dollar,Thomas Jefferson,M,Founder,2,1869,1826,43.0,USD
260,United States,US Dollar,George Washington,M,Founder,1,1869,1799,70.0,USD


'''***What percentage of banknotes were issued before the person’s death?***'''

In [None]:
df.query('first_death_diff<0 or first_death_diff == "NaN"')

Unnamed: 0,country,currency,name,gender,occupation,value,first_appearance,death,first_death_diff,currency_code
14,Bangladesh,Bangladeshi Taka,Bangabandhu Sheikh Mujibur Rahman,M,Founder,20,1972,1975,-3.0,BDT
15,Bangladesh,Bangladeshi Taka,Bangabandhu Sheikh Mujibur Rahman,M,Founder,1000,1972,1975,-3.0,BDT
16,Bangladesh,Bangladeshi Taka,Bangabandhu Sheikh Mujibur Rahman,M,Founder,500,1972,1975,-3.0,BDT
17,Bangladesh,Bangladeshi Taka,Bangabandhu Sheikh Mujibur Rahman,M,Founder,200,1972,1975,-3.0,BDT
18,Bangladesh,Bangladeshi Taka,Bangabandhu Sheikh Mujibur Rahman,M,Founder,100,1972,1975,-3.0,BDT
19,Bangladesh,Bangladeshi Taka,Bangabandhu Sheikh Mujibur Rahman,M,Founder,50,1972,1975,-3.0,BDT
20,Bangladesh,Bangladeshi Taka,Bangabandhu Sheikh Mujibur Rahman,M,Founder,10,1972,1975,-3.0,BDT
21,Bangladesh,Bangladeshi Taka,Bangabandhu Sheikh Mujibur Rahman,M,Founder,5,1972,1975,-3.0,BDT
22,Bangladesh,Bangladeshi Taka,Bangabandhu Sheikh Mujibur Rahman,M,Founder,2,1972,1975,-3.0,BDT
119,Indonesia,Indonesian Rupiah,Soekarno,M,Head of Gov't,100000,1945,1970,-25.0,IDR


'''***What percentage of featured individuals are musicians?***'''

In [None]:
df.query('occupation == "Musician"')

Unnamed: 0,country,currency,name,gender,occupation,value,first_appearance,death,first_death_diff,currency_code
10,Australia,Australian Dollar,Nellie Melba,F,Musician,100,1996,1931,65.0,AUD
47,Cape Verde,Cape Verdean Escudo,Cesária Évora,F,Musician,2000,2014,2011,3.0,CVE
49,Cape Verde,Cape Verdean Escudo,Codé Di Dona,M,Musician,1000,2014,2010,4.0,CVE
81,Czech Republic,Czech Koruna,Ema Destinnová,F,Musician,2000,1996,1930,66.0,CZK
92,Dominican Republic,Peso Dominicano,José Rufino Reyes y Siancas,M,Musician,2000,2000,1905,95.0,DOP
106,Georgia,Georgian Lari,Zakaria Paliashvili,M,Musician,2,1995,1933,62.0,GEL
185,Peru,Peruvian Sol,María Isabel Granda y Larco,F,Musician,10,2021,1983,38.0,PEN
213,Serbia,Serbian Dinar,Stevan Stevanovic Mokranjac,M,Musician,50,2005,1914,91.0,RSD
226,Sweden,Swedish Krona,Birgit Nilsson,F,Musician,500,2016,2005,11.0,SEK
229,Sweden,Swedish Krona,Evert Taube,M,Musician,50,2015,1976,39.0,SEK


'''Are writers or politicians more commonly depicted?'''

In [None]:
df[(df['occupation'] == 'Writer') |(df['occupation'] == 'Politician')]['occupation'].value_counts()

Unnamed: 0_level_0,count
occupation,Unnamed: 1_level_1
Writer,45
Politician,27



'''Filter the dataset for entries where the country is either Israel or Argentina and compute the count of records for each resulting row/group'''


In [25]:
df.query('country == "Israel" or country == "Argentina"').value_counts()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,Unnamed: 8_level_0,Unnamed: 9_level_0,count
country,currency,name,gender,occupation,value,first_appearance,death,first_death_diff,currency_code,Unnamed: 10_level_1
Argentina,Argentine Peso,Domingo Faustino Sarmiento,M,Head of Gov't,50,1999,1888,111.0,ARS,1
Argentina,Argentine Peso,Eva Perón,F,Activist,100,2012,1952,60.0,ARS,1
Argentina,Argentine Peso,Juan Manuel de Rosas,M,Politician,20,1992,1877,115.0,ARS,1
Argentina,Argentine Peso,Julio Argentino Roca,M,Head of Gov't,100,1988,1914,74.0,ARS,1
Argentina,Argentine Peso,Manuel Belgrano,M,Founder,10,1970,1820,150.0,ARS,1
Israel,Israeli New Shekel,Leah Goldberg,F,Writer,100,2017,1970,47.0,ILS,1
Israel,Israeli New Shekel,Nathan Alterman,M,Writer,200,2015,1970,45.0,ILS,1
Israel,Israeli New Shekel,Rachel Bluwstein,F,Writer,20,2017,1931,86.0,ILS,1
Israel,Israeli New Shekel,Shaul Tchernichovsky,M,Writer,50,2014,1943,71.0,ILS,1
