This notebook uses [the world population data from World Bank](https://data.worldbank.org/indicator/SP.POP.TOTL)

The code below loads the data from the CSV file

In [1]:
import pandas as pd
data = pd.read_csv('data/WorldBank_world_population.csv', skiprows=3).set_index('Country Name')
data.head()

Unnamed: 0_level_0,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,1966,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,Unnamed: 63
Country Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Aruba,ABW,"Population, total",SP.POP.TOTL,54211.0,55438.0,56225.0,56695.0,57032.0,57360.0,57715.0,...,101669.0,102046.0,102560.0,103159.0,103774.0,104341.0,104872.0,105366.0,105845.0,
Afghanistan,AFG,"Population, total",SP.POP.TOTL,8996973.0,9169410.0,9351441.0,9543205.0,9744781.0,9956320.0,10174836.0,...,29185507.0,30117413.0,31161376.0,32269589.0,33370794.0,34413603.0,35383128.0,36296400.0,37172386.0,
Angola,AGO,"Population, total",SP.POP.TOTL,5454933.0,5531472.0,5608539.0,5679458.0,5735044.0,5770570.0,5781214.0,...,23356246.0,24220661.0,25107931.0,26015780.0,26941779.0,27884381.0,28842484.0,29816748.0,30809762.0,
Albania,ALB,"Population, total",SP.POP.TOTL,1608800.0,1659800.0,1711319.0,1762621.0,1814135.0,1864791.0,1914573.0,...,2913021.0,2905195.0,2900401.0,2895092.0,2889104.0,2880703.0,2876101.0,2873457.0,2866376.0,
Andorra,AND,"Population, total",SP.POP.TOTL,13411.0,14375.0,15370.0,16412.0,17469.0,18549.0,19647.0,...,84449.0,83747.0,82427.0,80774.0,79213.0,78011.0,77297.0,77001.0,77006.0,


In [23]:
data.loc['Aruba', : ]

Country Code                    ABW
Indicator Name    Population, total
Indicator Code          SP.POP.TOTL
1960                          54211
1961                          55438
1962                          56225
1963                          56695
1964                          57032
1965                          57360
1966                          57715
1967                          58055
1968                          58386
1969                          58726
1970                          59063
1971                          59440
1972                          59840
1973                          60243
1974                          60528
1975                          60657
1976                          60586
1977                          60366
1978                          60103
1979                          59980
1980                          60096
1981                          60567
1982                          61345
1983                          62201
1984                        

In [20]:
data['aruba']

KeyError: 'aruba'

# Indexing and selection with Series

`pop2018` is a series that contains population data for year 2018

Answer the following questions using indexing and selection operators

In [2]:
pop2018 = data['2018']

In [3]:
pop2018

Country Name
Aruba                                                   1.058450e+05
Afghanistan                                             3.717239e+07
Angola                                                  3.080976e+07
Albania                                                 2.866376e+06
Andorra                                                 7.700600e+04
Arab World                                              4.197906e+08
United Arab Emirates                                    9.630959e+06
Argentina                                               4.449450e+07
Armenia                                                 2.951776e+06
American Samoa                                          5.546500e+04
Antigua and Barbuda                                     9.628600e+04
Australia                                               2.499237e+07
Austria                                                 8.847037e+06
Azerbaijan                                              9.942334e+06
Burundi              

- What is the population of the `United States`?

In [4]:
pop2018['United States']

327167434.0

- Increase that number by 1

In [7]:
pop2018['United States'] = pop2018['United States'] + 1

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [8]:
pop2018['United States']

327167435.0

- Select the population values of the `United States`, `Canada`, and `Mexico`?

In [9]:
pop2018[['United States', 'Canada', 'Mexico']]

Country Name
United States    327167435.0
Canada            37058856.0
Mexico           126190788.0
Name: 2018, dtype: float64

- Select the population values of the first five countries in the series

In [10]:
pop2018[:5]

Country Name
Aruba            105845.0
Afghanistan    37172386.0
Angola         30809762.0
Albania         2866376.0
Andorra           77006.0
Name: 2018, dtype: float64

In [16]:
pop2018.loc['Aruba':'Angola']

Country Name
Aruba            105845.0
Afghanistan    37172386.0
Angola         30809762.0
Name: 2018, dtype: float64

- Select the population values of the last five countries in the series

In [14]:
pop2018[-5:]

Country Name
Kosovo           1845300.0
Yemen, Rep.     28498687.0
South Africa    57779622.0
Zambia          17351822.0
Zimbabwe        14439018.0
Name: 2018, dtype: float64

- Select the population values that are larger than one billion

In [18]:
pop2018[pop2018 > 1e9]

Country Name
China                                           1.392730e+09
East Asia & Pacific (excluding high income)     2.081652e+09
Early-demographic dividend                      3.249141e+09
East Asia & Pacific                             2.328221e+09
High income                                     1.210312e+09
IBRD only                                       4.772284e+09
IDA & IBRD total                                6.412522e+09
IDA total                                       1.640238e+09
IDA only                                        1.084408e+09
India                                           1.352617e+09
Least developed countries: UN classification    1.009663e+09
Lower middle income                             3.022905e+09
Low & middle income                             6.383958e+09
Late-demographic dividend                       2.288666e+09
Middle income                                   5.678541e+09
OECD members                                    1.303529e+09
Post-demogr

# Indexing and selection with DataFrame

- Select data of years 2017 and 2018

- Select entries of countries `United States`, `Canada`, and `Mexico`

- Select data of countries `United States`, `Canada`, and `Mexico` in years 2017 and 2018

- Select data of years 2000 to 2018

- Select entries of countries from `Aruba` to `Azerbaijan`?

- Select entries of countries that had more that a billion people in year 2000?

- Select the first five rows

- Select the first five rows and the first five columns