# World Population Analysis

Libraries:

In [1]:
import pandas as pd
import numpy as np

Data reading:

In [2]:
df=pd.read_csv("./world_population.csv")
df.columns.name="Header"
df.index.name="Countries"

In [3]:
df.head()

Header,Rank,CCA3,Country/Territory,Capital,Continent,2022 Population,2020 Population,2015 Population,2010 Population,2000 Population,1990 Population,1980 Population,1970 Population,Area (km²),Density (per km²),Growth Rate,World Population Percentage
Countries,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
0,36,AFG,Afghanistan,Kabul,Asia,41128771,38972230,33753499,28189672,19542982,10694796,12486631,10752971,652230,63.0587,1.0257,0.52
1,138,ALB,Albania,Tirana,Europe,2842321,2866849,2882481,2913399,3182021,3295066,2941651,2324731,28748,98.8702,0.9957,0.04
2,34,DZA,Algeria,Algiers,Africa,44903225,43451666,39543154,35856344,30774621,25518074,18739378,13795915,2381741,18.8531,1.0164,0.56
3,213,ASM,American Samoa,Pago Pago,Oceania,44273,46189,51368,54849,58230,47818,32886,27075,199,222.4774,0.9831,0.0
4,203,AND,Andorra,Andorra la Vella,Europe,79824,77700,71746,71519,66097,53569,35611,19860,468,170.5641,1.01,0.0


Checking null values existance:

In [76]:
df.isnull().any()

Header
Rank                           False
CCA3                           False
Country/Territory              False
Capital                        False
Continent                      False
2022 Population                False
2020 Population                False
2015 Population                False
2010 Population                False
2000 Population                False
1990 Population                False
1980 Population                False
1970 Population                False
Area (km²)                     False
Density (per km²)              False
Growth Rate                    False
World Population Percentage    False
dtype: bool

## Top 10 highest populated countries

Sorting by "Rank":

In [5]:
df2=df.copy() #We create a new data frame
df2=df2.sort_values(by="Rank")
df2.head()

Header,Rank,CCA3,Country/Territory,Capital,Continent,2022 Population,2020 Population,2015 Population,2010 Population,2000 Population,1990 Population,1980 Population,1970 Population,Area (km²),Density (per km²),Growth Rate,World Population Percentage
Countries,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
41,1,CHN,China,Beijing,Asia,1425887337,1424929781,1393715448,1348191368,1264099069,1153704252,982372466,822534450,9706961,146.8933,1.0,17.88
92,2,IND,India,New Delhi,Asia,1417173173,1396387127,1322866505,1240613620,1059633675,870452165,696828385,557501301,3287590,431.0675,1.0068,17.77
221,3,USA,United States,"Washington, D.C.",North America,338289857,335942003,324607776,311182845,282398554,248083732,223140018,200328340,9372610,36.0935,1.0038,4.24
93,4,IDN,Indonesia,Jakarta,Asia,275501339,271857970,259091970,244016173,214072421,182159874,148177096,115228394,1904569,144.6529,1.0064,3.45
156,5,PAK,Pakistan,Islamabad,Asia,235824862,227196741,210969298,194454498,154369924,115414069,80624057,59290872,881912,267.4018,1.0191,2.96


Checking if the "Rank" column is actually accurate according to the data in the "2022 Population" column.

In [6]:
(df2["2022 Population"].rank(ascending=False)==df2["Rank"]).all()

True

Top 10 DataFrame:

In [77]:
top10=df2.reindex(["Rank","Country/Territory","2022 Population"],axis=1).iloc[:10,:]
top10

Header,Rank,Country/Territory,2022 Population
Countries,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
41,1,China,1425887337
92,2,India,1417173173
221,3,United States,338289857
93,4,Indonesia,275501339
156,5,Pakistan,235824862
149,6,Nigeria,218541212
27,7,Brazil,215313498
16,8,Bangladesh,171186372
171,9,Russia,144713314
131,10,Mexico,127504125


## Analysis by continent

Total number of countries per continent:

In [7]:
df2["Continent"].value_counts().sort_values()

Continent
South America    14
Oceania          23
North America    40
Asia             50
Europe           50
Africa           57
Name: count, dtype: int64

Each continent population:

In [43]:
v=[]
for i in df2["Continent"].unique():
    v.append(df2[df2["Continent"]==i]["2022 Population"].sum())
continents_population=pd.Series(v,index=df2["Continent"].unique())
continents_population.sort_values()
    

Oceania            45038554
South America     436816608
North America     600296136
Europe            743147538
Africa           1426730932
Asia             4721383274
dtype: int64

## Population growth

Which country has had the biggest population increase between 2022 and 1970?

In [9]:
maxdif2022_1970=(df2["2022 Population"]-df2["1970 Population"]).idxmax()
df2.loc[maxdif2022_1970]["Country/Territory"]

'India'

## Available land

Available land per person (km2/person):

In [67]:
land_available=pd.Series((df2["Area (km²)"]/df2["2022 Population"]).values,index=df2["Country/Territory"])
land_available.sort_values(ascending=False)

Country/Territory
Greenland           38.360890
Falkland Islands     3.220370
Western Sahara       0.461817
Mongolia             0.460254
Namibia              0.321625
                      ...    
Gibraltar            0.000184
Hong Kong            0.000147
Singapore            0.000119
Monaco               0.000055
Macau                0.000043
Length: 234, dtype: float64

What position has Colombia in this ranking?

In [74]:
land_available.rank(ascending=False)["Colombia"]

67.0

Colombia is the 67th country with the most land available per person in the world.