# DISCOVERING THE MOST POPULOUS COUNTRIES

## INTRODUCTION
We are going to be working with data from the `CIA World Factbook` which contains concise and detailed statistics about all of the countries on earth. The factbook contains demographic information such as:
* `population` - The global poplation.
* `population_growth` - The annual population growth rate as a percentage.
* `area` - The total land and water area.

Our goal in this analysis is to find out the following:
* The countries with the most population.
* The most densely populated countries.
* The countries with the highest population growth rate.
* The countries expected to grow the most the following year, etc.

## Connecting To The Database.

In [1]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

In [2]:
%%sql
SELECT *
  FROM sqlite_master
 WHERE type='table';

Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


## Overview Of The Data

In [3]:
%%sql
SELECT *
  FROM facts
  LIMIT 5;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


## Summary Statistics

In [4]:
%%sql
SELECT MIN(population), MAX(population),
       MIN(population_growth), MAX(population_growth)
  FROM facts;

Done.


MIN(population),MAX(population),MIN(population_growth),MAX(population_growth)
0,7256490011,0.0,4.02


As we can see from our table, the minimum population for the countries is 0 and the  maximum populaion is just above 7billion which is close to the world population and quite impossible for a country to have such a population.

## Exploring Outliers

In [5]:
%%sql
SELECT *
  FROM facts
  WHERE population = (SELECT MIN(population)
                     FROM facts)
  OR population = (SELECT MAX(population)
                  FROM facts);

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
250,ay,Antarctica,,280000.0,,0,,,,
261,xx,World,,,,7256490011,1.08,18.6,7.8,


The reason for the outlier data is because our table contains rows for both Antartica which has no native population and the World. So this explains why our min population is 0 and our max population is the same as the population of the world.

## Summary Statistic Revisited

In [6]:
%%sql
SELECT MIN(population) AS min_pop,
       MAX(population) AS max_pop,
       MIN(population_growth) AS min_pop_growth,
       MAX(population_growth) AS max_pop_growth 
  FROM facts
 WHERE name <> 'World';

Done.


min_pop,max_pop,min_pop_growth,max_pop_growth
0,1367485388,0.0,4.02


From our table above, we can see that there is a country with nearly 1.4 billion people.

## Exploring The Most Populous Countries

In [7]:
%%sql
SELECT *
  FROM facts
    WHERE name <> 'World'
    ORDER BY population DESC
    LIMIT 10;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
37,ch,China,9596960,9326410.0,270550.0,1367485388,0.45,12.49,7.53,0.44
77,in,India,3287263,2973193.0,314070.0,1251695584,1.22,19.55,7.32,0.04
197,ee,European Union,4324782,,,513949445,0.25,10.2,10.2,2.5
186,us,United States,9826675,9161966.0,664709.0,321368864,0.78,12.49,8.15,3.86
78,id,Indonesia,1904569,1811569.0,93000.0,255993674,0.92,16.72,6.37,1.16
24,br,Brazil,8515770,8358140.0,157630.0,204259812,0.77,14.46,6.58,0.14
132,pk,Pakistan,796095,770875.0,25220.0,199085847,1.46,22.58,6.49,1.54
129,ni,Nigeria,923768,910768.0,13000.0,181562056,2.45,37.64,12.9,0.22
14,bg,Bangladesh,148460,130170.0,18290.0,168957745,1.6,21.14,5.61,0.46
143,rs,Russia,17098242,16377742.0,720500.0,142423773,0.04,11.6,13.69,1.69


China and India are the most populous countries with a population of over a billion people, followed by the European Union. Since the European is made up of several countries, we are going to redo the query,this time excluding the European Union to give us a move accurate output of the most populous countries. 

In [8]:
%%sql
SELECT *
  FROM facts
    WHERE name <> 'World'
    AND name <> 'European Union'
    ORDER BY population DESC
    LIMIT 10;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
37,ch,China,9596960,9326410,270550,1367485388,0.45,12.49,7.53,0.44
77,in,India,3287263,2973193,314070,1251695584,1.22,19.55,7.32,0.04
186,us,United States,9826675,9161966,664709,321368864,0.78,12.49,8.15,3.86
78,id,Indonesia,1904569,1811569,93000,255993674,0.92,16.72,6.37,1.16
24,br,Brazil,8515770,8358140,157630,204259812,0.77,14.46,6.58,0.14
132,pk,Pakistan,796095,770875,25220,199085847,1.46,22.58,6.49,1.54
129,ni,Nigeria,923768,910768,13000,181562056,2.45,37.64,12.9,0.22
14,bg,Bangladesh,148460,130170,18290,168957745,1.6,21.14,5.61,0.46
143,rs,Russia,17098242,16377742,720500,142423773,0.04,11.6,13.69,1.69
85,ja,Japan,377915,364485,13430,126919659,0.16,7.93,9.51,0.0


China with a population of nearly 1.4 billion, India with nearly 1.3 billion and The United States with nearly 322 million are the 3 most populous countries. Indonesia, Brazil, Pakistan, Nigeria, Bangladesh, Russia and Japan in that orde, completes the top 10 moat populous countries

## Exploring Population Density
* To get the population density of countries, we are going to look at countries with population above the average and area below the average.

* We are going to make sure to discard of the rows that contains World and European Union as this can skew the averages.

In [9]:
%%sql
SELECT AVG(population), AVG(area)
  FROM facts
  WHERE name <> 'World'
  AND name <> 'European Union';

Done.


AVG(population),AVG(area)
30235554.991666667,539893.1895161291


In [10]:
%%sql
SELECT *
  FROM facts
  WHERE population > (SELECT AVG(population)
                      FROM facts
                     WHERE name <> 'World'
                     AND name <> 'European Union')
  AND area < (SELECT AVG(area)
             FROM facts
             WHERE name <> 'World'
             AND name <> 'European Union');
    

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
14,bg,Bangladesh,148460,130170,18290,168957745,1.6,21.14,5.61,0.46
65,gm,Germany,357022,348672,8350,80854408,0.17,8.47,11.42,1.24
80,iz,Iraq,438317,437367,950,37056169,2.93,31.45,3.77,1.62
83,it,Italy,301340,294140,7200,61855120,0.27,8.74,10.19,4.1
85,ja,Japan,377915,364485,13430,126919659,0.16,7.93,9.51,0.0
91,ks,"Korea, South",99720,96920,2800,49115196,0.14,8.19,6.75,0.0
107,my,Malaysia,329847,328657,1190,30513848,1.44,19.71,5.03,0.33
120,mo,Morocco,446550,446300,250,33322699,1.0,18.2,4.81,3.36
124,np,Nepal,147181,143351,3830,31551305,1.79,20.64,6.56,3.86
138,rp,Philippines,300000,298170,1830,100998376,1.61,24.27,6.11,2.09


Bangladesh, Germany and Iraq are the top 3 countries with the highest population density.

## Exploring Population To Area Ratio

In [11]:
%%sql
SELECT name,
       population/area AS pop_area_ratio
  FROM facts
  ORDER BY pop_area_ratio DESC
  LIMIT 10;

Done.


name,pop_area_ratio
Macau,21168
Monaco,15267
Singapore,8141
Hong Kong,6445
Gaza Strip,5191
Gibraltar,4876
Bahrain,1771
Maldives,1319
Malta,1310
Bermuda,1299


The countries with high population to area ratio as we expected are countries that do not have a high population density.

## Exploring Countries With High Population Growth Rate

* We are going to find out the countries with the top 10 population growth rate.
* We are going to estimate by how much the population will grow the following year.

In [12]:
%%sql
SELECT *
  FROM facts
  ORDER BY population_growth DESC
  LIMIT 10;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
162,od,South Sudan,644329.0,,,12042910,4.02,36.91,8.18,11.47
106,mi,Malawi,118484.0,94080.0,24404.0,17964697,3.32,41.56,8.41,0.0
29,by,Burundi,27830.0,25680.0,2150.0,10742276,3.28,42.01,9.27,0.0
128,ng,Niger,,1266700.0,300.0,18045729,3.25,45.45,12.42,0.56
182,ug,Uganda,241038.0,197100.0,43938.0,37101745,3.24,43.79,10.69,0.74
141,qa,Qatar,11586.0,11586.0,0.0,2194817,3.07,9.84,1.53,22.39
27,uv,Burkina Faso,274200.0,273800.0,400.0,18931686,3.03,42.03,11.72,0.0
109,ml,Mali,1240192.0,1220190.0,20002.0,16955536,2.98,44.99,12.89,2.26
219,cw,Cook Islands,236.0,236.0,0.0,9838,2.95,14.33,8.03,
80,iz,Iraq,438317.0,437367.0,950.0,37056169,2.93,31.45,3.77,1.62


About 7 countries with high population growth rate are African countires. With South Sudan having the highest population growth rate.

## Finding Countries With The Highest Estimated Growth

In [13]:
%%sql
SELECT name,
       ROUND((population_growth * population)/100) AS Estimated_growth
  FROM facts
  WHERE name <> 'World'
  ORDER BY Estimated_growth DESC
  LIMIT 10;

Done.


name,Estimated_growth
India,15270686.0
China,6153684.0
Nigeria,4448270.0
Pakistan,2906653.0
Ethiopia,2874562.0
Bangladesh,2703324.0
United States,2506677.0
Indonesia,2355142.0
"Congo, Democratic Republic of the",1944691.0
Philippines,1626074.0


Our table shows that India is the country with the highest expected annual population growth estimate with just over 15million, followed by China with Just over 6million and the Nigeria with over 440,000.

## Exploring Countries With Higher Death Rate Than Birth Rate.

In [14]:
%%sql
SELECT *
  FROM facts
  WHERE death_rate > birth_rate;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
10,au,Austria,83871,82445,1426,8665550,0.55,9.41,9.42,5.56
16,bo,Belarus,207600,202900,4700,9589689,0.2,10.7,13.36,0.7
22,bk,Bosnia and Herzegovina,51197,51187,10,3867055,0.13,8.87,9.75,0.38
26,bu,Bulgaria,110879,108489,2390,7186893,0.58,8.92,14.44,0.29
44,hr,Croatia,56594,55974,620,4464844,0.13,9.45,12.18,1.39
47,ez,Czech Republic,78867,77247,1620,10644842,0.16,9.63,10.34,2.33
57,en,Estonia,45228,42388,2840,1265420,0.55,10.51,12.4,3.6
65,gm,Germany,357022,348672,8350,80854408,0.17,8.47,11.42,1.24
67,gr,Greece,131957,130647,1310,10775643,0.01,8.66,11.09,2.32
75,hu,Hungary,93028,89608,3420,9897541,0.22,9.16,12.73,1.33


With the exception of Japan, all of the countries with death rate higher than birth rate are all European countries.

## CONCLUSION
Our goal at the start was to find the most populous countries, the countries with high population density, countries with high growth rate, etc. After our analysis we have discovered the following:

1. China, India and The United States are the 3 most populous nations.
2. Bangladesh, Germany and Iraq have the highest population density.
3. The countries with the top 10 highest growth rate were mainly made up of African countries.
4. India, China and Nigeria are the top 3 countries whose population is expected to grow the most the following year.
5. Most of the countries that have death rate higher than birth rate are European countries with the exception of Japan.