In this project, I will work with a dataset from the CIA World Factbook(2015), which provides some key demographic information.

Data Source: https://www.cia.gov/library/publications/download/download-2015/index.html

In [1]:
!conda install -yc conda-forge ipython-sql\

/bin/sh: 1: conda: not found


In [2]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

First of all, I am going to extract the first 5 rows of the table to get myself familiar with the dataset.

In [3]:
%%sql
SELECT *
  FROM facts
 LIMIT 5;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


Here are the descriptions for some of the columns:

1. name - The name of the country.
2. area- The country's total area (both land and water).
3. area_land - The country's land area in square kilometers.
4. area_water - The country's waterarea in square kilometers.
5. population - The country's population.
6. population_growth- The country's population growth as a percentage.
7. birth_rate - The country's birth rate, or the number of births a year per 1,000 people.
8. death_rate - The country's death rate, or the number of death a year per 1,000 people.

These columns can be roughly grouped into two categories: 
1. People/Population(Population, Growth, Birth & Death Rate, Migration Rate)
2. Area(Land, Water, Total)

1. Population

101. Which country has the highest population?
102. Is the country with the hightest population also the most densely poppulated?
103. Does this country has the highest birth rate?
104. What is its ratio of birth/rate rate?

In [4]:
%%sql
SELECT MAX(population) AS Max_Pop, 
       MIN(population) AS Min_Pop, 
       MAX(population_growth) AS Max_PopGrow, 
       MIN(population_growth) AS Min_PopGrow
  FROM facts;

Done.


Max_Pop,Min_Pop,Max_PopGrow,Min_PopGrow
7256490011,0,4.02,0.0


It is interesting that the maximum population shown here is over 7 million, and the minimum population is 0. So let's look into this:

In [46]:
%%sql
SELECT *
  FROM facts
 WHERE population == (SELECT MAX(population)
                        FROM facts);

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
261,xx,World,,,,7256490011,1.08,18.6,7.8,


In [47]:
%%sql
SELECT *
  FROM facts
 WHERE population == (SELECT MIN(population)
                        FROM facts);

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
250,ay,Antarctica,,280000,,0,,,,


It turns out that the table contains a row for the whole world, and a row for Antartica where there is no inhabitant. Moving forward, I am going to exclude the row for the "world".

In [12]:
%%sql
SELECT *
  FROM facts
 WHERE population == (SELECT MAX(population)
                        FROM facts
                       WHERE name <> 'World'); 

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
37,ch,China,9596960,9326410,270550,1367485388,0.45,12.49,7.53,0.44


In [89]:
%%sql
SELECT name, population, area_land, population/area_land AS pop_density
  FROM facts
 WHERE pop_density == (SELECT MAX(population/area_land)
                         FROM facts);

Done.


name,population,area_land,pop_density
Macau,592731,28,21168


Macau, also spelled Macao, and officially the Macao Special Administrative Region of the People's Republic of China, is a city in the western Pearl River Delta by the South China Sea.

So far, we've seen that China has the highest population and the most densely populated region is part of China. I want to see the top 10 most densely populated areas/regions to understand better.

In [93]:
%%sql
SELECT name, population, area_land, population/area_land AS pop_density
  FROM facts
 WHERE pop_density
 ORDER BY pop_density DESC
 LIMIT 10;

Done.


name,population,area_land,pop_density
Macau,592731,28,21168
Monaco,30535,2,15267
Singapore,5674472,687,8259
Hong Kong,7141106,1073,6655
Gaza Strip,1869055,360,5191
Gibraltar,29258,6,4876
Bahrain,1346613,760,1771
Maldives,393253,298,1319
Malta,413965,316,1310
Bermuda,70196,54,1299


Hong Kong is the 4th most densely populated area/region which is also part of People's Republic of China. 

In [95]:
%%sql
SELECT *
  FROM facts
 WHERE birth_rate = (SELECT MAX(birth_rate)
                       FROM facts); 

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
128,ng,Niger,,1266700,300,18045729,3.25,45.45,12.42,0.56


China does not have the highest birth rate. According to Statista.com, the fertility rate in Niger was estimated to be 6.49 children per woman, in 2017.

In [104]:
%%sql
SELECT population_growth, birth_rate/death_rate
  FROM facts
 WHERE name = 'China'; 

Done.


population_growth,birth_rate/death_rate
0.45,1.658698539176627


This tells us: there were about 1.7 births for every death in China as of 2015.

In [103]:
%%sql
SELECT AVG(population_growth)
  FROM facts; 

Done.


AVG(population_growth)
1.2009745762711863


Although China is the most populated country in the world, it has a slow population growth and even displays a trend of decline due to the one-child policy.

2. Area

201. Which country is the largest country by area?
202. Does this country hold a large population?
203. What is its ratio of water to land?
204. Which country has the highest population/area ratio?

In [122]:
%%sql
SELECT *
  FROM facts
 WHERE name <> 'world'
 ORDER BY area DESC
 LIMIT 20;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
143,rs,Russia,17098242,16377742.0,720500.0,142423773,0.04,11.6,13.69,1.69
32,ca,Canada,9984670,9093507.0,891163.0,35099836,0.75,10.28,8.42,5.66
186,us,United States,9826675,9161966.0,664709.0,321368864,0.78,12.49,8.15,3.86
37,ch,China,9596960,9326410.0,270550.0,1367485388,0.45,12.49,7.53,0.44
24,br,Brazil,8515770,8358140.0,157630.0,204259812,0.77,14.46,6.58,0.14
9,as,Australia,7741220,7682300.0,58920.0,22751014,1.07,12.15,7.14,5.65
197,ee,European Union,4324782,,,513949445,0.25,10.2,10.2,2.5
77,in,India,3287263,2973193.0,314070.0,1251695584,1.22,19.55,7.32,0.04
7,ar,Argentina,2780400,2736690.0,43710.0,43431886,0.93,16.64,7.33,0.0
87,kz,Kazakhstan,2724900,2699700.0,25200.0,18157122,1.14,19.15,8.21,0.41


In [121]:
%%sql
SELECT AVG(population)/(SELECT population
                          FROM facts
                         WHERE name = 'Russia')
  FROM facts;

Done.


AVG(population)/(SELECT population  FROM facts  WHERE name = 'Russia')
0.4359871039388491


Despite its large area, Russia has a relatively small total population. However, its population is still rather large in numbers in comparison to those of other countries.

In [114]:
%%sql
SELECT name, MAX(area_water/area_land)
  FROM facts
 WHERE name <> 'world';

Done.


name,MAX(area_water/area_land)
British Indian Ocean Territory,905


The British Indian Ocean Territory is a British overseas territory of the United Kingdom situated in the Indian Ocean halfway between Tanzania and Indonesia. 