# Analyzing CIA Factbook Data Using SQL by Bevis Lau

Database - CIA World Factbook (https://www.cia.gov/the-world-factbook/)

Objective - To study demographics data across different countries 

## Connect notebook with database

In [2]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

## Query database to understand table

In [2]:
%%sql
SELECT *
  FROM sqlite_master
 WHERE type='table';

Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


## Description on the columns

name — the name of the country.<br>
area— the country's total area (both land and water).<br>
area_land — the country's land area in square kilometers.<br>
area_water — the country's water area in square kilometers.<br>
population — the country's population.<br>
population_growth— the country's population growth as a percentage.<br>
birth_rate — the country's birth rate, or the number of births per year per 1,000 people.<br>
death_rate — the country's death rate, or the number of death per year per 1,000 people.<br>

Quick scan of all the columns

In [3]:
%%sql
SELECT *
FROM facts 
LIMIT 5

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


Understand the limit of data with MAX and MIN

In [9]:
%%sql
SELECT MAX(population), MIN(population),MAX(population_growth), MIN(population_growth)
FROM facts



Done.


MAX(population),MIN(population),MAX(population_growth),MIN(population_growth)
7256490011,0,4.02,0.0


In [12]:
%%sql
SELECT name
FROM facts
WHERE population = 0


Done.


name
Antarctica


In [13]:
%%sql
SELECT name
FROM facts
WHERE population = 7256490011


Done.


name
World


In [14]:
%%sql
SELECT MAX(population), MIN(population),MAX(population_growth), MIN(population_growth)
FROM facts
WHERE name != 'World'



Done.


MAX(population),MIN(population),MAX(population_growth),MIN(population_growth)
1367485388,0,4.02,0.0


To obtain the correct AVG population and area , World has to be excluded

In [15]:
%%sql
SELECT AVG(population), AVG(area)
FROM facts
WHERE name != 'World'


Done.


AVG(population),AVG(area)
32242666.56846473,555093.546184739


Which countries are highly densed vs average ?

In [3]:
%%sql
WITH avg_table AS (
    SELECT AVG(population) AS avg_population, AVG(area) AS avg_area
    FROM facts
    WHERE name != 'World'
)
SELECT name,population,area
FROM facts 
WHERE population > (SELECT avg_population FROM avg_table) AND area < (SELECT avg_area FROM avg_table)
ORDER BY population DESC ;

Done.


name,population,area
Bangladesh,168957745,148460
Japan,126919659,377915
Philippines,100998376,300000
Vietnam,94348835,331210
Germany,80854408,357022
Thailand,67976405,513120
United Kingdom,64088222,243610
Italy,61855120,301340
"Korea, South",49115196,99720
Spain,48146134,505370


To understand population density on the identified countries
In this case, population density is defined as population/land_area

In [6]:
%%sql
WITH avg_table AS (
    SELECT AVG(population) AS avg_population, AVG(area) AS avg_area
    FROM facts
    WHERE name != 'World'
)
SELECT name,population,area, population/area_land as population_density
FROM facts 
WHERE population > (SELECT avg_population FROM avg_table) AND area < (SELECT avg_area FROM avg_table)
ORDER BY population_density DESC,population DESC ;

Done.


name,population,area,population_density
Bangladesh,168957745,148460,1297
"Korea, South",49115196,99720,506
Japan,126919659,377915,348
Philippines,100998376,300000,338
Vietnam,94348835,331210,304
United Kingdom,64088222,243610,264
Germany,80854408,357022,231
Italy,61855120,301340,210
Uganda,37101745,241038,188
Thailand,67976405,513120,133


## Guided Question

Which country has the most people? Which country has the highest growth rate? <br>
Which countries have the highest ratios of water to land? Which countries have more water than land? <br>
Which countries will add the most people to their populations next year? <br>
Which countries have a higher death rate than birth rate? <br>
Which countries have the highest population/area ratio, and how does it compare to list we found in the previous screen? <br>

Extra Question:
Do we have any pattern between population and death rate ?

Which countries have a higher death rate than birth rate?

In [4]:
%%sql
SELECT name,birth_rate,death_rate
FROM facts
WHERE death_rate > birth_rate
ORDER BY death_rate DESC
LIMIT 10;

Done.


name,birth_rate,death_rate
Ukraine,10.72,14.46
Bulgaria,8.92,14.44
Latvia,10.0,14.31
Lithuania,10.1,14.27
Russia,11.6,13.69
Serbia,9.08,13.66
Belarus,10.7,13.36
Hungary,9.16,12.73
Moldova,12.0,12.59
Estonia,10.51,12.4


Which countries have the highest ratios of water to land? Which countries have more water than land?

In [6]:
%%sql
SELECT name,
       ROUND(CAST(area_water AS float) / CAST(area_land AS float), 2)  AS water_to_land_ratio,
       (area_water - area_land) AS water_to_land_diff,
       area_water,
       area_land
FROM facts
WHERE area_water > 0 AND area_land <> 0 
ORDER BY water_to_land_ratio DESC
LIMIT 20;


Done.


name,water_to_land_ratio,water_to_land_diff,area_water,area_land
British Indian Ocean Territory,905.67,54280,54340,60
Virgin Islands,4.52,1218,1564,346
Puerto Rico,0.55,-3949,4921,8870
"Bahamas, The",0.39,-6140,3870,10010
Guinea-Bissau,0.28,-20115,8005,28120
Malawi,0.26,-69676,24404,94080
Netherlands,0.23,-26243,7650,33893
Uganda,0.22,-153162,43938,197100
Eritrea,0.16,-84400,16600,101000
Liberia,0.16,-81271,15049,96320
