# Analysing CIA Factbook Data Using SQL

In this workbook we will use a database containing information about countries, including:

`area` 

`area_land`

`area_water`

`population`

`population_growth`

`birth_rate`

`death_rate`

`migration_rate`

In [1]:
%%capture
%load_ext sql
%sql sqlite:///facts.db

## Overview of the Data

In [2]:
%%sql
SELECT *
  FROM facts
 LIMIT 5;

 * sqlite:///facts.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


## Summary Statistics

In [6]:
%%sql
SELECT MIN(population), MAX(population), MIN(population_growth), MAX(population_growth)
  FROM facts;

 * sqlite:///facts.db
Done.


MIN(population),MAX(population),MIN(population_growth),MAX(population_growth)
0,7256490011,0.0,4.02


## Exploring Outliers

In [10]:
%%sql
SELECT name, MAX(population)
  FROM facts

 * sqlite:///facts.db
Done.


name,MAX(population)
World,7256490011


In [11]:
%%sql
SELECT name, MIN(population)
  FROM facts

 * sqlite:///facts.db
Done.


name,MIN(population)
Antarctica,0


In [24]:
%%sql
SELECT MIN(population), MAX(population), MIN(population_growth), MAX(population_growth)
  FROM facts
 WHERE name != 'World'

 * sqlite:///facts.db
Done.


MIN(population),MAX(population),MIN(population_growth),MAX(population_growth)
0,1367485388,0.0,4.02


## Average Population and Area

In [22]:
%%sql
SELECT name, AVG(population), AVG(area)
  FROM facts
 WHERE name != 'World'

 * sqlite:///facts.db
Done.


name,AVG(population),AVG(area)
Afghanistan,32242666.56846473,555093.546184739


## Finding Densely Populated Countries

In [26]:
%%sql
SELECT name
  FROM facts
 WHERE population > (SELECT AVG(population) FROM facts) AND area < (SELECT AVG(area) FROM facts)

 * sqlite:///facts.db
Done.


name
Bangladesh
Germany
Japan
Philippines
Thailand
United Kingdom
Vietnam


## Countries with the largest population density; countries with the highest growth rate

In [28]:
%%sql
SELECT name, population/area AS population_density
  FROM facts
ORDER BY population_density DESC
 LIMIT 3

 * sqlite:///facts.db
Done.


name,population_density
Macau,21168
Monaco,15267
Singapore,8141


The countries above are all very small, so let's run the query again but this time filtering for only countries who have an area equal to or above average.

In [31]:
%%sql
SELECT name, population/area AS population_density
  FROM facts
 WHERE area >= (SELECT AVG(area) FROM facts)
ORDER BY population_density DESC
 LIMIT 3

 * sqlite:///facts.db
Done.


name,population_density
India,380
Pakistan,250
Nigeria,196


In [33]:
%%sql
SELECT name, population_growth
  FROM facts
ORDER BY population_growth DESC
 LIMIT 3

 * sqlite:///facts.db
Done.


name,population_growth
South Sudan,4.02
Malawi,3.32
Burundi,3.28


## Highest water to land ratios

In [42]:
%%sql
SELECT name, CAST(area/area_water AS Float) AS water_to_land_ratio
  FROM facts
ORDER BY water_to_land_ratio DESC
 LIMIT 3

 * sqlite:///facts.db
Done.


name,water_to_land_ratio
Bosnia and Herzegovina,5119.0
Morocco,1786.0
Guinea,1756.0


In [44]:
%%sql
SELECT name
  FROM facts
 WHERE area_water > area_land

 * sqlite:///facts.db
Done.


name
British Indian Ocean Territory
Virgin Islands


## Countries expected to have the largest increase in their population in the next year

In [49]:
%%sql
SELECT name, population, population_growth, ROUND(population*population_growth,2)/100 AS population_increase
  FROM facts
 WHERE name != 'World'
ORDER BY population_increase DESC
 LIMIT 5

 * sqlite:///facts.db
Done.


name,population,population_growth,population_increase
India,1251695584,1.22,15270686.1248
China,1367485388,0.45,6153684.246
Nigeria,181562056,2.45,4448270.372
Pakistan,199085847,1.46,2906653.3662
Ethiopia,99465819,2.89,2874562.1691


## Countries with high death rates

In [50]:
%%sql
SELECT name
  FROM facts
 WHERE death_rate > birth_rate

 * sqlite:///facts.db
Done.


name
Austria
Belarus
Bosnia and Herzegovina
Bulgaria
Croatia
Czech Republic
Estonia
Germany
Greece
Hungary
