In this project, we'll work with data from the [CIA World Factbook](https://www.cia.gov/library/publications/the-world-factbook/), a compendium of statistics about all of the countries on Earth.

You can download it [here](https://dsserver-prod-resources-1.s3.amazonaws.com/257/factbook.db).

Firstly, let's connect to the database:

In [2]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

Query information about tables:

In [3]:
%%sql
SELECT *
  FROM sqlite_master
 WHERE type='table';

 * sqlite:///factbook.db
Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


First five rows of `facts` table:

In [4]:
%%sql
SELECT * FROM facts LIMIT 5;

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


Calculate some summary statistics:

In [5]:
%%sql

SELECT min(population), max(population), min(population_growth), max(population_growth)
FROM facts;

 * sqlite:///factbook.db
Done.


min(population),max(population),min(population_growth),max(population_growth)
0,7256490011,0.0,4.02


Explore countries with the minimum population:

In [6]:
%%sql

SELECT name FROM facts
WHERE population = (SELECT min(population) FROM facts);

 * sqlite:///factbook.db
Done.


name
Antarctica


Explore countries with the maximum population:

In [7]:
%%sql

SELECT name FROM facts
WHERE population = (SELECT max(population) FROM facts);

 * sqlite:///factbook.db
Done.


name
World


Let's exclude the row for the whole world and recalculate statistics:

In [8]:
%%sql

SELECT min(population), max(population), min(population_growth), max(population_growth)
FROM facts
WHERE name != 'World';

 * sqlite:///factbook.db
Done.


min(population),max(population),min(population_growth),max(population_growth)
0,1367485388,0.0,4.02


Calculate average population and area:

In [11]:
%%sql

SELECT AVG(population) AS average_population, AVG(area) AS average_area
FROM facts
WHERE name != 'World';

 * sqlite:///factbook.db
Done.


average_population,average_area
32242666.56846473,555093.546184739


Find densely populated countries:

In [12]:
%%sql

SELECT name
FROM facts
WHERE population > (SELECT AVG(population) AS average_population FROM facts) -- population above average
AND area < (SELECT AVG(area) AS average_area FROM facts) -- area below average;

 * sqlite:///factbook.db
Done.


name
Bangladesh
Germany
Japan
Philippines
Thailand
United Kingdom
Vietnam
