In this project, we'll work with data from the CIA World Factbook, a compendium of statistics about all of the countries on Earth. The Factbook contains demographic information like the following:

* population — the global population.
* population_growth — the annual population growth rate, as a percentage.
* area — the total land and water area.

In [1]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

Information on the tables in the database

In [4]:
%%sql
SELECT *
  FROM sqlite_master
 WHERE type='table';

Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


In [6]:
%%sql
SELECT *
 FROM facts
LIMIT 5

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


Let's start by calculating some summary statistics and look for any outlier countries

In [7]:
%%sql
SELECT MIN(population), MAX(population), MIN(population_growth), MIN(population_growth)
 FROM facts;

Done.


MIN(population),MAX(population),MIN(population_growth),MIN(population_growth)_1
0,7256490011,0.0,0.0


We see a few interesting things in the summary statistics on the previous screen:

* There's a country with a population of 0
* There's a country with a population of 7256490011 (or more than 7.2 billion people)

In [9]:
%%sql
SELECT name
 FROM facts
 WHERE population == (SELECT MIN(population) FROM facts)

Done.


name
Antarctica


In [10]:
%%sql
SELECT name
 FROM facts
 WHERE population == (SELECT MAX(population) FROM facts)

Done.


name
World


Now that we know this, we should recalculate the summary statistics we calculated earlier — this time excluding the row for the whole world.

In [12]:
%%sql
SELECT MIN(population), MAX(population), MIN(population_growth), MIN(population_growth)
 FROM facts
 WHERE population < (SELECT MAX(population) FROM facts)

Done.


MIN(population),MAX(population),MIN(population_growth),MIN(population_growth)_1
0,1367485388,0.0,0.0


Calculate the average value for the following population and area columns.

In [13]:
%%sql
SELECT AVG(population), AVG(area)
 FROM facts
 WHERE population < (SELECT MAX(population) FROM facts)

Done.


AVG(population),AVG(area)
32242666.56846473,582949.8523206752


We'll find countries that are densely populated. We'll identify countries that have the following:

* Above-average values for population.
* Below-average values for area.

In [38]:
%%sql
SELECT name as Most_densely_populated
 FROM facts
 WHERE population > (SELECT AVG(population) FROM facts)
 AND
(area < (SELECT AVG(area) FROM facts));              

Done.


Most_densely_populated
Bangladesh
Germany
Japan
Philippines
Thailand
United Kingdom
Vietnam


We'll finish by answering the following questions:

* Which country has the most people? Which country has the highest growth rate?

In [44]:
%%sql
SELECT name, MAX(population) as Most_people
 FROM facts
 WHERE name <> "World"

Done.


name,Most_people
China,1367485388


In [43]:
%%sql
SELECT name, MAX(population_growth) as Highest_pop_growth
 FROM facts
 WHERE name <> "World"

Done.


name,Highest_pop_growth
South Sudan,4.02


* Which countries have the highest ratios of water to land? Which countries have more water than land?

In [84]:
%%sql

SELECT name,
       CAST(area_water AS FLOAT)/area_land as water_to_land_ratio
FROM facts
WHERE name <> "World"
ORDER BY water_to_land_ratio DESC
LIMIT 5;

Done.


name,water_to_land_ratio
British Indian Ocean Territory,905.6666666666666
Virgin Islands,4.520231213872832
Puerto Rico,0.5547914317925592
"Bahamas, The",0.3866133866133866
Guinea-Bissau,0.2846728307254623


In [73]:
%%sql

SELECT name,
       CAST(area_water AS FLOAT)/area_land as more_water_than_land
FROM facts
WHERE name <> "World" AND more_water_than_land > 1

Done.


name,more_water_than_land
British Indian Ocean Territory,905.6666666666666
Virgin Islands,4.520231213872832


* Which countries will add the most people to their populations next year?

In [83]:
%%sql
SELECT name,
       ROUND(population * (1 + population_growth/100) - population)
       as highest_absolute_growth
FROM facts
WHERE name <> "World"
ORDER BY highest_absolute_growth DESC
LIMIT 5;

Done.


name,highest_absolute_growth
India,15270686.0
China,6153684.0
Nigeria,4448270.0
Pakistan,2906653.0
Ethiopia,2874562.0


* Which countries have a higher death rate than birth rate?

In [100]:
%%sql

SELECT name, birth_rate, death_rate, death_rate - birth_rate as percentage_diff
FROM facts
WHERE birth_rate < death_rate
ORDER BY percentage_diff DESC
LIMIT 10;

Done.


name,birth_rate,death_rate,percentage_diff
Bulgaria,8.92,14.44,5.52
Serbia,9.08,13.66,4.58
Latvia,10.0,14.31,4.3100000000000005
Lithuania,10.1,14.27,4.17
Ukraine,10.72,14.46,3.74
Hungary,9.16,12.73,3.5700000000000003
Germany,8.47,11.42,2.9499999999999997
Slovenia,8.42,11.37,2.9499999999999997
Romania,9.14,11.9,2.76
Croatia,9.45,12.18,2.7300000000000004


* Which countries have the highest population/area ratio, and how does it compare to list we found in the previous screen?

In [102]:
%%sql

SELECT name,
       ROUND(CAST(population AS FLOAT)/area, 2) as population_area_ratio,
       death_rate > birth_rate
FROM facts
WHERE name <> "World"
ORDER BY population_area_ratio DESC
LIMIT 10;

Done.


name,population_area_ratio,death_rate > birth_rate
Macau,21168.96,0
Monaco,15267.5,1
Singapore,8141.28,0
Hong Kong,6445.04,0
Gaza Strip,5191.82,0
Gibraltar,4876.33,0
Bahrain,1771.86,0
Maldives,1319.64,0
Malta,1310.02,0
Bermuda,1299.93,0


Among the top 10 countries with the highest population/area ratio only Monaco has a higher death rate than birth rate