# Guided Project: Analyzing CIA Factbook Data Using SQL

> The data from the [CIA World Factbook](<https://www.cia.gov/library/publications/the-world-factbook/>),
> a compendium of statistics about all of the countries on Earth. 
> The Factbook contains demographic information like:
> 
> * population - The population as of 2015.
> * population_growth - The annual population growth rate, as a 
> percentage. 
> * area - The total land and water area.
>
> To download the SQLite database - [factbook.db](<https://dsserver-prod-resources-1.s3.amazonaws.com/257/factbook.db>)

In [1]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

In [2]:
%%sql
SELECT * FROM sqlite_master WHERE type='table';

Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


In [3]:
%%sql
SELECT * FROM facts limit 5;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


### Here are the descriptions of the columns:

|Column Name|Description|
|:----|:----|
|name|The name of the country|
|area|The total land and sea area of the country.|
|population|The country's population.|
|population_growth|The country's population growth as a percentage.|
|birth_rate|The country's birth rate, or the number of births a year per 1,000 people.|
|death_rate|The country's death rate, or the number of death a year per 1,000 people.|
|area|The country's total area (both land and water).|
|area_land|The country's land area in square kilometers.|
|area_water|The country's waterarea in square kilometers.|

In [4]:
# calculating some summary statistics and 
# look for any outlier countries.

In [5]:
%%sql
SELECT MIN(population) AS minimum_population, 
MAX(population) AS maximum_population, 
MIN(population_growth) AS minimum_population_growth, 
MAX(population_growth) AS maximum_population_growth FROM facts;

Done.


minimum_population,maximum_population,minimum_population_growth,maximum_population_growth
0,7256490011,0.0,4.02


In [6]:
%%sql
SELECT name AS countries_with_least_population FROM facts 
WHERE population == (SELECT MIN(population) FROM facts);

Done.


countries_with_least_population
Antarctica


In [7]:
%%sql
SELECT name AS countries_with_maximum_population FROM facts 
WHERE population == (SELECT MAX(population) FROM facts);

Done.


countries_with_maximum_population
World


> It seems like the table contains a row for the whole world, 
> which explains the population of over 7.2 billion. 
>
> It also seems like the table contains a row for Antarctica, 
> which explains the population of 0. This seems to match the CIA 
> Factbook [page for Antarctica](<https://www.cia.gov/library/publications/the-world-factbook/geos/ay.html>)

In [8]:
# calculating some averages

In [9]:
%%sql
SELECT AVG(population) AS population_average, AVG(area) 
AS area_average FROM facts;

Done.


population_average,area_average
62094928.32231405,555093.546184739


In [10]:
# find countries that are densely populated

In [11]:
%%sql
SELECT name AS densely_populated_countries FROM facts WHERE 
population > (SELECT AVG(population) FROM facts) and 
area < (SELECT AVG(area) FROM facts);

Done.


densely_populated_countries
Bangladesh
Germany
Japan
Philippines
Thailand
United Kingdom
Vietnam


In [12]:
# countries have the highest ratios of water to land

In [13]:
%%sql
SELECT name AS countries_with_highest_water_to_land_ratio, 
CAST(area_water AS Float)/CAST(area_land AS Float) 
AS ratios_of_water_to_land FROM facts 
ORDER BY ratios_of_water_to_land DESC LIMIT 5;

Done.


countries_with_highest_water_to_land_ratio,ratios_of_water_to_land
British Indian Ocean Territory,905.6666666666666
Virgin Islands,4.520231213872832
Puerto Rico,0.5547914317925592
"Bahamas, The",0.3866133866133866
Guinea-Bissau,0.2846728307254623


In [14]:
# countries have more water than land

In [15]:
%%sql
SELECT name AS countries_having_more_water_than_land, 
area_water, area_land FROM facts WHERE area_water > area_land;

Done.


countries_having_more_water_than_land,area_water,area_land
British Indian Ocean Territory,54340,60
Virgin Islands,1564,346


In [16]:
# countries will add the most people to their population next year

In [17]:
%%sql
SELECT name AS countries_adding_most_people_next_year, 
population_growth FROM facts ORDER BY population_growth 
DESC LIMIT 10;

Done.


countries_adding_most_people_next_year,population_growth
South Sudan,4.02
Malawi,3.32
Burundi,3.28
Niger,3.25
Uganda,3.24
Qatar,3.07
Burkina Faso,3.03
Mali,2.98
Cook Islands,2.95
Iraq,2.93


In [18]:
# countries having higher death rate than birth rate

In [19]:
%%sql
SELECT name AS countries_with_higher_death_rate,
death_rate, birth_rate FROM facts WHERE death_rate > birth_rate;

Done.


countries_with_higher_death_rate,death_rate,birth_rate
Austria,9.42,9.41
Belarus,13.36,10.7
Bosnia and Herzegovina,9.75,8.87
Bulgaria,14.44,8.92
Croatia,12.18,9.45
Czech Republic,10.34,9.63
Estonia,12.4,10.51
Germany,11.42,8.47
Greece,11.09,8.66
Hungary,12.73,9.16
