#  Analyzing CIA Factbook Data Using SQL

In this project, we'll work with data from the CIA World Factbook, a compendium of statistics about all of the countries on Earth. The Factbook contains demographic information like:

* population - The population as of 2015.
* population_growth - The annual population growth rate, as a percentage.
* area - The total land and water area.

In [1]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

In [2]:
%%sql
SELECT * 
  FROM facts 
  LIMIT 5;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


## Calculating some summary statistics and look for outlier countries

In [3]:
%%sql
SELECT
    MIN(population) min_pop,
    MAX(population) max_pop, 
    MIN(population_growth) min_pop_grwth,
    MAX(population_growth) max_pop_grwth 
FROM facts;

Done.


min_pop,max_pop,min_pop_grwth,max_pop_grwth
0,7256490011,0.0,4.02


There's a country with a population of 0. There's also a country with more than 7.2 billion people. Let's zoom in to these countries.

Countrie(s) with the minimum population:

In [4]:
%%sql
SELECT *
  FROM facts
  WHERE population == (SELECT MIN(population) FROM facts);

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
250,ay,Antarctica,,280000,,0,,,,


Countrie(s) with the maximum population:

In [5]:
%%sql
SELECT *
  FROM facts
  WHERE population == (SELECT MAX(population) FROM facts);

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
261,xx,World,,,,7256490011,1.08,18.6,7.8,


The table contains a row for the whole world, which explains the population of over 7.2 billion. We should recalculate the summary statistics we calculated earlier, while excluding the row for the whole world.

In [6]:
%%sql
SELECT MIN(population) AS min_pop,
       MAX(population) AS max_pop,
       MIN(population_growth) AS min_pop_growth,
       MAX(population_growth) AS max_pop_growth 
  FROM facts
 WHERE name <> 'World';

Done.


min_pop,max_pop,min_pop_growth,max_pop_growth
0,1367485388,0.0,4.02


## Exploring Average Population and Area

In [7]:
%%sql
SELECT AVG(population) AS avg_population, AVG(area) AS avg_area
  FROM facts
 WHERE name <> 'World';

Done.


avg_population,avg_area
32242666.56846473,555093.546184739


## Finding Densely Populated Countries

In [8]:
%%sql
SELECT *
  FROM facts
  WHERE population > (SELECT AVG(population) FROM facts)
  AND area > (SELECT AVG(area) FROM facts);

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
24,br,Brazil,8515770,8358140.0,157630.0,204259812,0.77,14.46,6.58,0.14
37,ch,China,9596960,9326410.0,270550.0,1367485388,0.45,12.49,7.53,0.44
40,cg,"Congo, Democratic Republic of the",2344858,2267048.0,77810.0,79375136,2.45,34.88,10.07,0.27
53,eg,Egypt,1001450,995450.0,6000.0,88487396,1.79,22.9,4.77,0.19
58,et,Ethiopia,1104300,,104300.0,99465819,2.89,37.27,8.19,0.22
61,fr,France,643801,640427.0,3374.0,66553766,0.43,12.38,9.16,1.09
77,in,India,3287263,2973193.0,314070.0,1251695584,1.22,19.55,7.32,0.04
78,id,Indonesia,1904569,1811569.0,93000.0,255993674,0.92,16.72,6.37,1.16
79,ir,Iran,1648195,1531595.0,116600.0,81824270,1.2,17.99,5.94,0.07
114,mx,Mexico,1964375,1943945.0,20430.0,121736809,1.18,18.78,5.26,1.68
