# Analyzing CIA Factbook Data Using SQL

## Introduction

In [20]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

In [24]:
%%sql
SELECT * FROM sqlite_master WHERE type='table';

Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


## Overview of the Data

In [6]:
%%sql
SELECT * FROM facts LIMIT 5;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


## Summary Statistics

In [7]:
%%sql
SELECT MIN(population) AS 'Minimum population', 
       MAX(population) AS 'Maximum population', 
       MIN(population_growth) AS 'Minimum population growth', 
       MAX(population_growth) AS 'Maximum population growth'
FROM facts;

Done.


Minimum population,Maximum population,Minimum population growth,Maximum population growth
0,7256490011,0.0,4.02


## Outliers

In [25]:
%%sql
SELECT * FROM facts WHERE population == 0;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
250,ay,Antarctica,,280000,,0,,,,


The Antarctica has no indigenous inhabitants.

In [15]:
%%sql
SELECT * FROM facts WHERE population = (select max(population) from facts);

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
261,xx,World,,,,7256490011,1.08,18.6,7.8,


The maximum population value refers to the world population as a whole.

In [17]:
%%sql
SELECT * FROM facts ORDER BY population DESC LIMIT 5;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
261,xx,World,,,,7256490011,1.08,18.6,7.8,
37,ch,China,9596960.0,9326410.0,270550.0,1367485388,0.45,12.49,7.53,0.44
77,in,India,3287263.0,2973193.0,314070.0,1251695584,1.22,19.55,7.32,0.04
197,ee,European Union,4324782.0,,,513949445,0.25,10.2,10.2,2.5
186,us,United States,9826675.0,9161966.0,664709.0,321368864,0.78,12.49,8.15,3.86


By looking at the top 5 population, we can see that China has the greatest population.

## Average

In [18]:
%%sql
SELECT AVG(population), AVG(area)
FROM facts;

Done.


AVG(population),AVG(area)
62094928.32231405,555093.546184739


## Densely Populated Countries

In [19]:
%%sql
SELECT *
FROM facts
WHERE population > (SELECT AVG(population) FROM facts)
AND area < (SELECT AVG(area) FROM facts);

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
14,bg,Bangladesh,148460,130170,18290,168957745,1.6,21.14,5.61,0.46
65,gm,Germany,357022,348672,8350,80854408,0.17,8.47,11.42,1.24
85,ja,Japan,377915,364485,13430,126919659,0.16,7.93,9.51,0.0
138,rp,Philippines,300000,298170,1830,100998376,1.61,24.27,6.11,2.09
173,th,Thailand,513120,510890,2230,67976405,0.34,11.19,7.8,0.0
185,uk,United Kingdom,243610,241930,1680,64088222,0.54,12.17,9.35,2.54
192,vm,Vietnam,331210,310070,21140,94348835,0.97,15.96,5.93,0.3
