# CIA World Factbook Data
Working with data from the [CIA World Factbook](https://www.cia.gov/library/publications/the-world-factbook/), a compendium of statistics about all of the countries on Earth.   

The Factbook contains demographic information like:

Data | Description
 --- | ---
population | The population as of 2015.
population_growth | The annual population growth rate, as a percentage.
area | The total land and water area.

## Connecting to database

In [7]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

Interacting with database to identify all the available tables

In [13]:
%%sql
SELECT *
  FROM sqlite_master
WHERE type='table';

Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


Reading the project database

In [15]:
%%sql
select * from facts limit 5

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


## Details regarding the attributes in the table

Attribute | Description
 --- | --- 
name | The name of the country.
area | The country's total area (both land and water).
area_land | The country's land area in square kilometers.
area_water | The country's waterarea in square kilometers.
population | The country's population.
population_growth | The country's population growth as a percentage.
birth_rate | The country's birth rate, or the number of births a year per 1,000 people.
death_rate | The country's death rate, or the number of death a year per 1,000 people.

## Exploratory Analysis

In [16]:
%%sql
select min(population),
    max(population), 
    min(population_growth), 
    max(population_growth)
from facts

Done.


min(population),max(population),min(population_growth),max(population_growth)
0,7256490011,0.0,4.02


**Observations**:

* There is a country with no inhabitants ie. population 0
* There is a country with a population of 7.2bn which isn't technically possible and must be an error  

### Querying the country(s) with the minimum and maximum populations

* Maximum population 

In [21]:
%%sql
select
    name country_name
from facts 
where population == (select
                        max(population)
                    from facts)

Done.


country_name
World


* Minimum population 

In [22]:
%%sql
select
    name country_name
from facts 
where population == (select
                        min(population)
                    from facts)

Done.


country_name
Antarctica


It seems like the table contains a row for the whole world, which explains the population of over 7.2 billion. It also seems like the table contains a row for Antarctica, which explains the population of 0.

In [24]:
%%sql
select 
    min(population),
    max(population), 
    min(population_growth), 
    max(population_growth)
from facts 
where population < (select
                        max(population)
                    from facts)

Done.


min(population),max(population),min(population_growth),max(population_growth)
0,1367485388,0.0,4.02


In [28]:
%%sql
select 
    avg(population) avg_population,
    cast(avg(area) as float) avg_area
from facts 
where population < (select
                        max(population)
                    from facts)

Done.


avg_population,avg_area
32242666.56846473,582949.8523206752


### Identifying densely populated countries 

In [35]:
%%sql
select * 
from facts 
where area > (
    select cast(avg(area) as float)
    from facts 
    where population < (
        select max(population)
        from facts)
) and population > (
    select avg(population)
    from facts 
    where population < (
        select
        max(population)
        from facts
    )
)

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230.0,0.0,32564342,2.32,38.57,13.89,1.51
3,ag,Algeria,2381741,2381741.0,0.0,39542166,1.84,23.67,4.31,0.92
7,ar,Argentina,2780400,2736690.0,43710.0,43431886,0.93,16.64,7.33,0.0
24,br,Brazil,8515770,8358140.0,157630.0,204259812,0.77,14.46,6.58,0.14
28,bm,Burma,676578,653508.0,23070.0,56320206,1.01,18.39,7.96,0.28
32,ca,Canada,9984670,9093507.0,891163.0,35099836,0.75,10.28,8.42,5.66
37,ch,China,9596960,9326410.0,270550.0,1367485388,0.45,12.49,7.53,0.44
38,co,Colombia,1138910,1038700.0,100210.0,46736728,1.04,16.47,5.4,0.64
40,cg,"Congo, Democratic Republic of the",2344858,2267048.0,77810.0,79375136,2.45,34.88,10.07,0.27
53,eg,Egypt,1001450,995450.0,6000.0,88487396,1.79,22.9,4.77,0.19


### To-do:
* What country has the most people? What country has the highest growth rate?
*
*
*
* 