# CIA World Factbook Project

Analyze population statistics related to different countries

### Load the CIA database

In [1]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

Query the database to see which tables are available

In [3]:
%%sql
SELECT *
  FROM sqlite_master
 WHERE type='table';

Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


As we can see, there are only two tables available in the database, facts, and sqlite_sequence

In [6]:
%%sql
SELECT *
  FROM facts
 LIMIT 5;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


The SQL commands must be always precededed by the %%sql command.

### Find out which are the extreme cases of population and population growth

In [7]:
%%sql
SELECT MIN(population), MAX(population), MIN(population_growth), MAX(population_growth)
    FROM facts;

Done.


MIN(population),MAX(population),MIN(population_growth),MAX(population_growth)
0,7256490011,0.0,4.02


Now we proceed to find out which are the specific countries with these population values.

In [8]:
%%sql
SELECT name, population
    FROM facts
 WHERE (population == (SELECT MAX(population)
                         FROM facts) OR
        population == (SELECT MIN(population)
                          FROM facts)
       );

Done.


name,population
Antarctica,0
World,7256490011


We see that there was a row for the Total World population and another one for Antartida. These are no Countries, so we will exclude them

In [10]:
%%sql
SELECT MAX(population), MIN(population), MAX(population_growth), MIN(population_growth)
    FROM (SELECT name, population, population_growth
              FROM facts
           WHERE name NOT IN (SELECT name
                                  FROM facts
                               WHERE (population == 
                                      (SELECT MAX(population)
                                           FROM facts) OR 
                                      population == 
                                      (SELECT MIN(population)
                                           FROM facts)
                                     )
                             )
         );

Done.


MAX(population),MIN(population),MAX(population_growth),MIN(population_growth)
1367485388,48,4.02,0.0


Now, we do the same procedure and employ it to test which are the countries with population and area over the average. We use the previous query as a template.

In [11]:
%%sql
SELECT name, population, area
              FROM facts
           WHERE (name NOT IN (SELECT name
                                  FROM facts
                               WHERE (population == 
                                      (SELECT MAX(population)
                                           FROM facts
                                      ) OR population == 
                                      (SELECT MIN(population)
                                           FROM facts
                                      )
                                     )
                              )
                 ) 
           AND (population > (SELECT AVG(population)
                                  FROM facts
                               WHERE (name NOT IN (SELECT name
                                                       FROM facts
                                                    WHERE (population == 
                                                          (SELECT MAX(population)
                                                               FROM facts
                                                          ) OR population == 
                                                          (SELECT MIN(population)
                                                               FROM facts
                                                          )
                                                          )
                                                  )
                                     )
                             )
                               
               )
            AND (area < (SELECT AVG(area)
                                  FROM facts
                               WHERE (name NOT IN (SELECT name
                                                       FROM facts
                                                    WHERE (population == 
                                                          (SELECT MAX(population)
                                                               FROM facts
                                                          ) OR population == 
                                                          (SELECT MIN(population)
                                                               FROM facts
                                                          )
                                                          )
                                                  )
                                     )
                             )
                               
               )

Done.


name,population,area
Bangladesh,168957745,148460
Germany,80854408,357022
Iraq,37056169,438317
Italy,61855120,301340
Japan,126919659,377915
"Korea, South",49115196,99720
Morocco,33322699,446550
Philippines,100998376,300000
Poland,38562189,312685
Spain,48146134,505370


### Questions to answer
* What country has the most people? 
* What country has the highest growth rate?
* Which countries have the highest ratios of water to land? 
* Which countries have more water than land?
* Which countries will add the most people to their population next year?
* Which countries have a higher death rate than birth rate?
* What countries have the highest population/area ratio and how does it compare to list we found in the previous screen?


Now let's figure out which are the most populated and the fastest growing countries,

In [15]:
%%sql
SELECT name, population, population_growth
              FROM facts
           WHERE (name NOT IN (SELECT name
                                  FROM facts
                               WHERE (population == 
                                      (SELECT MAX(population)
                                           FROM facts
                                      ) OR population == 
                                      (SELECT MIN(population)
                                           FROM facts
                                      )
                                     )
                              )
                 ) 
           AND ((population == (SELECT MAX(population)
                                   FROM facts
                                WHERE (name NOT IN (SELECT name
                                                        FROM facts
                                                     WHERE (population == 
                                                           (SELECT MAX(population)
                                                                FROM facts
                                                           ) OR population == 
                                                           (SELECT MIN(population)
                                                                FROM facts
                                                           )
                                                           )
                                                   )
                                      )
                              )
                               
                )
            OR (population_growth == (SELECT MAX(population_growth)
                                          FROM facts
                                      WHERE (name NOT IN (SELECT name
                                                       FROM facts
                                                    WHERE (population == 
                                                          (SELECT MAX(population)
                                                               FROM facts
                                                          ) OR population == 
                                                          (SELECT MIN(population)
                                                               FROM facts
                                                          )
                                                          )
                                                  )
                                     )
                             )
                               
                )
               )

Done.


name,population,population_growth
China,1367485388,0.45
South Sudan,12042910,4.02


Not surprising, China is the highest populated country in the world. South Sudan is the fastest growing one.

Now we study the countries wih highest ratio of water to land, and which have more water than land.

In [35]:
%%sql
SELECT name, (CAST(area_water AS Float)/area_land) AS water_land_ratio
    FROM facts
WHERE (name NOT IN (SELECT name
                        FROM facts
                     WHERE (population == 
                            (SELECT MAX(population)
                                 FROM facts
                            ) OR population == 
                            (SELECT MIN(population)
                                 FROM facts
                            )
                           )
                   ) AND
       (CAST(area_water AS Float)/area_land > 0.5
       )
      )
 ORDER BY (CAST(area_water AS Float)/area_land) DESC;


Done.


name,water_land_ratio
British Indian Ocean Territory,905.6666666666666
Virgin Islands,4.520231213872832
Puerto Rico,0.5547914317925592


Apparently, only Virgin Islands, British Indian Ocean Territory and Puerto Rico have more water area than land area. The country with highest ratio by far is British Indian Ocean Territory with a whoping 900%.

Now we take a look at the country with the largest population growth next year in total population.

In [41]:
%%sql
SELECT name, (population * population_growth/100) AS population_increase
    FROM facts
WHERE (name NOT IN (SELECT name
                        FROM facts
                     WHERE (population == 
                            (SELECT MAX(population)
                                 FROM facts
                            ) OR population == 
                            (SELECT MIN(population)
                                 FROM facts
                            )
                           )
                   ) AND
       ((population * population_growth/100) == 
        (SELECT MAX(population * population_growth/100)
             FROM facts
          WHERE (name NOT IN 
                 (SELECT name
                      FROM facts
                   WHERE (population == 
                          (SELECT MAX(population)
                               FROM facts
                          ) OR population == 
                          (SELECT MIN(population)
                                 FROM facts
                            )
                           )
                   )
                )
        )
       )
      )

Done.


name,population_increase
India,15270686.1248


Surprise surprise! The country which will grow the most next year is India. That was rather obvious

Now let's answer which countries have higher death rate than birth rate.

In [44]:
%%sql
SELECT name, (death_rate/birth_rate) AS death_to_birth_rate
    FROM facts
WHERE (name NOT IN (SELECT name
                        FROM facts
                     WHERE (population == 
                            (SELECT MAX(population)
                                 FROM facts
                            ) OR population == 
                            (SELECT MIN(population)
                                 FROM facts
                            )
                           )
                   ) AND
       (death_rate/birth_rate > 1
       )
      )
 ORDER BY (death_rate/birth_rate) DESC;


Done.


name,death_to_birth_rate
Bulgaria,1.6188340807174888
Serbia,1.5044052863436124
Latvia,1.431
Lithuania,1.4128712871287128
Hungary,1.3897379912663756
Monaco,1.3894736842105262
Slovenia,1.350356294536817
Ukraine,1.3488805970149254
Germany,1.3482880755608029
Saint Pierre and Miquelon,1.3099730458221026


OK, it seems that what is often called  East Europe have a big difference between the death and birth rate. Other western countries like Germany, Greece and Italy are also in the list. Surprisingly Japan is not very high in the top, I expected it to be very high due to its very old population.

Finally, we investigate the densest populated countries

In [45]:
%%sql
SELECT name, (population/area) AS population_area
    FROM facts
WHERE (name NOT IN (SELECT name
                        FROM facts
                     WHERE (population == 
                            (SELECT MAX(population)
                                 FROM facts
                            ) OR population == 
                            (SELECT MIN(population)
                                 FROM facts
                            )
                           )
                   )
      )
 ORDER BY (population/area) DESC;

Done.


name,population_area
Macau,21168.0
Monaco,15267.0
Singapore,8141.0
Hong Kong,6445.0
Gaza Strip,5191.0
Gibraltar,4876.0
Bahrain,1771.0
Maldives,1319.0
Malta,1310.0
Bermuda,1299.0


Here we see that the densest populated countries are islands, or city states. That is logical. These results do not agree well with the results from last page query, since many of these states ha