# CIA Factbook 

The [CIA World Factbook](https://www.cia.gov/the-world-factbook/), produced for US policymakers and coordinated throughout the US Intelligence Community, presents the basic realities about the world in which we live.

Information in The Factbook is collected from – and coordinated with – a wide variety of US Government agencies, as well as from hundreds of published sources. Being update on averge every week.

The purpose of this project is to apply SQL to analize the data and answer questions about the data.

## Preparing and knowing the data.

In [1]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

In [2]:
%%sql
SELECT *
  FROM sqlite_master
WHERE type='table';

 * sqlite:///factbook.db
Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


There are 2 datasets, but the one we are interested in is the `facts` table, lets see the first rows

In [3]:
%%sql
SELECT *
  FROM facts
 LIMIT 5;

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


Here are the description of the columns:
- name: The name of the country.
- area: The total land and sea area of the country.
- population: The country's population.
- population_growth: The country's population growth as a percentage.
- birth_rate: The country's birth rate, or the number of births a year per 1,000 people.
- death_rate: The country's death rate, or the number of death a year per 1,000 people.
- area: The country's total area (both land and water).
- area_land: The country's land area in square kilometers.
- area_water: The country's waterarea in square kilometers.

In [4]:
%%sql
SELECT COUNT(*)
  FROM facts;

 * sqlite:///factbook.db
Done.


COUNT(*)
261


There are in total 260 countries in the dataset.

# Answering questions

## Which countries have the maximun and minimun population

To answer this lets first wathc the numbers behind the question 

In [5]:
%%sql
SELECT MIN(population), MAX(population)
  FROM facts;

 * sqlite:///factbook.db
Done.


MIN(population),MAX(population)
0,7256490011


This looks strange, how can it be a country with zero people and other with so many people,so let's dive and know the country names of these numbers.

#### Name of the country with zero population

In [6]:
%%sql
SELECT *
  FROM facts
 WHERE population == (SELECT MIN(population)
                      FROM facts);

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
250,ay,Antarctica,,280000,,0,,,,


#### Name of the country with the maximun population

In [7]:
%%sql 
SELECT *
  FROM facts
 WHERE population==(SELECT MAX(population)
                     FROM facts);

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
261,xx,World,,,,7256490011,1.08,18.6,7.8,


We can see that the country with 0 population is `Antartica` and the "country' with the unreasonable amount of population is `World`, so there is a row that summarizes all the countries.

For the other hand `Antartica` which make sense...nobody lives there.

Now that we know this, we can let Antarctiva stay, but lets remove the world row and answer the same question as before. 

In [8]:
%%sql
SELECT name, MAX(population)
  FROM facts
 WHERE name <> 'World';

 * sqlite:///factbook.db
Done.


name,MAX(population)
China,1367485388


It is a universal knowledge that Chine is the country with the highest amount of people in world, let's see the ratio between the population in chine and the entire world

In [9]:
%%sql
SELECT CAST(MAX(population) AS FLOAT) / (SELECT MAX(population) FROM facts)*100 AS 'China_Population_Ratio'
  FROM facts
WHERE name<>'World';

 * sqlite:///factbook.db
Done.


China_Population_Ratio
18.84499786986615


China posses 19% of the global population, that is a really big number.

## Wich countries are above the global population mean?

Let's first know the numbers behind the question

In [10]:
%%sql
SELECT AVG(population) AS AVG_population
  FROM facts
 WHERE name <>'World';

 * sqlite:///factbook.db
Done.


AVG_population
32242666.56846473


The population mean is 32.3 million, so lets now see which countries are above that number

In [11]:
%%sql
SELECT name
  FROM facts
 WHERE population > (SELECT AVG(population)
                FROM facts
                WHERE name<>'World') AND name<> 'World';

 * sqlite:///factbook.db
Done.


name
Afghanistan
Algeria
Argentina
Bangladesh
Brazil
Burma
Canada
China
Colombia
"Congo, Democratic Republic of the"


In [12]:
%%sql
SELECT COUNT(*)
  FROM facts
 WHERE population > (SELECT AVG(population)
                FROM facts
                WHERE name<>'World') AND name<> 'World';

 * sqlite:///factbook.db
Done.


COUNT(*)
41


There are only 41 countries that are above the global average population. Now, this may generate us to think other question, if those countries have many people, `do they have the space to hold that amount of people?`

### Which Countries are above the average population and under the average area?
 Let's first know the average area for a country

In [13]:
%%sql
SELECT AVG(area) AS AVG_area
  FROM facts
 WHERE name <>'World';

 * sqlite:///factbook.db
Done.


AVG_area
555093.546184739


The average area of a country is 555,000 square meter

In [14]:
%%sql
SELECT *
  FROM facts
 WHERE population>(SELECT AVG(population)
                    FROM facts
                     WHERE name<>'World') AND
        area < (SELECT AVG(area)
                FROM facts
                WHERE name<> 'World');

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
14,bg,Bangladesh,148460,130170,18290,168957745,1.6,21.14,5.61,0.46
65,gm,Germany,357022,348672,8350,80854408,0.17,8.47,11.42,1.24
80,iz,Iraq,438317,437367,950,37056169,2.93,31.45,3.77,1.62
83,it,Italy,301340,294140,7200,61855120,0.27,8.74,10.19,4.1
85,ja,Japan,377915,364485,13430,126919659,0.16,7.93,9.51,0.0
91,ks,"Korea, South",99720,96920,2800,49115196,0.14,8.19,6.75,0.0
120,mo,Morocco,446550,446300,250,33322699,1.0,18.2,4.81,3.36
138,rp,Philippines,300000,298170,1830,100998376,1.61,24.27,6.11,2.09
139,pl,Poland,312685,304255,8430,38562189,0.09,9.74,10.19,0.46
163,sp,Spain,505370,498980,6390,48146134,0.89,9.64,9.04,8.31


Theses are the countries.

These countries may not have enough space to have a population settle throughout its territory and may have a lot od buildings, or may live in very compact cities.

Now let's see their ratio.

### Which countries have the highest population/area ratio?

In [15]:
%%sql
SELECT name, population, area, ROUND(CAST(population AS FLOAT)/area,2) AS Ratio
 FROM facts
ORDER BY Ratio DESC
LIMIT 10;

 * sqlite:///factbook.db
Done.


name,population,area,Ratio
Macau,592731,28,21168.96
Monaco,30535,2,15267.5
Singapore,5674472,697,8141.28
Hong Kong,7141106,1108,6445.04
Gaza Strip,1869055,360,5191.82
Gibraltar,29258,6,4876.33
Bahrain,1346613,760,1771.86
Maldives,393253,298,1319.64
Malta,413965,316,1310.02
Bermuda,70196,54,1299.93


The area is given in square kilometers.

If we compare the outputs in the previous question there is not a single country from this output in the output of the previuos question. This because even if in the first wuestions those countries are above the average population and under the average area, they still have a bis area in coparison to these outputs, where some of them are ilands, or independent countries that are a little fraction from where they belong like Hon Kong or Monaco. 

This does not mean that in these countries people live in small hpuses and barely fit, depends if that country is settle area or just entertainment and relax areas like Monaco or Maldives.

So let's give a filter of area, onlu countries above 600 km2 will be here, to filter the ilands and enterntainment countries.

In [16]:
%%sql
SELECT name, population, area, ROUND(CAST(population AS FLOAT)/area,2) AS Ratio
 FROM facts
WHERE area > 600
ORDER BY Ratio DESC
LIMIT 10;

 * sqlite:///factbook.db
Done.


name,population,area,Ratio
Singapore,5674472,697,8141.28
Hong Kong,7141106,1108,6445.04
Bahrain,1346613,760,1771.86
Bangladesh,168957745,148460,1138.07
Mauritius,1339827,2040,656.78
Taiwan,23415126,35980,650.78
Lebanon,6184701,10400,594.68
"Korea, South",49115196,99720,492.53
Rwanda,12661733,26338,480.74
West Bank,2785366,5860,475.32


Now these are real countries, where people live and work normally, that are very densed poblated.

## Which country has the highest growth rate? 

In [17]:
%%sql
SELECT MAX(population_growth), name
  FROM facts;

 * sqlite:///factbook.db
Done.


MAX(population_growth),name
4.02,South Sudan


The country that is growing faster than the others in population is `South Sudan`.

## Which countries have a higher death rate than birth rate? 

In [18]:
%%sql
SELECT  name, death_rate, birth_rate
  FROM facts
WHERE death_rate>birth_rate
ORDER BY death_rate-birth_rate DESC;

 * sqlite:///factbook.db
Done.


name,death_rate,birth_rate
Bulgaria,14.44,8.92
Serbia,13.66,9.08
Latvia,14.31,10.0
Lithuania,14.27,10.1
Ukraine,14.46,10.72
Hungary,12.73,9.16
Germany,11.42,8.47
Slovenia,11.37,8.42
Romania,11.9,9.14
Croatia,12.18,9.45


Surprisingly at least 90% of the countries that their `death_rate` is bigger than their `birth_rate` are European.

## Which countries have more water than land?

In [19]:
%%sql 
SELECT *
  FROM facts
 WHERE area_water > area_land;

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
228,io,British Indian Ocean Territory,54400,60,54340,,,,,
247,vq,Virgin Islands,1910,346,1564,103574.0,0.59,10.31,8.54,7.67


These countries are:
- British Indian Ocean Territory
- Virgin Islands
    

### Which are the top 5 countries in area water?

In [20]:
%%sql
SELECT * 
  FROM facts 
 ORDER BY area_water DESC
LIMIT 5;

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
32,ca,Canada,9984670,9093507,891163,35099836,0.75,10.28,8.42,5.66
143,rs,Russia,17098242,16377742,720500,142423773,0.04,11.6,13.69,1.69
186,us,United States,9826675,9161966,664709,321368864,0.78,12.49,8.15,3.86
77,in,India,3287263,2973193,314070,1251695584,1.22,19.55,7.32,0.04
37,ch,China,9596960,9326410,270550,1367485388,0.45,12.49,7.53,0.44


The top 5 countries that have the most area_water in the world are:
- Canada
- Russia
- United States
- India
- China

### Which countries have the highest ratios of water to land? 

In [21]:
%%sql
SELECT name,area_land, area_water, CAST(area_water AS FLOAT)/area_land*100 AS Ratio
  FROM facts
 ORDER BY Ratio DESC
  LIMIT 10;

 * sqlite:///factbook.db
Done.


name,area_land,area_water,Ratio
British Indian Ocean Territory,60,54340,90566.66666666666
Virgin Islands,346,1564,452.0231213872832
Puerto Rico,8870,4921,55.47914317925592
"Bahamas, The",10010,3870,38.66133866133866
Guinea-Bissau,28120,8005,28.46728307254623
Malawi,94080,24404,25.939625850340136
Netherlands,33893,7650,22.571032366565365
Uganda,197100,43938,22.29223744292237
Eritrea,101000,16600,16.435643564356436
Liberia,96320,15049,15.623961794019934


These countries are the ones with the most `Water/Land` Ratio.

These countries are or islands, are at the coast or, inside their territory they have lakes or a slice or sea.