# Analyzing CIA Factbook Data Using SQL

The CIA World Factbook collects statistics about all counteries on Earth,
the original data can be found [here](https://www.cia.gov/library/publications/the-world-factbook/). <br>
Data is public and in the free domain, all informations about copyright can be found [here](https://www.cia.gov/library/publications/the-world-factbook/docs/contributor_copyright.html)

This notebook will analyze the factbook.db, using SQL.

In [11]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

Let's see the name of the table or table in our database.

In [12]:
%%sql
SELECT * FROM sqlite_master WHERE type='table';

 * sqlite:///factbook.db
Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


The only table in factbook.db is **facts**, let's see the first five rows in facts.

In [13]:
%%sql
SELECT * 
FROM facts
LIMIT 5

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


The table contains the following columns: <br>

- name - The country name.
- area - Total area of the country (land + water)
- population - The country's population.
- population_growth- A percentage expressing the population's growth
- birth_rate - The number of births per 1000 inhabitants.
- death_rate - The number of deaths per 1000 inhabitants
- area_land - Land area in square kilometers.
- area_water - Water area in square kilometers.
- migration_rate - migration rate compares the difference between the number of persons entering and leaving a country during the year per 1,000 persons (based on midyear population) 


### Population

In [14]:
%%sql
SELECT  MIN(population) AS "Minumun Population",
MAX(population) AS "Maximum Population", 
MIN(population_growth) AS "Minimum Population Growth",
MAX(population_growth) AS "Maximum Population Growth"
FROM facts

 * sqlite:///factbook.db
Done.


Minumun Population,Maximum Population,Minimum Population Growth,Maximum Population Growth
0,7256490011,0.0,4.02


There is a country with zero inhabitants, one with zero growth and one country has a population of  more the 7 billon people. Let's see which country correspond to these strange numbers.

In [15]:
%%sql
SELECT name, population
FROM facts
WHERE population = (
    SELECT MIN(population)
    FROM facts
)

 * sqlite:///factbook.db
Done.


name,population
Antarctica,0


Our mystery is solved, the total population of Antarctica is zero, probably because according to [wikipedia](https://en.wikipedia.org/wiki/Demographics_of_Antarctica), Antarctica has no permanent residents: there are from 1000 to 4000 people depending on the season in Antarctica, but their presence is only temporary and they are from many other countries. Moreover Antarctica isn't a political entity.

Let'see the country with the maximum population.

In [16]:
%%sql
SELECT name, population
FROM facts
WHERE population = (
    SELECT MAX(population)
    FROM facts
)

 * sqlite:///factbook.db
Done.


name,population
World,7256490011


The maximum population is referred to all the world, so not surprisingly it is such a big number.

Let's calculate the average population and area for all the dataset.

In [17]:
%%sql
SELECT ROUND(AVG(population), 2) "average population", ROUND(AVG(area),2) "average area km"
FROM facts

 * sqlite:///factbook.db
Done.


average population,average area km
62094928.32,555093.55


The average population for all countries is around 62 millions and the average area is a little over 555000 square kilometers.

Let's see what countries are above the average population.

In [18]:
%%sql
SELECT name, population
FROM facts
WHERE population > (SELECT AVG(POPULATION) FROM facts)
ORDER BY POPULATION

 * sqlite:///factbook.db
Done.


name,population
United Kingdom,64088222
France,66553766
Thailand,67976405
"Congo, Democratic Republic of the",79375136
Turkey,79414269
Germany,80854408
Iran,81824270
Egypt,88487396
Vietnam,94348835
Ethiopia,99465819


China is the most populated country.

### Area and Ratio Water to Land

What countries are below the average area?

In [19]:
%%sql
SELECT name, area
FROM facts
WHERE area < (SELECT AVG(area) FROM facts)
ORDER BY area

 * sqlite:///factbook.db
Done.


name,area
Holy See (Vatican City),0
Monaco,2
Coral Sea Islands,3
Ashmore and Cartier Islands,5
Navassa Island,5
Spratly Islands,5
Clipperton Island,6
Gibraltar,6
Wake Island,6
Paracel Islands,7


The smallest area is the Vatican City one, with a value of zero in this dataset, which doesn't mean it hasn't an area, but only that it is under 1 square kilometer, it is in fact 0.44 square kilometers (see [here](https://en.wikipedia.org/wiki/Vatican_City)).

Which country has the highest ratios of water to land?

Which country has the highest ratios of water to land?

In [20]:
%%sql
SELECT name, CAST(area_water AS Float)/CAST(area_land as Float) "Ratio Water To Land"
FROM facts
ORDER BY "Ratio Water To Land" DESC
LIMIT 10

 * sqlite:///factbook.db
Done.


name,Ratio Water To Land
British Indian Ocean Territory,905.6666666666666
Virgin Islands,4.520231213872832
Puerto Rico,0.5547914317925592
"Bahamas, The",0.3866133866133866
Guinea-Bissau,0.2846728307254623
Malawi,0.2593962585034013
Netherlands,0.2257103236656536
Uganda,0.2229223744292237
Eritrea,0.1643564356435643
Liberia,0.1562396179401993


The country with the highest ratio of water to land is the British Indian Ocean Territory with an outstanding over 905, while the second one is the territory of the Virgin Islands with a ratio of 4.5

Which countries have more water than land?<br>
The answer can be easily deducted from the preivious query, because it is true for all the countries with ratio water to land above 1, but below a more explicit query.

In [21]:
%%sql
SELECT name, area_water, area_land
FROM facts
WHERE area_water > area_land

 * sqlite:///factbook.db
Done.


name,area_water,area_land
British Indian Ocean Territory,54340,60
Virgin Islands,1564,346


### Population Increase and Decline

According to the documentation of the dataset, the population growth "compares the average annual percent change in populations, resulting from a surplus (or deficit) of births over deaths and the balance of migrants entering and leaving a country." <br>

Let'see witch countries will add more people to their population, deriving this number considering the population growth and the current population.

In [22]:
%%sql
SELECT name, (population_growth * population)/100 "new people"
FROM facts
ORDER BY "new people" DESC
LIMIT 10

 * sqlite:///factbook.db
Done.


name,new people
World,78370092.1188
India,15270686.1248
China,6153684.246
Nigeria,4448270.372
Pakistan,2906653.3662
Ethiopia,2874562.1691
Bangladesh,2703323.92
United States,2506677.1392
Indonesia,2355141.8008000003
"Congo, Democratic Republic of the",1944690.832


India will have the highest increase in population with over 15 millions new inhabitants.

Which countries have a higher death rate than birth rate?

In [23]:
%%sql
SELECT name, birth_rate "birth rate", death_rate "death_rate",
ROUND((birth_rate - death_rate), 2) "difference"
FROM facts
WHERE "death_rate" > "birth rate"
ORDER BY difference

 * sqlite:///factbook.db
Done.


name,birth rate,death_rate,difference
Bulgaria,8.92,14.44,-5.52
Serbia,9.08,13.66,-4.58
Latvia,10.0,14.31,-4.31
Lithuania,10.1,14.27,-4.17
Ukraine,10.72,14.46,-3.74
Hungary,9.16,12.73,-3.57
Germany,8.47,11.42,-2.95
Slovenia,8.42,11.37,-2.95
Romania,9.14,11.9,-2.76
Croatia,9.45,12.18,-2.73


Most of the countries with higher death rate than birth rate are in Europe.