# Analyzing CIA Factbook Data Using SQL

## Introduction

#### This is an analysis of the data from the [CIA World Factbook](https://www.cia.gov/the-world-factbook/), a compendium of statistics about all of the countries on Earth. The data used isn't current, but the same methodologies used in this analysis can be used when statistics are further updated. The population at the time in this analysis is roughly 7.26 billion.

The first thing to do is connect to our database, which we will do below.

In [6]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

## Overview of the Data

We will print the first five rows to get an idea of what the data looks like. The only table within the dataset is labeled **facts**.

In [7]:
%%sql
SELECT *
  FROM facts
 LIMIT 5;

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


**Here are the descriptions for the columns:**

- name — the name of the country.
- area— the country's total area (both land and water).
- area_land — the country's land area in square kilometers.
- area_water — the country's waterarea in square kilometers.
- population — the country's population.
- population_growth— the country's population growth as a percentage.
- birth_rate — the country's birth rate, or the number of births per year per 1,000 people.
- death_rate — the country's death rate, or the number of death per year per 1,000 people
- migration_rate -  the country's migration rate.

Below we'll calculate the minimum and maximum population of all countries, the country with the least population and the country with the highest population. We will also calculate the country with the lowest population growth and the country with the highest population growth.

## Statistics of the World Population

In [8]:
%%sql
SELECT MIN(population) AS min_pop,
       MAX(population) AS max_pop,
       MIN(population_growth) AS min_pop_growth,
       MAX(population_growth) AS max_pop_growth
  FROM facts;

 * sqlite:///factbook.db
Done.


min_pop,max_pop,min_pop_growth,max_pop_growth
0,7256490011,0.0,4.02


As we can see from the results above, something isn't right. It's saying the country with the smallest population has zero people inhabiting it. It also says the highest population in a country is 7,256,490,011. This is impossible. This number is closer to the world population, which is what it must be. Let's find out what these two countries are.

In [9]:
%%sql
SELECT *
  FROM facts
 WHERE population == (SELECT MIN(population)
                        FROM facts
                     );

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
250,ay,Antarctica,,280000,,0,,,,


Above, we can see that the country with the smallest population the data is saying is Antarctica. Scientists inhabit Antarctica when doing research, and scientists from various countries stay on a base when there. Antarctica has no permanent population since it's only made up of researchers and scientists who are typically only there for a season, such as in the Summer or Winter.

In [10]:
%%sql
SELECT *
  FROM facts
 WHERE population == (SELECT MAX(population)
                        FROM facts
                     );

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
261,xx,World,,,,7256490011,1.08,18.6,7.8,


Above, we see that it's saying that the large number we got early of 7,256,490,011 is called **World**. This is the world's population at the time the data was collected, so we know for sure that it was the population of Earth and not a particular country.

With this information, we now know to exclude the **World** row from the data to know what the real minimum and maximum populations are country-wise.

In [26]:
%%sql
select min(population), max(population),
min(population_growth), max(population_growth)
from facts
where name <> 'World'
and name <> 'Antarctica'
; 

 * sqlite:///factbook.db
Done.


min(population),max(population),min(population_growth),max(population_growth)
48,1367485388,0.0,4.02


We see here that the actual minimum population of a country is 48, and the maximum population of a country is 1,367,485,388. Below we can see who these two countries are.

In [27]:
%%sql
select name as Country,
            population as Total_population,
            population*100/(select population from facts
                           where name = 'World') as '%_from_World_population'
from facts
where 
          population = (select min(population)
                       from facts
                       where name != 'Antarctica')
       or population = (select max(population)
                       from facts
                       where name != 'World')
        ;

 * sqlite:///factbook.db
Done.


Country,Total_population,%_from_World_population
China,1367485388,18
Pitcairn Islands,48,0


## Average Population and Area of Countries

The two countries are China and the Pitcairn Islands. What's wild is how China makes up almost a fifth of the world's population. An interesting thing about the Pitcairn Islands is that they're four volcanic islands in the Pacific Ocean. The islands reside in British Overseas Territory.

In [12]:
%%sql
SELECT AVG(population) AS avg_population, AVG(area) AS avg_area
  FROM facts
 WHERE name <> 'World';

 * sqlite:///factbook.db
Done.


avg_population,avg_area
32242666.56846473,555093.546184739


Above, we have the average population of a country which is roughly 32,242,666. The average area of a country is roughly 555,094 square kilometers.

## Densely Populated Countries

In [14]:
%%sql
SELECT *
  FROM facts
 WHERE population > (SELECT AVG(population)
                       FROM facts
                      WHERE name <> 'World'
                    )
   AND area < (SELECT AVG(area)
                 FROM facts
                WHERE name <> 'World'
                );

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
14,bg,Bangladesh,148460,130170,18290,168957745,1.6,21.14,5.61,0.46
65,gm,Germany,357022,348672,8350,80854408,0.17,8.47,11.42,1.24
80,iz,Iraq,438317,437367,950,37056169,2.93,31.45,3.77,1.62
83,it,Italy,301340,294140,7200,61855120,0.27,8.74,10.19,4.1
85,ja,Japan,377915,364485,13430,126919659,0.16,7.93,9.51,0.0
91,ks,"Korea, South",99720,96920,2800,49115196,0.14,8.19,6.75,0.0
120,mo,Morocco,446550,446300,250,33322699,1.0,18.2,4.81,3.36
138,rp,Philippines,300000,298170,1830,100998376,1.61,24.27,6.11,2.09
139,pl,Poland,312685,304255,8430,38562189,0.09,9.74,10.19,0.46
163,sp,Spain,505370,498980,6390,48146134,0.89,9.64,9.04,8.31


Here we have 14 countries with a higher-than-average population and a below-average area. So countries that have a dense population.

## Highest Population, Growth Rate, and Birth Rate

In [17]:
%%sql
select name as 'Country',
           population as 'Total_Population',
           population_growth as 'Growth_Rate_%'
        from facts
        where population== (select max(population)
                            from facts
                            where name !='World')
        or population_growth == (select max(population_growth)
                                 from facts
                                 where name !='World')
               ;

 * sqlite:///factbook.db
Done.


Country,Total_Population,Growth_Rate_%
China,1367485388,0.45
South Sudan,12042910,4.02


China has the highest total population, and South Sudan has the highest population growth at 4.02%

In [22]:
%%sql
select name as Country,
                       cast((birth_rate - death_rate)*population/1000 as int) as Population_Growth_Next_Year
        from facts
        where name <> 'World'
        order by Population_Growth_Next_Year desc
        limit 10
         ;

 * sqlite:///factbook.db
Done.


Country,Population_Growth_Next_Year
India,15308236
China,6782727
Nigeria,4491845
Pakistan,3203291
Ethiopia,2892466
Indonesia,2649534
Bangladesh,2623913
"Congo, Democratic Republic of the",1969297
Philippines,1834130
Mexico,1645881


Here we have the top 10 countries with the highest projected population growth within the following year. At the top is India, at just over 15 million.

In [23]:
%%sql
select name as 'country',
        birth_rate,  death_rate,
        round(birth_rate - death_rate, 2) as diff
        from facts
where name <> 'World'
order by diff desc
limit 10
         ;

 * sqlite:///factbook.db
Done.


country,birth_rate,death_rate,diff
Malawi,41.56,8.41,33.15
Uganda,43.79,10.69,33.1
Niger,45.45,12.42,33.03
Burundi,42.01,9.27,32.74
Mali,44.99,12.89,32.1
Burkina Faso,42.03,11.72,30.31
Zambia,42.13,12.67,29.46
Ethiopia,37.27,8.19,29.08
South Sudan,36.91,8.18,28.73
Tanzania,36.39,8.0,28.39


These countries have the highest birth rate, with Malawi at the top.

In [24]:
%%sql
select name as Country,
        cast(population/area as int) as People_per_km2_Area,
        population as Population, area as Total_Area_km2
        from facts
where name <> 'World'
order by  People_per_km2_Area desc
limit 10
         ;

 * sqlite:///factbook.db
Done.


Country,People_per_km2_Area,Population,Total_Area_km2
Macau,21168,592731,28
Monaco,15267,30535,2
Singapore,8141,5674472,697
Hong Kong,6445,7141106,1108
Gaza Strip,5191,1869055,360
Gibraltar,4876,29258,6
Bahrain,1771,1346613,760
Maldives,1319,393253,298
Malta,1310,413965,316
Bermuda,1299,70196,54


These are the countries with the highest population density per km2. These are the ones, though, that didn't make it into the analysis from earlier because their total population was below average.

## Conclusion

This analysis gave insight into countries all around the world ranging from the country with the highest population to the lowest. Going from the most densely populated countries and those looking to see the most significant population growth within the following year. There's lots of information in this analysis that I wasn't aware of, which made it fun to break down. I hope you enjoyed learning about our world as much as I did.