# Analyzing Country Statistics Using SQL
In this project, we'll be analyzing statistics about all of the countries on Earth. The main aim is just to practice some SQL, so we'll be limiting our exploration to:
- Population and population growth
- Average population and area
- Densely-populated countriies

Also note that you won't be able to run this project on Deepnote. I completed it locally and manually uploaded it to Deepnote to publish it. Silly I know, I'm building a Jupyter publishing service so I no longer need to do this. 


## Data Overview
We'll be working with SQL data from the [CIA World Factbook](https://www.cia.gov/library/publications/the-world-factbook/). 

The Factbook contains demographic information like the following:
- `population`: The global population.
- `population_growth`: The annual population growth rate, as a percentage.
- `area`: The total land and water area.

Let's preview some of the information:

In [None]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

'Connected: None@factbook.db'

In [None]:
%%sql
SELECT *
  FROM facts
LIMIT 5;

Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


## Data Analysis

### Population and Population Growth
Let's run some SQL queries to explore general population and population growth information

In [None]:
%%sql
SELECT MIN(population) as min_population
FROM facts;

Done.


min_population
0


In [None]:
%%sql
SELECT MAX(population) as max_population
FROM facts;

Done.


max_population
7256490011


In [None]:
%%sql
SELECT MIN(population_growth) as min_population_growth
FROM facts;

Done.


min_population_growth
0.0


In [None]:
%%sql
SELECT MAX(population_growth) as max_population_growth
FROM facts;

Done.


max_population_growth
4.02


From these queries, we can see a few interesting things:

- There's a country with a population of `0`.
- There's a country with a population greater than `7256490011` (7.2 billion)

Let's see what who these guilty countries are

In [None]:
%%sql
SELECT name
FROM facts
WHERE population = (SELECT MIN(population) FROM facts);

Done.


name
Antarctica


In [None]:
%%sql
SELECT name
FROM facts
WHERE population = (SELECT MAX(population) FROM facts);

Done.


name
World


It looks like the database table contains a row for the `World`, which explains the 7.2 billion. 

Similarly, having a row for `Antarctica` explains the population value of 0.

### Average Population and Area
Let's find the average population and area for a country, excluding `World` which would bias the results

In [None]:
%%sql
SELECT AVG(population)
FROM facts
WHERE name != "World";

Done.


AVG(population)
32242666.56846473


In [None]:
%%sql
SELECT AVG(area)
FROM facts
WHERE name != "World";

Done.


AVG(area)
555093.546184739


This highlights that the average country population size is `32,242,666` million, while the average area is `555,093` sq km.

### Densely-Populated Countries
Given that we know the average population and area, let's find densely-populated countries. We'll consider them densely populated if they:
- have a `population` size greater than the average
- have an `area` less than the average

In [None]:
%%sql
SELECT name
FROM facts
WHERE name != "World" 
and population > (SELECT AVG(population) FROM facts WHERE name != "World")
and ar > (SELECT AVG(population) FROM facts WHERE name != "World")

Done.


name
Afghanistan
Algeria
Argentina
Bangladesh
Brazil
Burma
Canada
China
Colombia
"Congo, Democratic Republic of the"
