# CIA Factbook Data Analysis with SQL

The CIA factbook, also known as the [World Factbook](https://www.cia.gov/the-world-factbook/), is an annual publication of the US Central Intelligence Agency. It provides basic intelligence by summarizing information about countries and regions worldwide. The factbook contains a mix of demographic, geographic and population-based data, among many more.

This project aims to use SQL in Jupyter Notebook to analyze data from this [SQLite factbook.db](https://dsserver-prod-resources-1.s3.amazonaws.com/257/factbook.db) database.

## Database Connection

In [1]:
%%capture
%load_ext sql
%sql sqlite:///factbook.db

## Data Overview
---
First, let's examine the tables in our database:

In [2]:
%%sql
SELECT *
  FROM sqlite_master
 WHERE type='table';

 * sqlite:///factbook.db
Done.


type,name,tbl_name,rootpage,sql
table,sqlite_sequence,sqlite_sequence,3,"CREATE TABLE sqlite_sequence(name,seq)"
table,facts,facts,47,"CREATE TABLE ""facts"" (""id"" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, ""code"" varchar(255) NOT NULL, ""name"" varchar(255) NOT NULL, ""area"" integer, ""area_land"" integer, ""area_water"" integer, ""population"" integer, ""population_growth"" float, ""birth_rate"" float, ""death_rate"" float, ""migration_rate"" float)"


#### Notes
>- There are two tables in the database, **sqlite_sequence** and **facts**. 
>- The sql_sequence table does not contain any columns. 
>- The facts table contains information on _population, birth rate, migration rate_ and many more: this is what we need.
  
We will work with the **facts** table henceforth.

In [3]:
%%sql
-- Examine the first five rows in facts
SELECT *
  FROM facts
 LIMIT 5

 * sqlite:///factbook.db
Done.


id,code,name,area,area_land,area_water,population,population_growth,birth_rate,death_rate,migration_rate
1,af,Afghanistan,652230,652230,0,32564342,2.32,38.57,13.89,1.51
2,al,Albania,28748,27398,1350,3029278,0.3,12.92,6.58,3.3
3,ag,Algeria,2381741,2381741,0,39542166,1.84,23.67,4.31,0.92
4,an,Andorra,468,468,0,85580,0.12,8.13,6.96,0.0
5,ao,Angola,1246700,1246700,0,19625353,2.78,38.78,11.49,0.46


The facts table comprises **11 columns** with rather intuitive names. Here is a reference dictionary for the column names:
>- **id** - Entry row number.
>- **code** — The country's [internet code](https://www.cia.gov/the-world-factbook/field/internet-country-code/).
>- **name** — Name of the country.
>- **area** — The country's total area (both land and water).
>- **area_land** — The country's land area in square kilometers.
>- **area_water** — The country's water area in square kilometers.
>- **population** —  Whole number of people or inhabitants in the country.
>- **population_growth** — The country's population growth as a percentage.
>- **birth_rate** — The number of births per year per 1,000 inhabitants.
>- **death_rate** — The number of deaths per year per 1,000 inhabitants.
>- **migration_rate** — The difference between the number of persons entering and leaving the country during the year per 1,000 persons. 
