# Nigerian Population By State And Region

This notebook analyses a dataset scrapped from Wikipedia on Nigerian Population by State. The figures are based on the 2006 Nigerian population census. This means they are over a decade old, and thus, not representative of the current Nigerian population. Nonetheless, the rankings of regions and states are expected to still reflect the current reality.

In exploring this dataset, I set out to answer the following the questions?
- Which are the five most populous states?
- Which are the five least populous states?
- What is the population of each of the six geopolitical zones?
- By extension of number 3, which are the most and least populous regions?

### Importing Relevant Libraries

In [1]:
import pandas as pd
from numpy import arange 
import matplotlib.pyplot as plt

### Getting the Data

In [2]:
url = "https://en.wikipedia.org/wiki/List_of_Nigerian_states_by_population"

In [3]:
data = pd.read_html(url, header=0)
nigerian_states_pop = data[0]        #getting the first table on the page                                                     
nigerian_states_pop.drop(columns = ["Population (2016)", "Rank (2006)"], inplace=True)     #removing unnecessary columns
nigerian_states_pop.rename({"Population (2016)":"Population"}, axis=1, inplace=True)
nigerian_states_pop

Unnamed: 0,State,Population (2006)
0,Kano State,9401288
1,Lagos State,9113605
2,Kaduna State,6113503
3,Katsina State,5801584
4,Oyo State,5580894
5,Rivers State,5198605
6,Bauchi State,4653066
7,Jigawa State,4361002
8,Benue State,4253641
9,Anambra State,4177828


### Five most populous states

In [4]:
five_most_populous_states = nigerian_states_pop.head()
five_most_populous_states

Unnamed: 0,State,Population (2006)
0,Kano State,9401288
1,Lagos State,9113605
2,Kaduna State,6113503
3,Katsina State,5801584
4,Oyo State,5580894


### Five least populous states

In [5]:
five_least_populous_states = nigerian_states_pop.tail()
five_least_populous_states

Unnamed: 0,State,Population (2006)
32,Taraba State,2294800
33,Ebonyi State,2176947
34,Nasarawa State,1869377
35,Bayelsa State,1704515
36,Federal Capital Territory,1405201


### Aggregating Population Data by Region

In [6]:
ss_pop = sum(nigerian_states_pop.loc[[5,11,14,23,26,35],"Population"])
print('Total South-South Population is {:,}'.format(ss_pop))

KeyError: 'Population'

In [None]:
sw_pop = sum(nigerian_states_pop.loc[[1,4,15,17,18,28],"Population"])
print('Total South-West Population is {:,}'.format(sw_pop))

In [None]:
se_pop = sum(nigerian_states_pop.loc[[9,13,21,27,33],"Population"])
print('Total South-East Population is {:,}'.format(se_pop))

In [None]:
nc_pop = sum(nigerian_states_pop.loc[[8,12,19,24,29,34,36],"Population"])
print('Total North-Central Population is {:,}'.format(nc_pop))

In [None]:
ne_pop = sum(nigerian_states_pop.loc[[6,10,25,30,31,32],"Population"])
print('Total North-East Population is {:,}'.format(ne_pop))

In [None]:
nw_pop = sum(nigerian_states_pop.loc[[0,2,3,7,16,20,22],"Population"])
print('Total North-West Population is {:,}'.format(nw_pop))

In [None]:
reg_pop_dict = {"Region": ['South-South', 'South-West', 'South-East', 'North-Central', 'North-East', 'North-West'],"Population": [ss_pop,sw_pop,se_pop,nc_pop,ne_pop,nw_pop]}
labels = [1,2,3,4,5,6]
reg_pop = pd.DataFrame(reg_pop_dict, index=labels)
reg_pop = reg_pop.sort_values(by = "Population",ascending=False)
reg_pop

### Visualizing Population by Region

In [None]:
col = reg_pop["Region"]

bar_position = arange(6) + 1.0
bar_height = reg_pop["Population"].values

fig,ax = plt.subplots()
ax.bar(bar_position, bar_height, 0.5)

tick_positions = range(1,7)
ax.set_xticks(tick_positions)
ax.set_xticklabels(col,rotation=90)
ax.set_yticklabels(range(0,60000000,10000000))

plt.xlabel("Regions")
plt.ylabel("Population")
plt.title("Nigerian Population by Region")

plt.show()

In [None]:
print('Most Populous Region is the North-West with {:,} people'.format(nw_pop))

In [None]:
print('Least Populous Region is the South-East with {:,} people'.format(se_pop))