## Northern Ireland Statistics and Research Agency

[2017 Mid Year Population Estimates for Northern Ireland (NEW FORMAT TABLES)
](https://www.nisra.gov.uk/publications/2017-mid-year-population-estimates-northern-ireland-new-format-tables)

In [1]:
from gssutils import *
scraper = Scraper('https://www.nisra.gov.uk/publications/'\
                  '2017-mid-year-population-estimates-northern-ireland-new-format-tables')
scraper


## 2017 Mid Year Population Estimates for Northern Ireland (NEW FORMAT TABLES)

Statistics are available as follows:

### Population Estimates for all geographic levels (not including Super Output
Areas and former Electoral Wards)

Statistics are available for various time periods dependent on the geographic
level selected. For example, Northern Ireland level population estimates are
available from 1971 to 2017, while estimates for Local Government Districts
are available from 2001 to 2017. See the table metadata for more details.

  * Components of population change
  * Population by sex and single year of age
  * Population by sex and age bands
  * Population totals
  * Population Densities
  * Median age

### Population Estimates for Super Output Areas and former Electoral Wards

Time period is 2001 to 2017

  * Population by sex and broad age bands

### Migration Estimates for Northern Ireland

Time period is 2001 to 2017

  * Net migration by sex and single year of age
  * Net migration by sex and age bands
  * Migration flows by type

### Population Estimates by Deprivation Deciles

Northern Ireland population reported by deprivation deciles for both the new
[NIMDM2017](https://www.nisra.gov.uk/statistics/deprivation/northern-ireland-
multiple-deprivation-measure-2017-nimdm2017) and the old
[NIMDM2010](https://www.nisra.gov.uk/statistics/deprivation/northern-ireland-
multiple-deprivation-measure-2010-nimdm2010) measures. Time period is 2001 to
2017.

  * Population by sex and five year age bands

### See also:

  * [Traditional format tables](https://www.nisra.gov.uk/publications/2017-mid-year-population-estimates-northern-ireland)
  * 2017 mid year population and migration [infographic](http://www.ninis2.nisra.gov.uk/InteractiveMaps/DataVis/NI%20Population%202017.pdf "external link opens in a new window / tab") and [interactive map](http://www.ninis2.nisra.gov.uk/InteractiveMaps/Population/Population%20Change/Population%20Estimates%20Broad%20Age%20Bands/atlas.html "external link opens in a new window / tab")
  * 2017 Mid-Year Population Estimates - Components of Population Change - [Interactive Map](http://www.ninis2.nisra.gov.uk/InteractiveMaps/Population/ComponentsPopChange/atlas.html "external link opens in a new window / tab")
  * 2017 Mid-Year Population Estimates - Population Totals - [Interactive Map](http://www.ninis2.nisra.gov.uk/InteractiveMaps/Population/Population%20Change/Population%20Totals/atlas.html "external link opens in a new window / tab")
  * 2017 Mid-Year Population Estimates - [Interactive Population Pyramid](http://www.ninis2.nisra.gov.uk/InteractiveMaps/Population%20Pyramid/2007_2017MYE/NINIS_Pyramid_2017.html "external link opens in a new window / tab")
  * Population Estimates - [UK Comparisons Paper](https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/methodologies/consistencyofmethodsusedforpopulationstatisticsacrossukcountries "external link opens in a new window / tab")


### Distributions

1. All areas - Components of population change ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_CoC.xlsx))
1. All areas - Population by sex and single year of age ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_SYA.xlsx))
1. All areas - Population by sex and age bands ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_AGE_BANDS.xlsx))
1. All areas - Population totals (2001-2017: including historical NI 1821-2017) ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_POP_TOTALS.xlsx))
1. All areas - Population densities (2001-2017) ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_POP_DENSITIES.xlsx))
1. All areas - Median age (2001-2017) ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_MEDIAN_AGE.xlsx))
1. SOAs and Former Electoral Wards - Population by sex and broad age bands (2001-2017) ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_SOA_WARD.xlsx))
1. Northern Ireland - Net migration by sex and single year of age (2001-2017) ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_NETMIG_AGE.xlsx))
1. Northern Ireland - Net migration by sex and age bands (2001-2017) ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_NETMIG_AGE_BANDS.xlsx))
1. Northern Ireland - Migration flows by type (2001-2017) ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_MIG_FLOWS.xlsx))
1. Deprivation Deciles - population by sex and five year age bands (2001-2017) ([MS Excel Spreadsheet](https://www.nisra.gov.uk/sites/nisra.gov.uk/files/publications/MYE17_NIMDM.xlsx))


In [2]:
from databaker.framework import *
import pandas as pd

from rdflib import Namespace
from rdflib.namespace import DCTERMS, VOID

DCAT = Namespace('http://www.w3.org/ns/dcat#')
GDP = Namespace('http://gss-data.org.uk/def/gdp#')
GOVORG = Namespace('https://www.gov.uk/government/organisations/')

def extractMetadata(sheet):
    metadata = {}
    description = []
    prop = None
    name2prop = {
        'National Statistics Theme:': DCAT.theme,
        'Data Subset:': DCAT.theme,
        'Dataset Title:': DCTERMS.title,
        'Coverage:': DCTERMS.spatial,
        'Source:': DCTERMS.publisher,
        'Contact:': DCAT.contactPoint,
        'National Statistics Data?': GDP.nationalStatistics,
        'Responsible Statistician:': DCAT.contactPoint
    }
    section = 'metadata'
    for row in sheet:
        if section == 'metadata':
            if row[1] in name2prop:
                prop = name2prop[row[1]]
            elif row[1] == 'Description of Data':
                section = 'description'
            if section != 'description' and len(row[2]) != 0 and prop:
                if prop in metadata:
                    metadata[prop] = metadata[prop] + " " + row[2]
                else:
                    metadata[prop] = row[2].strip()
        elif section == 'description':
            description.append(row[1])
    metadata[DCTERMS.description] = '\n'.join(description).strip()
    return metadata

In [3]:
book = scraper.distribution(
    title='All areas - Components of population change'
).as_pandas(sheet_name=None) # All sheets as a dict of DataFrames
flat = book['Flat']
%run "NISRA Migration MEY17CoC.ipynb"
all_tidy = tidy.copy()
md1 = extractMetadata(book['Metadata'])

book = scraper.distribution(
    title=lambda x: x.startswith('Northern Ireland - Net migration by sex and age bands')
).as_pandas(sheet_name=None) # All sheets as a dict of DataFrames
flat = book['Flat']
%run "NISRA Migration MYE17 NETMIG AGE BANDS Gender.ipynb"
all_tidy = pd.concat([all_tidy, tidy])
md2 = extractMetadata(book['Metadata'])

book = scraper.distribution(
    title=lambda x: x.startswith('Northern Ireland - Net migration by sex and age bands')
).as_pandas(sheet_name=None) # All sheets as a dict of DataFrames
flat = book['Flat']
%run "NISRA Migration MYE17 NETMIG AGE.ipynb"
all_tidy = pd.concat([all_tidy, tidy])
md3 = extractMetadata(book['Metadata'])

book = scraper.distribution(
    title=lambda x: x.startswith('Northern Ireland - Migration flows by type')
).as_pandas(sheet_name=None) # All sheets as a dict of DataFrames
flat = book['Flat']
%run "NISRA Migration MYE17 NETMIG FLOW.ipynb"
all_tidy = pd.concat([all_tidy, tidy])
md4 = extractMetadata(book['Metadata'])

Unnamed: 0,area,area_code,area_name,year,sort,category,MYE
0,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1,Starting population,1688838
1,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,2,Births,21460
2,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,3,Deaths,14432
3,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,4,Natural Change,7028
4,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,5,Internal Inflows,0
5,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,6,Internal Outflows,0
6,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,7,Internal Net,0
7,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,8,United Kingdom Inflows,12510
8,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,9,United Kingdom Outflows,11589
9,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,10,United Kingdom Net,921


Unnamed: 0,area,area_code,area_name,year,type,gender,age,NETMIG
0,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,00-17,-90
1,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,18-24,-1070
2,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,25-34,1274
3,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,35-44,362
4,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,45-54,181
5,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,55-64,173
6,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,65+,91
7,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,Females,00-17,36
8,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,Females,18-24,-549
9,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,Females,25-34,609


Unnamed: 0,area,area_code,area_name,year,type,gender,age,NETMIG
0,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,00-17,-90
1,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,18-24,-1070
2,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,25-34,1274
3,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,35-44,362
4,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,45-54,181
5,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,55-64,173
6,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,All persons,65+,91
7,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,Females,00-17,36
8,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,Females,18-24,-549
9,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,1. United Kingdom Net,Females,25-34,609


Unnamed: 0,area,area_code,area_name,year,category,sort,MYE
0,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,United Kingdom Inflows,1,12510
1,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,Rest of World Inflows,2,6488
2,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,Total Inflows,3,18998
3,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,United Kingdom Outflows,4,11589
4,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,Rest of World Outflows,5,6393
5,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,Total Outflows,6,17982
6,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,United Kingdom Net,7,921
7,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,Rest of World Net,8,95
8,1. Northern Ireland,N92000002,NORTHERN IRELAND,2001/2002,Total Net,9,1016
9,1. Northern Ireland,N92000002,NORTHERN IRELAND,2002/2003,United Kingdom Inflows,1,11107


We're merging the data from these spreadsheets into the same cube, by making implicit dimensions explicit. The metadata should be the same for each, but it differs slightly: the description for the first table is a bit less verbose; the reference area for the first is a specialization of the rest.

In [4]:
metadata = md4
md4

{rdflib.term.URIRef('http://purl.org/dc/terms/description'): ''}

In [5]:
all_tidy.count()

Mid Year                       16220
Area                           16220
Age                            16220
Sex                            16220
Population Change Component    16220
Measure Type                   16220
Value                          16220
Unit                           16220
dtype: int64

In [6]:
all_tidy.rename(columns={'Area': 'Area of Destination or Origin'}, inplace=True)
all_tidy

Unnamed: 0,Mid Year,Area of Destination or Origin,Age,Sex,Population Change Component,Measure Type,Value,Unit
0,2001-06-30T00:00:00/P1Y,N92000002,all,T,Starting population,Count,1688838,People
1,2001-06-30T00:00:00/P1Y,N92000002,all,T,Births,Count,21460,People
2,2001-06-30T00:00:00/P1Y,N92000002,all,T,Deaths,Count,14432,People
3,2001-06-30T00:00:00/P1Y,N92000002,all,T,Natural Change,Count,7028,People
4,2001-06-30T00:00:00/P1Y,N92000002,all,T,Internal Inflows,Count,0,People
5,2001-06-30T00:00:00/P1Y,N92000002,all,T,Internal Outflows,Count,0,People
6,2001-06-30T00:00:00/P1Y,N92000002,all,T,Internal Net,Count,0,People
7,2001-06-30T00:00:00/P1Y,N92000002,all,T,United Kingdom Inflows,Count,12510,People
8,2001-06-30T00:00:00/P1Y,N92000002,all,T,United Kingdom Outflows,Count,11589,People
9,2001-06-30T00:00:00/P1Y,N92000002,all,T,United Kingdom Net,Count,921,People


In [7]:
out = Path('out')
out.mkdir(exist_ok=True)
all_tidy.to_csv(out / 'migration_nisra.csv', index = False)

NameError: name 'Path' is not defined

Some metadata is in the spreadsheets, above, while some is in the web page at https://www.nisra.gov.uk/publications/2017-mid-year-population-estimates-northern-ireland-new-format-tables

In [None]:
from lxml import html
import os
from rdflib import URIRef, Literal, Dataset, RDFS, RDF
from rdflib.namespace import VOID
import re

pageURL = 'https://www.nisra.gov.uk/publications/2017-mid-year-population-estimates-northern-ireland-new-format-tables'
page = session.get(pageURL)
tree = html.fromstring(page.text)
published = pd.to_datetime(
    tree.xpath("//span[starts-with(text(), 'Date published')]/following-sibling::*/text()")[0].strip()
).tz_localize('Europe/London').date()

metadata[DCTERMS.issued] = Literal(published)

def pathify(label):
    return re.sub('-\$', '',
        re.sub('-+', '-',
            re.sub('[^\\w/]', '-', label.lower())))

base = 'http://gss-data.org.uk'
datasetPath = pathify(os.getenv('JOB_NAME', 'NISRA-NI-migration-estimates'))
metadataURI = URIRef(f'{base}/graph/{datasetPath}/metadata')
datasetURI = URIRef(f'{base}/data/{datasetPath}')

PMD = Namespace('http://publishmydata.com/def/dataset#')
QB = Namespace('http://purl.org/linked-data/cube#')
GDP = Namespace(f'{base}/def/gdp#')

def propObject(prop, s):
    if prop in [DCAT.theme, DCTERMS.title, DCTERMS.spatial, DCTERMS.description,
                DCAT.contactPoint]:
        return Literal(s, 'en')
    elif prop == DCTERMS.publisher:
        return {'NISRA': GOVORG['northern-ireland-statistics-and-research-agency']}.get(
            s, Literal(s, 'en')
        )
    elif prop == GDP.nationalStatistics:
        return Literal(True) if s == 'Yes' else Literal(False)
    else:
        return Literal(s, 'en')

quads = Dataset()
quads.bind('pmd', PMD)
quads.bind('qb', QB)
quads.bind('dct', DCTERMS)
quads.bind('void', VOID)
quads.bind('gdp', GDP)
quads.bind('dcat', DCAT)

mdgraph = quads.graph(metadataURI)
mdgraph.add((datasetURI, RDF.type, PMD.LinkedDataset))
mdgraph.add((datasetURI, RDF.type, PMD.Dataset))
mdgraph.add((datasetURI, RDF.type, QB.DataSet))

for k, v in metadata.items():
    mdgraph.add((datasetURI, k, propObject(k, v)))

mdgraph.add((datasetURI, VOID.sparqlEndpoint, URIRef(f'{base}/sparql')))
mdgraph.add((datasetURI, PMD.graph, URIRef(f'{base}/graph/{datasetPath}')))
mdgraph.add((datasetURI, GDP.family, GDP.migration))

with open(out / 'dataset.trig', 'wb') as f:
    quads.serialize(destination=f, format='trig')