# Migration between Scotland and Overseas

In [1]:
from gssutils import *
scraper = Scraper('https://www.nrscotland.gov.uk/statistics-and-data/statistics/' \
                      'statistics-by-theme/migration/migration-statistics/migration-between-scotland-and-overseas')
scraper

## Migration between Scotland and Overseas

### Description

Migration between Scotland and overseas refers to people moving between
Scotland and any country outside the UK.

Due to the sources of data used by National Records of Scotland (NRS) to
estimate migration, the country or group of countries that overseas migrants
come from cannot be identified.



### Distributions

1. Migration between administrative areas and overseas by sex ([MS Excel Spreadsheet](https://www.nrscotland.gov.uk/files//statistics/migration/2018-july/tab-z1-overseas-mig-flows-admin-sex-hb-2001-02-latest-july-18.xlsx))
1. Migration between administrative areas and overseas by sex ([text/csv](https://www.nrscotland.gov.uk/files//statistics/migration/2018-july/tab-z1-overseas-mig-flows-admin-sex-hb-2001-02-latest-july-18.zip))
1. Migration between Scotland and overseas by age ([MS Excel Spreadsheet](https://www.nrscotland.gov.uk/files//statistics/migration/2018-july/tab-z2-overseas-mig-flows-by-age-scotland-2001-02-latest-july-18.xlsx))
1. Migration between Scotland and overseas by age ([text/csv](https://www.nrscotland.gov.uk/files//statistics/migration/2018-july/tab-z2-overseas-mig-flows-by-age-scotland-2001-02-latest-july-18.zip))


In [2]:
scraper.dataset.theme = metadata.THEME['population']
scraper.dataset

In [3]:
databaker_sheets = {sheet.name: sheet for sheet in scraper.distribution(
    title='Migration between administrative areas and overseas by sex',
    mediaType=Excel).as_databaker()}

In [4]:
next_table = pd.DataFrame()

In [5]:
%%capture

tab = databaker_sheets['Net-Council Area-Sex']
%run "migration-admin-areas-by-sex-net.ipynb"
next_table = pd.concat([next_table, new_table])

tab = databaker_sheets['In-Council Area-Sex']
%run "migration-admin-areas-by-sex-in.ipynb"
next_table = pd.concat([next_table, new_table])

tab = databaker_sheets['Out-Council Area-Sex']
%run "migration-admin-areas-by-sex-out.ipynb"
next_table = pd.concat([next_table, new_table])



In [6]:
 distribution = scraper.distribution(
    title='Migration between Scotland and overseas by age',
    mediaType='application/vnd.ms-excel')
tabs = distribution.as_databaker()

In [7]:
%run "migration-by-age-2001-to-2017.ipynb"
next_table = pd.concat([next_table, Final_table])



















In [8]:
tab = distribution.as_pandas(sheet_name = 'SYOA Females (2001-)')
%run "migration-by-age-2001-to-2017-females.ipynb"
next_table = pd.concat([next_table, Final_table])

In [9]:
%run "migration-by-age-2001-to-2017-persons.ipynb"
next_table = pd.concat([next_table, Final_table])




In [10]:
%run "migration-by-age-2001-to-2017-males.ipynb"
next_table = pd.concat([next_table, Final_table])

ERROR:File `'\'"migration-by-age-2001-to-2017-males.ipynb"\'.py'` not found.


In [11]:
next_table.count()

Area of Destination or Origin    19439
Mid Year                         19439
Sex                              19439
Age                              19439
Flow                             19439
Measure Type                     19439
Value                            19439
Unit                             19439
dtype: int64

In [12]:
next_table.columns = ['Area of Destination or Origin1' if x=='Area of Destination or Origin' else x for x in next_table.columns]

In [13]:
import pandas as pd
c=pd.read_csv("scottish-geo-lookup.csv")

In [14]:
c

Unnamed: 0,label,notation
0,Scotland,S92000003
1,Clackmannanshire,S12000005
2,Glasgow City,S12000046
3,Dumfries and Galloway,S12000006
4,East Ayrshire,S12000008
5,East Lothian,S12000010
6,East Renfrewshire,S12000011
7,Falkirk,S12000014
8,Fife,S12000015
9,Highland,S12000017


In [15]:
Final_table = pd.merge(next_table, c, how = 'left', left_on = 'Area of Destination or Origin1', right_on = 'label')

In [16]:
Final_table.columns = ['Area of Destination or Origin' if x=='notation' else x for x in Final_table.columns]

In [17]:
Final_table['Area of Destination or Origin'].fillna('None', inplace = True)

In [18]:
def user_perc(x,y):
    
    if x == 'None' :
        return y
    else:
        return x
    
Final_table['Area of Destination or Origin'] = Final_table.apply(lambda row: user_perc(row['Area of Destination or Origin'], row['Area of Destination or Origin1']), axis = 1)


In [19]:
Final_table = Final_table[['Area of Destination or Origin','Mid Year','Sex','Age', 'Flow','Measure Type','Value','Unit']]

In [20]:
Final_table['Value'] = Final_table['Value'].astype(int)

In [21]:
Final_table = Final_table[Final_table['Mid Year'] != '']

In [22]:
Final_table = Final_table[Final_table['Mid Year'] != 'Year']

In [23]:
Final_table['Age'] = Final_table['Age'].map(
    lambda x: {
        'nrs/all' : 'all', 
        'year/all' : 'all',
        }.get(x, x))


In [24]:
Final_table['Flow'] = Final_table['Flow'].str.lower()

In [25]:
Final_table['Flow'] = Final_table['Flow'].map(
    lambda x: {
        'total' : 'resident'
        }.get(x, x))

In [26]:
from pathlib import Path
out = Path('out')
out.mkdir(exist_ok=True)
Final_table.drop_duplicates().to_csv(out / 'tidy.csv', index = False)

In [27]:
scraper.dataset.family = 'migration'
scraper.dataset.license = 'http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/'

with open(out / 'dataset.trig', 'wb') as metadata:
    metadata.write(scraper.generate_trig())