# Manually editing zone translation files

This notebook manually edits the `X_to_Y_spatial.csv` files, where `X` is a region-specific zone system (e.g. LSOA2021-YH) and `Y` is a country-wide zone system (e.g. LAD2021) to add in the "missing" elements of the country-wide zone system and infill the translation values with 0. This is because when translating LSOA2021-YH to LAD2021 (for example) only the LAD2021 zones in the YH subset will be generated in the correspondence, and hence when trying to add DVectors (or some other operator), there are missing zones in the DVector.data and the zone columns are then dropped.

This is not a perfect work around, but it should at least have no impact on any calculations!

In [1]:
from pathlib import Path
from itertools import product

import pandas as pd

### Define constants and zone systems

In [2]:
# define cache path
CACHE_PATH = Path(r'F:\Working\Land-Use\CACHE')

In [3]:
# define the "master" zone system that we want to add to the translation files
# this should be the country-wide zone system you are trying to translate the region-specific DVectors to
ZONE_SYSTEM = 'LAD2021+SCOTLANDLAD'

In [4]:
# define the regions of the region-specific zone systems
GORS = ['EM', 'EoE', 'Lon', 'NE', 'NW', 'SE', 'SW', 'Wales', 'WM', 'YH']
# GORS = ['Scotland']

In [5]:
# define the zone systems you wish to translate to the ZONE_SYSTEM level (e.g. LSOA2021)
ZONE_SYSTEMS = ['LSOA2021']
# ZONE_SYSTEMS = ['DZ2011']

### Read in the master zone definition to make sure all zones are accounted for

In [6]:
master = pd.read_csv(CACHE_PATH / ZONE_SYSTEM / 'zoning.csv')

In [7]:
master.head()

Unnamed: 0,zone_id,descriptions
0,S12000005,Clackmannanshire
1,S12000006,Dumfries and Galloway
2,S12000008,East Ayrshire
3,S12000010,East Lothian
4,S12000011,East Renfrewshire


### Add in "missing" zones from the master zone system to the translation files (with 0 data)

In [8]:
for SYSTEM, GOR in product(ZONE_SYSTEMS, GORS):
    translation_file = CACHE_PATH / f'{ZONE_SYSTEM}_{SYSTEM}-{GOR}' / f'{ZONE_SYSTEM}_to_{SYSTEM}-{GOR}_spatial.csv'
    # translation_file = CACHE_PATH / f'{SYSTEM}-{GOR}_{ZONE_SYSTEM}' / f'{SYSTEM}-{GOR}_to_{ZONE_SYSTEM}_spatial.csv'
    # translation_file = CACHE_PATH / f'{SYSTEM}_{ZONE_SYSTEM}' / f'{SYSTEM}_to_{ZONE_SYSTEM}_spatial.csv'
    # read in translation file (this should be a direct output of caf.space.translate)
    sub = pd.read_csv(translation_file)
    dfs = [sub]
    # add in a 0 column of data for each zone in the master zone system
    for zone in master['zone_id'].unique():
        mat = sub.copy()
        mat[f'{ZONE_SYSTEM}_id'] = zone
        mat[f'{SYSTEM}-{GOR}_to_{ZONE_SYSTEM}'] = 0
        mat[f'{ZONE_SYSTEM}_to_{SYSTEM}-{GOR}'] = 0
        # mat[f'{SYSTEM}_to_{ZONE_SYSTEM}'] = 0
        # mat[f'{ZONE_SYSTEM}_to_{SYSTEM}'] = 0
        dfs.append(mat)
    
    # recombine to a single output file
    output = pd.concat(dfs).groupby(
        [f'{SYSTEM}-{GOR}_id', f'{ZONE_SYSTEM}_id']
        # [f'{SYSTEM}_id', f'{ZONE_SYSTEM}_id']
    ).agg(
        {
            f'{SYSTEM}-{GOR}_to_{ZONE_SYSTEM}': 'sum',
            f'{ZONE_SYSTEM}_to_{SYSTEM}-{GOR}': 'sum'
            # f'{SYSTEM}_to_{ZONE_SYSTEM}': 'sum',
            # f'{ZONE_SYSTEM}_to_{SYSTEM}': 'sum'
        }
    ).reset_index()
    # overwrite existing zone translation file
    output.to_csv(translation_file, index=False)