# Combine Partition Maps

This notebook uses the DHS cluster data to partion the clusters into train and validation segments.


## File System Structure



## Input
<pre style="font-family: monospace;">
./GIS-Image-Stack-Processing
    /AOI/
        Partitions/
            PK/
                <span style="color: blue;">PK_all.json</span> 
                <span style="color: blue;">PK_train.json</span> 
                <span style="color: blue;">PK_valid.json</span> 
            TD/
                <span style="color: blue;">TD_all.json</span> 
                <span style="color: blue;">TD_train.json</span> 
                <span style="color: blue;">TD_valid.json</span> 
</pre>

## Output

DHS data is used as the basis for creating partition maps for each country based on the location of clusters. 

<pre style="font-family: monospace;">
./GIS-Image-Stack-Processing
    /AOI/
        Partitions/
            <span style="color: blue;">all.json</span> 
            <span style="color: blue;">train.json</span> 
            <span style="color: blue;">valid.json</span> 
            PK/
                PK_all.json
                PK_train.json
                PK_valid.json
            TD/
                TD_all.json
                TD_train.json
                TD_valid.json   

</pre>

In [1]:
import os
import sys
import json
from collections import defaultdict

In [3]:
PRT_ROOT = './GIS-Image-Stack-Processing/AOI/Partitions'

## Combine Partition Maps

In [4]:
def combine_partition_maps(directory):
    
    # Dictionary to hold combined data: keys are 'train', 'valid', 'all'
    combined_partitions = defaultdict(dict)

    # Traverse each subdirectory within the main directory
    for subfolder in os.listdir(directory):
        
        subfolder_path = os.path.join(directory, subfolder)
        if os.path.isdir(subfolder_path):  # Ensure it is a directory
           
            # Scan the subfolder for json files
            for filename in os.listdir(subfolder_path):
                
                if filename.endswith('.json') and ('_train.json' in filename or '_valid.json' in filename or '_all.json' in filename):
                    partition_type = filename.split('_')[-1].replace('.json', '')  # Extract 'train', 'valid', or 'all'
                    country_code = filename.split('_')[0]  # Extract country code like 'PK', 'TD'

                    # Load the JSON data from the file
                    with open(os.path.join(subfolder_path, filename), 'r') as file:
                        data = json.load(file)
                    
                    # Add this data to the corresponding part in the combined dictionary
                    if country_code in data:
                        combined_partitions[partition_type].update(data)

    # Write out the combined data to new JSON files
    for partition_type, data in combined_partitions.items():
        
        output_path = os.path.join(directory, f"{partition_type}.json")
        
        with open(output_path, 'w') as file:
            json.dump(data, file, indent=4)
        
        print(f"Combined {partition_type} partition map saved to {output_path}")

In [5]:
combine_partition_maps(PRT_ROOT)

Combined train partition map saved to ./GIS-Image-Stack-Processing/AOI/Partitions/train.json
Combined all partition map saved to ./GIS-Image-Stack-Processing/AOI/Partitions/all.json
Combined valid partition map saved to ./GIS-Image-Stack-Processing/AOI/Partitions/valid.json
