# Create Layer Config Backup

This notebook outlines how to run a process to create a remote backup of gfw layers.

Rough process:

- Run this notebook from the `gfw/data` folder
- Wait...
- Check `_metadata.json` files in the `production` and `staging` folders for changes
- If everything looks good, make a PR

First, install the latest version of LMIPy

In [2]:
!pip install LMIPy

from IPython.display import clear_output
clear_output()

print('LMI ready!')

LMI ready!


Next, import relevent modules

In [38]:
import LMIPy as lmi
import os
import json
from pprint import pprint
from datetime import datetime

First, pull the gfw repo and check that the following path correctly finds the `data/layers` folder, inside which, you should find a `production` and `staging` folder.

In [11]:
envs = ['staging', 'production']

In [4]:
path = './layers'

In [12]:
# Check correct folders are found

if not all([folder in os.listdir(path) for folder in envs]):
    print(f'Boo! Incorrect path: {path}')
else:
    print('Good to go!')

Good to go!


Run the following to save, build `.json` files and log changes.

In [58]:
%%time
for env in envs:
    
    # Get all old ids
    old_ids = [file.split('.json')[0] for file in os.listdir(path + f'/{env}') if '_metadata' not in file]
    
    old_datasets = []
    files = os.listdir(path + f'/{env}')
    
    # Extract all oild datasets
    for file in files:
        if '_metadata' not in file:
            with open(path + f'/{env}/{file}') as f:
                old_datasets.append(json.load(f))
    
    # Now pull all current gfw datasets and save
    col = lmi.Collection(app=['gfw'], env=env)
    col.save(path + f'/{env}')
    
    # Get all new ids
    new_ids = [file.split('.json')[0] for file in os.listdir(path + f'/{env}') if '_metadata' not in file]
    
    # See which are new, and which have been removed
    added = list(set(new_ids) - set(old_ids))
    removed = list(set(old_ids) - set(new_ids))
    changed = []
    
    # COmpare old and new, logging those that have changed
    for old_dataset in old_datasets:
        ds_id = old_dataset['id']
        old_ids.append(ds_id)
        with open(path + f'/{env}/{ds_id}.json') as f:
                new_dataset = json.load(f)
        
        if old_dataset != new_dataset:
            changed.append(ds_id)
    
    # Create metadata json
    with open(path + f'/{env}/_metadata.json', 'w') as f:
        
        meta = {
            'updatedAt': datetime.today().strftime('%Y-%m-%d@%Hh-%Mm-%Ss'),
            'env': env,
            'differences': {
                'changed': changed,
                'added': added,
                'removed': removed
            }
        }
        
        # And save it too!
        json.dump(meta,f)
        
print('Done!')

0it [00:00, ?it/s]

Saving to path: ./gfw/data/layers/staging


11it [00:01,  8.96it/s]
0it [00:00, ?it/s]

Saving to path: ./gfw/data/layers/production


242it [00:12, 19.65it/s]
