# DE Africa Coastlines useful tools <img align="right" src="https://github.com/digitalearthafrica/deafrica-sandbox-notebooks/raw/main/Supplementary_data/DE_Africa_Logo_Stacked_RGB_small.jpg">

This notebook contains useful code snippets for processing and manipulating DE Africa Coastlines data.


---

## Getting started
Set working directory to top level of repo to ensure links work correctly:

In [1]:
cd ..

/g/data/dea-coastlines/deafrica-coastlines


### Load packages

First we import the required Python packages, then we connect to the database, and load the catalog of virtual products.

In [2]:
%matplotlib inline
%load_ext line_profiler
%load_ext autoreload
%autoreload 2

import os
import sys
import numpy as np
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt


## Extract style table from GeoPackage

In [None]:
import zipfile
with zipfile.ZipFile('../coastlines_v0.2.2.zip', 'r') as zip_ref:
    zip_ref.extractall()

In [3]:
# Load 'layer_styles' from geopackage and export as a CSV
layer = gpd.read_file("coastlines_cli_update (6).gpkg", layer="layer_styles")
layer.drop(['geometry'], axis=1).to_csv('coastlines/styles.csv', index=False)

## View output files on S3

In [None]:
# !aws s3 --no-sign-request --region=af-south-1 ls --recursive s3://deafrica-data-dev-af/coastlines/ | grep '.gpkg$'

In [None]:
# !aws s3 --no-sign-request --region=af-south-1 ls --recursive s3://deafrica-data-staging-af/coastlines/


## Run status per tile from Argo YAML

In [None]:
import pandas as pd
import yaml
from yaml import SafeLoader

# Load Argo job status
with open('run_status.yaml') as f:
    data = yaml.load(f, Loader=SafeLoader)

# Keep only jobs with valid inputs    
data_cleaned = {a:b for a, b in data['status']['nodes'].items() if 'inputs' in b}

# Obtain error code or missing error code for each job
df = pd.DataFrame(
    [
        (b["inputs"]["parameters"][0]["value"], b["outputs"]["exitCode"])
        if "outputs" in b
        else (b["inputs"]["parameters"][0]["value"], None)
        for a, b in data_cleaned.items()
    ],
    columns=["id", "error"],
)

# Drop non-tiles
df = df.loc[~df.id.isin(['v0.2.3', 'https://deafrica-input-datasets.s3.af-south-1.amazonaws.com/deafrica-coastlines/32km_coastal_grid_deafrica.geojson'])]

# Export to CSV that can be merged with tile grid
df['id'] = df['id'].astype(int)
df.to_csv('tile_status.csv', index=False)

***

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 
Digital Earth Africa data is licensed under the [Creative Commons by Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) license.

**Contact:** For assistance with any of the Python code or Jupyter Notebooks in this repository, please post a [Github issue](https://github.com/GeoscienceAustralia/DEACoastLines/issues/new).

**Last modified:** May 2022