<img src="../../../img/harmonize_logo.png" align="right" width="66"/>


#### <span style="color:#336699">Introduction to Earth Observation Data Cubes tuned for Health Response (EDPU)<br> STAC functions in Python 
 </span>
<hr style="border:1px solid #0077b9;">

The [**E**ODCtHRS **D**ata **PU**blisher (EDPU)](https://github.com/Harmonize-Brazil/edpu) is a package developed by Harmonize Brazil Team - INPE in Python to publish data of interest to the project in the computing platform EODCtHRS. This package is a set of scripts that works with GeoServer and SpatioTemporal Asset Catalog (STAC), which are the tecnologies adopted in the scope of the Brazil Data Cube (BDC) project.

Although the dataset of Harmonize project contains heterogeneous data, such as tabular, vector and raster data, the workflow to accomodate these data are the same. All of them are published into GeoServer as a layer and their metadata are published into STAC. So, taking it into consideration, we developed this package to optimize the flow of publishing the data into EODCtHRS. This workflow is represented in the Figure below.

<img src="../../../img/general_workflow.jpg" align="center"/>

The EDPU has a set of functions that publish health, climate and drone image data as "features" and "coverages" on the GeoServer, as well as styles to customize the data visualization according to the values present. Another feature that the EDPU has in relation to the published layers is to add a temporal visualization of them (it allows you to enable the temporal dimension related to each published layer).

The data metadata for each module in the Harmonize project is published via STAC. The EDPU also provides a set of functions for dynamically creating and sharing collections and items related to each piece of data in the STAC catalog.

#### <span style="color:#336699"><b>First Step</b> 
 </span>


To run the examples in this Jupyter Notebook, you need to install the [edpu](https://github.com/Harmonize-Brazil/edpu.git) package. 

##### <span style="color:#336699"> Note </span>
<hr style="border:1px solid #0077b9;">
<br>
If you want to create a new Python Virtual Environment, please, follow this instruction on a terminal before installing the package:

<ul> <li>First, create a new virtual environment linked to Python 3.7:
        
    $ python3.7 -m venv venv
</ul>

<ul><li>Activate the new environment:

    $ source venv/bin/activate
</ul>

<ul><li>Update pip, wheel and setuptools:

    $ pip3 install --upgrade pip setuptools wheel
</ul>
<br>
<hr style="border:1px solid #0077b9;">

##### <span style="color:#336699"> Install </span>

<ul> <li>Use git to clone the software repository:

    $ git clone https://github.com/Harmonize-Brazil/edpu.git
</ul>

<ul><li>Go to the source code folder:

    $ cd edpu
</ul>

<ul><li>Install in development mode:

    $ pip3 install -e .
</ul>

##### <span style="color:#336699"> Import Packages </span>
Let's load the json and edpu packages (STAC script):

In [1]:
import json
from edpu import stac

##### <span style="color:#336699"> Publishing a new Collection </span>

To publish a new collection, the EDPU uses the BDC-Catalog package. To do this, the user must provide a json file with the collection's information following a standard of metadata values defined by the BDC team. To improve that, EDPU provides json with collection templates to publish Health, Climate and Drone Image metadata into STAC. Theses templates can be found in the following directory structure:
<center><b>./edpu/templates/jsons/*.json</b></center>

As the package provides an example collection structure, the user only needs to create a <i>dictionary</i> with the updates they want to make to the new collection. An example is provided below.

In [3]:
new_informations = {
    "name": "dengue_mortality_rate_municipality_week",
    "version": 1,
    "metadata": {
            "wms": {
                "url": "http://localhost:10190/geoserver/bdc_lcc/wms",
                "layerName": "bdc_lcc:dengue_mortality_rate_municipality_week"
            }
    },
    "bands": [
        {
            "name": "TABULAR",
            "common_name": "tabular",
            "description": "This is the data in a tabular format",
            "data_type": "uint8",
            "mime_type": "application/octet-stream",
            "min_value": 0,
            "max_value": 0,
            "nodata": None,
            "resolution_x": None,
            "resolution_y": None,
            "center_wavelength": None,
            "full_width_half_max": None
        },
        {
            "name": "THUMBNAIL",
            "common_name": "thumbnail",
            "description": "This is the quicklook of data",
            "data_type": "uint8",
            "mime_type": "image/png",
            "min_value": 0,
            "max_value": 0,
            "nodata": None,
            "resolution_x": None,
            "resolution_y": None,
            "center_wavelength": None,
            "full_width_half_max": None
        },
        {
            "name": "GEOJSON",
            "common_name": "geojson",
            "description": "This is the data in a geospatial format",
            "data_type": "uint8",
            "mime_type": "application/geo+json",
            "min_value": 0,
            "max_value": 0,
            "nodata": None,
            "resolution_x": None,
            "resolution_y": None,
            "center_wavelength": None,
            "full_width_half_max": None
        },
        {
            "name": "SHAPEFILE",
            "common_name": "shapefile",
            "description": "This is the data in a shapefile format",
            "data_type": "uint8",
            "mime_type": "application/zip",
            "min_value": 0,
            "max_value": 0,
            "nodata": None,
            "resolution_x": None,
            "resolution_y": None,
            "center_wavelength": None,
            "full_width_half_max": None
        }
    ]
}

In [9]:
new_informations

With the metadata dictionary of the new collection, you can publish it on BDC-STAC. To do this, you can use the *publish_collection()* function. To run it, you can provide the following parameters:

<ul>
    <li> <b>data</b>: this parameter is <b>required</b>. It's a dictionary that contains the collection's metadata JSON file.<br><br>
    <li> <b>template</b>: this parameter is <b>required</b>. It represents the JSON metadata model of the collections. You can pass keywords such as Health, Climate and Drone or a path to the template. The first one uses the edpu template and the second one uses the template provided by user.<br><br>
    <li> <b>output_file</b>: this parameter is <b>optional</b>. It is a string containing the path to store the json file of the new collection. By default, its value is None. <br><br>
    <li> <b>del_output_file</b>: this parameter controls the deletion of the output_file. If True, it deletes the output_file. By default, its value is True. <br><br>
    <li> <b>workspace</b>: this parameter receive the name of geoserver workspace. By default, its value is bdc_lcc.
</ul>


In [8]:
col_id = stac.publish_collection(data=new_informations, template='health')
print(col_id)