# <span style="color:#336699; text-align:center">Introduction to the Python Client Library Harmonize (harmonizeds.py)</span>
<hr style="border:2px solid #0077b9;">

<br/>

# Python Client API
<hr style="border:1px solid #0077b9;">

For running the examples in this Jupyter Notebook you will need to install the [Client Python](https://github.com/brazil-data-cube/harmonizeds.py).To install it from pip, use the following command:

In [None]:
!pip install git+https://github.com/Harmonize-Brazil/harmonize-ds

In order to access the funcionalities of the client API, you should import the `harmonize_ds` package, as follows:

In [None]:
from harmonize_ds import HARMONIZEDS

After that, you can check the installed version of sample package:

In [None]:
harmonize_ds.__version__

# Listing the Available Datasets
<hr style="border:1px solid #0077b9;">

In the Jupyter environment, the HARMONIZEDS object will list the available datasets from the service:

In [None]:
print("Available collections:")
for collection in HARMONIZEDS.collections():
    print(collection)

# Retrieving the Metadata of a Dataset
<hr style="border:1px solid #0077b9;">

You can access the metadata of a specific dataset using the identifier of datasource (``id``) and collection ``collection_id``.

In [None]:
zica = HARMONIZEDS.get_collection(
    id="HARMONIZE-WFS", collection_id="bdc_lcc:zika_cases_north_mun_week"
)

In [None]:
metadata_zica = zica.describe()
metadata_zica

# Retrieving the data
<hr style="border:1px solid #0077b9;">

In order to retrieve the data of a dataset, use the the function ``data()``. This will return the data in a ``GeoPandas``.

In [None]:
df = zica.get(
        filter={
            'date': '2017-02-26'
        }
)

In [None]:
# This function returns the first n rows of bdc_obs
df.head()

# Visualizing the data
<hr style="border:1px solid #0077b9;">

It is possible to plot the dataset data with the ``plot`` method:

In [None]:
pip install matplotlib

In [None]:
import matplotlib.pyplot as plt

In [None]:
df.plot( marker='o', color='red', markersize=5, figsize=(20, 20));

# Visualizing the data with GeoPandas and others data
<hr style="border:1px solid #0077b9;">

After retrieving dataset data you can use any Python library to perform data processing. In this section we show how to use ``GeoPandas`` to load and use others data. With Pandas installed, import the library:

In [None]:
import geopandas as gpd
from matplotlib import pyplot as plt

You can define a file to import. In this example we use the ``read_file()`` to open a shapefile. Those data can be found in [unidades_da_federacao](http://servicodados.ibge.gov.br/Download/Download.ashx?u=geoftp.ibge.gov.br/organizacao_do_territorio/malhas_territoriais/malhas_municipais/municipio_2017/Brasil/BR/br_unidades_da_federacao.zip) and [Biomas_250mil](ftp://geoftp.ibge.gov.br/informacoes_ambientais/estudos_ambientais/biomas/vetores/Biomas_250mil.zip)

In [None]:
file_biomas = "https://geoftp.ibge.gov.br/informacoes_ambientais/estudos_ambientais/biomas/vetores/Biomas_250mil.zip"
file_uf = "https://geoftp.ibge.gov.br/organizacao_do_territorio/malhas_territoriais/malhas_municipais/municipio_2020/Brasil/BR/BR_UF_2020.zip"

In [None]:
# Load the biomas data of IBGE
biomas = gpd.read_file(file_biomas)

In [None]:
biomas

In [None]:
uf = gpd.read_file(file_uf)

In [None]:
uf.head()

The code below plots ``biomes``, ``federative units`` and ``zica`` ``df`` samples on a single map:

In [None]:
fig, ax = plt.subplots(figsize=(20,15))

biomas.plot(ax=ax, cmap='Set2', column='Bioma',edgecolor='black', legend=True,legend_kwds={'title': "Biomes", 'fontsize': 15})

uf.geometry.boundary.plot(ax=ax, color=None, edgecolor='black',linewidth = 0.2)

df.plot(ax=ax, color='red', markersize=4, edgecolor='black', linewidth = 0.1);

# Save data to file
<hr style="border:1px solid #0077b9;">

You can save data from a dataset to a ``shapefile`` using the ``.to_file`` method. It is necessary to inform in the parameter path, the directory that you want to save the file and the data. In the example below the data ``df`` from the ``zica`` dataset is being saved in a shapefile with the name ``my_save_bdc_obs``

In [None]:
HARMONIZEDS.save_feature(filename="my_save_bdc_obs.shp", gdf=df)