## Query HRVPP catalogue via OpenSearch and directly download data

This notebook shows how to query the HRVPP catalogue service. The HRVPP catalogue service is an alternative for the WekEO HDA interface and implements a standardized [OpenSearch interface](http://docs.opengeospatial.org/is/17-047r1/17-047r1.html).
The catalogue service also comes with a [Python client](https://github.com/VITObelgium/terracatalogueclient), which allows for easy integration in Python notebooks and Python-based processing chains.

We'll also show how to download data over HTTP using WekEO credentials.

### Table of contents
* [Install & import packages](#install-import)
* [Discover collections](#discover-collections)
* [Search products](#search-products)
* [Download data](#download-data)

#### Install & import packages <a class="anchor" id="install-import"></a>

Let's start with installing the Python catalogue client

In [22]:
!pip3 install --user --quiet --index-url=https://artifactory.vgt.vito.be/api/pypi/python-packages/simple terracatalogueclient==0.1.14

Next, we import some required packages and initialize the catalogue client. Make sure to select the HRVPP catalogue environment:

In [23]:
from terracatalogueclient import Catalogue
from terracatalogueclient.config import CatalogueConfig
from terracatalogueclient.config import CatalogueEnvironment

# make sure to retrieve config for the HRVPP catalogue
config = CatalogueConfig.from_environment(CatalogueEnvironment.HRVPP)
catalogue = Catalogue(config)

#### Discover collections <a class="anchor" id="discover-collections"></a>

We'll first query the available HRVPP collections. You'll see that we have one VI (Vegentation Index) collection in UTM projection and ST (Seasonal Trajectories) and VPP (Vegetation Phenology and Productivity parameters) in both UTM and LAEA projections:

In [24]:
import pandas as pd
collections = catalogue.get_collections()

rows = []
for c in collections:
    rows.append([c.id, c.properties['title']])

df = pd.DataFrame(data = rows, columns = ['Identifier', 'Description'])
df.style.set_properties(**{'text-align': 'left'})

Unnamed: 0,Identifier,Description
0,copernicus_r_3035_x_m_hrvpp-st_p_2017-now_v01,"Seasonal Trajectories, 10-daily, LAEA projection"
1,copernicus_r_3035_x_m_hrvpp-vpp_p_2017-now_v01,"Vegetation Phenology and Productivity parameters, yearly, LAEA projection"
2,copernicus_r_utm-wgs84_10_m_hrvpp-st_p_2017-now_v01,"Seasonal Trajectories, 10-daily, UTM projection"
3,copernicus_r_utm-wgs84_10_m_hrvpp-vi_p_2017-now_v01,"Vegetation Indices, daily, UTM projection"
4,copernicus_r_utm-wgs84_10_m_hrvpp-vpp_p_2017-now_v01,"Vegetation Phenology and Productivity parameters, yearly, UTM projection"


#### Search products <a class="anchor" id="search-products"></a>

Let's search for all VPP parameters for one UTM tile for 2020 using the VPP UTM collection.

In [25]:
import pandas as pd
import datetime as dt

rows = []
products = catalogue.get_products(
    "copernicus_r_utm-wgs84_10_m_hrvpp-vpp_p_2017-now_v01",
    start=dt.date(2020, 1, 1),
    end=dt.date(2020, 12, 31),
    tileId="31UFS"
)
for product in products:
    rows.append([product.id, product.data[0].href, (product.data[0].length/(1024*1024))])

df = pd.DataFrame(data = rows, columns = ['Identifier', 'URL', 'Size (MB)'])
df.style.set_properties(**{'text-align': 'left'})

Unnamed: 0,Identifier,URL,Size (MB)
0,VPP_2020_S2_T31UFS-010m_V101_s1_AMPL,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_AMPL.tif,235.381
1,VPP_2020_S2_T31UFS-010m_V101_s1_EOSD,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_EOSD.tif,136.215
2,VPP_2020_S2_T31UFS-010m_V101_s1_EOSV,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_EOSV.tif,214.04
3,VPP_2020_S2_T31UFS-010m_V101_s1_LENGTH,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_LENGTH.tif,141.398
4,VPP_2020_S2_T31UFS-010m_V101_s1_LSLOPE,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_LSLOPE.tif,167.612
5,VPP_2020_S2_T31UFS-010m_V101_s1_MAXD,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_MAXD.tif,127.254
6,VPP_2020_S2_T31UFS-010m_V101_s1_MAXV,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_MAXV.tif,235.424
7,VPP_2020_S2_T31UFS-010m_V101_s1_MINV,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_MINV.tif,191.541
8,VPP_2020_S2_T31UFS-010m_V101_s1_QFLAG,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_QFLAG.tif,18.2049
9,VPP_2020_S2_T31UFS-010m_V101_s1_RSLOPE,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_RSLOPE.tif,160.235


You can also subset on productType and season as shown below.

Products are returned as a Python generator! If you want to be able to iterate over the results more than once, you can convert it to a list.
Mind that this will load all results in memory, which could be huge depending on the number of results returned!

In [26]:
import datetime as dt
products = catalogue.get_products(
    "copernicus_r_utm-wgs84_10_m_hrvpp-vpp_p_2017-now_v01",
    start=dt.date(2020, 1, 1),
    end=dt.date(2020, 12, 31),
    tileId="31UFS",
    productType="SOSV",
    productGroupId="s1"
)
product_list = list(products)

rows = []
for product in product_list:
    rows.append([product.id, product.data[0].href, (product.data[0].length/(1024*1024))])

df = pd.DataFrame(data = rows, columns = ['Identifier', 'URL', 'Size (MB)'])
df.style.set_properties(**{'text-align': 'left'})

Unnamed: 0,Identifier,URL,Size (MB)
0,VPP_2020_S2_T31UFS-010m_V101_s1_SOSV,https://phenology.vgt.vito.be/download/VPP_V01/2020/CLMS/Pan-European/Biophysical/VPP/v01/2020/s1/VPP_2020_S2_T31UFS-010m_V101_s1_SOSV.tif,218.985


#### Download data <a class="anchor" id="download-data"></a>

Let's download this data to the notebook environment. This requires a WekEO account, so let's retrieve the user credentials first:

In [27]:
wekeo_username = input()

 demo4clms


In [28]:
import getpass
wekeo_password = getpass.getpass()

 ···········


In [29]:
catalogue.authenticate_non_interactive(username=wekeo_username, password=wekeo_password)

<terracatalogueclient.client.Catalogue at 0x7f4f39022ca0>

In [30]:
catalogue.download_products(product_list, '/tmp')

You are about to download 229.62 MB, do you want to continue? [Y/n]  Y


Finally, check if the data is downloaded:

In [31]:
import os
os.listdir('/tmp/VPP_2020_S2_T31UFS-010m_V101_s1_SOSV/')

['VPP_2020_S2_T31UFS-010m_V101_s1_SOSV.tif']