## Calculating Performance Assessment Indicators (PAIs) from wapor

#### Background

All the waterpip functions have been organized into classes to help automate as much of the process as possible. Most importantly is that folder structuring and naming is automated by calling on and intiating the *WaporStructure* class in the background. A user sets their project directory using the *waterpip_directory* and *project_name* inputs and the functions take care of the rest. 

#### Performance Assessment Indicators (PAIs)

Increasing competition for and limited availability of water and land resources puts a serious constraint on agricultural production systems. Sustainable land and water management practices will be critical to expand production efficiently and address food insecurity while limiting impact on the ecosystem. This requires a good understanding of how agricultural systems are performing and their potential for improvement. Variables affecting the performance of agricultural systems are both biophysical (climate, soil, topography) and socio-ecological (market, infrastructure, farm management, available inputs). The proposed approach is built on performance assessment indicators that look at satellite observations of the actual crop production and water consumption from the WaPOR database. The indicators focus on the actual performance of the agriculture system and the underlying biophysical factors, but as a satellite-based system it cannot provide information on underlying socio-ecological variables.
The approach is based on a number of Performance Assessment Indicators (PAIs) that are derived from FAO WaPOR data on crop, water consumption and growth. The indicators estimate Water Productivity (WP) and Land Productivity (yield) pefromances at various levels for specific crop types for a selected area and time period. 


##### NOTE: 
If this is your first time running this please read the instructions below  and then follow the steps to analyse the data. The following description is the nearly the same as in *03_waterpip_analysis_basics* except it does not require previously existing data to run. 

## 1 Import modules/libraries

In [1]:
import os
from datetime import datetime
from waterpip.scripts.analysis.wapor_analysis import WaporAnalysis
from waterpip.scripts.retrieval.wapor_retrieval import WaporRetrieval
print('class imported succesfully, you are at the starting line')

class imported succesfully, you are at the starting line


## 2 initiate the retrieval class to make the crop mask

See notebook 02_waterpip_download_basics for details

In [3]:
# initiate the retrieval class to make the crop mask (see notebook 02_waterpip_download_basics for details)
retrieval = WaporRetrieval(            
    waterpip_directory=r'C:\Users\roeland\workspace\projects\waterpip\waterpip_dir',
    shapefile_path=r"C:\Users\roeland\workspace\projects\waterpip\testing\static\Bekaa_boundary_bbox.shp",
    wapor_level=3,
    project_name='lebanon_wheat',
    api_token='c009b20150c8b6986dd321ebe1df6dbd0c5cc7684475a6ad88da64e7b45ff89ecc4e24128d2cf5bb'
)

# run your chosen method below

bbox shapefile based on the input shapefile made and outputted too: C:\Users\roeland\workspace\projects\waterpip\waterpip_dir\lebanon_wheat\L3\00_reference\Bekaa_boundary_bbox_bbox.shp
running check for all wapor wapor_level catalogues and downloading as needed:
Loading WaPOR catalog for wapor_level: 1
catalogue location: C:\Users\roeland\workspace\projects\waterpip\waterpip_dir\metadata\wapor_catalogue_L1.csv
Loading WaPOR catalog for wapor_level: 2
catalogue location: C:\Users\roeland\workspace\projects\waterpip\waterpip_dir\metadata\wapor_catalogue_L2.csv
Loading WaPOR catalog for wapor_level: 3
catalogue location: C:\Users\roeland\workspace\projects\waterpip\waterpip_dir\metadata\wapor_catalogue_L3.csv
wapor_level 3 location shapefile exists skipping retrieval
wapor_level 3 location shapefile: C:\Users\roeland\workspace\projects\waterpip\waterpip_dir\metadata\wapor_L3_locations.shp
loading wapor catalogue for this run:
Loading WaPOR catalog for wapor_level: 3
catalogue location: C:

## 3 Create a crop mask file for use during analysis

Most of the functions in *WaporAnalysis* require a crop mask to carry out an analysis properly. *WaporRetrieval* provides two methods by which to produce a crop mask for use during analysis. 

**WaporRetrieval.create_crop_mask_from_shapefile**: The first and most reliable method is to base it on on your own shapefile using this function. If you are certain that your shapes/geometries cover the crop/fields of interest. This way is best. It masks to the geometries in the given shapefile to make the mask (if a mask column of 0/1 per geometry is specified it masks to specific geoemtries in the shapefile)

The original shapefile is then copied to teh reference folder of the project for further use during the rest of the analysis.

**WaporRetrieval.retrieve_crop_mask_from_WAPOR**: The second function uses the input shapefile to retrieve the land classification rasters from WAPOR for a given period and returns one raster of the most common land classification (per cell) over that period. Given a crop/coverage it then masks to that specific crop to create a mask file. This is considered the raw mask file. (multiple crop version to come)

The raw mask is then vectorized to produce a shapefile of the fields for analysis. This file is considered the raw version again. This shapefile is then cleaned up, the geometries filtered and fixed to produce a new shapefile containing geometries that may better fit the fields. Lastly a crop mask is made based on the cleaned up shapefile again. It is up to the user to select which combination to use.  

## NOTE:

Whichever method you use all these files can be found in the project specific reference folder *project_folder/L{}/00_reference/* of that project for further use during the rest of the analysis. UIt is recommended you use these as it prevents alteration of your original files. Further a unique id is generated per geometry within each shapefile so that specific fields/geometries can be identified later on. This **wpid** is used automatically during further analysis unless otherwise specified.

### 3.1 Using the shapefile to produce crop mask

#### Required Inputs:

**crop**: crop name for the output raster

#### Optional Inputs:

**template_raster_path**: raster providing the metadata for the output raster
if not provided retrieves a raster from WAPOR to use as the template.

**column**: if provided burns the value found in the specified column into the raster
so for a mask the values need to be 0,1.

**period_start**: standard value used to grab a template raster if  one is not provided, 
can be ignored if the function works

**period_end**: standard value used to grab a template raster if  one is not provided, 
can be ignored if the function works

**return_period**: standard value used to grab a template raster if  one is not provided, 
can be ignored if the function works

*NOTE: if a template_raster_path is not provided the function will atempt to retrieve one from wapor, however if nothing is found you may need to change the optional inputs*

*NOTE: the advanges of a template_raster_path is that you can use it to set the extent, resolution etc of your analysis from this point forward*

In [5]:
# method one using the shapefile
crop_mask_raster_path, crop_mask_shape_path = retrieval.create_crop_mask_from_shapefile(
        crop='your_crop_name_goes_here')

crop was autocorrected to :your_crop_name_goes_here
the following given datacomponents could not be found in the wapor_level catalog or were not available for the specified return period:
 ['L3_BKA_TBP_D', 'L3_BKA_QUAL_LCC_D', 'L3_BKA_PHE_D']
continuing with the remainder
retrieving download info for component: LCC
retrieving download info for wapor_level 3 region: BKA
attempting to retrieve donwload info for 1 rasters from wapor
Download info Progress: |--------------------------------------------------| 0.0% Complete: 0 out of 1

KeyboardInterrupt: 

### 3.2 Using WaPOR land cover classification to produce crop mask

#### Required Inputs:

**crop**: crop to mask too has to match the name used in the wapor database  
classification codes      

**period_start**: standard value used to grab a template raster if  one is not provided, 
can be ignored if the function works (may have been provided on class setup)

**period_end**: standard value used to grab a template raster if  one is not provided, 
can be ignored if the function works (may have been provided on class setup)

*NOTE: the crop code name provided has to match one avialable in the dictionary provided in waterpip\scripts\retrieval\wapor_land_cover_classification_codes.py if it does not match suggestions will be provided*

*NOTE: The  crop even if it is in the dictionary may not exist for in the given area accordign to the LCC or can only be found in very small numbers, in both cases the function wiil provide info*


In [4]:
# method using the land classification raster from wapor
crop_mask_raster_path, crop_mask_shape_path = retrieval.retrieve_crop_mask_from_WAPOR(
    period_start=datetime(2020,3,5),
    period_end=datetime(2020,4,5),
    crop='Wheat'
)

crop was autocorrected to :wheat
retrieving download info for component: LCC
retrieving download info for wapor_level 3 region: BKA
attempting to retrieve donwload info for 4 rasters from wapor
Download Info Progress: |██████████████████████████████████████████████████| 100.0% Complete: 4 out of 4
attempting to retrieve 4 rasters from wapor
Download Raster Progress: |██████████████████████████████████████████████████| 100.0% Complete: 4 out of 4
Processing\Warping Progress: |██████████████████████████████████████████████████| 100.0% Complete: 4 out of 4
percentage of occurrence of your chosen crop in the raster according to WAPOR is: 8.882
raw crop mask raster made: C:\Users\roeland\workspace\projects\waterpip\waterpip_dir\lebanon_wheat\L3\00_reference\wheat_20200305_20200405_raw_mask.shp
raw crop mask shape made: C:\Users\roeland\workspace\projects\waterpip\waterpip_dir\lebanon_wheat\L3\00_reference\wheat_20200305_20200405_raw_mask.shp
turn this of in the code to set it lower than thi

## 4 Initiate/activate WaporAnalysis class to start analysis

Once you have your crop mask you can analyse some wapor data. The WaporAnalysis class was made to make analyzing data retrieved from the WAPOR portal easy. To intiate the class you need to enter/edit the following inputs below:

*Note: uses the same inputs as found in the class WaporRetrieval. Except that the api_token is only required if retrieving data.* 

#### Required Inputs:

- **waterpip_directory**: path to the directory where the project specific directory will be created. the class *WaporRetrieval* automatically creates a new directory using the input *project_name* on activation and creates subfolders to organise the data as well. The functions that follow automatically use these folders (**required**).

- **shapefile_path**: the shapefile is a needed input that specifies the location to download data for as well as the projection to output it in. Directly the input is the path to the shapefile itself. The function retrieves the data for the area(s) shown in the shapefile  (**required**).

**Note**: A shapefile is required and provides alot of the required info for the project including the extent and the output projection. Any projection (crs) is accepted, wapor data is  always downloaded in epsg: 4326 and the shapefile bounding box is transformed as needed to match. transformations are made again if needed to retrieve the data and transform it to match the projection (crs) of the input shapefile. 

- **wapor_level**: level of WAPOR data to download. There are 3 levels from low resolution 250m (1) and mid resolution 100m (2) to high resolution 30m (3). All of Africa and part of the middle east is available at level 1. Specific countries are available at level 2. Only some specific locations around the size of valleys or hydrosheds are available at level 3. For more info on the levels please see: https://wapor.apps.fao.org/home/WAPOR_2/1  (**required**).

**Note**: A spatial check is carried out on the download area specified in your shapefile to see if data is available for it at the given level when running (only level 1 and 3 spatial checks exist currently). Error messages provide details.

- **project_name**: name of the directory that will be created, all data retrieved and analysed can be found in here, auto set to *test* if not provided.

#### Optional Inputs:

The following inputs are optional. They can also be provided when running the class functions for more flexibility. The advantage of passing them during clas setup/initialisation is that it is easy to repeatedly use the class functions with the same inputs. That way you are assured it will always run for the same inputs. The advantage of passing the class functions is that it is flexible. by changing only a few inputs you can retrieve different sets of data each time while maintaining the same required class inputs (folder structure, wapor level and area of interest (shapefile) etc). 

- **api_token**: the api token retrieved form the WAPOR site goes here. see the instructions above on how to retrieve a token from the WAPOR website.

- **period_start**: date you want to start your data download from, enter as a datetime object. This can also be provided later when running the class functions. Auto sets to the before running if not provided.

- **period_end**: date you want to end your data download at, enter as a datetime object. This can also be provided later when 
running the class functions. Auto sets to the day of running if not provided.

**datetime objects**: A specific way of formatting dates for python. It is made up of the function datetime followed by the date in brackets split into the sections: Year (4 digits), month (2 or 1 digit), day (2 or 1 digits). (google python datetime object for more details)

*Example*: November 4th 2020 or 4-11-2020: datetime(2020,11,4)  

*Note*: do not use leading zeros for single digit dates (1 not 01). 

- **return_period**: return period to download data for, given as a single letter code. available periods include: I: Daily, D: Dekadal, S: Seasonal, A: Annual (yearly). This can also be provided later when running the class functions. Auto sets to the Dekadal (D) if not provided.

- **datacomponents**: datacomponents (parameters of interest such as transpiration and net primary productivity) to download data for. These are input as single letter code strings seperated by a ',' in a list such as: ['T', 'NPP']. if you set the datacomponents input to ['ALL'] it will download all datacomponents available for that return period and level at that location.   This can also be provided later when running the class functions. Auto sets to the ['ALL'] if not provided.

In [5]:
analysis = WaporAnalysis(
    waterpip_directory=r'C:\Users\roeland\workspace\projects\waterpip\waterpip_dir',
    shapefile_path=crop_mask_shape_path,
    wapor_level=3,
    project_name='lebanon_wheat'
    )

## 5 Calculate WaPOR based PAIs 

Once you have intitiated the class you can retrieve some data and calculate PAIs based sololy on WaPOR data. The following function does this for some standardised and reliable PAI helpful to understand how agricultural systems are performing and their potential for improvement.

This function could be considered a sall processing chain as it calls on multiple subfunctions to do its task in a clear and effecietn manner. Each of those sub functions some related to specific indicators such as crop water deficit cna also be called on their own if you so wish. Check out and dive into the code for details.

Below you can find the description taken directly from the class function

        """
        Description:
            calculate all available perfornamce indicators per cell to test for adequacy, effeciency
            reliability and equity for the given period and area as defined by the class shapefile

            beneficial fraction: Sum of Transpiration  / Sum of Evapotranspiration (bf)
            equity here: standard deviation of Evapotranspiration / Evapotranspiration Mean (cov)
            equity safi: standard deviation of summed Evapotranspiration per field / 
            mean of summed evapotranspiration per field (cov)
            crop_water_deficit: Potential evapotranspiration - Sum of Evapotranspiration (cwd)
            relative evapotranspiration: Sum of Evapotranspiration / Potential evapotranspiration (ret)
            temporal relative evapotranspiration: per dekad Sum of Evapotranspiration / Potential evapotranspiration
            (tret)

        Args:
            self: (see class for details)
            api_token: token used to retrieve the data 
            period_start: start of the season in datetime
            period_end: end of the season in datetime
            return_period: return period to retrieve data for, 
            auto set to monthly
            crop_mask_path: path to the crop mask defining the area for analysis
            crop: crop being analysed used in the name
            output_nodata: nodata value to use on output
            fields_shapefile_path: if the path to the fields shapefile_path is provided
            then the field level statistics are also calculated           
            field_stats: list of statistics to carry out during the field level analysis, 
            also used in the column names 
            id_key: name of shapefile column/feature dictionary key providing the feature indices 
            wpid is a reliable autogenerated index provided while making the crop mask
            (note: also handy for joining tables and the crop mask shape/other shapes back later) 
            out_dict: if true outputs a dictionary instead of a shapefile and does not
            write to csv.
        
        Return:
            tuple: list of paths to the performance indicator rasters,  (dataframe/dict, csv of field statistics)
        """   

In [6]:
outputs = analysis.calc_relative_evapotranspiration(
    api_token='c009b20150c8b6986dd321ebe1df6dbd0c5cc7684475a6ad88da64e7b45ff89ecc4e24128d2cf5bb',
    period_start=datetime(2020,3,5), # do not need to provide this if you do it on class setup
    period_end=datetime(2020,4,5), # do not need to provide this if you do it on class setup
    fields_shapefile_path=crop_mask_shape_path,
    crop_mask_path=crop_mask_raster_path,
    crop='wheat',
    output_nodata=-9999,
    )

print(outputs)

retrieving AETI data between 20200305 and 20200405 for the crop: wheat
retrieving download info for component: AETI
retrieving download info for wapor_level 3 region: BKA
attempting to retrieve donwload info for 4 rasters from wapor
Download Info Progress: |██████████████████████████████████████████████████| 100.0% Complete: 4 out of 4
attempting to retrieve 4 rasters from wapor
Download Raster Progress: |██████████████████████████████████████████████████| 100.0% Complete: 4 out of 4
Processing\Warping Progress: |██████████████████████████████████████████████████| 100.0% Complete: 4 out of 4


  outputs = ufunc(*inputs)


Relative evapotranspiration raster calculated: ret (Adequacy PAI)
Calculating ret field statistics...
attempting to claculate zonal stats for a single raster
calculating all feature statistics...
Relative evapotranspiration field statistics calculated: ret (Adequacy PAI)
('C:\\Users\\roeland\\workspace\\projects\\waterpip\\waterpip_dir\\lebanon_wheat\\L3\\04_results\\L3_wheat_ret_20200305_20200405.tif', (      L3_wheat_ret_20200305_20200405_mean
1                                0.000000
2                                0.000000
3                                0.000000
5                                0.000000
6                                0.000000
...                                   ...
3370                             0.013793
3371                             0.000000
3372                             0.000000
3373                             0.000000
3374                             0.000000

[1780 rows x 1 columns], 'C:\\Users\\roeland\\workspace\\projects\\waterpip\\waterpip_d

## 6 Check out the data 
if the code ran succesfully you should be able to find soem results in the folder: 
*<wapor_directory>/<project_name>/L<number>/04_results*

## 7 Visualise the data

You can check the data using a program such as Qgis or arcGIS or however you want.

## 8 Rinse and Repeat  

Now that you know how to retrieve data and analyse data feel free to repeat the notebooks *04_waterpip_analysis_PAIs* and play around with the parameters. If you feel like it you can even get into the code itself and see what you can code, run, retrieve and analyse! 

## 9 Visualising Performance Assessment Indicators (PAIs) for an area

If you feel like it you can also take a look at notebook *05_visualising_waterpip_results.ipynb* where we walk you through the process of producing some more informative visualisations and graphs from some of your previously downloaded data.