# (E) COMPO CREATION

**REQUIREMENT : (A)**

The COMPO Index is only required for the summer process, namely script (F) and (G) if report_type = year.

The COMPO index is a standardized composite index which summarise information coming from the BI; SBI and BI2 indexes.

COMPO is computed as following for a given date (i):

> COMPOi = (BIi standardised + SBIi standardised + BI2i standardised) / 3

BIi, SBIi, BI2i are standardised as following (example for BIi):

> BIi standardised = (BIi - BI mean) / BI std

Means and standard deviations are calculated over a 4-year period (between dteDebut and dteFin, see below).

The calculation of COMPO is operated by the function

> **compo(sites, dteDebut, dteFin, dteExe)**

For each site, this function operated the following steps:

**1) Compute_sar_compo_stats_by_sar_id**

For each site, calculation of means and standard deviations for the period dteDebut-dteFin.

For each site, calculation of standardised BIi, BI2i and SBIi for each available date (i)  over the period dteDebut-dteFin.

**2) Fetch_sar_compo_stats_by_sar_id**

**3) Compute_compo_time_series_stats**

For each site, calculation COMPOi for each available date (i)  over the period dteDebut-dteFin.

**4) Compute_smoothed_compo_time_series_stats**

For each site, the time series of COMPOi over filled and smoothed using a gaussian filter of 61 days (122 days in total).

The gaussian filter is the same as applied in (C).

**5) For each site the resulted smoothed COMPO time series is saved is a table**

table_name = '{0}_COMPO_{1}_smoothed'.format(sar_id_segment, dteExe)

## Import libraries and functions

In [None]:
import os, sys, datetime
sys.path.append("/home/gswinnen/SARSAR_Package_RenPri/code/") # localisation of SARSAR libraries

#sys.path.append("/home/issep/sarsar-issep/SARSAR_utils/")                   # emplacement des modules RenPri
#sys.path.append("/home/issep/sarsar-issep/SARSAR_utils/rme_chg_detection_module/") # emplacement de la fonction de Mattia

from issep import sarsar_admin
from os.path import join
from lecture_ini import config
from csv import reader
from select_sites import sites_to_process

## Reading the list of SARs without S2 data

In [None]:
# Reading of the list of SAR without S2 data
with open("/home/gswinnen/Public/liste_sites_sans_NDVI.txt",'r') as file:
    csv_reader = reader(file)
    list_lines = list(csv_reader)
    list_sites_without_S2_data = []
    for item in list_lines:
        item = item[0].strip()
        if item != '': # check theri is no blank line
            if item != "sar_id_segment":
                list_sites_without_S2_data.append(item)
print(list_sites_without_S2_data)

## Definition of the **compo** function

In [None]:
def compo(sites, dteDebut, dteFin, dteExe):
    """ This function computes the COMPO composite raw and smoothed index values over a given period.
        Results are registered in new tables in the DB.
            
            Parameters (they are automatically read in "sarsar.ini" by the function "config"):
                sites (list): list of sites (sar_id_segment) to process
                dteDebut (date): date from which we need data (YYYY-MM-DD)
                dteFin (date): date until which we need data (YYYY-MM-DD)
                dteExe: processing/execution date (YYYYMMDD)
    """

    ## CONNECTION TO DB
    # Define Database connection parameters
    # NOTE: password is in ~/.pgpass
    credentials = config(section='postgresql')

    db_credentials = {
        'host': credentials['host'],
        'user': credentials['user'],
        'db' : credentials['database']
    }

    # ALWAYS prepare env et the beginning
    print('> Preparing env (DB credentials, etc)')
    sarsar_admin.prepare_env(db_credentials)
    
    conn = sarsar_admin._create_or_get_db_connection()
    cur = None
    cur2 = None
    
    ## STARTING COMPO CALCULATION
    
    try:
        import psycopg2.extras
        cur = conn.cursor(cursor_factory = psycopg2.extras.DictCursor)
        cur2 = conn.cursor(cursor_factory = psycopg2.extras.DictCursor)
        
#            strSQL = '''SELECT sar_id_segment, index_name, acq_date, index_mean, pixel_count 
#                        FROM sar_index_stats WHERE sar_id_segment = '{0}' AND index_name NOT IN ('BI2','VV','SBI','NDVI','BI2_part1','BI','BAI','VH') 
#                        AND substring(index_name,1,2) != 'VV' AND acq_date BETWEEN '{1}' AND '{2}' 
#                        ORDER BY sar_id_segment, index_name, acq_date;'''.format(site, dteDebut, dteFin)
        
        compteur = 0
#        cur.execute('SELECT DISTINCT sar_id_segment FROM sar_index_stats;')
#        result = cur.fetchall()    
        
        #result = ['52063-ISA-0020-01'] #DEBUG
        
        for site in sites:
            #sar_id_segment = dict(site)['sar_id_segment'] #Un-comment once debugged
            sar_id_segment = site
            #sar_id_segment = '52063-ISA-0020-01'
#            sar_id_segment = '56078-ISA-0008-01'  # DEBUG compute_sar_compo_stats_by_sar_id error
#            sar_id_segment = '25072-ISA-0013-01'  # DEBUG
            compteur += 1
            print("> ",compteur,sar_id_segment)
        # Checking that the current site has S2 data
            if sar_id_segment not in list_sites_without_S2_data:
                # Checking the data availability before calling the sarsar_admin.compute_sar_compo_stats_by_sar_id function;
                # because it calls a stored procedure of the PostgreSQL DB which raises an error when there is no 
                # data in the sar_index_stats table for the requested site, indices and date range...
                strSQL = '''SELECT count(*) FROM sar_index_stats
                            WHERE sar_id_segment = '{0}' AND acq_date BETWEEN '{1}' AND '{2}'
                            AND index_name IN ('BI', 'BI2', 'SBI');'''.format(sar_id_segment, dteDebut, dteFin)
                #print(strSQL)
                cur2.execute(strSQL)
                computables = cur2.fetchone()
                
                # If not enough data, results is useless (and std can't be calculated)
                if computables[0] > 10:
                    
                    # Computing and fetching the index components for the site and date range."
                    # Removing the dashes from the dates, to suit the compo_stats function, )
                    sarsar_admin.compute_sar_compo_stats_by_sar_id(sar_id_segment, dteDebut.replace('-', ''), dteFin.replace('-', ''))
                    compo_stats = sarsar_admin.fetch_sar_compo_stats_by_sar_id(sar_id_segment, dteDebut.replace('-', ''), dteFin.replace('-', '') )
                else:
                    compo_stats = []

#                print('The are %i records (acquisition dates with COMPO values) for this specific COMPO serie (sar_id_segment AND start_date_incl AND end_date_incl)' % len(compo_stats))


                if len(compo_stats) > 0:
        
                    # Computing COMPO raw values
                    compo_ts = sarsar_admin.compute_compo_time_series_stats(compo_stats)

                    # Smoothing COMPO raw value time serie
                    smoothed_compo_ts = sarsar_admin.compute_smoothed_compo_time_series_stats(compo_stats, time_resampling='D', gaussian_sigma=61)

                    # NB: smoothed_compo_ts is a tuple of 2 arrays: first is an array of interpolated dates, second is an array of corresponding smoothed stats
                    smoothed_profile_dico = {}
                    
    ## SAVING OUTPUTS IN TABLES
    
                    #DEBUG : for i in range(len(compo_stats)):
                    #DEBUG : print(len(compo_stats),len(smoothed_compo_ts[0]))
                    for i in range(len(smoothed_compo_ts[0])):
                        # transformation of smoothed_compo_ts into a dictionnary
                        smoothed_profile_dico[smoothed_compo_ts[0][i].strftime("%Y-%m-%d")] = smoothed_compo_ts[1][i]

                    # DEBUG : pause = input("pressez une touche...")

                    # Saving COMPO smoothed time serie in a table named "{sar_id_segment}_COMPO_{dteExe}_smoothed"
                    # Defining result table name
                    table_name = '{0}_COMPO_{1}_smoothed'.format(sar_id_segment, dteExe)

#                    print(table_name)  # DEBUG
                    # Checking table existence
                    strSQL = 'DROP TABLE IF EXISTS "{0}";'.format(table_name)
                    cur.execute(strSQL)
                    conn.commit()
                    # Table creation
                    strSQL = 'CREATE TABLE IF NOT EXISTS "{0}" (dte DATE, indice NUMERIC);'.format(table_name)  #, ', '.join(type_valeurs))
                    cur.execute(strSQL)

                    # Filling table with results
                    for item in smoothed_profile_dico:
                        strSQL = 'INSERT INTO "{0}" (dte, indice) VALUES (\'{1}\', {2});'.format(table_name, item, smoothed_profile_dico[item])  # , (', '.join(champs)), str(tuple(valeurs)))
                        cur.execute(strSQL)

                    # Posting allmodifications
                    conn.commit()
                else:
                    # Error message if not enough data to compute COMPO
                    print("COMPO can't be computed : not enough data")
                
            else:
                # Error message if a given site has no S2 data
                print("segment ignored")
    except (Exception, psycopg2.DatabaseError) as error:
        print(error)
        print('sar_id_segment =', sar_id_segment)
    finally:
        if cur is not None:
            cur.close()
            
    # ALWAYS release env at the end
    print('> Releasing env')
    sarsar_admin.release_env()
            

## Jupyter notebook conversion in python script


In [None]:
!jupyter nbconvert --to script E_compo.ipynb

with open('E_compo.py', 'r', errors='ignore') as f:
    lines = f.readlines()

with open('E_compo.py', 'w') as f:
    for line in lines:
        if 'nbconvert --to script' in line:
            break
        else:
            if '# In[' not in line[:10]:
                f.write(line)

## Calling functions

In [None]:
# Call config
dates = config(section='dates')
dteDebut = dates['deb']
dteFin = dates['fin']
dteExe = dates['exe']

## Call sites_to_process
lstSARs = sites_to_process(dteDebut, dteFin)

# Call compo
compo(lstSARs, dteDebut, dteFin, dteExe)

In [None]:
def smooth_time_serie(dteDebut, dteFin, dteExe):
    
    # List all indices
    indices = ['BAI','BI','BI2','NDVI','NDWI2','SBI']
        
    # Define Database connection parameters
    # NOTE: password is in ~/.pgpass
    credentials = config(section='postgresql')

    db_credentials = {
        'host': credentials['host'],
        'user': credentials['user'],
        'db' : credentials['database']
    }

    # ALWAYS prepare env et the beginning
    print('> Preparing env (DB credentials, etc)')
    sarsar_admin.prepare_env(db_credentials)
    
    # Process all sar_id_segments
    conn = sarsar_admin._create_or_get_db_connection()
    cur = None

    try:
        import psycopg2.extras
        cur = conn.cursor(cursor_factory = psycopg2.extras.DictCursor)
        cur.execute('SELECT DISTINCT sar_id_segment FROM sar_index_stats;')    
        result = cur.fetchall()
        for row in result:
            sar_id_segment = dict(row)
            
            
            for indice in indices :
                print('> Processing smoothing for %s, %s' % (sar_id_segment, indice))
                raw_profile = sarsar_admin.fetch_sar_index_stats_by_sar_id(sar_id_segment['sar_id_segment'], indice, dteDebut, dteFin )
                smoothed_profile = sarsar_admin.compute_smoothed_time_series_stats(raw_profile)  # .remove_outliers_from_stats()

                # transpose les deux arrays dans un dictionnaire tel qu'attendu pour la boucle suivante...
                smoothed_dates = smoothed_profile[0]
                smoothed_values = smoothed_profile[1]
                smoothed_profile_dico = {}

                for i in range(len(smoothed_dates)):
                    smoothed_profile_dico[smoothed_dates[i].strftime("%Y-%m-%d")] = smoothed_values[i]

                # (re)crée LA table "{sar_id_segment}_{indice}_{dteExe}_smoothed" qui doit accueillir ces données
                table_name = '{0}_{1}_{2}_smoothed'.format(sar_id_segment['sar_id_segment'], indice, dteExe)
                strSQL = 'DROP TABLE IF EXISTS "{0}";'.format(table_name)  # Encadrer le nom de table pour tolérer les chiffres en début et les tirets au milieu
                cur.execute(strSQL)
                conn.commit()

                strSQL = 'CREATE TABLE IF NOT EXISTS "{0}" (dte DATE, indice NUMERIC);'.format(table_name)  #, ', '.join(type_valeurs))
                cur.execute(strSQL)

                for item in smoothed_profile_dico:
                    strSQL = 'INSERT INTO "{0}" (dte, indice) VALUES (\'{1}\', {2});'.format(table_name, item, smoothed_profile_dico[item])  # , (', '.join(champs)), str(tuple(valeurs)))
                    cur.execute(strSQL)

            # Poste toutes les modifications
            conn.commit()
            
        cur.close()
        
    except (Exception, psycopg2.DatabaseError) as error:
        print(error)
    finally:
        if cur is not None:
            cur.close()
            
    # ALWAYS release env at the end
    print('> Releasing env')
    sarsar_admin.release_env()

In [None]:
# DEBUG: appelle la fonction

dates = config(section='dates')
dteDebut = dates['deb']
dteFin = dates['fin']
dteExe = dates['exe']

smooth_time_serie(dteDebut, dteFin, dteExe)