## Notebook for deploying Cognite Function

Run all cells sequentially until `Experimental` section to deploy your Cognite Function.

Modifications are done in `Inputs` section, where you need to supply relevant input parameters as required by the calculations and deployment of your Cognite Function. The input parameters are stored in `data_dict`. There are two types of input parameters:
- A: General parameters required for deployment of any Cognite Function
- B: Optional (calculation-specific) parameters used as input to your calculation function. These should enter `data_dict["calc_params"]` as key-value pairs.

### Inputs

#### A. Required parameters
- `ts_input_names`: names of input time series (given as list, even if only one input)
- `ts_output_names`: names of output time series (also given as list). NB: if multiple time series outputs, order of ts_output_names must correspond to order in ts_input_names.
- `function_name`: name of Cognite Function to deploy (i.e., folder with name `cf_*function_name*`)
- `calculation_function`: name of calculation function to run, should be defined in transformation.py (in the folder `cf_*function_name*`) as `calc_*calculation_function*`
- `sampling_rate`: sampling rate of input time series, given in seconds
- `cron_interval_min`: minute-interval to run schedule at (NB: currently only supported for min-interval [1, 60))
- `backfill_days`: number of days back in time to perform backfilling
- `cdf_env`: CDF environment to deploy to

In [13]:
# ts_input_names = ["VAL_17-FI-9101-286:VALUE", "VAL_17-PI-95709-258:VALUE", "VAL_11-PT-92363B:X.Value", "VAL_11-XT-95067B:Z.X.Value"]
ts_input_names = ["VAL_11-LT-95034A:X.Value"]
# ts_output_names = ["VAL_17-FI-9101-286:MULTIPLE.Test", "VAL_17-PI-95709-258:MULTIPLE.Test", "VAL_11-PT-92363B:MULTIPLE.Test", "VAL_11-XT-95067B:MULTIPLE.Test"]
ts_output_names = ["VAL_11-LT-95034A:X.CDF.D.AVG.LeakValue"]
function_name = "daily-avg-drainage" #"multiple_outputs"
calculation_function = "daily_avg_drainage" # "calculation"

sampling_rate = 60 #
cron_interval_min = str(15) #
assert int(cron_interval_min) < 60 and int(cron_interval_min) >= 1
backfill_days = 3
backfill_hour = 23
backfill_min_start = 0

cdf_env = "dev"

#### B. Optional parameters

In [14]:
tank_volume = 1400
derivative_value_excl = 0.002
lowess_frac = 0.001
lowess_delta = 0.01

#### Insert parameters into data dictionary

In [15]:
backfill_min_end = backfill_min_start + int(cron_interval_min)
if backfill_min_end >= 60:
    backfill_min_end = 59

data_dict = {'ts_input_names':ts_input_names,
            'ts_output_names':ts_output_names,
            'function_name': f"cf_{function_name}",
            'calculation_function': f"calc_{calculation_function}",
            'granularity': sampling_rate,
            'dataset_id': 1832663593546318, # Center of Excellence - Analytics dataset
            'backfill_days': backfill_days,
            'backfill_hour': backfill_hour, # 23: backfilling to be scheduled at last hour of day as default
            'backfill_min_start': backfill_min_start, 'backfill_min_end': backfill_min_end,
            'calc_params': {
                'derivative_value_excl':derivative_value_excl, 'tank_volume':tank_volume,
                'lowess_frac': lowess_frac, 'lowess_delta': lowess_delta
            }}

### Authentication

In [16]:
from cognite.client.data_classes import functions

from initialize import initialize_client
from deploy_cognite_functions import deploy_cognite_functions

func_suffix = "Development"

In [17]:
# Set limit on function calls - don't think it's really necessary ...
func_limits = functions.FunctionsLimits(timeout_minutes=60, cpu_cores=0.25, memory_gb=1, runtimes=["py39"], response_size_mb=2)
# I think that timeout_minutes will be capped at 15 anyway ...
client = initialize_client(cdf_env, cache_token=False)

### Deploy Cognite Function in one go

#### Single call

Initial transformation is data-intensive. A scheduled call will likely time out. Instead, do a separate call first. 

In [18]:
deploy_cognite_functions(data_dict, client, cron_interval_min,
                         single_call=True, scheduled_call=False)

Cognite Function created. Waiting for deployment status to be ready ...
Ready for deployement.
Calling Cognite Function individually ...
... Done


#### Scheduled call

For subsequent calls, transformations are only done on current date, not too data intensive. This can be handled by scheduled calls.

In [19]:
deploy_cognite_functions(data_dict, client, cron_interval_min,
                         single_call=False, scheduled_call=True)

Setting up Cognite Function schedule ...
... Done


## Experimental

sid = client.functions.schedules.list(function_id=func_drainage.id).to_pandas().id[0]
scid = func_drainage.list_calls(schedule_id=sid, limit=-1).to_pandas()
resp = func_drainage.retrieve_call(id=scid).get_response()
resp

my_func = client.functions.retrieve(external_id=data_dict["function_name"])
my_schedule_id = client.functions.schedules.list(
            name=data_dict["function_name"]).to_pandas().id[0]
all_calls = my_func.list_calls(
            schedule_id=my_schedule_id, limit=-1).to_pandas()
all_calls.tail()

pd.date_range(start=datetime(2023,11,16,0,0), end=datetime(2023,11,16,3,51), freq="T")
extid = client.time_series.list(name="VAL_17-FI-9101-286:VALUE")[0].external_id
ts_orig_all = client.time_series.data.retrieve(external_id=extid,
                                                   limit=20,
                                                   ).to_pandas()
ts_orig_all.head()

### Generalizing Cognite Functions - sketch

ts_all = {
        'ts_A': {'granularity':15, 'var':'a'},
        'ts_B': {'granularity':10, 'b_specific':[1,2,3]},
        'ts_X': {'max_days':8, 'thermo_coeff': 0.05, 'filter':'lowess'},
        'ts_Y': {'tot_days': 40},
        'out': 'test',
        'in': {'granularity':15, 'b_specific':[1,2,3,4]},
        }

func_drainage = client.functions.retrieve(external_id="draiange")
func_thermo = client.functions.retrieve(external_id="thermo")

func_drainage_schedule = []
func_thermo_schedule = []

"""Create individual schedules for three time series running drainage-Cognite-function"""
for ts in ['A', 'B', 'Y']:
    func_schedule = client.functions.schedules.create(
        name=f"avg-leak-{ts}",
        cron_expression=f"*/{cron_interval_min} * * * *",
        function_id=func_drainage.id, # SAME function id
        description=f"Leak rate calculation for time series {ts}",
        data=ts_all[f'ts_{ts}'] # DIFFERENT data dictionaries
    )
    func_drainage_schedule.append(func_schedule)

func_drainage_X = client.functions.schedules.create(
    name=f"avg-leak-X",
    cron_expression=f"*/{cron_interval_min} * * * *",
    function_id=func_drainage.id,
    description=f"Leak rate calculation for time series X",
    data=ts_all['ts_X'],
)

"""Run schedules for DIFFERENT Cognite Functions on SAME time series Y.
ALTERNATIVE 1: Each one with SAME data dictionary"""
for func in [func_drainage, func_thermo]:
    func_schedule = client.functions.schedules.create(
        name=f"tsY_{func.name}",
        cron_expression=f"*/{cron_interval_min} * * * *",
        function_id=func.id, # DIFFERENT function ids
        description=f"{func.name} calculation for time series Y",
        data=ts_all['ts_Y'] # SAME data dictionary
    )
    func_thermo_schedule.append(func_schedule)

"""ALTERNATIVE 2: Each one with DIFFERENT data dictionaries"""
for func in [func_drainage, func_thermo]:
    func_schedule = client.functions.schedules.create(
        name=f"tsY_{func.name}",
        cron_expression=f"*/{cron_interval_min} * * * *",
        function_id=func.id, # DIFFERENT function ids
        description=f"{func.name} calculation for time series Y",
        data=ts_all[f'ts_Y'][func.name] # DIFFERENT data dictionaries
    )
    func_thermo_schedule.append(func_schedule)
