## Notebook for deploying Cognite Function

Run all cells sequentially until `Experimental` section to deploy your Cognite Function.

Modifications are done in `Inputs` section, where you need to supply relevant input parameters as required by instantiation, calculations and deployment of your Cognite Function. The input parameters related to calculations and deployment are stored in `data_dict`. There are two types of input parameters:
- A: General parameters required for deployment of any Cognite Function
- B: Optional (calculation-specific) parameters used as input to your calculation function. These should enter `data_dict["calc_params"]` as key-value pairs.

If your Cognite Function is already instantiated, but you want to set up a new schedule, you can omit calling `generate_cf` and skip straight to calling `deploy_cognite_functions` with a modified `data_dict` of parameters that satisfy your scheduled calculation.

### --- Authentication ---

In [22]:
import pandas as pd
from cognite.client.data_classes import functions
from cognite.client.data_classes.functions import FunctionSchedulesList
from cognite.client.data_classes.functions import FunctionSchedule

from initialize import initialize_client
from deploy_cognite_functions import deploy_cognite_functions
from generate_cf import generate_cf

cdf_env = "dev"

In [23]:
# Set limit on function calls - don't think it's really necessary ...
func_limits = functions.FunctionsLimits(timeout_minutes=60, cpu_cores=0.25, memory_gb=1, runtimes=["py39"], response_size_mb=2)
client = initialize_client(cdf_env)

In [24]:
# client.time_series.delete(external_id="VAL_17-FI-9101-286:VALUE.COPY")
# client.time_series.delete(external_id="CF_IdealPowerConsumption_NEW")

### --- Inputs ---

#### A. Required parameters
- `ts_input_names` (list): 
    - names of input time series (a list, even if only one input). Must be given in same order as calculations are performed in `transformations.py`
- `ts_output_names` (list): 
    - names of output time series (also given as list). NB: if multiple time series outputs, order of ts_output_names must correspond to order in ts_input_names.
- `function_name` (string): 
    - name of Cognite Function to deploy (i.e., folder with name `cf_*function_name*`)
- `calculation_function` (string): 
    - name of main calculation function to run, should be defined in transformation.py (in the folder `cf_*function_name*`) as `main_*calculation_function*`
- `schedule_name` (string):
    - name of schedule to set up for the Cognite Function. NB: make sure name is unique to avoid overwriting already existing schedules for a particular Cognite Function! If setting up multiple schedules for the same Cognite Function, one for each input time series, a good advice to keep them organized is to use the name of the time series as the name of the schedules 
- `aggregate` (dictionary):
    - information about any aggregations to perform in the calculation.
    - if **not** performing any aggregates, leave the dictionary empty!
    - if performing aggregates, two keys must be specified:
    1. `period` (string):
        - the time range defining the aggregated period
        - valid values: `["minute", "hour", "day", "month", "year"]`
    2. `type` (string):
        - what type of aggregate to perform
        - valid values: any aggregation supported by `pandas`, e.g., `"mean"`, `"max"`, ... 
- `sampling_rate` (int): 
    - sampling rate of input time series, given in seconds
- `cron_interval_min` (string): 
    - minute-interval to run schedule at (NB: currently only supported for min-interval [1, 60)). The number should be provided as string.
- `backfill_period` (int): 
    - the period (default: number of days) back in time to perform backfilling
    - if performing aggregates, it is the number of aggregated periods (e.g., if aggregating over month, a value of 3 will backfill three month back in time)
- `backfill_hour` (int):
    - the hour of the day to perform backfilling
- `backfill_min_start` (int):
    - performs backfilling for any scheduled call that falls within hour=`backfill_hour` and minute=`[backfill_min_start, backfill_min_start+cron_interval_min]`
- `testing` (bool):
    - defaults to `False`. Set to `True` if running unit tests
- `add_packages` (list): 
    - additional packages required to run the calculations in `transformations.py`

In [25]:
# ts_input_names = ["VAL_17-FI-9101-286:VALUE", "VAL_17-PI-95709-258:VALUE", "VAL_11-PT-92363B:X.Value", "VAL_11-XT-95067B:Z.X.Value"] # Inputs to IdealPowerConsumption function # ["VAL_11-XT-95067B:Z.X.Value", 87.8, "CF_IdealPowerConsumption"] # Inputs to WasterEnergy function
ts_input_names = ["VAL_18-LIT-80243:VALUE"]
# ts_output_names = ["CF_IdealPowerConsumption"]
ts_output_names = ["VAL_18-LIT-80243.CDF.D.AVG.LeakValue"]

function_name = "avg-drainage"
calculation_function = "aggregate"
schedule_name = ts_input_names[0]

aggregate = {}
aggregate["period"] = "day"
aggregate["type"] = "mean"

sampling_rate = 60 #
cron_interval_min = str(15) #
assert int(cron_interval_min) < 60 and int(cron_interval_min) >= 1
backfill_period = 3
backfill_hour = 15 # 23
backfill_min_start = 30

add_packages = ["statsmodels"]

#### B. Optional parameters

In [26]:
tank_volume = 240
derivative_value_excl = 0.002
lowess_frac = 0.001
lowess_delta = 0.01

#### Insert parameters into data dictionary

In [27]:
backfill_min_start = min(59, backfill_min_start)

data_dict = {'ts_input_names':ts_input_names,
            'ts_output_names':ts_output_names,
            'function_name': f"cf_{function_name}",
            'schedule_name': schedule_name,
            'calculation_function': f"main_{calculation_function}",
            'granularity': sampling_rate,
            'dataset_id': 1832663593546318, # Center of Excellence - Analytics dataset
            'cron_interval_min': cron_interval_min,
            'aggregate': aggregate,
            'testing': False,
            'backfill_period': backfill_period, # days by default (if not doing aggregates)
            'backfill_hour': backfill_hour, # 23: backfilling to be scheduled at last hour of day as default
            'backfill_min_start': backfill_min_start, 'backfill_min_end': min(59.9, backfill_min_start + int(cron_interval_min)),
            'calc_params': {
                'derivative_value_excl':derivative_value_excl, 'tank_volume':tank_volume,
                'lowess_frac': lowess_frac, 'lowess_delta': lowess_delta, #'aggregate_period': aggregate["period"]
            }}

### --- Instantiate Cognite Function ---

Set up folder structure for the Cognite Function as required by the template.

In [18]:
generate_cf(function_name, add_packages)

Writing __init__.py ...
Writing handler.py ...
Writing transformation.py ...
Created requirements.txt in c:/Users/vetnev/OneDrive - Aker BP/Documents/First Task/opshub-task1/src/cf_power-Template
Packages to add:  ['python-dotenv', 'pytest', 'cognite-sdk', 'ipykernel', 'pandas', 'numpy']

Using version ^1.0.0 for python-dotenv

Updating dependencies
Resolving dependencies...

Package operations: 1 install, 0 updates, 0 removals

  â€¢ Installing python-dotenv (1.0.0)

Writing lock file

Using version ^7.4.4 for pytest

Updating dependencies
Resolving dependencies...

Package operations: 5 installs, 0 updates, 0 removals

  â€¢ Installing colorama (0.4.6)
  â€¢ Installing iniconfig (2.0.0)
  â€¢ Installing packaging (23.2)
  â€¢ Installing pluggy (1.3.0)
  â€¢ Installing pytest (7.4.4)

Writing lock file

Using version ^7.13.4 for cognite-sdk

Updating dependencies
Resolving dependencies...

Package operations: 16 installs, 0 updates, 0 removals

  â€¢ Installing pycparser (2.21)
  â€¢ 

### --- Define transformation function ---

In this step, modify `transformation.py` to include your calculations.

### --- Deploy Cognite Function in one go ---

#### Single call

Initial transformation is data-intensive. A scheduled call will likely time out. Instead, do a separate call first. 

In [28]:
deploy_cognite_functions(data_dict, client,
                         single_call=True, scheduled_call=False)

Cognite Function created. Waiting for deployment status to be ready ...
Ready for deployement.
Calling Cognite Function individually ...
... Done


#### Scheduled call

For subsequent calls, transformations are only done on current date, not too data intensive. This can be handled by scheduled calls.

In [29]:
deploy_cognite_functions(data_dict, client,
                         single_call=False, scheduled_call=True)

Preparing schedule to start sharp at next minute ...
Setting up Cognite Function schedule at time 2024-01-12 15:35:00+00:00 ...
... Done
