## Cash Liquidity Forecast
For the Data Product Cash Flow we want to retrieve the prediction model and apply the data product to the trained model. This notebook shows an example workflow for the retrieval of a logged model and applying the CashFlow data product.
This involves in total the following steps for the overall prediction:
- Retrieve logged model from MLflow
- Write prediction data to Delta Table
- Expose Delta Table over Delta Share

### Install packages
All necessary packages for this notebook are going to be outlined in the following notebook cell. In order to make sure that the results are reproducible, the following packages are going to be installed:
- Mlflow: Used for tracking and storing of our model
- AutoTS: Package allowing us to run different Time Series algorithms

In [0]:
%pip install mlflow
%pip install autots['additional']
%restart_python

### Import packages

In [0]:
from pathlib import Path
import mlflow
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, date_trunc, sum, explode, expr
from delta import *

### Setup Spark Session and consume data product

Please replace the values `<CATALOG_NAME>` and `<SCHEMA_NAME>` with the specific values that match our use case and group. You can find the correct names by checking the **Unity Catalog** and look for the specific catalog and schema names:`uc_XXX`, `grpX`. Additionally, please replace the value `<TIME_SERIES_TABLE_NAME>` with the according name. 

Please note: 
We adapted the code here to match our use case. Therefore, some of the lines are commented out and not needed. However, they can be useful for future applications. 

In [0]:
%sql
-- CREATE CATALOG IF NOT EXISTS <CATALOG_NAME>;
SET CATALOG uc_cash_liquidity_forecast;
-- CREATE SCHEMA IF NOT EXISTS <SCHEMA_NAME>;
USE SCHEMA grp1

In [0]:
builder = SparkSession.builder.appName("cash_flow_forecasting").getOrCreate()
data = spark.read.table("prepared_cash_flow_time_series")

### Time Series Forecasting
We retrieve the model from the run from which we stored on MLflow for the Time Series training. After we retrieve the model from MLflow, we submit the spark dataframe to the predict function and retrieve the prediction from the function. After that we save the prediction data as a Delta Table and expose it over the Delta Share back to SAP Business Data Cloud.

Please replace the value `<USERNAME>` with your user name.

In [0]:
time_series_data = data.toPandas().astype({"ds": "datetime64[ns]", "y": float})

In [0]:
notebook_path = dbutils.notebook.entry_point.getDbutils().notebook().getContext().notebookPath().get()
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/<USERNAME>/Time Series Forecasting")

### Retrieve MLflow model
In order to get the model that we logged from our training procedure, we search in our MLflow experiment the last successful run ID and provide it to the mlflow functions in order to retrieve the prediction

In [0]:
last_run = mlflow.search_runs(order_by=["start_time DESC"])
run_id = last_run[last_run["status"] == "FINISHED"]["run_id"].iloc[0]


In [0]:
logged_model = f'runs:/{run_id}/model'

# Load model as a PyFuncModel
loaded_model = mlflow.pyfunc.load_model(logged_model)
prediction = loaded_model.predict(time_series_data)

In [0]:
prediction = spark.createDataFrame(prediction)

#### Creation of Cashflow Prediction table
We create a prediction table that contains a constraint key consisting of the date and CompanyCode column. 
As the constraint key is unique over the complete Databricks catalog, please replace the constraint `<CONSTRAINT_NAME>` with an appropriate name for the constraint key. Additionally, choose a name for the prediction table and replace the value `<PREDICTION_NAME>` with it.

To be able to install the data product to Datasphere later on, the following requirements need to be fulfilled:

- **Primary keys** are defined 
- **DeletionVectors** = disabled
- **ChangeDataFeed** = enabled 

The following code will take care of this and set `Primary Key` and adjust the settings for `DeletionVectors` and `ChangeDataFeed`. 

In [0]:
prediction.write.format("delta").\
    mode("overwrite").\
    option("delta.enableChangeDataFeed", "true").\
    option("delta.enableDeletionVectors", "false").\
    saveAsTable("cashflow_prediction")

In [0]:
%sql
ALTER TABLE cashflow_prediction ALTER COLUMN `date` SET NOT NULL;
ALTER TABLE cashflow_prediction ALTER COLUMN CompanyCode SET NOT NULL;
ALTER TABLE cashflow_prediction ADD CONSTRAINT PK_DATE_COMP1 PRIMARY KEY (`date`, CompanyCode);
ALTER TABLE cashflow_prediction SET TBLPROPERTIES(
  delta.enableDeletionVectors = false,
  delta.enableChangeDataFeed = true
);

### ALT VON HIER AN -> TO DELETE

In [0]:
%sql
CREATE TABLE IF NOT EXISTS cashflow_prediction_dp (
  `date` TIMESTAMP NOT NULL,
  CompanyCode STRING NOT NULL,
  forecast DOUBLE,
  upper_forceast DOUBLE,
  lower_forecast DOUBLE,
  CONSTRAINT PK_DATE_COMP PRIMARY KEY (`date`, CompanyCode)
)

In [0]:
%sql
ALTER TABLE cashflow_prediction_dp SET TBLPROPERTIES(
  delta.enableDeletionVectors = false,
  delta.enableChangeDataFeed = true
);

In [0]:
# TO DELETE
prediction.write.format("delta").\
    #mode("overwrite").\
    option("overwriteSchema", "true").\
    option("delta.enableChangeDataFeed", "true").\
    option("delta.enableDeletionVectors", "false").\
    saveAsTable("cashflow_prediction_dp")