## Cash Liquidity Forecast
In this notebook we will forecast the cash flow by using the trained model and applying it to the data product `Cash Flow`.

This involves in total the following steps for the overall prediction:
- Retrieve logged trained model for prediction from MLflow
- Write prediction results to Delta Table

### Install packages
All necessary packages for this notebook are going to be outlined in the following notebook cell. In order to make sure that the results are reproducible, the following packages are going to be installed:
- Mlflow: Used for tracking, storing and retriving of our model
- AutoTS: Package allowing us to run different Time Series algorithms

In [0]:
%pip install mlflow
%pip install autots['additional']
%restart_python

### Import packages

In [0]:
from pathlib import Path
import mlflow
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, date_trunc, sum, explode, expr
from delta import *

### Setup Spark Session and consume data product

Please replace the values `<CATALOG_NAME>` and `<SCHEMA_NAME>` with the specific values that match our use case and group. You can find the correct names by checking the **Unity Catalog** and look for the specific catalog and schema names:`uc_XXX`, `grpX`.

Please note: 
We adapted the code here to match our use case. Therefore, some of the lines are commented out and not needed. However, they can be useful for future applications. 

In [0]:
%sql
-- CREATE CATALOG IF NOT EXISTS <CATALOG_NAME>;
SET CATALOG <CATALOG_NAME>;
CREATE SCHEMA IF NOT EXISTS <SCHEMA_NAME>;
USE SCHEMA <SCHEMA_NAME>;

Please replace the value `<PREPARED_TABLE_NAME>` with the prepared cashflow table name, that we have created in the previous exercise.

In [0]:
builder = SparkSession.builder.appName("cash_flow_forecasting").getOrCreate()
data = spark.read.table("<PREPARED_TABLE_NAME>")

### Time Series Forecasting
We retrieve the model from the run from which we stored on MLflow for the Time Series training. After we retrieve the model from MLflow, we submit the spark dataframe to the predict function and retrieve the prediction from the function. After that we save the prediction data as a Delta Table.



In [0]:
time_series_data = data.toPandas().astype({"ds": "datetime64[ns]", "y": float})

Please replace the value `<USERNAME>` with your user name / email, e.g. ac229588u01@sapexperienceacademy.com .

In [0]:
notebook_path = dbutils.notebook.entry_point.getDbutils().notebook().getContext().notebookPath().get()
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/<USERNAME>/Time Series Forecasting")

### Retrieve MLflow model
In order to get the model that we logged from our training procedure, we search in our MLflow experiment the last successful run ID and provide it to the mlflow functions in order to retrieve the prediction

In [0]:
last_run = mlflow.search_runs(order_by=["start_time DESC"])
run_id = last_run[last_run["status"] == "FINISHED"]["run_id"].iloc[0]


In [0]:
logged_model = f'runs:/{run_id}/model'

# Load model as a PyFuncModel
loaded_model = mlflow.pyfunc.load_model(logged_model)
prediction = loaded_model.predict(time_series_data)

In [0]:
prediction = spark.createDataFrame(prediction)

#### Creation of Cashflow Prediction table
We create a prediction table that contains a constraint key consisting of the `date` and `CompanyCode` column. 
As the constraint key is unique over the complete Databricks catalog, please replace the constraint `<CONSTRAINT_NAME>` with an appropriate name for the constraint key. Additionally, replace the variable `<PREDICTION_RESULT_TABLE_NAME>` with the value `cashflow_prediction`.

To be able to install the data product to Datasphere later on, the following requirements need to be fulfilled:

- **Primary keys** are defined 
- **DeletionVectors** = disabled
- **ChangeDataFeed** = enabled 

The following code will take care of this and set `Primary Key` and adjust the settings for `DeletionVectors` and `ChangeDataFeed`. 

In [0]:
prediction.write.format("delta").\
    mode("overwrite").\
    option("delta.enableChangeDataFeed", "true").\
    option("delta.enableDeletionVectors", "false").\
    saveAsTable("<PREDICTION_RESULT_TABLE_NAME>")

In [0]:
%sql
ALTER TABLE <PREDICTION_NAME> ALTER COLUMN `date` SET NOT NULL;
ALTER TABLE <PREDICTION_NAME> ALTER COLUMN CompanyCode SET NOT NULL;
ALTER TABLE <PREDICTION_NAME> ADD CONSTRAINT <CONSTRAINT_NAME> PRIMARY KEY (`date`, CompanyCode);
ALTER TABLE <PREDICTION_NAME> SET TBLPROPERTIES(
  delta.enableDeletionVectors = false,
  delta.enableChangeDataFeed = true
);

After successful execution, please validate the table `cashflow_prediction` in the Unity Catalog in your created SCHEMA and preview the prediction data:
- open the `Catalog Explorer` by right-clicking on the table
- navigate to the tab `sample data` 
- select compute `Serverless Starter Warehouse` 