## Cash Liquidity Forecast
In this notebook we will forecast the cash flow by using the trained model and applying it to the data product `Cashflow`.

These are the steps for this exercise:
1. Install and import packages
2. Load prepared data
3. Load forecasting model
4. Predict cashflow
5. Persist prediction table

### 1. Install and import packages
All necessary packages for this notebook are going to be outlined in the following notebook cell. In order to make sure that the results are reproducible, the following packages are going to be installed:
- **mlflow**: Tracking of our ML model
- **neuralforecast**:  is a comprehensive suite of neural network-based models for time series forecasting. It's designed to be scalable, user-friendly, and highly performant, making it suitable for both researchers and practitioners.

In [0]:
%pip install mlflow=3.5.0
%pip install neuralforecast=3.1.2
%restart_python

In [0]:
from pathlib import Path
import mlflow
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, date_trunc, sum, explode, expr, struct
from delta import *

#### Set parameters
# &#x270D;
Please replace the values `<CATALOG_NAME>` and `<SCHEMA_NAME>` with the specific values that match our use case and group. You can find the correct names by checking the **Unity Catalog** and look for the specific catalog and schema names:`uc_XXX`, `grpX`. Additionally, please replace the value `<TIME_SERIES_TABLE_NAME>` with the according name. 

In [0]:
%sql
-- CREATE CATALOG IF NOT EXISTS <CATALOG_NAME>;
SET CATALOG uc_cash_liquidity_forecast;
-- CREATE SCHEMA IF NOT EXISTS <SCHEMA_NAME>;
USE SCHEMA grp01

### 2. Load prepared data

# &#x270D;
Please replace the value `<PREPARED_TABLE_NAME>` with the prepared cashflow table name, that we have created in the previous exercise.

In [0]:
data = spark.read.table("prepared_cash_flow_time_series")

In [0]:
time_series_data = data.withColumn("y", data["y"].cast("float"))

In [0]:
time_series_df = time_series_data.toPandas()

### 3. Load forecasting model

Retrieve logged trained model for prediction from MLflow

In [0]:
mlflow.set_registry_uri("databricks-uc")

# &#x270D;
Replace the variable `<MODEL_NAME>`with the name of the logged trained model in the previous exercise.

In [0]:
model = mlflow.pyfunc.load_model("models:/neuralforecast_nhits@prod")

### 4. Predict cashflow

In [0]:
prediction = model.predict(time_series_df)

In [0]:
prediction = prediction.rename(mapper={"ds": "date", "NHITS": "forecast", "NHITS-lo-90": "lower_forecast", "NHITS-hi-90": "upper_forecast"}, axis=1)

In [0]:
prediction = spark.createDataFrame(prediction)

### 5. Persist prediction table
# &#x270D;
We create a prediction table that contains a constraint key consisting of the `date` and `CompanyCode` column. 
As the constraint key is unique over the complete Databricks catalog, please replace the constraint `<CONSTRAINT_NAME>` with an appropriate name for the constraint key. Additionally, replace the variable `<PREDICTION_RESULT_TABLE_NAME>` with the value `cashflow_prediction`.

To be able to install the data product to Datasphere later on, the following requirements need to be fulfilled:

- **Primary keys** are defined 
- **DeletionVectors** = disabled
- **ChangeDataFeed** = enabled 

The following code will take care of this and set `Primary Key` and adjust the settings for `DeletionVectors` and `ChangeDataFeed`. 
Please replace the variables `<BOOL>`accordingly.

In [0]:
prediction.write.format("delta").\
    mode("overwrite").\
    option("delta.enableChangeDataFeed", "true").\
    option("delta.enableDeletionVectors", "false").\
    saveAsTable("cashflow_prediction")

In [0]:
%sql
ALTER TABLE cashflow_prediction ALTER COLUMN `date` SET NOT NULL;
ALTER TABLE cashflow_prediction ALTER COLUMN CompanyCode SET NOT NULL;
ALTER TABLE cashflow_prediction ADD CONSTRAINT PK_DATE_COMP1 PRIMARY KEY (`date`, CompanyCode);


After successful execution, please validate the table cashflow_prediction in the Unity Catalog in your created SCHEMA and preview the prediction data:

open the Catalog Explorer by right-clicking on the table
navigate to the tab sample data
select compute Serverless Starter Warehouse

You should be able to see the sample data as following:

![cashflow_prediction_sample_view.png](../../images/cashflow_prediction_sample_view.png)