# **Notebook to use the registered model**

In this notebook we will create and register machine learning model. we will be using mlflow library along with others to create the model. some important features of this code snippit 



# Instructions 
- please ensure the steps defined in "01-CreateTable" are completed.
- please ensure the steps defined in "02-CreateMLModel" are completed.
- please change the lakeHouseName to your own 
- please change the model name if you have changed it while creating model

In [None]:
# Lakehouse name , replace with your own
lakeHouseName = "dataverse_development_cds2_workspace_unqf0798579be6eee118bc36045bd003"

# load data in our data frame from temp table. important to note that we will only load records where reveneu is zero

# SQL Query
query = f"""
SELECT 
    c.crffa_cif,c.crffa_revenue,c.crffa_csat,c.crffa_noofreturns,c.crffa_educationalbackground,c.crffa_yearofbirth,c.crffa_recency,
    r.MntWines,r.MntFruits, r.MntMeatProducts,r.MntFishProducts,r.MntBakeryProducts,r.MntBeverageProds,r.MntDairyProds,r.NumDealsPurchases,
    r.NumWebPurchases,r.NumCatalogPurchases,r.NumStorePurchases
FROM 
    {lakeHouseName}.contact AS c
JOIN 
    {lakeHouseName}.retailstoretxnsummary_01 AS r
ON 
    c.crffa_cif = r.ID
WHERE 
    c.crffa_dateforarticle = 1
    AND c.crffa_revenue = 0
"""
dfmain = spark.sql(query)
dfmain.printSchema()

In [None]:
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.regression import LinearRegressionModel
import mlflow.spark

from pyspark.sql.functions import format_number

# model name
model_name = "MyModelRevenue"

# Load the registered model
model_uri = f"""models:/{model_name}/1"""  # replace with your model name and version
loaded_model = mlflow.spark.load_model(model_uri)

# Define the feature columns
feature_columns = ["crffa_csat", "crffa_noofreturns", "crffa_educationalbackground", "crffa_yearofbirth", "crffa_recency", "MntWines",
 "MntFruits", "MntMeatProducts", "MntFishProducts", "MntBakeryProducts", "MntBeverageProds", "MntDairyProds", "NumDealsPurchases", 
 "NumWebPurchases", "NumCatalogPurchases", "NumStorePurchases"]

# Assemble the features into a feature vector
assembler = VectorAssembler(inputCols=feature_columns, outputCol="features")
df = assembler.transform(dfmain)

# Transform the data using the loaded model
predictions = loaded_model.transform(df)


In [None]:
# Drop the 'features' column
predictions = predictions.select("crffa_cif", "prediction")

# Save the DataFrame to the Data Lake
table_name = "contact_revenue_predictions"
predictions.write.format('delta').mode("overwrite").save(f"Tables/{table_name}")
print(f"Spark DataFrame saved to delta table: {table_name}")

StatementMeta(, , , Waiting, )

Spark DataFrame saved to delta table: contact_revenue_predictions


Spark DataFrame saved to delta table: contact_revenue_predictions
