## <span style="color:#ff5f27;">🚀 Batch Inference Pipeline</span>

This notebook performs the following actions:

* Gets a feature view object with its name/version from Hopsworks
* Downloads a Pandas DataFrame with new inference data from Hopsworks using the feature view and the call `fv.get_batch_data(start_time="...")`.
* Downloads the model from Hopsworks using with its name/version.
* Makes predictions on batch data.

## <span style="color:#ff5f27;">📝 Imports </span>


In [2]:
#!pip install --quiet hopsworks 

In [3]:
import hopsworks
import pandas as pd
import joblib
import os
import numpy as np
from sklearn.metrics import confusion_matrix
from matplotlib import pyplot
import seaborn as sns

In [4]:
# Define version numbers for feature view and model
fv_version = 1
model_version = 1

# Define start and end times for the data
start_time_data = "2016-11-01"
end_time_data = "2016-12-01"

## <span style="color:#ff5f27;"> 🔮 Connect to Hopsworks Feature Store</span>

In [5]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store()

Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/537749




Connected. Call `.close()` to terminate connection gracefully.


## <span style="color:#ff5f27;"> ⚙️ Feature View Retrieval</span>


In [6]:
# Get the 'loans_approvals' feature view
feature_view = fs.get_feature_view(
    name="loans_approvals", 
    version=fv_version,
)

## <span style="color:#ff5f27;">🗄 Model Registry</span>


In [7]:
# Get the model registry
mr = project.get_model_registry()

Connected. Call `.close()` to terminate connection gracefully.


## <span style='color:#ff5f27'>🚀 Fetch and test the model</span>

In [8]:
# Retrieve the model from the Model Registry using the name "lending_model" and specified version
model = mr.get_model(
    "lending_model",
    version=model_version,
)

# Download the model directory from the Model Registry
model_dir = model.download()

# Load the model using joblib from the downloaded model directory
model = joblib.load(model_dir + "/lending_model.pkl")

Downloading model artifact (1 dirs, 4 files)... DONE

https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


## <span style="color:#ff5f27;">🔮  Batch Prediction </span>

In [9]:
# Initialize batch scoring
feature_view.init_batch_scoring(1)

# Get batch data for a specified time range from start_time_data to end_time_data
batch_data = feature_view.get_batch_data(
    start_time=start_time_data,
    end_time=end_time_data,
)

# Display the first three rows of the batch data
batch_data.head(3)



Error: Reading data from Hopsworks, using Hive           


DatabaseError: Execution failed on sql: WITH right_fg0 AS (SELECT *
FROM (SELECT `fg1`.`loan_amnt` `loan_amnt`, `fg1`.`term` `term`, `fg1`.`int_rate` `int_rate`, `fg1`.`installment` `installment`, `fg1`.`sub_grade` `sub_grade`, `fg1`.`purpose` `purpose`, `fg1`.`zip_code` `zip_code`, `fg1`.`id` `join_pk_id`, `fg1`.`issue_d` `join_evt_issue_d`, `fg0`.`home_ownership` `home_ownership`, `fg0`.`annual_inc` `annual_inc`, `fg0`.`verification_status` `verification_status`, `fg0`.`dti` `dti`, `fg0`.`open_acc` `open_acc`, `fg0`.`pub_rec` `pub_rec`, `fg0`.`revol_bal` `revol_bal`, `fg0`.`revol_util` `revol_util`, `fg0`.`total_acc` `total_acc`, `fg0`.`initial_list_status` `initial_list_status`, `fg0`.`application_type` `application_type`, `fg0`.`mort_acc` `mort_acc`, `fg0`.`pub_rec_bankruptcies` `pub_rec_bankruptcies`, RANK() OVER (PARTITION BY `fg1`.`id`, `fg1`.`issue_d` ORDER BY `fg0`.`earliest_cr_line` DESC) pit_rank_hopsworks
FROM `airquality_don123_featurestore`.`loans_1` `fg1`
INNER JOIN `airquality_don123_featurestore`.`applicants_1` `fg0` ON `fg1`.`id` = `fg0`.`id` AND `fg1`.`issue_d` >= `fg0`.`earliest_cr_line`
WHERE `fg1`.`issue_d` >= TIMESTAMP '2016-11-01 00:00:00.000' AND `fg1`.`issue_d` < TIMESTAMP '2016-12-01 00:00:00.000') NA
WHERE `pit_rank_hopsworks` = 1) (SELECT `right_fg0`.`loan_amnt` `loan_amnt`, `right_fg0`.`term` `term`, `right_fg0`.`int_rate` `int_rate`, `right_fg0`.`installment` `installment`, `right_fg0`.`sub_grade` `sub_grade`, `right_fg0`.`purpose` `purpose`, `right_fg0`.`zip_code` `zip_code`, `right_fg0`.`home_ownership` `home_ownership`, `right_fg0`.`annual_inc` `annual_inc`, `right_fg0`.`verification_status` `verification_status`, `right_fg0`.`dti` `dti`, `right_fg0`.`open_acc` `open_acc`, `right_fg0`.`pub_rec` `pub_rec`, `right_fg0`.`revol_bal` `revol_bal`, `right_fg0`.`revol_util` `revol_util`, `right_fg0`.`total_acc` `total_acc`, `right_fg0`.`initial_list_status` `initial_list_status`, `right_fg0`.`application_type` `application_type`, `right_fg0`.`mort_acc` `mort_acc`, `right_fg0`.`pub_rec_bankruptcies` `pub_rec_bankruptcies`
FROM right_fg0)
TExecuteStatementResp(status=TStatus(statusCode=3, infoMessages=['*org.apache.hive.service.cli.HiveSQLException:Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask:28:27', 'org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:343', 'org.apache.hive.service.cli.operation.SQLOperation:runQuery:SQLOperation.java:232', 'org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:269', 'org.apache.hive.service.cli.operation.Operation:run:Operation.java:255', 'org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:541', 'org.apache.hive.service.cli.session.HiveSessionImpl:executeStatement:HiveSessionImpl.java:516', 'sun.reflect.GeneratedMethodAccessor210:invoke::-1', 'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43', 'java.lang.reflect.Method:invoke:Method.java:498', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78', 'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36', 'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63', 'java.security.AccessController:doPrivileged:AccessController.java:-2', 'javax.security.auth.Subject:doAs:Subject.java:422', 'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1821', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59', 'com.sun.proxy.$Proxy53:executeStatement::-1', 'org.apache.hive.service.cli.CLIService:executeStatement:CLIService.java:281', 'org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:712', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1557', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1542', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624', 'java.lang.Thread:run:Thread.java:750'], sqlState='08S01', errorCode=1, errorMessage='Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask'), operationHandle=None)
unable to rollback

In [None]:
# Make predictions on the batch data using the loaded model
predictions = model.predict(batch_data)

# Display the first 10 predictions
predictions[:10]

---

### <span style="color:#ff5f27;">🥳 <b> Next Steps  </b> </span>
Congratulations you've now completed the Loan Approval tutorial for Managed Hopsworks.

Check out our other tutorials on ➡ https://github.com/logicalclocks/hopsworks-tutorials

Or documentation at ➡ https://docs.hopsworks.ai