# Automate loan approvals with Business rules in Apache Spark and Scala

### Automating at scale your business decisions in Apache Spark with IBM ODM 8.9.2

This Scala notebook shows you how to execute locally business rules in DSX and Apache Spark. 
You'll learn how to call in Apache Spark a rule-based decision service. This decision service has been programmed with IBM Operational Decision Manager.  

This notebook puts in action a decision service named Miniloan that is part of the ODM tutorials. It determines with business rules whether a customer is eligible for a loan according to specific criteria. The criteria include the amount of the loan, the annual income of the borrower, and the duration of the loan.

First we load an application data set that was captured as a CSV file. In scala we apply a map to this data set to automate a rule-based reasoning, in order to outcome a decision. The rule execution is performed locally in the Spark service. This notebook shows a complete Scala code that can execute any ruleset based on the public APIs.

To get the most out of this notebook, you should have some familiarity with the Scala programming language.

## Contents 
This notebook contains the following main sections:

1. [Load the loan validation request dataset.](#loaddatatset)
2. [Load the business rule execution and the simple loan application object model libraries.](#loadjars)
3. [Import Scala packages.](#importpackages)
4. [Implement a decision making function.](#implementDecisionServiceMap)
5. [Execute the business rules to approve or reject the loan applications.](#executedecisions) 
6. [View the automated decisions.](#viewdecisions)
7. [Summary and next steps.](#summary)  

<a id="accessdataset"></a>
## 1. Loading a loan application dataset file
A data set of simple loan applications is already available. You load it in the Notebook through its url.

In [1]:
// @hidden_cell
import scala.sys.process._

"wget https://raw.githubusercontent.com/ODMDev/decisions-on-spark/master/data/miniloan/miniloan-requests-10K.csv".!

--2018-06-05 09:20:42--  https://raw.githubusercontent.com/ODMDev/decisions-on-spark/master/data/miniloan/miniloan-requests-10K.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.48.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.48.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 417500 (408K) [text/plain]
Saving to: ‘miniloan-requests-10K.csv.54’

     0K .......... .......... .......... .......... .......... 12% 17.6M 0s
    50K .......... .......... .......... .......... .......... 24% 20.3M 0s
   100K .......... .......... .......... .......... .......... 36% 14.2M 0s
   150K .......... .......... .......... .......... .......... 49% 18.5M 0s
   200K .......... .......... .......... .......... .......... 61% 14.3M 0s
   250K .......... .......... .......... .......... .......... 73% 6.38M 0s
   300K .......... .......... .......... .......... .......... 85% 20.4M 0s
   350K .......... ........

0

In [None]:
val filename = "miniloan-requests-10K.csv"

This following code loads the 10 000 simple loan application dataset written in CSV format.

In [14]:
val requestData = sc.textFile(filename)
val requestDataCount = requestData.count
println(s"$requestDataCount loan requests read in a CVS format")
println("The first 5 requests:")
requestData.take(20).foreach(println)

10000 loan requests read in a CVS format
The first 5 requests:
John Doe, 550, 80000, 250000, 240, 0.05d
John Woo, 540, 100000, 250000, 240, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.07d
John Doe, 550, 80000, 250000, 240, 0.05d
John Woo, 540, 100000, 250000, 240, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.07d
John Doe, 550, 80000, 250000, 240, 0.05d
John Woo, 540, 100000, 250000, 240, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.07d
John Doe, 550, 80000, 250000, 240, 0.05d
John Woo, 540, 100000, 250000, 240, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.07d
John Doe, 550, 80000, 250000, 240, 0.05d
John Woo, 540, 100000, 250000, 240, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.05d
Peter Woo, 540, 60000, 250000, 120, 0.07d


<a id="loadjars"></a>
## 2. Add libraries for business rule execution and a loan application object model
The XXX refers to your object storage or other place where you make available these jars.

Add the following jars to execute the deployed decision service
<il>
<li>%AddJar https://XXX/j2ee_connector-1_5-fr.jar</li>
<li>%AddJar https://XXX/jrules-engine.jar</li>
<li>%AddJar https://XXX/jrules-res-execution.jar</li>
</il>

In addition you need the Apache Jackson annotation lib
<il>
<li>%AddJar https://XXX/jackson-annotations-2.6.5.jar</li>
</il>

Business Rules apply on a Java executable Object Model packaged as a jar. We need these classes to create the decision requests, and to retreive the response from the rule engine.
<il>
<li>%AddJar https://XXX/miniloan-xom.jar</li>
</il>

In [None]:
// @hidden_cell
// The urls below are accessible for an IBM internal usage only

%AddJar https://XXX/j2ee_connector-1_5-fr.jar
%AddJar https://XXX/jrules-engine.jar
%AddJar https://XXX/jrules-res-execution.jar
%AddJar https://XXX/jackson-annotations-2.6.5.jar -f

//Loan Application eXecutable Object Model
%AddJar https://XXX/miniloan-xom.jar -f

print("Your notebook is now ready to execute business rules to approve or reject loan applications")

<a id="importpackages"></a>
## 3. Import packages
Import ODM and Apache Spark packages.

In [16]:
import java.util.Map
import java.util.HashMap

import com.fasterxml.jackson.core.JsonGenerationException
import com.fasterxml.jackson.core.JsonProcessingException
import com.fasterxml.jackson.databind.JsonMappingException
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.SerializationFeature

import org.apache.spark.SparkConf
import org.apache.spark.api.java.JavaDoubleRDD
import org.apache.spark.api.java.JavaRDD
import org.apache.spark.api.java.JavaSparkContext
import org.apache.spark.api.java.function.Function
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path

import scala.collection.JavaConverters._

import ilog.rules.res.model._

import com.ibm.res.InMemoryJ2SEFactory
import com.ibm.res.InMemoryRepositoryDAO

import ilog.rules.res.session._

import miniloan.Borrower
import miniloan.Loan

import scala.io.Source
import java.net.URL
import java.io.InputStream

<a id="implementDecisionServiceMap"></a>
## 4. Implement a Map function that executes a rule-based decision service

In [17]:
case class MiniLoanRequest(borrower: miniloan.Borrower, 
      loan: miniloan.Loan) 

case class RESRunner(sessionFactory: com.ibm.res.InMemoryJ2SEFactory)  {
    
  def executeAsString(s: String): String = {
    println("executeAsString")
    val request = makeRequest(s)
    val response = executeRequest(request)
    
    response
  }
  
   private def makeRequest(s: String): MiniLoanRequest = {
    val tokens = s.split(",")
       
    // Borrower deserialization from CSV
    val borrowerName = tokens(0)
    val borrowerCreditScore = java.lang.Integer.parseInt(tokens(1).trim())
    val borrowerYearlyIncome = java.lang.Integer.parseInt(tokens(2).trim())
    val loanAmount = java.lang.Integer.parseInt(tokens(3).trim())
    val loanDuration = java.lang.Integer.parseInt(tokens(4).trim())
    val yearlyInterestRate = java.lang.Double.parseDouble(tokens(5).trim())
    val borrower = new miniloan.Borrower(borrowerName, borrowerCreditScore, borrowerYearlyIncome)
       
    // Loan request deserialization from CSV
    val loan = new miniloan.Loan()
    loan.setAmount(loanAmount)
    loan.setDuration(loanDuration)
    loan.setYearlyInterestRate(yearlyInterestRate)
       
    val request = new MiniLoanRequest(borrower, loan)
    request
  }
    
 def executeRequest(request: MiniLoanRequest): String = {
    try {
        val sessionRequest = sessionFactory.createRequest()
        val rulesetPath = "/Miniloan/Miniloan"
        sessionRequest.setRulesetPath(ilog.rules.res.model.IlrPath.parsePath(rulesetPath))

        //sessionRequest.getTraceFilter.setInfoAllFilters(false)
        val inputParameters = sessionRequest.getInputParameters
        inputParameters.put("loan", request.loan)
        inputParameters.put("borrower", request.borrower)
        val session = sessionFactory.createStatelessSession()
        
        val response = session.execute(sessionRequest)
        
        var loan = response.getOutputParameters().get("loan").asInstanceOf[miniloan.Loan]
        val mapper = new com.fasterxml.jackson.databind.ObjectMapper()
        mapper.configure(com.fasterxml.jackson.databind.SerializationFeature.FAIL_ON_EMPTY_BEANS, false)
        val results = new java.util.HashMap[String,Object]()
        results.put("input", inputParameters)
        results.put("output", response.getOutputParameters())
        try {
            //return mapper.writeValueAsString(results)
            return mapper.writerWithDefaultPrettyPrinter().writeValueAsString(results);
        } catch {
            case e: Exception => return e.toString()
        }
        "Error"
    } catch {
        case exception: Exception => {
            return exception.toString()
        }
    }
    "Error"
  }
}


val decisionService = new Function[String, String]() {

    @transient private var ruleSessionFactory: InMemoryJ2SEFactory = null
    private val rulesetURL = "https://odmlibserver.mybluemix.net/8901/decisionservices/miniloan-8901.dsar"
    @transient private var rulesetStream: InputStream = null

  def GetRuleSessionFactory(): InMemoryJ2SEFactory = {
    if (ruleSessionFactory == null) {
      ruleSessionFactory = new InMemoryJ2SEFactory()
      // Create the Management Session 
      var repositoryFactory = ruleSessionFactory.createManagementSession().getRepositoryFactory()
      var repository = repositoryFactory.createRepository()
  
      // Deploy the Ruleapp with the Regular Management Session API.
      var rapp = repositoryFactory.createRuleApp("Miniloan", IlrVersion.parseVersion("1.0"));
      var rs = repositoryFactory.createRuleset("Miniloan",IlrVersion.parseVersion("1.1"));
      rapp.addRuleset(rs);
        
      //var fileStream = Source.fromResourceAsStream(RulesetFileName)

      rulesetStream = new java.net.URL(rulesetURL).openStream()

      rs.setRESRulesetArchive(IlrEngineType.DE,rulesetStream)
      repository.addRuleApp(rapp)
    
    }
    ruleSessionFactory
  }
    
  def call(s: String): String = {
    var runner = new RESRunner(GetRuleSessionFactory())
    return runner.executeAsString(s)
  }
    
  def execute(s: String): String = {
    try {
      var runner = new RESRunner(GetRuleSessionFactory())
      return runner.executeAsString(s)
    } catch {
      case exception: Exception => {
        exception.printStackTrace(System.err)
      }
    }
    "Execution error"
  }
}

<a id="executedecisions"></a>
## 5. Automate the decision making on the loan application dataset
You invoke a map on the decision function. While the map occurs rule engines are processing in parallel the loan applications to produce a data set of answers.

In [18]:
println("Start of Execution")
val answers = requestData.map(decisionService.execute)
printf("Number of rule based decisions: %s \n" , answers.count)
// Cleanup output file
//val fs = FileSystem.get(new URI(outputPath), sc.hadoopConfiguration);
//if (fs.exists(new Path(outputPath)))
   // fs.delete(new Path(outputPath), true)
// Save RDD in a HDFS file
println("End of Execution ")
//answers.saveAsTextFile("swift://DecisionBatchExecution." + securedAccessName + "/miniloan-decisions-10.csv")

println("Decision automation job done")

Start of Execution
Number of rule based decisions: 10000                                           
End of Execution 
Decision automation job done


<a id="viewdecisions"></a>
## 6. View your automated decisions
Each decision is composed of output parameters and of a decision trace. The loan data contains the approval flag and the computed yearly repayment. The decision trace lists the business rules that have been executed in sequence to come to the conclusion. Each decision has been serialized in JSON.

In [19]:
//answers.toDF().show(false)
answers.take(1).foreach(println)

{
  "output" : {
    "ilog.rules.firedRulesCount" : 0,
    "loan" : {
      "amount" : 250000,
      "duration" : 240,
      "yearlyInterestRate" : 0.05,
      "yearlyRepayment" : 19798,
      "approved" : true,
      "messages" : [ ]
    }
  },
  "input" : {
    "loan" : {
      "amount" : 250000,
      "duration" : 240,
      "yearlyInterestRate" : 0.05,
      "yearlyRepayment" : 19798,
      "approved" : true,
      "messages" : [ ]
    },
    "borrower" : {
      "name" : "John Doe",
      "creditScore" : 550,
      "yearlyIncome" : 80000
    }
  }
}


<a id="summary"></a>
## 7. Summary and next steps
Congratulations! You have applied business rules to automatically determine loan approval eligibility. You loaded a loan application data set, ran a rule engine inside an Apache Spark cluster to make an eligibility decision for each applicant. Each decision is a Scala object that is part of a Spark Resilient Data Set. 
Each decision is structured with input parameters (the context of the decision) and output parameters. For audit purpose the rule engine can emit a decision trace.

Disclaimer: this notebook uses an experimental in memory RuleSession API that simplifies the deployment pattern. For a customer deployment you have to use a Rule Execution Server database to store the rule set. At execution time the Apache Spark application loads and run the rules through regular JSE RuleSession API. 

In both cases (regular database or experimental in memory) ODM empowers rule engine to automate decisions in parallel locally in the Spark cluster giving high scalability.

<a id="authors"></a>
## Authors
Pierre Feillet and Laurent Grateau are business rule engineers at IBM working in the Decision lab located in France.

Copyright © 2018 IBM. This notebook and its source code are released under the terms of the MIT License.