An end-to end demonstration and integration verification test (IVT) for Spark on z/OS.
Jupyter Notebook
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Client Retention Demo


This project serves as an end-to-end demonstration and integration verification test (IVT) for the Spark on z/OS reference architecture.

The project uses imitation financial data for a retail bank scenario. The retail bank is assumed to have two data sources for their system of records:

  • Client Profile Data (VSAM)
  • Client Transaction History (DB2)

The bank would use the Scala Workbench to distill these data sources into a desired data set for use by data explorers who would use the Interactive Insights Workbench to perform downstream analytics.


To perform an IVT of a Spark on z/OS deployment one would do the following:

  1. Install and configure the IBM z/OS Platform for Apache Spark.
  2. Prime the z/OS data sources with sample demo data.
  3. Install the Scala Workbench
  4. Install the Interactive Insights Workbench (I2W) and MongoDB
  5. Run a Scala Notebook to prime the MongoDB
  6. Run a I2W Notebook to visualize downstream analytics.


  • z System Data Package
    • Sample Client Profile Data (VSAM)
    • Sample Client Transaction History (DB2)
    • Preload scripts
  • Notebooks
    • Sample Scala Notebook that performs data munging on DB2 and VSAM data and writes results to MongoDB.
    • Sample Python Notebooks that analyzes data in MongoDB.
    • Sample Python Notebook that uses Dato to provide a churn analysis on the data in MongoDB. Pending contribution from Dato.


The client retention demo requires the following:

Demo Setup

z/OS Data

Prepare VSAM and DB2 data sources with sample demo data. Go into the data/zos/ directory and follow the README in that directory.

Setup Interactive-Insights-Workbench (I2W) Environment

Download I2W and following the Quickstart instructions.

Setup MongoDB

If you haven't already, Download I2W and follow the Sample Database instructions.

Setup Scala-Workbench Environment

Download the Scala-Workbench and follow the setup instructions. In the Setup Environment Variables section, there is an example template/docker-compose.yml.template file. In this file, make sure to fill in the following fields:

  • JDBC_USER - User able to query VSAM and DB2
  • JDBC_PASS - Password for user able to query VSAM and DB2
  • JDBC_HOST - Host system of VSAM and DB2

If you are using a username/password with MongoDB, also fill in the following fields:

  • MONGO_USER - User able to access the MongoDB database
  • MONGO_PASS - Password for the user abel to access the MongoDB database
  • MONGO_HOST - Host system of the MongoDB instance

Verification Test

Once the setup steps listed above have been completed, you can verify the setup using the following scripts:

  1. On the Scala-Workbench run the client_retention_demo.ipynb notebook.
  2. On I2W run the client_explore.ipynb and churn_business_value.ipynb notebooks.


The client_retention_demo.ipynb will use IBM z/OS Platform for Apache Spark to access data stored in a DB2 table and in a VSAM data set. It will then calculate some aggregate statistics, then offload the results to MongoDB.


The client_explore.ipynb will read from MongoDB, and create several interactive exploritory widgets.


The churn_business_value.ipynb will read from MongoDB, and create several interactive widgets that show business value of target groups.


The client_churn_analysis.ipynb will read from MongoDB, and apply machine learning techniques against the data using technology from Dato. Pending contribution from Dato