Data Security & Privacy related laws and regulations have become more stringent and at the same time businesses are expected to open for ecosystem partners. This makes data governance very critical to avoid litigation, loss of competitive position and trust.
A collaborating application from within the enterprise or an ecosystem partner could require access to a data source for both read and write operations. A read operation must hence mask sensitive data such as name, location, contact details, date of birth, credit card number, financial details and more. The write operation must ensure all the data policies are enforced. In such a scenario, a data governance framework plays a critical role to enforce data security and privacy at the same time be an enabler for business to achieve their strategy.
The code pattern Mask data for AI applications for security and privacy conformance demonstrated a methodology to mask sensitive data for a collaborating application. This code pattern demonstrates the following aspects:
- Mask sensitive data for a collaborating application from within the enterprise or ecosystem partner.
- Authentication when the collaborating application is a chatbot ensuring data security.
- Maintain control and enforce data policies when the collaborating application writes back to the data source.
Let us consider the following business scenario. There is an insurance portal application where a customer can register, sign in, purchase a policy, view policy details and surrender a policy. Currently all this functionality is available on a web portal. The business wants to enhance the reach of its application and build new systems of engagement for their customers. This scenario is to engage a Chatbot to provide conversational and interactive experience for the customers for ease of use. The chatbot coexists with the web portal and provides the following features and business services:
- Register as a user
- Sign in to the chatbot
- Buy an insurance policy
- View all policy details
- Surrender a policy
In this scenario, web portal application owns the data and is responsible for data governance enforcement. Engaging Chatbot as system of experience, it is required to expose data/entities to enable following capabilities:
- Authenticate users of the chatbot.
- Read access to data from the web portal application with sensitive data masked.
- Write access to data with data policies enforced.
The data security requirements are as follows:
- The web portal application owns the data and hence performs all write operations on the data with data policies enforced.
- The chatbot application performs write operations on the data by invoking APIs exposed by the web portal application. This ensures conformance to data governance and security.
- Every request from the chatbot application to the web portal application must be authenticated.
- The chatbot application must have read access to the data with sensitive information like credit card number masked.
Note: To keep the web portal application simple, we will build only the APIs for the web portal application. We will refer to it as
Portal Svc
. The focus of this code pattern is on the chatbot application.
In this code pattern, you will learn how to:
- Set up data assets for governance in the Watson Knowledge Catalog
- Create data categories, classes, business terms and data protection rules for the data assets
- Create virtualized view of the data on Watson Query with data policies enforced
- Create a chatbot aapplication using Watson Assistant that invokes APIs exposed by the
Portal Svc
for writing data to the data source, and consumes the read-only data with sensitive information masked from Watson Query.
Security Verify has been used to implement authentication for the Chatbot application.
- Create tables in Db2. The Db2 connection and the tables(as
Data Asset
) are added to theWatson Knowledge Catalog(WKC)
. The data policies are configured for the data assets inWKC
. - Db2 is added as a data source in Watson Query. The needed tables are virtualized and a
View
is created by joining the virtualized tables. - The Watson Query virtualized tables and view are published to
WKC
. The data policies are configured for the data assets inWKC
. - User accesses the chatbot. User is provided the option to
Register as a user
orLogin
. - In case of a new user, User is provided a web url for registration. In case of existing user, the User is authenticated using a one-time passcode sent to the user's email address.
- User accesses the registration link hosted on the
Portal Svc
. User fills up the registration form with details. - A new user is created in Security Verify, and a record is added in Db2 table with other customer details.
- The user after a successful authentication of one-time passcode can perform the following operations that involves a write operation to the data source -
Buy a Policy
orSurrender a Policy
. ThePortal Svc
APIs for the operation is invoked. ThePortal Svc
validates the request withSecurity Verify
usingToken Introspection
. - The
Portal Svc
then writes to the Db2 database with the data policies applied for the invoked operations. - The response from the
Portal Svc
is returned toWatson Assistant
,Chatbot Svc
and eventually to the end user accessing the chatbot interface. - The user requests to
View Active policies
orView All policies
. Since this a read operation, the request goes toChatbot Svc
. TheChatbot Svc
validates the request withSecurity Verify
usingToken Introspection
. - The
Chatbot Svc
accessesWatson Query
to get the results. The data policies are applied to mask sensitive data in the results. - All responses are sent back to the user on the chatbot interface.
- IBM Cloud account
- OpenShift Cluster
- IBM Security Verify account
- Git client
- Clone the repository
- Create IBM Cloud Services
- Configure Security Verify
- Provide access for collaborators to Cloud Pak for Data
- Create Cloud Functions Action
- Setup Watson Assistant Chatbot
- Deploy Applications
- Configure Watson Query
- Configure Watson Knowledge Studio
- Access the Application
- Summary
From a command terminal, run the below command to clone the repo:
git clone https://github.com/IBM/data-governance-insurance-chatbot-app
2.1 Create DB2, Watson Knowledge Catalog, Watson Query service and Watson Assistant instances on Cloud Pak for Data
In the code pattern, we will be using Cloud Pak for Data.
Cloud Pak For Data is available in two modes -
2.1.1 For fully managed service, click here and follow the steps.
2.1.2 For self managed software, click here and follow the steps.
Go to the Watson Knowledge Studio console. Select View All Catalogs
on the hamburger menu on the top left.
Click on Create Catalog
.
Enter a name for the catalog (say InsClCatalog
). Enter a description. Select Enforce data policies
. Click Create
.
Click Security Verify to sign up for Security Verify. After you sign up for an account, the account URL (https://[tenant name].verify.ibm.com/ui/admin) and password is sent in an email.
Note: If you are using a Cloud Pak For Data as a self managed software, the same cluster can be used for application deployment.
Go to this link to create an instance of OpenShift cluster.
Make a note of the Ingress Subdomain URL
:
Please follow the instructions here to configure Security Verify.
For fully managed service, click here and follow the steps.
For self managed software, click here and follow the steps.
Login to your IBM Cloud account. On the dashboard, click on the hamburger menu and navigate to Functions
and click on Actions
.
Click the Create
button to create a new action.
Enter a name for action under Action Name
. Leave Enclosing Package
as (Default Package)
itself. Under Runtime
select option for Node.js.
Click on Create
button. You are presented with actions code editor. Replace the existing code with the javascript code here.
Next, in the javascript code, update the value of following variables (mentioned in the beginning of the file):
//Security Verify Details
var tenant_url = "xxxx.verify.ibm.com"
var client_id = "xxxx"
var client_secret = "xxxx"
//API Details
var REGISTRATION_API_URL = "http://<openshift_url>/ins/portalsvc/register";
var BUY_POLICY_API_URL = "http://<openshift_url>/ins/portalsvc/createpolicy";
var SURRENDER_POLICY_API_URL = "http://<openshift_url>/ins/portalsvc/surrpolicy";
var VIEW_ACTIVE_POLICIES_API_URL = "http://<openshift_url>/ins/chatbotsvc/getallactivepolicies";
var VIEW_ALL_POLICIES_API_URL = "http://<openshift_url>/ins/chatbotsvc/getallpolicies";
Note: Please use the security verify credentials noted in step 3 and replace
openshift_url
by theOpenShift ingress subdomain url
as noted in step 2.4.
Click Save
button on the top right of the code editor.
For the action just created, click Endpoints
on the left side navigation menu. Select the checkbox Enable as Web Action
. Click the Save
button on the right top corner. When saved, Copy web action url
icon, under Web Action
section is enabled. Click the icon to copy the webhook url. This URL will be used in Watson Assistant for it to call the actions in Cloud Functions.
Go to the Watson Assistant instance that you created earlier. Then click on Launch Watson Assistant
button to launch Watson Assistant dashboard.
-
In the Watson Assistant home page, click
Create New +
option on the top panel. -
Provide a name of your choice, say
InsuranceBot
and clickCreate assistant
. -
Navigate to
Assistant Settings
in the left panel towards down. Under the Dialog section, click onActivate Dialog
. Now, Dialog will be visible as one of the options in the left panel. -
Click on
Dialog > Options > Upload/Download
and provide a json file available at<cloned repo>/sources/chatbot/dialog/
. ClickUpload
. -
On the left navigation links click
Options > Webhooks
and inURL
text field, enter the REST API endpoint as noted in step 4 and append it with .json. It should look something like thishttps://eu-gb.functions.appdomain.cloud/api/v1/web/.../default/sample.json
Now the chatbot is ready to use.
-
On the left panel, click the
Preview
button. -
Click on
Customize web chat
button presented in top-right corner. Here you can make changes as per your choice. -
The following changes were made for this code pattern:
- Under
Home Screen
tab, toggle a button to set itoff
. - Click on
Save and Exit
.
- Under
-
Now, chatbot can be used in this preview window.
-
Optional: If you wish to embed this chatbot onto your portal, go to
Preview > Customize web chat > Embed (tab)
. It shows a code snippet like:<script> window.watsonAssistantChatOptions = { integrationID: "cxxx0", // The ID of this integration. region: "us-south", // The region your integration is hosted in. serviceInstanceID: "fexxxa", // The ID of your service instance. onLoad: function(instance) { instance.render(); } }; setTimeout(function(){ const t=document.createElement('script'); t.src="https://web-chat.global.assistant.watson.appdomain.cloud/versions/" + (window.watsonAssistantChatOptions.clientVersion || 'latest') + "/WatsonAssistantChatEntry.js"; document.head.appendChild(t); }); </script>
This code will later be copied and pasted inside the body tags of
chatbot.html
in the chatbot svc application. The chatbot will then be integrated with your portal. This step is described during the chatbot-svc deployment.
Note that this code pattern uses the preview option to ease the process.
In the cloned repo folder - go to sources/portal-svc/src/main/resources
. Open db.config
.
Replace the {{host}} and {{port}} with the host and port you noted during Db2 credentials creation. Enter the userid, password and schema with the username, password and username(in uppercase). Save the file.
Note: the schema should be in uppercase of the username noted in Db2 credentials.
jdbcurl=jdbc:db2://{{host}}:{{port}}/bludb:sslConnection=true;
userid=
password=
schema=
In the cloned repo folder - go to sources/portal-svc/src/main/resources
. Open verify.config
.
Make the below changes and save the file:
- Replace {{tenant-id}} with the tenant id of Security Verify noted at the time of creation.
- For
clientId
andclientSecret
enter the Client ID and Client secret noted on theSign-on
tab of Security Verify. - For
apiClientId
andapiClientSecret
enter the Client ID and Client secret noted on theAPI Access
tab of Security Verify.
introspectionUrl=https://{{tenant-id}}.verify.ibm.com/v1.0/endpoint/default/introspect
tokenUrl=https://{{tenant-id}}.verify.ibm.com/v1.0/endpoint/default/token
userInfoUrl=https://{{tenant-id}}.verify.ibm.com/v1.0/endpoint/default/userinfo
clientId=
clientSecret=
usersUrl=https://{{tenant-id}}.verify.ibm.com/v2.0/Users
apiClientId=
apiClientSecret=
On the terminal window, got to the repository folder that we cloned earlier.
Go to the directory - sources/portal-svc/src/main/java/com/example/portalsvc/rest/
.
Open the file PortalSvcEndpoint.java
.
Replace the placeholder {{ingress-sub-domain}}
with the ingress sub domain of the OpenShift cluster you noted earlier. Save the file.
private static String ingressSubDomain = "portal-svc-governance.{{ingress-sub-domain}}/";
Now change directory to /sources/portal-svc
in the cloned repo folder.
Run the following commands to deploy Portal Service
.
oc new-project governance
mvn clean install
oc new-app . --name=portal-svc --strategy=docker
oc start-build portal-svc --from-dir=.
oc logs -f bc/portal-svc
oc expose svc/portal-svc
Ensure that the application is started successfully using the command oc get pods
. Also make a note of the route using the command oc get routes
.
In this step, we will create two tables in the Db2 database - CUSTOMER and ORDERS table.
Invoke the URL - http://portal-svc-governance.{{IngressSubdomainURL}}/ins/portalsvc/setupdb
Note: Replace {{IngressSubdomainURL}} with
Ingress subdomain
of the OpenShift cluster.
In the cloned repo folder - go to sources/chatbot-svc/src/main/resources
. Open db.config
.
Here, you need to enter the Watson Query
credentials noted earlier. For mode, specify managed
if using the SaaS version of Cloud Pak for Data, and self
if Cloud Pak for Data is deployed on self-managed OpenShift cluster. Specify the JDBC url
for the Watson Query
. The JDBC url will have apikey
embedded for managed
mode. This apikey
is what was generated on the Data Collaborator
IBM Cloud account.
For a self-managed cluster, enter the user id and password of the collaborator user. Save the file.
jdbcurl=jdbc:db2://
mode=managed
userid=
password=
schema=INSSCHEMA
In the cloned repo folder - go to sources/chatbot-svc/src/main/resources
. Open verify.config
.
Make the below changes and save the file:
- Replace {{tenant-id}} with the tenant id of Security Verify noted at the time of creation.
- For
clientId
andclientSecret
enter the Client ID and Client secret noted on theSign-on
tab of Security Verify. - For
apiClientId
andapiClientSecret
enter the Client ID and Client secret noted on theAPI Access
tab of Security Verify.
introspectionUrl=https://{{tenant-id}}.verify.ibm.com/v1.0/endpoint/default/introspect
tokenUrl=https://{{tenant-id}}.verify.ibm.com/v1.0/endpoint/default/token
userInfoUrl=https://{{tenant-id}}.verify.ibm.com/v1.0/endpoint/default/userinfo
clientId=
clientSecret=
usersUrl=https://{{tenant-id}}.verify.ibm.com/v2.0/Users
apiClientId=
apiClientSecret=
Open the file sources/chatbot-svc/src/main/resources/chatbot.html
.
Add the embed script you copied during the Watson Assistant Setup, between the HTML body tags. Save the file.
The chatbot will be accessible at the URL: http://chatbot-svc-governance.{{IngressSubdomainURL}}/ins/chatbotsvc/chatbot after deploying the service.
Note: Replace IngressSubdomainURL with the Ingress subdomain of the OpenShift cluster.
On the terminal window, got to the repository folder that we cloned earlier.
Now change directory to /sources/chatbot-svc
in the cloned repo folder.
Run the following commands to deploy Chatbot Service
.
oc new-project governance
mvn clean install
oc new-app . --name=chatbot-svc --strategy=docker
oc start-build chatbot-svc --from-dir=.
oc logs -f bc/chatbot-svc
oc expose svc/chatbot-svc
Ensure that the application is started successfully using the command oc get pods
. Also make a note of the route using the command oc get routes
.
Login to Cloud Pak for Data
with Data Owner
credentials. Go to the Watson Query console.
Select Service settings
in the dropdown menu. Click on Governance
tab. Enable Enforce policies within Data Virtualization
and Enforce publishing to a governed catalog
.
Select Data Sources
in the dropdown menu. Click on Add Connection
. Select Db2 on Cloud
if the instance is on IBM Cloud. Enter the Db2
credentials that you noted earlier, and create the connection.
Select Schemas
in the dropdown menu. Click on New schema
with a name say INSSCHEMA
.
Select Schemas
in the dropdown menu. Select the POLICY_HOLDER
and POLICIES
tables. Add to Cart. Go to the cart, select Virtualized data
option and click on Virtualize
as shown.
Select Virtualized data
in the dropdown menu. Select POLICY_HOLDER
and POLICIES
table. Click on Join
. In the next page, create a join key from CUST_ID
of POLICY_HOLDER
table to CUST_ID
of POLICIES
table.
On the next page, select Virtualized data
option. Click Create View
.
Select Virtualized data
in the dropdown menu. For the POLICYHOLDER_POLICIES_VIEW
select Manage Access
. On the access page, click on Grant Access
and provide access to the Data Collaborator
user.
Login to Cloud Pak for Data
with Data Owner
credentials. Go to the Watson Query console.
Click View All Catalogs
on the left hamburger menu. Click on the catalog that you created earlier. All the Watson Query Data Assets should appear as shown.
Click on the INSSCHEMA.POLICY_HOLDER
data asset. Click on the Asset
tab.
Enter the connection details of Watson Query noted earlier.
If it is a fully managed Cloud Pak for Data service:
- On the IBM Cloud Dashboard, go to
Manage
and selectAccess (IAM
). Create an IBM Cloud API Key. Note the API key. - On the
Asset
tab, select API Key as the mode of authentication. - Enter the API key noted in the earlier step, and click
Connect
.
If it is a self managed software for Cloud Pak for Data:
- Enter the
Data Owner
credentials for Cloud Pak for Data.
The data should now be visible on the Asset
tab:
For each of the assets - INSSCHEMA.POLICY_HOLDER
,INSSCHEMA.POLICIES
and INSSCHEMA.POLICYHOLDER_POLICIES_VIEW
, go to the Profile
tab and click Create Profile
.
Click View All Catalogs
on the left hamburger menu. Click on Add category
and select New category
.
Create a category for personal financial information. Enter a name
and click Create
.
Click Data classes
on the left hamburger menu. Click on Add data class
and select New data class
.
Enter details as shown and click Create
.
This will be saved as Draft
. Click Publish
to publish the data class.
Click Business terms
on the left hamburger menu. Click on Add business term
and select New business term
.
Enter details as shown and click Create
.
This will be saved as Draft
. Click Publish
to publish the business term.
Open the Asset
tab for ORDERS
table. Assign the data class CC_NUM_CLASS
created earlier to the credit card information columns.
Open the Asset
tab for POLICY_HOLDER
table. Verify the data class assignment for mobile and email columns.
Click Rules
on the left hamburger menu. Click on Add rule
and select New rule
.
Next select Data protection rule
. Configure the rule as shown. This rule will mask the credit card data for collaborators. Click on Create
.
Similarly, you can add rules for masking mobile, email and credit card expiry information.
Login to Watson Query with Data Owner
credentials. Preview
the POLICYHOLDER_POLICIES_VIEW
.
Login to Watson Query with Data Collaborator
credentials. Preview
the POLICYHOLDER_POLICIES_VIEW
.
In the next section, let us access the application and see the data privacy policies enforced for the chatbot.
As mentioned in step 6, this code pattern uses the preview of chatbot provided by Watson Assistant service. To access the chatbot, go to Watson Assistant Service instance link on your cloud resources, click Launch Watson Assistant
and then click on Preview
in the left panel. Now you can converse with chatbot as shown.
In this code pattern you saw how to set up a data governance framework with security using Watson Knowledge Catalog, Watson Query and Security Verify to enforce data policies for a collaborating chatbot application. You saw how both write and read operations can be supported for a collaborating application while enforcing data policies.
This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.