<img src="https://raw.githubusercontent.com/Db2-DTE-POC/CPDDVLAB/master/media/Digital Technical Engagement.png">

# IBM Cloud Pak for Data Data Virtualization Demonstration

### Where to find this notebook online
You can find a copy of this notebook at https://github.com/Db2-DTE-POC/CPDDVLAB.

### This notebooks demonstrates using the CPD REST APIs to access the Data Virtualization Service

All the code in this notebook runs using the main **peter** userid and **PETER** password.

#### RESTful Services
IBM Cloud Pak for Data is built on a set of microservices that communicate with each other and the Console user interface using RESTful APIs. You can use these services to automate anything you can do throught the user interface.

This Jupyter Notebook contains examples of how to use the Open APIs to retrieve information from the virtualization service, how to run SQL statements directly against the service through REST and how to provide authoritization to objects. This provides a way write your own script to automate the setup and configuration of the virtualization service.

The next part of the lab relies on a set of base classes to help you interact with the RESTful Services API for IBM Cloud Pak for Data Virtualization. You can access this library on GITHUB. The commands below download the library and run them as part of this notebook.
<pre>
&#37;run CPDDVRestClassV35.ipynb
</pre>
The cell below loads the RESTful Service Classes and methods directly from GITHUB. Note that it will take a few seconds for the extension to load, so you should generally wait until the "Db2 Extensions Loaded" message is displayed in your notebook. 
1. Click the cell below
2. Click **Run**

In [None]:
!wget -O CPDDVRestClassV35.ipynb https://raw.githubusercontent.com/Db2-DTE-POC/CPDDVLAB/master/CPDDVRestClassV35.ipynb
%run CPDDVRestClassV35.ipynb

## Establishing a Connection to the Console

### Connections
To connect to the Data Virtualization service you need to provide the URL, the service name (v1) and profile the console user name and password. The next cell connects to the console from inside the IBM CPD Cluster.

In [None]:
# Connect to the Db2 Data Management Console service

Console  = 'https://cpd-cpd-cpd.cp4d-poc-estesexp-283594-73aebe06726e634c608c4167edcc2aeb-0000.tor01.containers.appdomain.cloud'
user     = 'xxxx'
password = 'xxxx'

# Set up the required connection
CPDAPI = Db2(Console)
api = '/v1'
CPDAPI.authenticate(api, user, password)
database = Console

## Utility Routines

#### Run SQL through the SQL Editor Service
You can also use the SQL Editor service to run your own SQL. Statements are submitted to the editor. Your code then needs to poll the editor service until the script is complete. Fortunately you can use the DB2 class included in this lab so that it becomes a very simple Python call. The **runScript** routine runs the SQL and the **displayResults** routine formats the returned JSON. 

In [None]:
CPDAPI.displayResults(CPDAPI.runScript('SELECT * FROM MSSQL."Service_Order"; SELECT * FROM MSSQL."Shipment"'))

### Virtualized Tables and Views
The next two cells are useful to determine all the virtualized data availble to the admin user and the objects available by role.

In [None]:
### Display All Virtualized Tables and Views
display(CPDAPI.getVirtualizedTablesDF())
display(CPDAPI.getVirtualizedViewsDF())

In [None]:
r = CPDAPI.getSchemas()
if (CPDAPI.getStatusCode(r)==200):
    json = CPDAPI.getJSON(r)
    df = pd.DataFrame(json_normalize(json['resources']))
    display(df[['definertype','name']])
else:
    print(CPDAPI.getStatusCode(r))

### Cloud Pak for Data User Management
The next two cells can be used to list existing CPD users and add a new user to the system.

In [None]:
# Get the list of CPD Users
r = CPDAPI.getUsers()
if (CPDAPI.getStatusCode(r)==200):
    json = CPDAPI.getJSON(r)
    df = pd.DataFrame(json_normalize(json))
    print(', '.join(list(df)))
    display(df[['uid','username','displayName']])
else:
    print(CPDAPI.getStatusCode(r))

### Add Users to CPD and Data Virtualization
Set the value **ids** to the number of users you want to create. 

In [None]:
# Add a Single user to CPD
username = "LABUSER"
displayName = "LABUSER"
email = "xxxx@ca.ibm.com"
user_roles = ["Data Scientist"]
password = 'tsdvlab'
r = CPDAPI.addUser(username, displayName, email, user_roles, password)
if (CPDAPI.getStatusCode(r)==201):
    print('User Added')
else:
    print(CPDAPI.getStatusCode(r))

In [None]:
# Add LABUSER to the DV Service

userList = {'UserRoot':['LABUSER'],'Role':['User']}
userListDF = pd.DataFrame(userList) 

df = CPDAPI.getUsersDF() # Get existing list of users to get the uid

for row in range(0, len(userListDF)):
    display_name = userListDF['UserRoot'].iloc[row]
    role = userListDF['Role'].iloc[row]
    print(display_name)

    r = CPDAPI.addUserToDV(display_name, role, df)
    if (CPDAPI.getStatusCode(r)==200):
        print('User: '+display_name+' added to Data Virtualization Service')
    else:
        print(CPDAPI.getStatusCode(r))

### Grant Access to Data Engineers to Existing Views and Tables

In [None]:
# Grant Access to Data Engineers to all the Views owned by the logged in user
ViewsDF = CPDAPI.getVirtualizedViewsDF()
# Remove some views from the list
ViewsDF = ViewsDF[ViewsDF.viewschema != 'COGNOS']
display(ViewsDF)

In [None]:
# Grant Access to Data Engineers to all the Views owned by the logged in user
# Remove specific views
ViewsDF = CPDAPI.getVirtualizedViewsDF()
ViewsDF = ViewsDF[ViewsDF.viewschema != 'COGNOS']

roleToGrant = 'DV_ENGINEER'
for index, row in ViewsDF.iterrows():
    name = row['viewname']
    schema = row['viewschema']

    r = CPDAPI.grantPrivledgeToRole(name, schema, roleToGrant)
    if (CPDAPI.getStatusCode(r)==200):
        print('Access granted')
    else:
        print(CPDAPI.getStatusCode(r))

In [None]:
# Grant Access to Data Engineers to all the Virtualizated Tables owned by the logged in user
TablesDF = CPDAPI.getVirtualizedTablesDF()
# Remove two tables
# TablesDF = TablesDF[TablesDF.table_name != 'STOCK_TRANSACTIONS_DV']

roleToGrant = 'DV_ENGINEER'
for index, row in TablesDF.iterrows():
    name = row['table_name']
    schema = row['table_schema']

    r = CPDAPI.grantPrivledgeToRole(name, schema, roleToGrant)
    if (CPDAPI.getStatusCode(r)==200):
        print('Access granted')
    else:
        print(CPDAPI.getStatusCode(r))

## Credits: IBM 2020, Peter Kohlmann [kohlmann@ca.ibm.com]