In [None]:
# @hidden_cell
# The project token is an authorization token that is used to access project resources like data sources, connections, and used by platform APIs.
# Click on the toolbar icon with the Insert Project token option (three dots)



# Deployment of Functions to Watson ML
This notebook contains steps and code to demonstrate how to deploy Python Functions to the Watson Machine Learning service. It facilitates [ibm-watson-machine-learning](https://pypi.python.org/pypi/ibm-watson-machine-learning) library available in PyPI repository. It introduces commands for creating, updating & deleting spaces, deploying getting list and detailed information about them. It then publishes and deploys two Python functions. Firstly a very simple Python Function then a function that builds and deploys an NLP model as a Python function in Watson ML.

This notebook uses Python 3.10.

## Contents

This notebook contains the following parts:


1.  [Set up the environment](#setup)
2.  [Create new space](#create_space)
3.  [List all existing spaces](#list_space)
4.  [Get details about space](#get_space)
5.  [Set default space](#set_space)
6.  [Deploy Python Function](#deploy_function)
7.  [Delete existing space](#delete_space)
8.  [Summary and next steps](#summary)


<a id="setup"></a>
## 1. Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Create a <a href="https://console.ng.bluemix.net/catalog/services/ibm-watson-machine-learning/" target="_blank" rel="noopener no referrer">Watson Machine Learning (WML) Service</a> instance (a free plan is offered and information about how to create the instance can be found <a href="https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/ml-service-instance.html?context=analytics" target="_blank" rel="noopener no referrer">here</a>).

### Connection to WML

Authenticate the Watson Machine Learning service on IBM Cloud. You need to provide platform `api_key` and instance `location`.

**Tip**: Your `Cloud API key` can be generated by going to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). From that page, click your name, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. Give your key a name and click **Create**, then copy the created key and paste it below. You can also get a service specific url by going to the [**Endpoint URLs** section of the Watson Machine Learning docs](https://cloud.ibm.com/apidocs/machine-learning).  You can check your instance location in your  <a href="https://console.ng.bluemix.net/catalog/services/ibm-watson-machine-learning/" target="_blank" rel="noopener no referrer">Watson Machine Learning (WML) Service</a> instance details.

You can also get service specific apikey by going to the [**Service IDs** section of the Cloud Console](https://cloud.ibm.com/iam/serviceids).  From that page, click **Create**, then copy the created key and paste it below.

**Action**: Enter your `api_key` and `location` in the following cell.

In [None]:
api_key = 'PASTE THE CLOUD API KEY'
location = 'PASTE LOCATION'

In [None]:
wml_credentials = {
    "apikey": api_key,
    "url": 'https://' + location + '.ml.cloud.ibm.com'
}

### Install and import the `ibm-watson-machine-learning` package
**Note:** `ibm-watson-machine-learning` documentation can be found <a href="http://ibm-wml-api-pyclient.mybluemix.net/" target="_blank" rel="noopener no referrer">here</a>.

In [None]:
from ibm_watson_machine_learning import APIClient
client = APIClient(wml_credentials)

<a id="create_space"></a>
## 2. Create new space

There are two ways to create a Watson Mchine Learning Deployment Space. In this notebook, we will cover option **2. Programmatically using Python**

**1. Through the menu system**
you need to create a space that will be used for your work. If you do not have space already created, you can use [Deployment Spaces Dashboard](https://dataplatform.cloud.ibm.com/ml-runtime/spaces?context=cpdaas) to create one.

- Click New Deployment Space
- Create an empty space
- Select Cloud Object Storage
- Select Watson Machine Learning instance and press Create
- Copy `space_id` and paste it below

**2. Programmatically using Python**

To do this we use the `ibm_watson_machine_learning` SDK to prepare the space for your work. The steps to perform it are described below.

The following information is required

1. The `space_name` this can be any name you would like to give the deployment spaces. This is where all of your assets will be moved to from the Watson Studio project for the deployment.

2. In addition you will need to define space metadata. 

i. You will need Watson Machine Learning instance `wml_service_name` and the `wml_crn`
You can get your WML instance `name` and `crn` by following the instructions from [Setup](#setup). 

ii. The `cos_resource_crn` is the Cloud Object Storage `crn`. 
You can get Cloud Object Storage `crn` by following steps:

- Go to [IBM Cloud website](https://cloud.ibm.com/)
- Choose storage from your Dashboard
- Select your cloud object storage
- Choose Service Credentials from the Menu on the left
- Create new credentials by clicking New Credentials or open existing credentials with Writer priviledges
- Copy `resource_instance_id` field and paste it below as `cos_resource_crn`

**Tip:** If you already have a space and you want to create a new one, you can get metadata required for space creation from your existing space details by running `client.spaces.get_details(your_space_id)`.

Next you can create space by following cell execution.

In [None]:
space_name = 'PASTE THE WATSON ML DEPLOYMENT SPACE NAME'
wml_service_name = 'PASTE THE WATSON ML SERVICE NAME'
wml_crn = 'PASTE THE WATSON ML CRN'
cos_resource_crn = 'PASTE THE COS RESOURCE CRN'
use_existing_space=True

The following code checks if a deployment space with the `space_name` specified above exists. If it does, the existing deployment space will be used, otherwise a new deplyment space will be created.

In [None]:
space_uid=""
for space in client.spaces.get_details()['resources']:

    if space['entity']['name'] ==space_name:
        print("Deployment space with name",space_name,"already exists . .")
        space_uid=space['metadata']['id']
        client.set.default_space(space_uid)
        if(use_existing_space==False):

            for deployment in client.deployments.get_details()['resources']:
                print("Deleting deployment",deployment['entity']['name'], "in the space",)
                deployment_id=deployment['metadata']['id']
                client.deployments.delete(deployment_id)
            print("Deleting Space ",space_name,)
            client.spaces.delete(space_uid)
            time.sleep(10)
        else:
            print("Using the existing space")
            
            
if (space_uid=="" or use_existing_space==False):
    print("\nCreating a new deployment space -",space_name)
    # create the space and set it as default
    space_metadata = {
        'name': space_name,
        'description': 'This is a test deployment space',
        'storage': {
            'type': 'bmcos_object_storage',
            'resource_crn': cos_resource_crn
        },
        'compute': {
            'name': wml_service_name,
            'crn': wml_crn
        }
    }
    stored_space_details = client.spaces.store(space_metadata)

    space_uid = stored_space_details['metadata']['id']

    client.set.default_space(space_uid)


In [None]:
# Details of the deployment spaces uid
print(space_uid)



**Tip** In order to check if the space creation is completed succesfully change next cell format to code and execute it. It should return 'active'.

In [None]:
client.spaces.get_details(space_uid)['entity']['status']['state']

**Action**: If you didn't create new space in this notebook by `ibm_watson_machine_learning`, please assign space uid below and change cell format to `code`.

<a id="list_space"></a>
## 3. List all existing spaces

You can use `list` method to print all existing spaces.

In [None]:
client.spaces.list()

<a id="get_space"></a>
## 4. Get details about space

You can use `get_details` method to print details about given space. You need to provide `space_id` of desired space.

In [None]:
client.spaces.get_details(space_uid)

<a id="set_space"></a>
## 5. Set default space

To be able to interact with all resources available in Watson Machine Learning, you need to set **space** which you will be using.

In [None]:
client.set.default_space(space_uid)

<a id="deploy_function"></a>
## 6. Deploy Python Function

## a. Simple Example

### i. Create the Python Function

In [None]:
#wml_python_function
def my_deployable_function():
    def score( payload ):
        a=payload['input_data'][0]['values'][0]
        b=payload['input_data'][0]['values'][1]
        # append that to the master result 
        c=a+b
        c = str(c)
        # create an empty list to store all results, if multiple users are passed as input 
        all_outputs = []
        all_outputs.append(list(c))
        # format everything into a dictionary format WML can interact with 
        score_output = {'predictions': [{'values': c}]}
        return score_output
    return score

In [None]:
#calling the function locally
func_result = my_deployable_function()({"input_data": [{"values": [4, 5]}]})
print(func_result)

### ii. Deploy and Test the Python Function

In [None]:
# Look up software specification for the deployable function

software_spec_uid = client.software_specifications.get_uid_by_name("runtime-22.2-py3.10")
software_spec_uid

In [None]:
# Store the deployable function in your Watson Machine Learning repository

meta_data = {
    client.repository.FunctionMetaNames.NAME: 'My Test Python Deployment Function',
    client.repository.FunctionMetaNames.SOFTWARE_SPEC_UID: software_spec_uid
}

function_details = client.repository.store_function(meta_props=meta_data, function=my_deployable_function)

In [None]:
# Get published function ID

function_uid = client.repository.get_function_uid(function_details)

In [None]:
# Deploy the stored function

metadata = {
    client.deployments.ConfigurationMetaNames.NAME: 'My Test Deployment',
    client.deployments.ConfigurationMetaNames.ONLINE: {}
}

function_deployment_details = client.deployments.create(function_uid, meta_props=metadata)

## b. SemanticSearch Example

In [None]:
%%capture
!pip install sentence-transformers

In [None]:
import pickle
import re
import json
import ast
import pandas as pd
from tqdm.notebook import tqdm
from sentence_transformers import SentenceTransformer, util

### i. Load Data, Model and Create Embeddings

### Import FAQ Question Answer Pairs

In [None]:
qa = json.load(project.get_file('data_for_train.json'))

In [None]:
len(qa)

In [None]:
questions = []
answers = []
for i in range(len(qa)):
    questions.append(qa[i]['Question'])
    answers.append(qa[i]['Answer'][0])

In [None]:
testing = pd.DataFrame({'questions': questions, 'answers': answers})

In [None]:
testing.head()

In [None]:
# might be the best
model = SentenceTransformer('all-MiniLM-L6-v2')

# alternative models
# model = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')
# model = SentenceTransformer('multi-qa-mpnet-base-dot-v1')
# model = SentenceTransformer('sentence-transformers/paraphrase-MiniLM-L6-v2')
# model = SentenceTransformer('clips/mfaq')

In [None]:
model

In [None]:
q_list = testing['questions'].astype(str).tolist()
corpus_embed = model.encode(q_list, convert_to_tensor=True, show_progress_bar=True)

In [None]:
corpus_embed

In [None]:
q_list = testing['questions'].astype(str).tolist()

In [None]:
testing.head()

### ii. Create the Python Function

In [None]:
def faq_answer(question, corpus, q_list, nr, top_k=10, score=0.5):
    qe = model.encode(question)
    print(qe)
    print(type(qe))
    print(corpus)
    print(top_k)
    hits = util.semantic_search(qe, corpus, top_k)
    print(hits)
    output = []
    for hit in hits[0]:
        d = {}
        if hit['score'] > score:
            d['question'] = q_list[hit['corpus_id']]
            d['score'] = hit['score']
            d['answer'] = nr.loc[nr['questions'] == q_list[hit['corpus_id']]]['answers'].to_string(index=False)
        output.append(d)
    print('scoreOutputType',type(output))
    return output

In [None]:
question = "I visited Sweden this week, it was very cold"
# question = "I watched TV last night, love watching the news"

In [None]:
# question (type:str) is the payload, i.e. the user's input
# corpus_embed (type: torch.Tensor) is the tensor model
# testing (type: dataFrame) includes questions and answers
# q_list (type: list) is the list of questions (based off the questions in the testing dataframe)
# top_k (type:int) top N scores to return
# score (type:int) return scores over this threshold value
# output (type: list)
output = faq_answer(question, corpus_embed, q_list, testing, top_k=10, score=0.0)


### iii. Deploy and Test the Python Function

In [None]:
import pickle

In [None]:
# Save model in the local directly 
# This is not saved to the project directory, but the local memory of the Jupyter pod that is ephemeral
# We only need access to this momentarily to promote the assets to the deployment space 

with open(r"corpus_embed.p", "wb") as output_file:
    pickle.dump(corpus_embed, output_file)

In [None]:
testing.to_csv('testing.csv', index=False)

In [None]:
# Use the WML client API to move (promote) these assets to the deployment space 
asset_details_model = client.data_assets.create('corpus_embed.p', file_path='corpus_embed.p')
testing = client.data_assets.create('testing.csv', file_path='testing.csv')

In [None]:
# Get the IDs of the promoted assets 
model_id = asset_details_model['metadata']['guid']
testing_id = testing['metadata']['guid']

In [None]:
print('Model id:',model_id,'Testing id:',testing_id)

In [None]:
# Format everything to reside in the assets_dict dictionary object 
assets_dict = {'model_id' : model_id, 'testing_id': testing_id}

In [None]:
# Create the ai_parms dictionary that contains the WML credentials, space ID and the assets_dict dictionary
# This helps access all the information together later on

ai_parms = {"wml_credentials": wml_credentials, "space_uid": space_uid, 'assets' : assets_dict}

In [None]:
# Create a definition of the deployment function
def faq_answer_wml(parms=ai_parms):
    import os
    os.environ["TOKENIZERS_PARALLELISM"] = "false"
    try:
        import subprocess
        subprocess.check_output("pip install sentence-transformers --user", stderr=subprocess.STDOUT, shell=True)
        subprocess.check_output("pip install spacy --user", stderr=subprocess.STDOUT, shell=True)
        subprocess.check_output("python -m spacy download en_core_web_md --user", stderr=subprocess.STDOUT, shell=True)
        subprocess.check_output("pip install autocorrect --user", stderr=subprocess.STDOUT, shell=True)
        
        # import all the necessary packages
        import pickle
        from ibm_watson_machine_learning import APIClient
        from autocorrect import Speller, word_regexes
        import spacy
        from tqdm.notebook import tqdm
        from sentence_transformers import SentenceTransformer, util

        import pickle
        import re
        import json
        import ast
        import pandas as pd
        from autocorrect import Speller, word_regexes
        import spacy
        from tqdm.notebook import tqdm
        from sentence_transformers import SentenceTransformer, util

        # instantiate the WML client 
        client = APIClient(parms["wml_credentials"])
        client.set.default_space(parms["space_uid"])
            
        # get the path to the model and the necessary files locally 
        model_path = client.data_assets.download(parms['assets']['model_id'], 'corpus_embed.p')
        testing_path =  client.data_assets.download(parms['assets']['testing_id'], 'testing.csv')

        # read the files locally
        testing = pd.read_csv('testing.csv')

        # Initiate the model
        model = SentenceTransformer('all-MiniLM-L6-v2')

        q_list = testing['questions'].astype(str).tolist()         

        # load the model locally 
        with open(str(model_path), "rb") as input_file:
            corpus_embed = pickle.load(input_file)

        def get_faq_answer(question, corpus, q_list, nr, top_k=10, score=0.5):
            qe = model.encode(question)
            hits = util.semantic_search(qe, corpus, top_k)
            output = []
            for hit in hits[0]:
                d = {}
                if hit['score'] > score:
                    d['question'] = q_list[hit['corpus_id']]
                    d['score'] = hit['score']
                    d['answer'] = nr.loc[nr['questions'] == q_list[hit['corpus_id']]]['answers'].to_string(index=False)
                output.append(d)
            return output
            
    except subprocess.CalledProcessError as e:        
        install_err = "subprocess.CalledProcessError:\n\n" + "cmd:\n" + e.cmd + "\n\noutput:\n" + e.output.decode()
        raise Exception("Installing failed:\n" + install_err)
    

    
    # define the score function 
    def score(function_payload):
            
        try:
            
            # iterate over each question in the payload 
            all_output = []
            
            for input_values in function_payload["input_data"][0]["values"]:
                question = str(input_values[0])
                output = get_faq_answer(question, corpus_embed, q_list, testing, top_k=10, score=0.0)
                all_output.append(output)
            
            score_response = {
                "predictions": [
                    {
                        "fields": ["question"],
                        "values": [all_output]
                    }
                ]
            } 
            
            return score_response
    

        # if there is an exception 
        except Exception as e:

            # return the error 
            score_response = {
                "predictions": [
                    {
                        "fields": ["error"],
                        "values": [
                            [ e.__repr__() ]
                        ]
                    }
                ]
            } 
            return score_response

    # return the score function 
    return score

In [None]:
#calling the function locally

job_payload = {
    "input_data": [
        {
            "fields": ["question"],
            "values": [
                ["I went to London for a business meeting?"],
                ["I watched TV last night, love watching the news"]
            ]
        }
    ]
}

output = faq_answer_wml()(job_payload)
# faq_answer_wml()({"input_data": [{"values": ["I watched TV last night, love watching the news"]}]})
#faq_answer_wml()({"input_data": [{"values": ['I drive a fast car']}]})


In [None]:
output["predictions"][0]["values"][0][0]

<a id="deploy_function"></a>
## 8. Deploy Function

In [None]:
# Look up software specification for the deployable function
software_spec_uid = client.software_specifications.get_uid_by_name("runtime-22.2-py3.10")
software_spec_uid

In [None]:
# Store the deployable function in your Watson Machine Learning repository
meta_data = {
    client.repository.FunctionMetaNames.NAME: 'NLP FAQ Model',
    client.repository.FunctionMetaNames.SOFTWARE_SPEC_UID: software_spec_uid
}

function_details = client.repository.store_function(meta_props=meta_data, function=faq_answer_wml)

In [None]:
# Get published function ID
function_uid = client.repository.get_function_uid(function_details)

In [None]:
# Deploy the stored function

metadata = {
    client.deployments.ConfigurationMetaNames.NAME: 'NLP FAQ Model',
    client.deployments.ConfigurationMetaNames.ONLINE: {}
}

function_deployment_details = client.deployments.create(function_uid, meta_props=metadata)

<a id="summary"></a>
## 8. Summary and next steps

 You successfully completed this notebook! You learned how to use ibm-watson-machine-learning client for Watson Machine Learning instance space management and clean up. Check out our _[Online Documentation](https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/welcome-main.html?context=analytics?pos=2)_ for more samples, tutorials, documentation, how-tos, and blog posts. 

In [None]:
deployment_id = function_deployment_details["metadata"]["id"]

In [None]:
job_payload = {
    "input_data": [
        {
            "fields": ["question"],
            "values": [
                ["I went to London for a business meeting?"],
            ]
        }
    ]
}

print(job_payload)

In [None]:
job_details = client.deployments.score(deployment_id, job_payload)
print(job_details)

In [None]:
job_details["predictions"][0]["values"][0][0]