# Use custom software_spec to create statsmodels function describing data with `ibm-watson-machine-learning`

This notebook demonstrates how to deploy in Watson Machine Learning service a python function with `statsmodel` which requires to create custom software specification using conda yaml file with all required libraries.  
Some familiarity with bash is helpful. This notebook uses Python 3.9 with statsmodel.


## Learning goals

The learning goals of this notebook are:

-  Working with the Watson Machine Learning instance
-  Creating custom software specification
-  Online deployment of python function
-  Scoring data using deployed function

## Contents

This notebook contains the following parts:

1.	[Setup](#setup)
2.  [Function creation](#create)
3.	[Function upload](#upload) 
4.	[Web service creation](#deploy)
5.	[Scoring](#score)
6.  [Clean up](#cleanup)
7.	[Summary and next steps](#summary)

<a id="setup"></a>
## 1. Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Create a <a href="https://console.ng.bluemix.net/catalog/services/ibm-watson-machine-learning/" target="_blank" rel="noopener no referrer">Watson Machine Learning (WML) Service</a> instance (a free plan is offered and information about how to create the instance can be found <a href=" https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/ml-service-instance.html?context=analytics" target="_blank" rel="noopener no referrer">here</a>).

### Connection to WML

Authenticate the Watson Machine Learning service on IBM Cloud. You need to provide platform `api_key` and instance `location`.

You can use [IBM Cloud CLI](https://cloud.ibm.com/docs/cli/index.html) to retrieve platform API Key and instance location.

API Key can be generated in the following way:
```
ibmcloud login
ibmcloud iam api-key-create API_KEY_NAME
```

In result, get the value of `api_key` from the output.


Location of your WML instance can be retrieved in the following way:
```
ibmcloud login --apikey API_KEY -a https://cloud.ibm.com
ibmcloud resource service-instance WML_INSTANCE_NAME
```

In result, get the value of `location` from the output.

**Tip**: Your `Cloud API key` can be generated by going to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). From that page, click your name, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. Give your key a name and click **Create**, then copy the created key and paste it below. You can also get a service specific url by going to the [**Endpoint URLs** section of the Watson Machine Learning docs](https://cloud.ibm.com/apidocs/machine-learning).  You can check your instance location in your  <a href="https://console.ng.bluemix.net/catalog/services/ibm-watson-machine-learning/" target="_blank" rel="noopener no referrer">Watson Machine Learning (WML) Service</a> instance details.

You can also get service specific apikey by going to the [**Service IDs** section of the Cloud Console](https://cloud.ibm.com/iam/serviceids).  From that page, click **Create**, then copy the created key and paste it below.

**Action**: Enter your `api_key` and `location` in the following cell.

In [1]:
api_key = 'PASTE YOUR PLATFORM API KEY HERE'
location = 'PASTE YOUR INSTANCE LOCATION HERE'

In [3]:
wml_credentials = {
    "apikey": api_key,
    "url": 'https://' + location + '.ml.cloud.ibm.com'
}

### Install and import the `ibm-watson-machine-learning` package
**Note:** `ibm-watson-machine-learning` documentation can be found <a href="https://ibm.github.io/watson-machine-learning-sdk//" target="_blank" rel="noopener no referrer">here</a>.

In [None]:
!pip install -U ibm-watson-machine-learning

In [2]:
from ibm_watson_machine_learning import APIClient

client = APIClient(wml_credentials)

### Working with spaces

First, create a space that will be used for your work. If you do not have space already created, you can use [Deployment Spaces Dashboard](https://dataplatform.cloud.ibm.com/ml-runtime/spaces?context=cpdaas) to create one.

- Click New Deployment Space
- Create an empty space
- Select Cloud Object Storage
- Select Watson Machine Learning instance and press Create
- Copy `space_id` and paste it below

**Tip**: You can also use SDK to prepare the space for your work. More information can be found [here](https://github.com/IBM/watson-machine-learning-samples/blob/master/cloud/notebooks/python_sdk/instance-management/Space%20management.ipynb).

**Action**: Assign space ID below

In [4]:
space_id = 'PASTE YOUR SPACE ID HERE'

You can use `list` method to print all existing spaces.

In [None]:
client.spaces.list(limit=10)

To be able to interact with all resources available in Watson Machine Learning, you need to set **space** which you will be using.

In [3]:
client.set.default_space(space_id)

'SUCCESS'

<a id="create"></a>
## 2. Create function

In this section you will learn how to create deployable function
with statsmodels module calculating describition of a given data.  
**Hint**: To install statsmodels execute `!pip install statsmodels`.

#### Create deploayable callable which uses stsmodels library

In [4]:
def deployable_callable():
    """
    Deployable python function with score
    function implemented.
    """
    try:
        from statsmodels.stats.descriptivestats import describe
    except ModuleNotFoundError as e:
        print(f"statsmodels not installed: {str(e)}")
        
    def score(payload):
        """
        Score method.
        """
        try:
            data = payload['input_data'][0]['values']
            return {
                'predictions': [
                    {'values': str(describe(data))}
                ]
            }
        except Exception as e:
            return {'predictions': [{'values': [repr(e)]}]}
        
    return score

####  Test callable locally

**Hint**: To install numpy execute `!pip install numpy`.

In [5]:
import numpy as np

data = np.random.randn(10, 10)
data_description = deployable_callable()({
    "input_data": [{
        "values" : data
    }]
})

print(data_description["predictions"][0]["values"])

                          0          1          2          3          4  \
nobs              10.000000  10.000000  10.000000  10.000000  10.000000   
missing            0.000000   0.000000   0.000000   0.000000   0.000000   
mean               0.226671   0.118863   0.296608   0.018030  -0.059889   
std_err            0.078214   0.109682   0.077958   0.069102   0.099104   
upper_ci           0.379969   0.333837   0.449404   0.153467   0.134352   
lower_ci           0.073374  -0.096110   0.143813  -0.117407  -0.254130   
std                0.782143   1.096823   0.779584   0.691019   0.991042   
iqr                1.138703   0.979375   1.111591   0.625231   1.315194   
iqr_normal         0.844122   0.726012   0.824023   0.463485   0.974954   
mad                0.623878   0.774813   0.618962   0.513830   0.799944   
mad_normal         0.781915   0.971084   0.775754   0.643991   1.002581   
coef_var           3.450557   9.227584   2.628330  38.326234 -16.547990   
range              2.2899

<a id="upload"></a>
## 3. Upload python function

In this section you will learn how to upload the python function to the Cloud.

#### Custom software_specification
Create new software specification based on default Python 3.9 environment extended by statsmodel package.

In [1]:
config_yml =\
"""
name: python310
channels:
  - conda-forge
  - nodefaults
dependencies:
  - statsmodels
prefix: /opt/anaconda3/envs/python310
"""

with open("config.yaml", "w", encoding="utf-8") as f:
    f.write(config_yml)

In [7]:
base_sw_spec_uid = client.software_specifications.get_uid_by_name("runtime-22.2-py3.10")

In [8]:
!cat config.yaml

name: python39
channels:
  - empty
  - nodefaults
dependencies:
- pip:
  - statsmodels

prefix: /opt/anaconda3/envs/python39


`config.yaml` file describes details of package extention. Now you need to store new package extention with APIClient.

In [9]:
meta_prop_pkg_extn = {
    client.package_extensions.ConfigurationMetaNames.NAME: "statsmodels env",
    client.package_extensions.ConfigurationMetaNames.DESCRIPTION: "Environment with statsmodels",
    client.package_extensions.ConfigurationMetaNames.TYPE: "conda_yml"
}

pkg_extn_details = client.package_extensions.store(meta_props=meta_prop_pkg_extn, file_path="config.yaml")
pkg_extn_uid = client.package_extensions.get_uid(pkg_extn_details)
pkg_extn_url = client.package_extensions.get_href(pkg_extn_details)

Creating package extensions
SUCCESS


#### Create new software specification and add created package extention to it.

In [10]:
meta_prop_sw_spec = {
    client.software_specifications.ConfigurationMetaNames.NAME: "statsmodels software_spec",
    client.software_specifications.ConfigurationMetaNames.DESCRIPTION: "Software specification for statsmodels",
    client.software_specifications.ConfigurationMetaNames.BASE_SOFTWARE_SPECIFICATION: {"guid": base_sw_spec_uid}
}

sw_spec_details = client.software_specifications.store(meta_props=meta_prop_sw_spec)
sw_spec_uid = client.software_specifications.get_uid(sw_spec_details)

client.software_specifications.add_package_extension(sw_spec_uid, pkg_extn_uid)

SUCCESS


'SUCCESS'

#### Get the details of created software specification

In [11]:
client.software_specifications.get_details(sw_spec_uid)

{'metadata': {'name': 'statsmodels software_spec',
  'asset_id': 'e81c0d6d-45cf-43d6-8047-d9858f0c74f8',
  'href': '/v2/software_specifications/e81c0d6d-45cf-43d6-8047-d9858f0c74f8',
  'asset_type': 'software_specification',
  'created_at': '2022-04-12T09:33:32Z'},
 'entity': {'software_specification': {'type': 'derived',
   'display_name': 'statsmodels software_spec',
   'base_software_specification': {'guid': '12b83a17-24d8-5082-900f-0ab31fbfd3cb',
    'href': '/v2/software_specifications/12b83a17-24d8-5082-900f-0ab31fbfd3cb'},
   'package_extensions': [{'metadata': {'space_id': '680a7515-620c-461f-9c6f-1f4c535bfc47',
      'usage': {'last_updated_at': '2022-04-12T09:33:31Z',
       'last_updater_id': 'IBMid-55000091VC',
       'last_update_time': 1649756011121,
       'last_accessed_at': '2022-04-12T09:33:31Z',
       'last_access_time': 1649756011121,
       'last_accessor_id': 'IBMid-55000091VC',
       'access_count': 0},
      'rov': {'mode': 0,
       'collaborator_ids': {},
  

#### Store the function

In [12]:
meta_props = {
    client.repository.FunctionMetaNames.NAME: "statsmodels function",
    client.repository.FunctionMetaNames.SOFTWARE_SPEC_UID: sw_spec_uid
}

function_details = client.repository.store_function(meta_props=meta_props, function=deployable_callable)
function_id = client.repository.get_function_id(function_details)

#### Get function details

In [14]:
client.repository.get_details(function_id)

{'entity': {'software_spec': {'id': 'e81c0d6d-45cf-43d6-8047-d9858f0c74f8',
   'name': 'statsmodels software_spec'},
  'type': 'python'},
 'metadata': {'created_at': '2022-04-12T09:33:47.753Z',
  'id': '1211c5af-3f0e-475a-927c-30114868bafa',
  'modified_at': '2022-04-12T09:33:52.512Z',
  'name': 'statsmodels function',
  'owner': 'IBMid-55000091VC',
  'space_id': '680a7515-620c-461f-9c6f-1f4c535bfc47'},

**Note:** You can see that function is successfully stored in Watson Machine Learning Service.

In [None]:
client.repository.list_functions()

<a id="deploy"></a>
## 4. Create online deployment
You can use commands bellow to create online deployment for stored function (web service).

#### Create online deployment of a python function

In [15]:
metadata = {
    client.deployments.ConfigurationMetaNames.NAME: "Deployment of statsmodels function",
    client.deployments.ConfigurationMetaNames.ONLINE: {}
}

function_deployment = client.deployments.create(function_id, meta_props=metadata)



#######################################################################################

Synchronous deployment creation for uid: '1211c5af-3f0e-475a-927c-30114868bafa' started

#######################################################################################


initializing
Note: online_url is deprecated and will be removed in a future release. Use serving_urls instead.
.........
ready


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='9b723c1c-5d8f-4f1f-bc4e-bf7d582a4ba6'
------------------------------------------------------------------------------------------------




In [None]:
client.deployments.list()

Get deployment id.

In [16]:
deployment_id = client.deployments.get_id(function_deployment)
print(deployment_id)

9b723c1c-5d8f-4f1f-bc4e-bf7d582a4ba6


<a id="score"></a>
## 5. Scoring

You can send new scoring records to web-service deployment using `score` method.

In [17]:
scoring_payload = {
    "input_data": [{
        'values': data
    }]
}

In [18]:
predictions = client.deployments.score(deployment_id, scoring_payload)
print(data_description["predictions"][0]["values"])

                          0          1          2          3          4  \
nobs              10.000000  10.000000  10.000000  10.000000  10.000000   
missing            0.000000   0.000000   0.000000   0.000000   0.000000   
mean               0.226671   0.118863   0.296608   0.018030  -0.059889   
std_err            0.078214   0.109682   0.077958   0.069102   0.099104   
upper_ci           0.379969   0.333837   0.449404   0.153467   0.134352   
lower_ci           0.073374  -0.096110   0.143813  -0.117407  -0.254130   
std                0.782143   1.096823   0.779584   0.691019   0.991042   
iqr                1.138703   0.979375   1.111591   0.625231   1.315194   
iqr_normal         0.844122   0.726012   0.824023   0.463485   0.974954   
mad                0.623878   0.774813   0.618962   0.513830   0.799944   
mad_normal         0.781915   0.971084   0.775754   0.643991   1.002581   
coef_var           3.450557   9.227584   2.628330  38.326234 -16.547990   
range              2.2899

<a id="cleanup"></a>
## 6. Clean up   

If you want to clean up all created assets:
- experiments
- trainings
- pipelines
- model definitions
- models
- functions
- deployments

see the steps in this sample [notebook](https://github.com/IBM/watson-machine-learning-samples/blob/master/cloud/notebooks/python_sdk/instance-management/Machine%20Learning%20artifacts%20management.ipynb).

<a id="summary"></a>
## 7. Summary and next steps     

 You successfully completed this notebook! You learned how to use Watson Machine Learning for function deployment and scoring with custom software_spec.  
 Check out our [Online Documentation](https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/welcome-main.html?context=analytics) for more samples, tutorials, documentation, how-tos, and blog posts. 

### Author

**Jan Sołtysik** Intern in Watson Machine Learning.

Copyright © 2020, 2021, 2022 IBM. This notebook and its source code are released under the terms of the MIT License.