# Use custom software_spec to create statsmodels function describing data with `ibm-watson-machine-learning`

This notebook demonstrates how to deploy in Watson Machine Learning service a python function with `statsmodel` which requires to create custom software specification using conda yaml file with all required libraries.  
Some familiarity with bash is helpful. This notebook uses Python 3.10 with statsmodel.


## Learning goals

The learning goals of this notebook are:

-  Working with the Watson Machine Learning instance
-  Creating custom software specification
-  Online deployment of python function
-  Scoring data using deployed function

## Contents

This notebook contains the following parts:

1. [Setup](#setup)
2. [Function creation](#create)
3. [Function upload](#upload) 
4. [Web service creation](#deploy)
5. [Scoring](#score)
6. [Clean up](#cleanup)
7. [Summary and next steps](#summary)

<a id="setup"></a>
## 1. Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Contact with your Cloud Pack for Data administrator and ask him for your account credentials

### Connection to WML

Authenticate the Watson Machine Learning service on IBM Cloud Pack for Data. You need to provide platform `url`, your `username` and `api_key`.

In [None]:
username = 'PASTE YOUR USERNAME HERE'
api_key = 'PASTE YOUR API_KEY HERE'
url = 'PASTE THE PLATFORM URL HERE'

In [2]:
wml_credentials = {
    "username": username,
    "apikey": api_key,
    "url": url,
    "instance_id": 'openshift',
    "version": '4.8'
}

Alternatively you can use `username` and `password` to authenticate WML services.

```
wml_credentials = {
    "username": ***,
    "password": ***,
    "url": ***,
    "instance_id": 'openshift',
    "version": '4.8'
}

```

### Install and import the `ibm-watson-machine-learning` package
**Note:** `ibm-watson-machine-learning` documentation can be found <a href="http://ibm-wml-api-pyclient.mybluemix.net/" target="_blank" rel="noopener no referrer">here</a>.

In [None]:
!pip install -U ibm-watson-machine-learning

In [3]:
from ibm_watson_machine_learning import APIClient

client = APIClient(wml_credentials)

### Working with spaces

First of all, you need to create a space that will be used for your work. If you do not have space already created, you can use `{PLATFORM_URL}/ml-runtime/spaces?context=icp4data` to create one.

- Click New Deployment Space
- Create an empty space
- Go to space `Settings` tab
- Copy `space_id` and paste it below

**Tip**: You can also use SDK to prepare the space for your work. More information can be found [here](https://github.com/IBM/watson-machine-learning-samples/blob/master/cpd4.7/notebooks/python_sdk/instance-management/Space%20management.ipynb).

**Action**: Assign space ID below

In [None]:
space_id = 'PASTE YOUR SPACE ID HERE'

You can use `list` method to print all existing spaces.

In [None]:
client.spaces.list(limit=10)

To be able to interact with all resources available in Watson Machine Learning, you need to set **space** which you will be using.

In [5]:
client.set.default_space(space_id)

'SUCCESS'

<a id="create"></a>
## 2. Create function

In this section you will learn how to create deployable function
with statsmodels module calculating describition of a given data.  
**Hint**: To install statsmodels execute `!pip install statsmodels`.

#### Create deploayable callable which uses stsmodels library

In [7]:
def deployable_callable():
    """
    Deployable python function with score
    function implemented.
    """
    try:
        from statsmodels.stats.descriptivestats import describe
    except ModuleNotFoundError as e:
        print(f"statsmodels not installed: {str(e)}")
        
    def score(payload):
        """
        Score method.
        """
        try:
            data = payload['input_data'][0]['values']
            return {
                'predictions': [
                    {'values': str(describe(data))}
                ]
            }
        except Exception as e:
            return {'predictions': [{'values': [repr(e)]}]}
        
    return score

####  Test callable locally

**Hint**: To install numpy execute `!pip install numpy`.

In [8]:
import numpy as np

data = np.random.randn(10, 10)
data_description = deployable_callable()({
    "input_data": [{
        "values" : data
    }]
})

print(data_description["predictions"][0]["values"])

                          0          1          2          3          4  \
nobs              10.000000  10.000000  10.000000  10.000000  10.000000   
missing            0.000000   0.000000   0.000000   0.000000   0.000000   
mean              -0.087662  -0.162800   0.021847  -0.484764  -0.151347   
std_err            0.086628   0.082336   0.088078   0.154758   0.060011   
upper_ci           0.082126  -0.001424   0.194477  -0.181444  -0.033728   
lower_ci          -0.257450  -0.324175  -0.150783  -0.788085  -0.268967   
std                0.866281   0.823361   0.880783   1.547581   0.600109   
iqr                1.220332   0.774040   1.173032   1.866595   0.455900   
iqr_normal         0.904633   0.573797   0.869570   1.383709   0.337959   
mad                0.696300   0.556041   0.700227   1.257709   0.403634   
mad_normal         0.872682   0.696895   0.877605   1.576305   0.505881   
coef_var          -9.882045  -5.057510  40.315579  -3.192440  -3.965111   
range              2.5147

<a id="upload"></a>
## 3. Upload python function

In this section you will learn how to upload the python function to the Cloud.

#### Custom software_specification
Create new software specification based on default Python 3.10 environment extended by autoai-libs package.

In [9]:
config_yml =\
"""name: python38
channels:
  - empty
  - nodefaults
dependencies:
- pip:
  - statsmodels

prefix: /opt/anaconda3/envs/python38
"""

with open("config.yaml", "w", encoding="utf-8") as f:
    f.write(config_yml)

In [10]:
base_sw_spec_uid = client.software_specifications.get_uid_by_name("runtime-23.1-py3.10")

In [11]:
!cat config.yaml

name: python37
channels:
  - defaults
dependencies:
  - pip:
    - statsmodels

prefix: /opt/anaconda3/envs/python37


`config.yaml` file describes details of package extention. Now you need to store new package extention with APIClient.

In [12]:
meta_prop_pkg_extn = {
    client.package_extensions.ConfigurationMetaNames.NAME: "statsmodels env",
    client.package_extensions.ConfigurationMetaNames.DESCRIPTION: "Environment with statsmodels",
    client.package_extensions.ConfigurationMetaNames.TYPE: "conda_yml"
}

pkg_extn_details = client.package_extensions.store(meta_props=meta_prop_pkg_extn, file_path="config.yaml")
pkg_extn_uid = client.package_extensions.get_uid(pkg_extn_details)
pkg_extn_url = client.package_extensions.get_href(pkg_extn_details)

Creating package extensions
SUCCESS


#### Create new software specification and add created package extention to it.

In [13]:
meta_prop_sw_spec = {
    client.software_specifications.ConfigurationMetaNames.NAME: "statsmodels software_spec",
    client.software_specifications.ConfigurationMetaNames.DESCRIPTION: "Software specification for statsmodels",
    client.software_specifications.ConfigurationMetaNames.BASE_SOFTWARE_SPECIFICATION: {"guid": base_sw_spec_uid}
}

sw_spec_details = client.software_specifications.store(meta_props=meta_prop_sw_spec)
sw_spec_uid = client.software_specifications.get_uid(sw_spec_details)

client.software_specifications.add_package_extension(sw_spec_uid, pkg_extn_uid)

SUCCESS


'SUCCESS'

#### Get the details of created software specification

In [None]:
client.software_specifications.get_details(sw_spec_uid)

#### Store the function

In [14]:
meta_props = {
    client.repository.FunctionMetaNames.NAME: "statsmodels function",
    client.repository.FunctionMetaNames.SOFTWARE_SPEC_UID: sw_spec_uid
}

function_details = client.repository.store_function(meta_props=meta_props, function=deployable_callable)
function_uid = client.repository.get_function_uid(function_details)

#### Get function details

In [15]:
client.repository.get_details(function_uid)

{'entity': {'software_spec': {'id': '62123911-898c-49be-87ae-e056067645d8',
   'name': 'statsmodels software_spec'},
  'type': 'python'},
 'metadata': {'created_at': '2021-02-03T13:02:50.513Z',
  'id': '77cc037c-6aab-4eac-9e45-c3331e010132',
  'modified_at': '2021-02-03T13:02:53.304Z',
  'name': 'statsmodels function',
  'owner': 'IBMid-55000091VC',
  'space_id': 'd70a423e-bab5-4b24-943a-3b0b29ad7527'},

**Note:** You can see that function is successfully stored in Watson Machine Learning Service.

In [None]:
client.repository.list_functions()

<a id="deploy"></a>
## 4. Create online deployment
You can use commands bellow to create online deployment for stored function (web service).

#### Create online deployment of a python function

In [16]:
metadata = {
    client.deployments.ConfigurationMetaNames.NAME: "Deployment of statsmodels function",
    client.deployments.ConfigurationMetaNames.ONLINE: {}
}

function_deployment = client.deployments.create(function_uid, meta_props=metadata)



#######################################################################################

Synchronous deployment creation for uid: '77cc037c-6aab-4eac-9e45-c3331e010132' started

#######################################################################################


initializing....................................................................................................................................................
ready


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='340eb7f7-4a08-4c9c-8371-f6cd907b7a3e'
------------------------------------------------------------------------------------------------




In [None]:
client.deployments.list()

Get deployment id.

In [17]:
deployment_id = client.deployments.get_uid(function_deployment)
print(deployment_id)

340eb7f7-4a08-4c9c-8371-f6cd907b7a3e


<a id="score"></a>
## 5. Scoring

You can send new scoring records to web-service deployment using `score` method.

In [18]:
scoring_payload = {
    "input_data": [{
        'values': data
    }]
}

In [19]:
predictions = client.deployments.score(deployment_id, scoring_payload)
print(data_description["predictions"][0]["values"])

                          0          1          2          3          4  \
nobs              10.000000  10.000000  10.000000  10.000000  10.000000   
missing            0.000000   0.000000   0.000000   0.000000   0.000000   
mean              -0.087662  -0.162800   0.021847  -0.484764  -0.151347   
std_err            0.086628   0.082336   0.088078   0.154758   0.060011   
upper_ci           0.082126  -0.001424   0.194477  -0.181444  -0.033728   
lower_ci          -0.257450  -0.324175  -0.150783  -0.788085  -0.268967   
std                0.866281   0.823361   0.880783   1.547581   0.600109   
iqr                1.220332   0.774040   1.173032   1.866595   0.455900   
iqr_normal         0.904633   0.573797   0.869570   1.383709   0.337959   
mad                0.696300   0.556041   0.700227   1.257709   0.403634   
mad_normal         0.872682   0.696895   0.877605   1.576305   0.505881   
coef_var          -9.882045  -5.057510  40.315579  -3.192440  -3.965111   
range              2.5147

<a id="cleanup"></a>
## 6. Clean up   

If you want to clean up all created assets:
- experiments
- trainings
- pipelines
- model definitions
- models
- functions
- deployments

please follow up this sample [notebook](https://github.com/IBM/watson-machine-learning-samples/blob/master/cpd4.7/notebooks/python_sdk/instance-management/Machine%20Learning%20artifacts%20management.ipynb).

<a id="summary"></a>
## 7. Summary and next steps     

 You successfully completed this notebook! You learned how to use Watson Machine Learning for function deployment and scoring with custom software_spec.  
 Check out our [Online Documentation](https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/welcome-main.html?context=analytics) for more samples, tutorials, documentation, how-tos, and blog posts. 

### Author

**Jan Sołtysik** Intern in Watson Machine Learning.

Copyright © 2020, 2021, 2022, 2023 IBM. This notebook and its source code are released under the terms of the MIT License.