# Categories Customization

In this notebook, we will be creating a custom model for Watson Natural Language Understanding (NLU) categories feature using the Train API.

We will go through following functionalities:
- How to create training data file
- How to train a categories model with NLU train API
- How to get status of the model
- How to use the trained model using NLU Analyze API

To start with you will need an NLU instance and an API key.


## Add your IBM Cloud service credentials here.
- If you use IAM service credentials, leave 'username' set to 'apikey'and set 'password' to the value of your IAM API key.
- If you use pre-IAM service credentials, set the values to your 'username' and 'password'.

Also set 'url' to the URL for your service instance as provided in your service credentials.
See the following instructions for getting your own credentials: https://cloud.ibm.com/docs/watson?topic=watson-iam


In [8]:
username = 'apikey'
password = 'YOUR_IAM_APIKEY'
url = 'NLU_URI'

## Create Training Data File

Categories training data requires labels and key phrases. Labels corresponds to the name of the classes. These labels can have hierarchy and are provided as a list of labels, where the order in this list represents the hierarchy. 

Each label can be provided with one or more "key phrases". These key phrases are used to train categories models. The key phrases should be unique per label and should represent the label. The key phrases can be more phrases containing more than 1 word, for example "action movies". 

In [9]:
training_data = [
    {
        "labels": ["Laptops"],
        "key_phrases": ["laptops"]
    },
    {
        "labels": ["Cell Phones"],
        "key_phrases": ["mobile phones", "android smart phones", "apple smart phones"]
    },
    {
        "labels": ["Video Games"],
        "key_phrases": ["video games"]
    },
    {
        "labels": ["Appliances"],
        "key_phrases": ["home appliance", "refrigerator", "laundry machine"]
    },
    {
        "labels": ["Cameras"],
        "key_phrases": ["cameras"]
    },
    {
        "labels": ["Cameras", "DSLR Cameras"],
        "key_phrases": ["Digital", "single-lens", "reflex camera"]
    },
    {
        "labels": ["Cameras", "DSLR Cameras", "Lenses"],
        "key_phrases": ["DSLR lens", "lenses"]
    },
    {
        "labels": ["Cameras", "Mirrorless Cameras"],
        "key_phrases": ["mirrorless", "interchangeable-lens", "camera"]
    }
    
]

# Save Training data in a file
import json

training_data_filename = 'training_data.json'

with open(training_data_filename, 'w', encoding='utf-8') as f:
    json.dump(training_data, f)


## Train Categories model with NLU

In [12]:
import json
import ntpath
import requests
import sys
import time

from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)


######### Create parameters required for making a call to NLU ######### 
feature_to_train = 'categories'

headers = {'Content-Type' : 'multipart/form-data'}

data = {
    'name':'Categories Custom model #1',
    'language':'en',
    'version':'1.0.1'
}

params = {
    'version': '2021-02-15'
}

uri = url + '/v1/models/{}'.format(feature_to_train)


print('\nCreating custom model...')

training_data_filename = 'training_data.json'

######### Make a call to NLU to train the model ######### 
with open(training_data_filename, 'rb') as f:
    response = requests.post(uri,
                         params=params,
                         data=data,
                         files={'training_data': (ntpath.basename(training_data_filename), f, 'application/json')},
                         auth=(username, password),
                         verify=False,
                        )

######### Parse response from NLU ######### 
    
print('Model creation returned: ', response.status_code)

if response.status_code != 201:
    print('Failed to create model')
    print(response.text)
else:
    print('\nCustom model training started...')
    response_json = response.json()
    model_id = response_json['model_id']
    print('Custom Model ID: ', model_id)


Creating custom model...
Model creation returned:  201

Custom model training started...
Custom Model ID:  2ba30e90-b858-4b9d-b273-7b8d6e536416


## Retrieve custom categories model by ID

In [16]:
import requests

params = {
    'version': '2021-02-15'
}

uri = url + '/v1/models/categories/' + model_id

######### Make a call to NLU ######### 

response = requests.get(uri, auth=(username, password), params=params, verify=False, headers=headers)

######### Parse response from NLU ######### 

print('Get model returned: ', response.status_code)

response_json = response.json()
print("Response from NLU:\n", json.dumps(response_json, indent=4, sort_keys=True))

Get model returned:  200
Response from NLU:
 {
    "created": "2021-02-12T09:58:05Z",
    "description": null,
    "features": [
        "categories"
    ],
    "language": "en",
    "last_deployed": "2021-02-12T10:01:07Z",
    "last_trained": "2021-02-12T09:58:05Z",
    "model_id": "2ba30e90-b858-4b9d-b273-7b8d6e536416",
    "model_version": "1.0.1",
    "name": "Categories Custom model #1",
    "status": "available",
    "user_metadata": null,
    "version": "1.0.1",
    "version_description": null,
    "workspace_id": null
}


## Use the model using NLU Analyze API

Once the model is trained, the status from the get request above will turn to `available`. Once the model is `available`, you can make the analyze request using the `model_id`

In [37]:
######### Create request #########

analyze_request_data = {
        "text":"I use Nikon NIKKOR 10-20mm lens for landscape pictures.",
        "language": "en",
        "features": {
            "categories": {
                "model": model_id
            }
        }
}

uri = url + '/v1/analyze'

params = {
    'version': '2021-02-15'
}

headers = {'Content-Type' : 'application/json'}

######### Make a call to NLU #########

response = requests.post(uri,
                         params=params,
                         json=analyze_request_data,
                         headers=headers,
                         auth=(username, password),
                         verify=False,
                        )

if response.status_code != 200:
    print('Failed to make request to model. Reason:')
    print(response.text)

else:
    response_json = response.json()

    print("Successfully analyzed request. Response from NLU:\n")
    print(json.dumps(response_json, indent=4, sort_keys=True))

Successfully analyzed request. Response from NLU:

{
    "categories": [
        {
            "label": "/Cameras/DSLR Cameras/Lenses",
            "score": 0.999968
        },
        {
            "label": "/Cameras/Mirrorless Cameras",
            "score": 0.994649
        },
        {
            "label": "/Cell Phones",
            "score": 0.001869
        }
    ],
    "language": "en",
    "usage": {
        "features": 0,
        "text_characters": 55,
        "text_units": 1
    }
}
