# Classifications

In this notebook, we will be creating a custom model using IBM Watson Natural Language Understanding (NLU) Classifications feature using the Python SDK for IBM cloud. If you need to install the Python SDK, please visit [https://github.com/watson-developer-cloud/python-sdk](https://github.com/watson-developer-cloud/python-sdk). 

We will go through the following functionalities:
- [How to Create Training Data](#How-to-Create-Training-Data)
   - [Create a JSON File from Data](#2.-Create-JSON-file-from-Data)
   - [Convert Data from NLC to NLU Format](#Convert-Data-from-NLC-to-NLU-Format-(Optional-to-Run))
   - [Fetch Data From NLC](#Fetch-Data-From-NLC-(Optional-to-Run))
- [How to Train a NLU Classifications Model](#3.-How-to-Train-a-NLU-Classifications-Model)
- [How to Get Status of a NLU Classifications Model](#4.-How-to-Get-Status-of-a-NLU-Classifications-Model)
- [How to Use a Trained NLU Classifications Model for Analysis](#5.-How-to-Use-a-Trained-NLU-Classifications-Model-for-Analysis)
- [How to Delete a NLU Classifications Model](#6.-How-to-Delete-a-NLU-Classifications-Model)

To start, we will need a NLU instance which will provide (among other things) a necessary API key. To provision an instance of NLU visit: [https://cloud.ibm.com/catalog/services/natural-language-understanding](https://cloud.ibm.com/catalog/services/natural-language-understanding). 


## 1. Add Your NLU Service Credentials Here.

See the following for authenticating to Watson services: [https://cloud.ibm.com/docs/watson?topic=watson-iam](https://cloud.ibm.com/docs/watson?topic=watson-iam). It will suffice to use the auto-generated service credentials when you instantiated the NLU service.


In [1]:
# Add your NLU credentials here
api_key = 'YOUR_API_KEY'
url = 'YOUR_URL'

from ibm_watson import NaturalLanguageUnderstandingV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

auth = IAMAuthenticator(api_key)
nlu = NaturalLanguageUnderstandingV1(version='2021-03-25', authenticator=auth)

nlu.set_service_url(url)

print("Successfully connected with the NLU service")


Successfully connected with the NLU service


# How to Create Training Data

NLU Classifications training data requires a list of text documents, each annotated by one or more labels. 

The training data for NLU Classifications needs to be in the following JSON format:

```json
[
    {
        "text": "document 1",
        "labels": ["label1"]
    },
    {
        "text": "document 2",
        "labels": ["label2", "label3"]
    }
]
```

## 2. Create JSON file from Data

In [2]:
training_data = [
    {
        "text": "How hot is it today?",
        "labels": ["temperature"]
    },
    {
        "text": "Is it hot outside?",
        "labels": ["temperature"]
    },
    {
        "text": "Will it be uncomfortably hot?",
        "labels": ["temperature"]
    },
    {
        "text": "Will it be sweltering?",
        "labels": ["temperature"]
    },
    {
        "text": "How cold is it today?",
        "labels": ["temperature"]
    },
    {
        "text": "Will we get snow?",
        "labels": ["conditions"]
    },
    {
        "text": "Are we expecting sunny conditions?",
        "labels": ["conditions"]
    },
    {
        "text": "Is it overcast?",
        "labels": ["conditions"]
    },
    {
        "text": "Will it be cloudy?",
        "labels": ["conditions"]
    },
    {
        "text": "How much rain will fall today?",
        "labels": ["conditions"]
    }
]

# Save Training data in a file
import json

training_data_filename = 'training_data.json'

with open(training_data_filename, 'w', encoding='utf-8') as f:
    json.dump(training_data, f, indent=4)

print('Data successfully saved locally in ' + training_data_filename)

Data successfully saved locally in training_data.json


## Convert Data from NLC to NLU Format (Optional to Run)

The cell below provides a way of converting the training data stored **locally** in CSV format (required by [Watson Natural Language Classifier](https://cloud.ibm.com/docs/natural-language-classifier?topic=natural-language-classifier-using-your-data#training-structure)) to JSON format required by NLU Classifications training data.

In [1]:
# Set path to training data file used to train Natural Language Classifier
nlc_training_data_file_name = 'nlc_training_data.csv'

## Imports

import csv
import json


def convert_nlc_to_nlu(filename):
    
    nlu_data = []

    with open(nlc_training_data_file_name, 'r', encoding='utf-8') as csv_file:
        csv_reader = csv.reader(csv_file, delimiter=',')
        for row in csv_reader:
            text = row[0]
            labels = row[1:]
            # Convert the text and label in NLU training data JSON object
            data_dict = {
                'text': text,
                'labels': labels
            }
            nlu_data.append(data_dict)

    return nlu_data

nlu_data = convert_nlc_to_nlu(nlc_training_data_file_name)
        
# Save Training data in a file
training_data_filename = 'training_data.json'

with open(training_data_filename, 'w', encoding='utf-8') as f:
    json.dump(nlu_data, f, indent=4)

print('Data successfully converted to NLU format and saved locally in ' + training_data_filename)

## Fetch Data From NLC (Optional to Run)

If you already have an existing Watson Natural Language Classifier (NLC) model, it is possible to fetch the training data from the NLC model and make it available for NLU Classifications. To start, you will need to provide the credentials of your NLC service instance. Additionally, you will need the url that links to the classifer where training data is being fetched from. Directions on finding the url for the classifier can be found on the NLC API documentation: [https://cloud.ibm.com/apidocs/natural-language-classifier?code=python#getclassifier](https://cloud.ibm.com/apidocs/natural-language-classifier?code=python#getclassifier)

In [None]:
NLC_username = "YOUR_NLC_USERNAME"
NLC_api_key = "YOUR_NLC_API_KEY"
NLC_classifier_url = "YOUR_CLASSIFIER_URL"

import json
import requests
import csv

from contextlib import closing

from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)


# Provide the filename to save the data downloaded from existing NLC classifier
nlc_training_data_file_name = "nlc_training_data.csv"

# Fetch data from NLC
with open(nlc_training_data_file_name, 'w', encoding='utf-8') as out_file:
    uri = "{}/training_data".format(NLC_classifier_url)
    with closing(requests.get(uri, auth=(NLC_username, NLC_api_key), verify=False, stream=True)) as res:
        lines = (line.decode('utf-8') for line in res.iter_lines())
        csv_writer = csv.writer(out_file)
        for row in csv.reader(lines):
            csv_writer.writerow(row)


# Convert to NLU format    
nlu_data = convert_nlc_to_nlu(nlc_training_data_file_name)
        
# Save Training data in a file
training_data_filename = 'training_data.json'

with open(training_data_filename, 'w', encoding='utf-8') as f:
    json.dump(nlu_data, f, indent=4)
    
print('Data successfully converted to NLU format and saved locally in ' + training_data_filename)

## 3. How to Train a NLU Classifications Model

To train a NLU Classifications model using the data created above, utilize the `create_classifications_model` method. To view all functionality, you can also look over the NLU API documentation: [https://cloud.ibm.com/apidocs/natural-language-understanding?code=python](https://cloud.ibm.com/apidocs/natural-language-understanding?code=python). 

In [3]:
with open(training_data_filename, 'rb') as file:
    model = nlu.create_classifications_model(language='en', training_data=file, training_data_content_type='application/json', name='MyClassificationsModel', model_version='1.0.1').get_result()
    
    print("Created a NLU Classifications model:")
    print(json.dumps(model, indent=4))


Created a NLU Classifications model:
{
    "name": "MyClassificationsModel",
    "user_metadata": null,
    "language": "en",
    "description": null,
    "model_version": "1.0.1",
    "version": "1.0.1",
    "workspace_id": null,
    "version_description": null,
    "status": "starting",
    "notices": [],
    "model_id": "9bb62cb1-b26a-4e3e-adfb-e62f3037e60c",
    "features": [
        "classifications"
    ],
    "created": "2021-07-27T20:01:06Z",
    "last_trained": "2021-07-27T20:01:06Z",
    "last_deployed": null
}


## 4. How to Get Status of a NLU Classifications Model

We can inspect the NLU Classifications model that has been recently created using the `get_classifications_model` method and passing in the `model_id`. 

In [7]:
model_id = model['model_id']
model_to_view = nlu.get_classifications_model(model_id=model_id).get_result()

print("Information about the created NLU Classifications model:")
print(json.dumps(model_to_view, indent=4))


Information about the created NLU Classifications model:
{
    "name": "MyClassificationsModel",
    "user_metadata": null,
    "language": "en",
    "description": null,
    "model_version": "1.0.1",
    "version": "1.0.1",
    "workspace_id": null,
    "version_description": null,
    "status": "standby",
    "notices": [],
    "model_id": "9bb62cb1-b26a-4e3e-adfb-e62f3037e60c",
    "features": [
        "classifications"
    ],
    "created": "2021-07-27T20:01:06Z",
    "last_trained": "2021-07-27T20:01:06Z",
    "last_deployed": "2021-07-27T20:07:31Z"
}


## 5. How to Use a Trained NLU Classifications Model for Analysis

Once the NLU Classifications model is fully trained, the `status` located in the cell above will turn to `available` indicating the model can be used for analysis (training above will take a few minutes to complete). Once ready, utilize the `analyze` method by passing in text, HTML, or public webpage urls. 

In [8]:
from ibm_watson.natural_language_understanding_v1 import Features, ClassificationsOptions

text = "What is the expected high for today?"

analysis = nlu.analyze(text=text, features=Features(classifications=ClassificationsOptions(model=model_id))).get_result()

print("Analysis response from trained NLU Classifications model:")
print(json.dumps(analysis, indent=4))

Analysis response from recently trained model:
{
    "usage": {
        "text_units": 1,
        "text_characters": 36,
        "features": 0
    },
    "language": "en",
    "classifications": [
        {
            "confidence": 0.562519,
            "class_name": "temperature"
        },
        {
            "confidence": 0.433996,
            "class_name": "conditions"
        }
    ]
}


## 6. How to Delete a NLU Classifications Model

Use the `delete_classifications_model` method to remove a specific classification model via a `model_id`. This operation can be verified by listing all the existing models with `list_classifications_models`. 

In [8]:
deleted_model = nlu.delete_classifications_model(model_id=model_id)
updated_models_list = nlu.list_classifications_models().get_result()

print("The NLU Classifications model created in this tutorial has been deleted:")
print(json.dumps(updated_models_list, indent=4))

The NLU Classifications model created in this tutorial has been deleted:
{
    "models": []
}
