# Business Entity Recognition Demo

This notebook is designed to demonstrate how easy it is to use the SAP AI Business Services - Business Entity Recognition service for classification tasks. In this demo, we train a model and evaluate its performance on a small example dataset.

For the demo, we prepared this Jupyter Notebook which demonstrates the use of this client library to invoke the most important functions of the Business Entity Recognition REST API. 

## Fetch python module and repo containing example dataset

This notebook requires the python package containing the client and a dataset to train a model on. Both are fetched in the cell below.

An example dataset is provided in the repo, you can exlpore the structure of the dataset required [here](https://github.wdf.sap.corp/i329525/BER-Client/tree/master/examples/data).

## Settings

The settings under `Environment specific configuration` require a valid service key for the Business Entity Recognition service on SAP Cloud Plattform.

The keys in the service key needed here are named exactly as the variables, specifically:
- url: The URL of the service deployment provided in the outermost hierachy of the service key json file
- uaa_url: The URL of the UAA server used for authentication provided in the __uaa__ of the service key json file
- uaa_clientid: The clientid used for authentication to the UAA server provided in the __uaa__ of the service key json file
- uaa_clientsecret: The clientsecret used for authentication to the UAA server provided in the __uaa__ of the service key json file

Service key json has to be copied to config.json file.

For the `Model specific configuration` the parameters are explained by a comment below.


# Model specific configuration
model_name = "" # choose an arbitrary model name for the model trained here, will be assigned to the trained model for identification purposes
dataset_folder = "data" # should point to (relative or absolute) path containing dataset

In [None]:
# update working directory path

import os

os.chdir('../')

#print(os.getcwd())

import pathlib
pathlib.Path().absolute()

In [None]:
import pathlib
pathlib.Path().absolute()

## Initialize Demo

In [None]:
from sap_ber_client import ber_api_client
from pprint import pprint
import json

In [None]:
import importlib
# import sap_ber_client.ber_api_client

importlib.reload(ber_api_client)

In [None]:
import json
config_file = os.getcwd() + '/examples/config.json'
with open(config_file, 'rb') as config:
    config_json = json.load(config)
    config.close()

In [None]:
# Instaniate object used to communicate with DC REST API
# my_ber_client = pyber.Pyber(url, uaa_clientid, uaa_clientsecret, uaa_url)

url = config_json['url']
uaa_clientid = config_json['uaa']['clientid']
uaa_clientsecret = config_json['uaa']['clientsecret']
uaa_url = config_json['uaa']['url']

my_ber_client = ber_api_client.BER_API_Client(url, uaa_clientid, uaa_clientsecret, uaa_url)

#print(my_ber_client.base_url)

## Display access token

In [None]:
# Token can be used to interact with e.g. swagger UI to explore BER API
#print(my_ber_client.session.headers)
#print("\nYou can use this token to Authorize here and explore the API via Swagger UI: \n{}api/v1/".format(url))

## Create Dataset for training of a new model

In [None]:
# Create Training dataset
response = my_ber_client.create_dataset()
pprint(response.json())

In [None]:
training_dataset_id = response.json()["data"]["datasetId"]
print(training_dataset_id)

In [None]:
# Upload training documents to the dataset from training directory
import os
dataset_folder = os.getcwd() + '/examples/data/english_training_dataset_annotated.json'
print("Uploading training documents to the dataset")
response = my_ber_client.upload_document_to_dataset(training_dataset_id, dataset_folder)
print("Finished uploading training documents to the dataset")
pprint(response.json())

In [None]:
# Pretty print the dataset statistics
print("Dataset statistics")
dataset_stats = my_ber_client.get_dataset(training_dataset_id)
pprint(dataset_stats.json())

## Training

In [None]:
model_name = "for-client-lib"

In [None]:
# Train the model
    
print("Start training job from model with modelName {}".format(model_name))
response = my_ber_client.train_model(model_name, training_dataset_id)
print(response.json())

In [None]:
print(response.json())
jobid = response.json()["data"]["jobId"]
print(jobid)

In [None]:
#Get the status of job
r = my_ber_client.get_training_status(jobid)
pprint(r.json())

In [None]:
#Get recently submitted jobs

response_recent = my_ber_client.get_recently_submitted_training_jobs_list()
pprint(response_recent.json())

## Model

In [None]:
# Get trained model versions
response = my_ber_client.get_trained_model_versions(model_name)
pprint(response.json())

In [None]:
# Get trained model version
model_version = 1 
response = my_ber_client.get_trained_model_version(model_name, model_version)
pprint(response.json())

In [None]:
#Get all models
response = my_ber_client.get_trained_models()
pprint(response.json())

## Deployment

In [None]:
# Deploy model
model_version = 1
response = my_ber_client.deploy_model(model_name, model_version) 
pprint(response.json())
deployment_id = response.json()['data']['deploymentId']

In [None]:
# Get all deployments
response = my_ber_client.get_deployments() 
pprint(response.json())

In [None]:
#Get deployment
response = my_ber_client.get_deployed_model(deployment_id) 
pprint(response.json())

In [None]:
# Undeploy model
response = my_ber_client.undeploy_model(deployment_id) 
pprint(response.json())


## Inference


<!-- This runs inference on all documents in the test set (stratification is done inside DC service and reproduced here).  
We are working on exposing the stratification results so that this cell can be shortend. -->

In [None]:
# Initialize variable if not initialized (when using only inference after 1st time)
model_name = "for-client-lib"
model_version = 1

In [None]:
# post inference job
text = 'From: John Doe, john.doe@company.com Sent: Tuesday, December 18, 2018 2:16 PM To: John Doe, john.doe@company.com Subject: FW: M40262 Sony - cuenta actualizada  Hi Ankur  Please clear the payment that it’s on the esprinet account according to the below RA:  596501215 -3.200,00 5022208353 27.195,00 LIQ. PRONTO PAGO-271,95 23.723,05 BR Madalina'
response = my_ber_client.post_inference_job(text, model_name, model_version)
pprint(response.json())
inference_jobid = response.json()["data"]["id"]

In [None]:
#Get inference result
response = my_ber_client.get_inference_job(inference_jobid)
pprint(response.json())

In [None]:
# Create Inference dataset
response = my_ber_client.create_dataset("inference")
pprint(response.json())
inference_dataset_id = response.json()["data"]["datasetId"]
print(inference_dataset_id)
inference_dataset_folder = os.getcwd() + '/examples/data/batch_inference.json'
print("Uploading inference documents to the dataset")
response = my_ber_client.upload_document_to_dataset(inference_dataset_id, inference_dataset_folder)
print("Finished uploading inference documents to the dataset")
pprint(response.json())

# post batch inference job
response = my_ber_client.post_batch_inference_job(inference_dataset_id, model_name, model_version)
pprint(response.json())
batch_inference_job_id = response.json()["data"]["id"]


In [None]:
#Get batch inference result
response = my_ber_client.get_inference_job(batch_inference_job_id)
pprint(response.json())

#Get batch inference result file
response = my_ber_client.get_batch_inference_job_result(batch_inference_job_id)
pprint(response.json())

## Deleting Datasets

In [None]:
#deleting training dataset
response = my_ber_client.delete_dataset(training_dataset_id)
pprint(response.json())

#deleting inference dataset
response = my_ber_client.delete_dataset(inference_dataset_id)
pprint(response.json())