## Notebook 4 – Tone Analyzer 
https://www.ibm.com/watson/developercloud/tone-analyzer.html 
https://www.ibm.com/watson/developercloud/tone-analyzer/api/v3/ 

Tone Analyzer service uses linguistic analysis to detect emotional and language tones in written text.

If you already have an IBM Cloud / Bluemix account login here https://console.bluemix.net/ 
If you have not registered for IBM Cloud - you will need to Register for a Free account here https://www.ibm.com/watson/developer/	

To create a TONE endpoint - https://console.bluemix.net/developer/watson/dashboard  LITE Plan for Tone is free

### Tone Analyzer Signals

A comma-separated list of tones for which the service is to return its analysis of the input. The indicated tones apply both to the full document and to individual sentences of the document. You can specify one or more of the following values:
    emotion
    language
    social
    
2016-05-19: The service can return results for the following tone IDs of the different categories:
        For the emotion category: 
            anger, 
            disgust, 
            fear, 
            joy, 
            sadness
        For the language category: 
            analytical, 
            confident, 
            tentative
        For the social category: 
            openness_big5, 
            conscientiousness_big5, 
            extraversion_big5, 
            agreeableness_big5,
            emotional_range_big5
         The service returns scores for all tones of a category, regardless of their values.

    
2017-09-21: The service can return results for the following tone IDs: anger, fear, joy, sadness, analytical, confident, and tentative. The service returns results only for tones whose scores meet a minimum threshold of 0.5.

If Tone Chat Score is used - these are the signals produced
    sad
    frustrated
    satisfied
    excited
    polite
    impolite
    sympathetic


## Install dependencies

In [None]:
#imports.... Run this each time after restarting the Kernel
#!pip install watson_developer_cloud
import watson_developer_cloud as watson
import json
from botocore.client import Config
import ibm_boto3
import requests


##  Cloud Object Storage - Add Credentials & Bucket Name
If you've not already set up COS - please see Step 1

### Credentials
Credentials are also created for you when you create project. From service dashboard page select `Service Credentials` from left navigation menu item, and copy/paste the credentials below:

### Bucket name
Buckets are created for you when you create project. From service dashboard page select `Buckets` from left navigation menu item, and get your bucket name and copy/paste bucket name below:


In [None]:
# For Cloud Object Storage - populate your own information here from "SERVICES" on this page, or Console Dashboard on ibm.com/cloud

# From service dashboard page select Service Credentials from left navigation menu item
credentials_os = {
  "apikey": "",
  "cos_hmac_keys": {
    "access_key_id": "",
    "secret_access_key": ""
  },
  "endpoints": "https://cos-service.bluemix.net/endpoints",
  "iam_apikey_description": "Auto generated apikey during resource-key operation for Instance",
  "iam_apikey_name": "",
  "iam_role_crn": "",
  "iam_serviceid_crn": "",
  "resource_instance_id": ""
}

# Buckets are created for you when you create project. From service dashboard page select Buckets from left navigation menu item, 
credentials_os['BUCKET'] = '<bucket_name>' # copy bucket name from COS

In [None]:
# The code was removed by DSX for sharing.

### Create Watson TONE Analyzer service

Two options to create a new TONE service.  (1) Above click SERVICES and create/add new LITE version of TONE; or (2) In Console Dashboard in ibm.com/cloud create a LITE TONE services.  Click on 'SERVICE CREDENTIALS' to get creds.

For more information on creating Watson services, see Notebook 1

In [None]:
# Add your TONE Service credentials here
credentials_tone = {
    "url": '',
    "apikey": ''
}

### Set up Object Storage Client

In [None]:
endpoints = requests.get(credentials_os['endpoints']).json()

iam_host = (endpoints['identity-endpoints']['iam-token'])
cos_host = (endpoints['service-endpoints']['cross-region']['us']['public']['us-geo'])

auth_endpoint = "https://" + iam_host + "/oidc/token"
service_endpoint = "https://" + cos_host


client = ibm_boto3.client(
    's3',
    ibm_api_key_id = credentials_os['apikey'],
    ibm_service_instance_id = credentials_os['resource_instance_id'],
    ibm_auth_endpoint = auth_endpoint,
    config = Config(signature_version='oauth'),
    endpoint_url = service_endpoint
   )





### Tone

- `process_text()` goes throught the text and fetch sentences and concatenate transcript based on chunk size
- `analyze transcript()` calls tone analyzer endpoint and analyze the transcript
- `post_anlysis()` shows tones and their score


In [None]:
from watson_developer_cloud import ToneAnalyzerV3

tone_analyzer = ToneAnalyzerV3(version = '2016-05-19',
                               iam_apikey = credentials_tone['apikey']


chunk_size=25

def chunk_transcript(transcript, chunk_size):
    transcript = transcript.split(' ')
    return [ transcript[i:i+chunk_size] for i in range(0, len(transcript), chunk_size) ] # chunking data
    

def process_text(text):
    transcript=''
    for sentence in json.loads(text)['results']:
        transcript = transcript + sentence['alternatives'][0]['transcript'] # concatenate sentences
    transcript = chunk_transcript(transcript, chunk_size) # chunk the transcript
    return transcript


def analyze_transcript(file_name):
    transcript = client.get_object(Bucket = credentials_os['BUCKET'], Key = file_name.split('.')[0]+'_text.json')['Body']
    transcript = transcript.read().decode("utf-8")
    tone_analysis={}
    for chunk in process_text(transcript):
        if len(chunk) > 2:
            chunk = ' '.join(chunk)
            tone_analysis[chunk] = tone_analyzer.tone(chunk, content_type='text/plain').get_result()
    res=client.put_object(Bucket = credentials_os['BUCKET'], Key= file_name.split('.')[0]+'_tone.json', Body = json.dumps(tone_analysis))
    return tone_analysis

def print_tones(tones):
    for tone in tones:
        print(tone) ## note for self: update this and show table instead

def post_analysis(result):
    for chunk in result.keys():
        tone_categories = result[chunk]['document_tone']['tone_categories']
        print('\nchunk: ', chunk)
        for tone_category in tone_categories:
            print_tones(tone_category['tones']) #add table instead of prints


In [None]:
file_list = ['sample1-addresschange-positive.ogg',
             'sample2-address-negative.ogg',
             'sample3-shirt-return-weather-chitchat.ogg',
             'sample4-angryblender-sportschitchat-recovery.ogg',
             'sample5-calibration-toneandcontext.ogg',
             'jfk_1961_0525_speech_to_put_man_on_moon.ogg',
             'May 1 1969 Fred Rogers testifies before the Senate Subcommittee on Communications.ogg']


In [None]:
result = analyze_transcript(file_list[0])
post_analysis(result) ## aggregrate tones then show histogram and then filter 

In [None]:
for filename in file_list:
    print("\n\nprocessing file: ", filename)
    result = analyze_transcript(filename)
    post_analysis(result) ## aggregrate tones then show histogram and then filter 