# Mini-project - creating a dataframe from analysed text data

For this project you are going to use the IBM Watson Tone Analyser API.  You will send text data to it, use security information stored in a config file to keep it secret, receive the results in JSON format, investigate the structure of the results and build a dataframe from them.

Then you will use the results to create a visualisation of tone and to report an overall set of statistics from the data.

---

## Step 1 - sign up for IBM Watson services to use the Tone Analyser

1.  Sign up for IBM Watson: https://www.ibm.com/cloud/watson-studio  
2.  Click 'Try on Cloud at no cost'  
3.  Select the London region  (costs reduced and performance improved when you use the nearest servers)  
4.  Create an IBM Cloud account (enter email and accept terms)  
5.  Follow the instructions to create the account  
6.  Provision the services  
7.  Then go to IBM Watson Studio  
8.  Select Tone Analyzer under the Your Services heading  
9.  You will be shown the **url** for the Tone Analyser API and an **API key** which is needed for using the API.

---

## Test to make sure it works

1.  Download this file, which has some text for you to test with: https://drive.google.com/file/d/1m65cPQGYQd1mwvEmfZw69-GMUBdo43k0/view?usp=sharing, put the file in the same folder as this worksheet for now.

2.  Create a second text file in the same folder as this worksheet that will hold the credentials for your IBM connection to the Tone Analyser.  Add the following text to this file and save it as 'config.txt'

{"config":{"url": "...the url you got from the IBM Tone Analyser...", "apikey":"... the API key from the analyser ..."}}  

These credentials will never appear in your code as will only be readable on your device.

3.  Run the code below,which will create a ToneAnalyzer with the credentials from your **config.txt** file, then feed the text from the **text-for-analysis.txt** file

4.  Decide what the data looks like and how this might be represented in a pandas dataframe

In [None]:
from ibm_watson import ToneAnalyzerV3
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
import os
import json

# get credentials from the file config.txt
def get_secret(key):
    # add code here to open the config.txt file and return the value associated with the iey (either 'apikey' or 'url')
    # if there is an error print an error message and return None
    try:
        if os.path.exists('config.txt'):
            config = open('config.txt', 'r')
            config_data =config.readline()
            config.close()
            return json.loads(config_data)['config'][key]
        else:
            print("File not found")
            return None
    except:
        print("File error")
        return None 


    
def get_text_for_analysis():
    # add code here to open the text-for-analysis.txt file and return the text it reads as one string
    # if there is an error, return None


     
    
# create a ToneAnalyzerV3 object, version 2017-09-21 using api key and url from config
authenticator = IAMAuthenticator(apikey=get_secret('apikey'))
tone_analyzer = ToneAnalyzerV3(
    version='2017-09-21',
    authenticator=authenticator
)
tone_analyzer.set_service_url(get_secret('url'))

# get the text for analysis from the file
text = get_text_for_analysis()
if text:
    tone_analysis = tone_analyzer.tone(
        {'text': text},
        content_type='application/json'
    ).get_result()    
    print(tone_analysis)
else:
    print("No data")

### Create (on paper) an idea of how this data might be organised into a data table

1.  How many bits of information are there about the document as a whole?
2.  How many bits of information are there about each sentence?
3.  If all tone analysis records were included in the dataframe, how many rows would there be?
4.  What information would be included in each row?

### Create a dataframe and start to populate with the data

Before you can create a dataframe from this data you will need to convert it into a table.  One way to do this would be to create a list of dictionary records, with each record formed from the data from each row in the original 'sentences_tone' data.  You will need to loop through the rows in the 'sentences_tone' list, nesting a loop through the 'tones' list for each sentence.  For each, copy across the columns you feel should be included.

_Hint:_  
` for row in sentence_data:
        for col in row['tones']:
            new_row = {'sentence_id':row['sentence_id'], 'text':row['text'], 'tone_score':col['score'], 'tone_id':col['tone_id'],'tone_name':col['tone_name']}`


In [None]:
import pandas as pd
import numpy as np

# convert json data to table format with one row for each tone for each sentence
def convert_to_tones_table(json_data):
    # add code here to convert the json_data from the text file into a table form
    # return the data normalized into a dataframe (pd.json_normalise(json_table))



tone_data = convert_to_tones_table(tone_analysis)
tone_data




### Summarise the sentence data
*  Which sentence is the most analytical?
*  which sentence is the least analytical?
*  what is the average analytical tone score for the sentences?
*  what do the analytical scores look like in a bar chart?

### Report the tone data for the whole document
---

Play with the data, create a dataframe for the document_tone, tones data (pd.json_normalize(document_tone
Display the document score for each of the tones in the analysis

### Change the text in the text file and analyse the new text.
---

Here is some alternative, happier text.  Replace the text in the text-for-analysis.txt file with the text below.  Then run the notebook cells again to see the results.

But I feel peaceful. Your success in the ring this morning was, to a small degree, my success. Your future is assured. You will live, secure and safe, Wilbur. Nothing can harm you now. These autumn days will shorten and grow cold. The leaves will shake loose from the trees and fall. Christmas will come, and the snows of winter. You will live to enjoy the beauty of the frozen world, for you mean a great deal to Zuckerman and he will not harm you, ever. Winter will pass, the days will lengthen, the ice will melt in the pasture pond. The song sparrow will return and sing, the frogs will awake, the warm wind will blow again. All these sights and sounds and smells will be yours to enjoy, Wilbur-this lovely world, these precious days.

### Find your own examples of text, replace the text in the file again, and analyse the results.