In [5]:
from hidden import IBM_personality_insights_api

# IBM Watson Personality Insights

To read these notes directly from IBM, please visit this website: https://www.ibm.com/watson/developercloud/tone-analyzer/api/v3/python.html?python#error-handling. Otherwise, most of the notes seen bellow have been taken directly from the website above and placed here to aid the reader.

The IBM Watson Personality Insights service enables applications to derive insights from social media, enterprise data, or other digital communications. The service uses linguistic analytics to infer individuals' intrinsic personality characteristics, including Big Five, Needs, and Values, from digital communications such as email, text messages, tweets, and forum posts.

The service can automatically infer, from potentially noisy social media, portraits of individuals that reflect their personality characteristics. The service can infer consumption preferences based on the results of its analysis and, for JSON content that is timestamped, can report temporal behavior.

## Authentication

IBM Cloud is migrating to token-based Identity and Access Management (IAM) authentication. With some service instances, you authenticate to the API by using IAM. You can pass either a bearer token in an Authorization header or an API key. If you pass in the API key, the SDK manages the lifecycle of the tokens.

In [6]:
from watson_developer_cloud import PersonalityInsightsV3

In [7]:
personality_insights = PersonalityInsightsV3(
    version = '2017-09-21',
    iam_apikey = IBM_personality_insights_api,
    url = 'https://gateway-wdc.watsonplatform.net/personality-insights/api'
)

## Service endpoint

The service endpoint is based on the location of the service instance. If your API endpoint URL differs from the default, you must set your endpoint. 

To find out which URL to use, view the service credentials by clicking the service instance on the Dashboard. Set the correct service ```URL``` by using the url parameter when you create the service instance or by calling the ```set_url()``` method of the service instance.

In [21]:
url = 'https://gateway-wdc.watsonplatform.net/personality-insights/api'

## Versioning

API requests require a version parameter that takes a date in the format ```version=YYYY-MM-DD```. When IBM changes the API in a **backwards-incompatible** way, they release a new version date.

Specify the version to use on API requests with the version parameter when you create the service instance. The service uses the API version for the date you specify, or the most recent version before that date. Don't default to the current date. Instead, specify a date that matches a version that is compatible with your app, and don't change it until your app is ready for a later version.

In [7]:
version = '2017-09-21'

## Data handling

#### Data Collection
By default, all Watson services log requests and their results. **Logging is done only to improve the services for future users. The logged data is not shared or made public.**

To prevent IBM from accessing your data for general service improvements, set the ```X-Watson-Learning-Opt-Out``` header parameter to ```true``` when you create the service instance. (Any value other than false or 0 disables request logging.) You can set the header using the ```set_default_headers``` method of the service object.

In [8]:
personality_insights.set_default_headers({'x-watson-learning-opt-out': "true"})

## Methods

#### Personality Profile

Get profile Generates a personality profile for the author of the input text. The service accepts a maximum of 20 MB of input content, but it requires much less text to produce an accurate profile; for more information, see Providing sufficient input. The service analyzes text in Arabic, English, Japanese, Korean, or Spanish and returns its results in a variety of languages. You can provide plain text, HTML, or JSON input by specifying the Content-Type parameter; the default is text/plain. Request a JSON or comma-separated values (CSV) response by specifying the Accept parameter; CSV output includes a fixed number of columns and optional headers.

Per the JSON specification, the default character encoding for JSON content is effectively always UTF-8; per the HTTP specification, the default encoding for plain text and HTML is ISO-8859-1 (effectively, the ASCII character set). When specifying a content type of plain text or HTML, include the charset parameter to indicate the character encoding of the input text; for example: Content-Type: text/plain;charset=utf-8. For text/html, the service removes HTML tags and analyzes only the textual content.

#### Request

profile(self, content, content_type, accept=None, content_language=None, accept_language=None, raw_scores=None, csv_headers=None, consumption_preferences=None, **kwargs)

**Input**:

| Name        | Description     | 
| ------------- |:-------------:| 
| content (str):     | A maximum of 20 MB of content to analyze, though the service requires much less text |



In [9]:
import json
profile = personality_insights.profile(content = t, content_type='text/plain').get_result()

In [11]:
import pandas as pd
profile.keys()



# Function to Distill the Big5 Personality Traits into DF

In [36]:
def text_to_big5_personality_pd(profile):
    df = pd.DataFrame(profile['personality'])
    df['trait'] = 'Big5'
    df['name'][4] = 'Neuroticism'
    return df[['trait','name', 'percentile']]

#### example

In [37]:
text_to_big5_personality_pd(profile)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.


Unnamed: 0,trait,name,percentile
0,Big5,Openness,0.764243
1,Big5,Conscientiousness,0.978601
2,Big5,Extraversion,0.41833
3,Big5,Agreeableness,0.349961
4,Big5,Neuroticism,0.907286


# Functions to Distill "Openness" Sub-Categories Into a Pandas DF

In [40]:
def text_to_openness_sub_personality_pd(profile):
    df = pd.DataFrame(profile['personality'])
    df = pd.DataFrame(df[df['name'] == 'Openness']['children'][0])
    df['trait'] = 'Openness'
    return df[['trait','name', 'percentile']]

In [41]:
text_to_openness_sub_personality_pd(profile)

Unnamed: 0,trait,name,percentile
0,Openness,Adventurousness,0.961168
1,Openness,Artistic interests,0.075603
2,Openness,Emotionality,0.241334
3,Openness,Imagination,0.012605
4,Openness,Intellect,0.929353
5,Openness,Authority-challenging,0.844719


# Functions to Distill "Conscientiousness" Sub-Categories Into a Pandas DF

In [42]:
def text_to_conscientiousness_sub_personality_pd(profile):
    df = pd.DataFrame(profile['personality'])
    df = pd.DataFrame(df[df['name'] == 'Conscientiousness']['children'][1])
    df['trait'] = 'Conscientiousness'
    return df[['trait','name', 'percentile']]

In [43]:
text_to_conscientiousness_sub_personality_pd(profile)

Unnamed: 0,trait,name,percentile
0,Conscientiousness,Achievement striving,0.974528
1,Conscientiousness,Cautiousness,0.978012
2,Conscientiousness,Dutifulness,0.966628
3,Conscientiousness,Orderliness,0.117107
4,Conscientiousness,Self-discipline,0.884116
5,Conscientiousness,Self-efficacy,0.897544


# Functions to Distill "Extraversion" Sub-Categories Into a Pandas DF

In [44]:
def text_to_extraversion_sub_personality_pd(profile):
    df = pd.DataFrame(profile['personality'])
    df = pd.DataFrame(df[df['name'] == 'Extraversion']['children'][2])
    df['trait'] = 'Extraversion'
    return df[['trait','name', 'percentile']]

In [45]:
text_to_extraversion_sub_personality_pd(profile)

Unnamed: 0,trait,name,percentile
0,Extraversion,Activity level,0.996005
1,Extraversion,Assertiveness,0.995737
2,Extraversion,Cheerfulness,0.431066
3,Extraversion,Excitement-seeking,0.067695
4,Extraversion,Outgoing,0.899447
5,Extraversion,Gregariousness,0.572762


# Functions to Distill "Agreeableness" Sub-Categories Into a Pandas DF

In [46]:
def text_to_agreeableness_sub_personality_pd(profile):
    df = pd.DataFrame(profile['personality'])
    df = pd.DataFrame(df[df['name'] == 'Agreeableness']['children'][3])
    df['trait'] = 'Agreeableness'
    return df[['trait','name', 'percentile']]

In [47]:
text_to_agreeableness_sub_personality_pd(profile)

Unnamed: 0,trait,name,percentile
0,Agreeableness,Altruism,0.903803
1,Agreeableness,Cooperation,0.919365
2,Agreeableness,Modesty,0.324203
3,Agreeableness,Uncompromising,0.864552
4,Agreeableness,Sympathy,0.93729
5,Agreeableness,Trust,0.989904


# Functions to Distill "Emotional range/Neuroticism" Sub-Categories Into a Pandas DF

In [48]:
def text_to_neuroticism_sub_personality_pd(profile):
    df = pd.DataFrame(profile['personality'])
    df = pd.DataFrame(df[df['name'] == 'Emotional range']['children'][4])
    df['trait'] = 'Neuroticism'
    return df[['trait','name', 'percentile']]

In [49]:
text_to_neuroticism_sub_personality_pd(profile)

Unnamed: 0,trait,name,percentile
0,Neuroticism,Fiery,0.022531
1,Neuroticism,Prone to worry,0.059867
2,Neuroticism,Melancholy,0.185419
3,Neuroticism,Immoderation,0.112079
4,Neuroticism,Self-consciousness,0.035895
5,Neuroticism,Susceptible to stress,0.079114


# Function to Distill "Needs" Into DF

In [56]:
def profile_to_needs_df(profile):
    df = pd.DataFrame(profile['needs'])
    df['trait'] = 'Needs'
    return df[['trait','name', 'percentile']]

# Function to Distill "Values" Into DF

In [57]:
def profile_to_values_df(profile):
    df = pd.DataFrame(profile['values'])
    df['trait'] = 'Values'
    return df[['trait','name', 'percentile']]

# Function to Concatenate all Sub-Personality Traits Into One DF

In [60]:
def all_personality_info_to_df(text):
    profile = personality_insights.profile(content = text, content_type='text/plain').get_result()
    frames = [text_to_big5_personality_pd(profile),
              text_to_openness_sub_personality_pd(profile),
              text_to_conscientiousness_sub_personality_pd(profile),
              text_to_extraversion_sub_personality_pd(profile),
              text_to_agreeableness_sub_personality_pd(profile),
              text_to_neuroticism_sub_personality_pd(profile),
              profile_to_needs_df(profile),
              profile_to_values_df(profile)
             ]
    return pd.concat(frames, ignore_index = True)

# Sample Pipeline: MBA student's "failure" essay 

#### Student's Essay

In [58]:
t = """ In my 2nd year in university, my 2 study partners and I were all working for software companies. We frequently discussed ways to make quantum career leaps. One that fascinated us was starting our own company.

One day we came up with an idea that would increase sales for consumer goods retailers and simultaneously decrease monthly consumer expenses. Each day, we polished our idea together for a couple hours.

After 2 weeks, I decided to get outside feedback. I looked for people who had at least 10 years experience in consumer goods. Finally, I convinced a friend, to connect me with a board member of the 2nd largest consumer goods retailer in my country.

I presented our business model to the board member, and he instructed his right-hand to set us meetings with managers who could evaluate our plans. Over the next month, we went to one meeting after another. The responses varied from enthusiasm to skepticism. Each time, we improved our presentation according to the feedback.

Finally, I managed to set a meeting with the previous CEO of the largest consumer goods retailer. He concluded our meeting with: “Guys, in my opinion, it’s not going to work”.

I couldn’t say if it was the pressure from school and work or the CEO’s negative feedback, but since that meeting, I wasn’t able to motivate the team to go on. We consciously gave up.

2 years later, one of my teammates called out of the blue: “check out this link…it works!”. I think he expected me to feel disappointment. Actually, I felt pride – my first business attempt was viable after all."""

#### Get all personality info

In [63]:
all_personality_info_to_df(t)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.


Unnamed: 0,trait,name,percentile
0,Big5,Openness,0.764243
1,Big5,Conscientiousness,0.978601
2,Big5,Extraversion,0.41833
3,Big5,Agreeableness,0.349961
4,Big5,Neuroticism,0.907286
5,Openness,Adventurousness,0.961168
6,Openness,Artistic interests,0.075603
7,Openness,Emotionality,0.241334
8,Openness,Imagination,0.012605
9,Openness,Intellect,0.929353


In [64]:
def df_to_d3(df):
    data = []
    keys, labels = df.index, df.columns
    for key in keys:
        values = [{'label': l, 'value': df[l][key]} for l in labels]
        data.append({'values': values, 'key': key})
    return data

In [65]:
df_to_d3(all_personality_info_to_df(t))

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.


[{'values': [{'label': 'trait', 'value': 'Big5'},
   {'label': 'name', 'value': 'Openness'},
   {'label': 'percentile', 'value': 0.7642428287455869}],
  'key': 0},
 {'values': [{'label': 'trait', 'value': 'Big5'},
   {'label': 'name', 'value': 'Conscientiousness'},
   {'label': 'percentile', 'value': 0.9786011328046709}],
  'key': 1},
 {'values': [{'label': 'trait', 'value': 'Big5'},
   {'label': 'name', 'value': 'Extraversion'},
   {'label': 'percentile', 'value': 0.41832957186732167}],
  'key': 2},
 {'values': [{'label': 'trait', 'value': 'Big5'},
   {'label': 'name', 'value': 'Agreeableness'},
   {'label': 'percentile', 'value': 0.3499606189760974}],
  'key': 3},
 {'values': [{'label': 'trait', 'value': 'Big5'},
   {'label': 'name', 'value': 'Neuroticism'},
   {'label': 'percentile', 'value': 0.9072861854188683}],
  'key': 4},
 {'values': [{'label': 'trait', 'value': 'Openness'},
   {'label': 'name', 'value': 'Adventurousness'},
   {'label': 'percentile', 'value': 0.9611676782299261