# Sentence Quality Scorer

This is the third notebook which evaluates three metrics. 
* The grading level for each clause
* The reading ease in each clause
* The quality of sentence for each clause.

### Introduction to Grading ease and Reading Level Scoring
Grading ease and Reading Levels are computed using the metrics elaborated here: https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests

In [1]:
!pip install textacy
!python3 -m spacy download en
import textacy
import pandas as pd
from textacy.text_stats import TextStats
print("Loaded")

[33mYou are using pip version 9.0.3, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
Collecting https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
[K    100% |████████████████████████████████| 37.4MB 79.7MB/s ta 0:00:01    39% |████████████▌                   | 14.6MB 61.1MB/s eta 0:00:01
[33mYou are using pip version 9.0.3, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m

[93m    Linking successful[0m
    /srv/conda/lib/python3.6/site-packages/en_core_web_sm -->
    /srv/conda/lib/python3.6/site-packages/spacy/data/en

    You can now load the model via spacy.load('en')

Loaded


The input for this notebook is from "abstraction_scored.csv"

In [2]:
df = pd.read_csv("./abstraction_scored.csv")
df.clauses_text_final = df.clauses_text_final.apply(eval)
df.voice = df.voice.apply(eval)
df.abstraction_score = df.abstraction_score.apply(eval)
df.sample(frac = 1).head(10)

Unnamed: 0,prompt,response,clauses_text_final,voice,idx,abstraction_score,abstraction_score_normalized
223,We could make the world a better place if,"Oh, God – better not to ask the question, and ...","[Oh God better not to ask the question and, ra...","[A_def, A_pron_x, A_def, A_def, A_def, P_bevb_x]",223,"[0.14, 0.14, 0.14, 0.25, 0.14, 0.25]","[0.56, 0.56, 0.56, 1.0, 0.56, 1.0]"
393,If I were in charge,life for everyone would be frightening!,"[life for everyone, would be frightening]","[Undefined, P_bevb_x]",393,"[0.14, 0.14]","[0.56, 0.56]"
131,What I like to do best is,"To Train, give the direction and give support ...","[To Train give the direction and, To Train giv...","[A_def, A_def, A_def, P_bevb_x, A_def, P_bevb_x]",131,"[0.25, 0.25, 0.14, 0.14, 0.14, 0.25]","[1.0, 1.0, 0.56, 0.56, 0.56, 1.0]"
3,When people are helpless,They often don&#039;t know it so they flak aro...,"[They often don, t know it, so they flak aroun...","[P_bevb_x, A_def, A_def, A_def]",3,"[0.14, 0.14, 0.14]","[0.56, 0.56, 0.56]"
349,What gets me into trouble is,"going through the black gates, but I don't do ...","[going through the black gates but, I don t, d...","[A_def, P_bevb_x, P_bevb_x]",349,"[0.25, 0.22, 0.25]","[1.0, 0.88, 1.0]"
115,"These days, work",is a Devine expression of the universe moving ...,"[a Devine expression of the universe, moving t...","[Undefined, A_def, P_bevb_x, A_def]",115,"[0.14, 0.14, 0.22, 0.14]","[0.56, 0.56, 0.88, 0.56]"
360,Sometimes I wish that,I didn't have my sisters...I've got 2,"[I didn t, have my sisters, I ve got 2]","[P_bevb_x, A_pron_x, A_def]",360,"[0.22, 0.14, 0.22]","[0.88, 0.56, 0.88]"
133,When people are helpless,I am inspired to reach out and be of service.,"[I am inspired to reach out and, be of service]","[P_bevb_x, A_def, P_bevb_x]",133,"[0.25, 0.25]","[1.0, 1.0]"
344,Rules,rules,[rules],[Undefined],344,[0.12],[0.48]
154,When they avoided me,hmmm... I\&#039;m stumped. I have no idea how ...,"[I have no idea, how to respond to this, Altho...","[P_bevb_x, A_def, P_bevb_x]",154,"[0.22, 0.14, 0.14]","[0.88, 0.56, 0.56]"


The raw values of reading ease and grading levels are computed. These are available via Textacy's TextStats.

In [3]:
def score_readability(text):
    doc = textacy.Doc(text, lang = "en")
    ts = TextStats(doc)
    return ts.readability_stats

df['readability_attributes_score'] = df.clauses_text_final.apply(lambda arr: [score_readability(x) for x in arr])
df['grading_level'] = df.readability_attributes_score.apply(lambda dct_arr: [round(dct['flesch_kincaid_grade_level'], 2) for dct in dct_arr])
df['reading_ease'] = df.readability_attributes_score.apply(lambda dct_arr: [round(dct['flesch_reading_ease'], 2) for dct in dct_arr])
_ = """[{'flesch_kincaid_grade_level': 0.6257142857142846, 'flesch_reading_ease': 103.04428571428573, 'smog_index': 3.1291, 'gunning_fog_index': 2.8000000000000003, 'coleman_liau_index': 2.6518669999999993, 'automated_readability_index': 0.23714285714285666, 'lix': 7.0, 'gulpease_index': 93.28571428571428, 'wiener_sachtextformel': -2.5074571428571426}]"""
del df['readability_attributes_score']
df.sample(frac = 1).head(10)

Unnamed: 0,prompt,response,clauses_text_final,voice,idx,abstraction_score,abstraction_score_normalized,grading_level,reading_ease
359,My father,goes to work,"[goes, to work]","[A_def, A_def]",359,"[0.12, 0.12]","[0.48, 0.48]","[-3.4, -3.01]","[121.22, 120.21]"
399,A teacher has the right to,I don't mean to be oppositional. . but this se...,"[I don, t mean to be oppositional but, this se...","[P_bevb_x, P_bevb_x, P_bevb_x, A_def, P_bevb_x...",399,"[0.22, 0.14, 0.25, 0.14, 0.25, 0.25, 0.22, 0.2...","[0.88, 0.56, 1.0, 0.56, 1.0, 1.0, 0.88, 0.88, ...","[-3.01, 6.42, 0.52, 0.72, 7.37, 2.88, 1.31, -1...","[120.21, 59.75, 102.05, 97.03, 54.7, 83.32, 90..."
201,What I like to do best is,"being in flow with whatever I do, a state of b...","[being in flow with, whatever I do a state of ...","[P_bevb_x, P_bevb_x, P_bevb_x, A_pron_x, A_def]",201,"[0.14, 0.25, 0.25, 0.14, 0.25]","[0.56, 1.0, 1.0, 0.56, 1.0]","[0.72, 6.73, 2.48, 0.72, 7.6]","[97.03, 71.77, 87.95, 97.03, 49.48]"
33,If I were in charge,Of anything right now it would be a challenge ...,"[Of anything right now, it would be a challeng...","[Undefined, P_bevb_x, P_bevb_x, P_bevb_x, A_de...",33,"[0.14, 0.25, 0.25, 0.25, 0.22]","[0.56, 1.0, 1.0, 1.0, 0.88]","[0.72, 0.52, 3.1, 5.68, 7.6]","[97.03, 100.24, 96.02, 66.79, 49.48]"
535,My co-workers and I,are creating a better and fulfilling potentia...,[creating a better and fulfilling potential fo...,"[A_def, A_def, A_pron_x, A_def, A_pron_x, P_be...",535,"[0.25, 0.25, 0.14, 0.11, 0.14, 0.14, 0.14, 0.1...","[1.0, 1.0, 0.56, 0.44, 0.56, 0.56, 0.56, 0.56,...","[12.69, 6.42, 2.28, -3.4, 7.6, 0.72, 8.37, 0.7...","[25.46, 59.75, 92.97, 121.22, 49.48, 97.03, 52..."
470,My main problem is,saying no... because I want to be helpful... b...,"[saying no, because I want to be helpful but, ...","[P_yn, P_bevb_x, P_bevb_x, P_bevb_x, P_get_x, ...",470,"[0.14, 0.25, 0.22, 0.14, 0.14, 0.14]","[0.56, 1.0, 0.88, 0.56, 0.56, 0.56]","[2.89, 2.31, 0.72, 13.11, -2.23, 0.52]","[77.91, 90.96, 97.03, 6.39, 118.18, 102.05]"
405,The thing I like about myself is,"My ability of open mindedness, realizing that ...","[My ability of open mindedness, that we are al...","[Undefined, A_pron_x, A_def, A_def, A_def, A_d...",405,"[0.14, 0.25, 0.14, 0.14, 0.25, 0.25, 0.14]","[0.56, 1.0, 0.56, 0.56, 1.0, 1.0, 0.56]","[7.6, 5.82, 5.25, 3.76, 6.42, 5.25, 0.52]","[49.48, 76.5, 62.79, 82.39, 59.75, 62.79, 100.24]"
293,I just can\'t stand people who,bully me,[who bully me],[A_def],293,[0.14],[0.56],[1.31],[90.99]
231,If my mother,I feel a pull to look back at my own upbringin...,[I feel a pull and my mother s inevitable infl...,"[A_pron_x, A_pron_x, P_bevb_x, P_bevb_x, A_def...",231,"[0.25, 0.25, 0.22, 0.14, 0.14, 0.25, 0.14, 0.1...","[1.0, 1.0, 0.88, 0.56, 0.56, 1.0, 0.56, 0.56, ...","[7.63, 2.31, 0.52, 3.67, 6.62, 3.76, -1.45, 1....","[63.49, 90.96, 100.24, 75.88, 54.73, 82.39, 11..."
327,When I get mad,I sometimes use big voice then I go somewhere ...,"[I sometimes use big voice, then I go somewher...","[A_def, P_get_x, P_bevb_x]",327,"[0.25, 0.22, 0.25]","[1.0, 0.88, 1.0]","[0.52, 2.88, 0.52]","[100.24, 83.32, 102.05]"


The absolute metrics are normalized between a 0-1 scale. The reading ease is inversely proportional to the quality of the sentence. So the reading-ease values are negated and these negated scores are reverse normalized.

In [4]:
def normalize(row, x_max, x_min, reverse_arr = False):
    if not reverse_arr:
        return [round((x - x_min)/(x_max - x_min), 2) for x in row]
    return [round((-1*x - x_min)/(x_max - x_min), 2) for x in row]

reading_ease = df['reading_ease'].tolist()
reading_ease = [j for i in reading_ease for j in i]
reading_ease = [-1*x for x in reading_ease]
x_max, x_min = max(reading_ease), min(reading_ease)
df['reading_ease_normalized'] = df['reading_ease'].apply(lambda arr : normalize(arr, x_max, x_min, reverse_arr = True))

grading_levels = df['grading_level'].tolist()
grading_levels = [j for i in grading_levels for j in i]
x_max, x_min = max(grading_levels), min(grading_levels)
df['grading_level_normalized'] = df['grading_level'].apply(lambda arr : normalize(arr, x_max, x_min, reverse_arr = False))
df.sample(frac = 1).head(10)

Unnamed: 0,prompt,response,clauses_text_final,voice,idx,abstraction_score,abstraction_score_normalized,grading_level,reading_ease,reading_ease_normalized,grading_level_normalized
286,I am,Lucy,[Lucy],[Undefined],286,[0.25],[1.0],[-3.4],[121.22],[0.0],[0.0]
37,Technology,might one day be the next step of evolution. I...,[might one day be the next step of evolution I...,"[P_bevb_x, P_bevb_x, P_bevb_x, P_bevb_x]",37,"[0.22, 0.25, 0.14]","[0.88, 1.0, 0.56]","[9.13, 12.69, 4.45]","[61.67, 25.46, 73.85]","[0.18, 0.28, 0.14]","[0.27, 0.34, 0.17]"
256,When I am criticized,I find myself going inward to first examine my...,"[I find, myself going inward, to first examine...","[A_def, A_def, A_pron_x, A_def, P_bevb_x, A_de...",256,"[0.22, 0.14, 0.25, 0.25, 0.25, 0.25, 0.25, 0.2...","[0.88, 0.56, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.8...","[-3.01, 9.18, 6.78, 2.88, 2.47, 0.52, -2.23, 4...","[120.21, 34.59, 73.17, 83.32, 95.17, 102.05, 1...","[0.0, 0.26, 0.14, 0.11, 0.08, 0.06, 0.01, 0.13...","[0.01, 0.27, 0.22, 0.13, 0.12, 0.08, 0.02, 0.1..."
106,Change is,good.,[good],[Undefined],106,[0.12],[0.48],[-3.4],[121.22],[0.0],[0.0]
468,The past,is best left in the past... Tomorrow presents ...,"[is, best left in the past Tomorrow, presents ...","[P_bevb_x, A_def, A_def]",468,"[0, 0.14, 0.25]","[0.0, 0.56, 1.0]","[-3.4, 2.48, 12.32]","[121.22, 87.95, 15.64]","[0.0, 0.1, 0.31]","[0.0, 0.12, 0.33]"
62,Being with other people,can be rewarding.,[can be rewarding],[P_bevb_x],62,[0.25],[1.0],[5.25],[62.79],[0.17],[0.18]
328,Rules,no running on the concrete.,[no running on the concrete],[A_def],328,[0.25],[1.0],[2.88],[83.32],[0.11],[0.13]
305,When I am nervous,"I would get out of bed, or house",[I would get out of bed or house],[P_bevb_x],305,[0.22],[0.88],[-0.67],[114.12],[0.02],[0.06]
204,If I had more money,I have big visions what I could do with that i...,"[I have big visions, what I could do with that...","[P_bevb_x, P_bevb_x, A_def, P_bevb_x, P_bevb_x...",204,"[0.22, 0.25, 0.22, 0.14, 0.25, 0.25, 0.12, 0.25]","[0.88, 1.0, 0.88, 0.56, 1.0, 1.0, 0.48, 1.0]","[0.72, 6.73, 3.67, 8.79, 3.76, 5.82, 8.79, 0.52]","[97.03, 69.99, 75.88, 35.61, 82.39, 76.5, 35.6...","[0.07, 0.15, 0.13, 0.25, 0.11, 0.13, 0.25, 0.06]","[0.09, 0.21, 0.15, 0.26, 0.15, 0.2, 0.26, 0.08]"
325,When they didn't let me join in,I left them alone.,"[I, left them alone]","[Undefined, A_def]",325,"[0.22, 0.14]","[0.88, 0.56]","[-3.4, -2.62]","[121.22, 119.19]","[0.0, 0.01]","[0.0, 0.02]"


In [5]:
#Cross verify that they are correct
reading_ease = df['reading_ease_normalized'].tolist()
reading_ease = [j for i in reading_ease for j in i]
grade = df['grading_level_normalized'].tolist()
grade = [j for i in grade for j in i]
print(max(reading_ease), min(reading_ease), max(grade), min(grade))
df[["prompt", "response", "clauses_text_final", "voice", "idx", "abstraction_score_normalized", "reading_ease_normalized", "grading_level_normalized"]].to_csv("readability_scored.csv", index = False)

1.0 0.0 1.0 0.0


### Introduction to Computing the clause's overall quality
This part determines how each clause adds importance to the overall intent of the sentence. To do this we evaluate keyword tuples (Usually an n-gram adds more value when compated to an individual token) of the original sentence using an unsupervised keyword extraction technique like SGRank (elaborated here: http://www.aclweb.org/anthology/S15-1013). The clauses that contain the n-grams are assigned the score of the n-gram as determined by SG Rank. The quality metric per clause is then determined as Sum(Sgrank values of tuples)/Total tuples with values.

This output is stored in "keyterm_scored.csv" which will be used to evaluate the final scores and voices.

In [6]:
from textacy.keyterms import sgrank
df['nlp_doc'] = df.apply(lambda row : textacy.Doc(row['prompt'] + " " + row['response'], lang = "en"), axis = 1)
df['sgrank'] = df['nlp_doc'].apply(lambda doc : sgrank(doc, n_keyterms = len(doc)))

def get_normalized_importance(df):
    clauses = df["clauses_text_final"]
    rank_tuples = dict(df['sgrank'])
    ngram_keys = rank_tuples.keys()
    op = []
    for clause in clauses:
        str_clause = "".join(clause)
        denominator = 0
        numerator = 0
        for x in ngram_keys:
            if x in str_clause:
                numerator += rank_tuples[x]
                denominator += 1
        op.append(round(numerator / denominator, 2) if denominator > 0 else 0.0)
    return op
    
df["sgrank_normalized"] = df.apply(get_normalized_importance, axis = 1)
df[["prompt", "response", "clauses_text_final", "voice", "idx", "abstraction_score_normalized", "reading_ease_normalized", "grading_level_normalized", "sgrank_normalized"]].to_csv("keyterm_scored.csv", index = False)