# Automated Essay Evaluation
# SMU, Data Science, Capstone
## Chris Roche, Nathan Deinlein, Darryl Dawkins

To add a model/criteria you need to do three main things:
    
1. Add the call to your model inside the run_criteriaN function
2. Update the output of your run_criteriaN function to be:
   * a string to be displayed to the student
   * a bool for whether the student needs help in this area
3. Add resources for help to the recommender csv file


TODO:
1. run_recommender function has a string for each criteria with a link. Right now it's just placeholders. Have the recommender engine populate those strings dynamically (criteria1Link, criteria2Link, criteria3Link)
2. Add more criteria as necessary. There are about 10 places that need to be updated, but it's pretty strightforward. It'll probbaly take me about 5 minutes. I'll add more as needed.

NOTE: when you build the UI, it will display below the cell below, but I recommend clicking the 127.0.0.1 link it creates and run it out of a browser window.

In [28]:
# Import libraries
import nltk
nltk.download('punkt')
nltk.download('stopwords')
import gradio as gr
import spacy
import spacy
from statistics import mean, median, mode
from TRUNAJOD import surface_proxies
import TRUNAJOD.ttr
import pytextrank


[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\OaklandHillsMansion\AppData\Roaming\nltk_data
[nltk_data]     ...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\OaklandHillsMansion\AppData\Roaming\nltk_data
[nltk_data]     ...
[nltk_data]   Package stopwords is already up-to-date!


### Functions for Essay Extracting

In [29]:
def get_sent_bounds(doc):
    # Get phrases, vectorize and get sent bounds
    limit_phrases = 4

    phrase_id = 0
    sent_bounds = [ [s.start, s.end, set([])] for s in doc.sents ]

    # Loop through each phrase from the document
    for p in doc._.phrases:
        # ic(phrase_id, p.text, p.rank)

        # Find every sentence the chunk is apert of
        # Loop thorugh each phrase chunk
        for chunk in p.chunks:
            # ic(chunk.start, chunk.end)

            # Loop through all sentences in sent_bounds
            for sent_start, sent_end, sent_vector in sent_bounds:
                # Check if chunk is in the sentence
                if chunk.start >= sent_start and chunk.end <= sent_end:
                    # ic(sent_start, chunk.start, chunk.end, sent_end)

                    # Add phrase_id to sent_vector from sent_bounds
                    sent_vector.add(phrase_id)
                    break

        phrase_id += 1

        if phrase_id == limit_phrases:
            break
    
    return sent_bounds

def get_unit_vector(key_doc):

    # Get phrases, vectorize and get sent bounds
    limit_phrases = 4

    phrase_id = 0
    unit_vector = []

    # Loop through each phrase from the document
    for p in key_doc._.phrases:
        # ic(phrase_id, p.text, p.rank)

        # Add rank to unit_vector list
        unit_vector.append(p.rank)

        phrase_id += 1

        if phrase_id == limit_phrases:
            break

    return unit_vector

def normalize_unit_vector(unit_vector):
    # Sum unit vectors for normalization
    sum_ranks = sum(unit_vector)

    # Normalize unit vector
    unit_vector = [ rank/sum_ranks for rank in unit_vector ]

    return unit_vector


def sent_uv_rank(unit_vector, sent_bounds):
    # Ranking each sentence based on how similiar they are,
    # in relation to each unit vector, using sum of squares 
    from math import sqrt

    sent_rank = {}
    sent_id = 0

    # Loop through sent_bound list
    for sent_start, sent_end, sent_vector in sent_bounds:
        # ic(sent_vector)
        sum_sq = 0.0
        # ic

        # Loop through each phrase in the key vector and
        # compare it to the sentence
        for phrase_id in range(len(unit_vector)):
            # ic(phrase_id, key_unit_vector[phrase_id])

            # If phrase_id is NOT in the sent add 
            # get the sum_sq of the unit_vector length
            if phrase_id not in sent_vector:
                sum_sq += unit_vector[phrase_id]**2.0

        # Get the square root of the sum of squares
        sent_rank[sent_id] = sqrt(sum_sq)
        sent_id += 1

    return sent_rank


def get_top_ranks(doc, sent_rank):
    from operator import itemgetter
    # sort sent_rank
    sorted(sent_rank.items(), key=itemgetter(1))
    # print(sorted(sent_rank.items(), key=itemgetter(1)))

    # limit for the number of top sentences to collect
    limit_sentences = 5

    sent_text = {}
    sent_id = 0
    top_5_ranks = []
    top_5_word_count = []
    # top_5_avg_sent_len = []
    top_5_lex_div = []

    # Create id for each sentence from the document
    for sent in doc.sents:
        sent_text[sent_id] = sent
        sent_id += 1

    num_sent = 0

    # Loop through sorted sent_rank list
    for sent_id, rank in sorted(sent_rank.items(), key=itemgetter(1)):
        # ic(sent_id, sent_text[sent_id])
        num_sent += 1
        top_5_ranks.append(rank)
        
        top_5_word_count.append(surface_proxies.word_count(sent_text[sent_id]))
        # top_5_avg_sent_len.append(surface_proxies.average_sentence_length(doc))
        top_5_lex_div.append(TRUNAJOD.ttr.lexical_diversity_mtld(doc))

        if num_sent == limit_sentences:
            break

    # print(top_5_ranks)
    # min_sent = sent_text[limit_sentences]
    # max_sent = sent_text[0]
    rank_avg = mean(top_5_ranks)
    rank_med = median(top_5_ranks)
    rank_mode = mode(top_5_ranks)

    mean_word_count = mean(top_5_word_count)
    # avg_sent_len = median(top_5_avg_sent_len)
    mean_lex_div = mean(top_5_lex_div)

    return rank_avg, rank_med, rank_mode, mean_word_count, mean_lex_div


In [30]:
@spacy.registry.misc("articles_scrubber")
def articles_scrubber():
    def scrubber_func(span: spacy.tokens.Span) -> str:
        for token in span:
            if token.pos_ not in ["DET", "PRON", "ADJ"]:
                break
            
            span = span[1:]
        return span.lemma_
    return scrubber_func

### Criteria 1

In [31]:
########################################################
# Lexical Diversity MLTD Criteria1
########################################################

def run_criteria1(essay, doc):
    needsHelp = False
    
    lexical_score = TRUNAJOD.ttr.lexical_diversity_mtld(doc)
    
    # Calculated in the EDA python notebook
    modelMedianDiversity = 0.5481
    modelMedianTotalWords = 184
    modelMedianUniqueWords = 101
    
    allWords = nltk.tokenize.word_tokenize(essay)
    allWords=[allWords.lower() for allWords in allWords if allWords.isalpha()]
    
    # Get basic statistics about the essay
    totalWords = len(allWords)
    vocabWords = len(set(allWords))
    diversity = vocabWords / totalWords
    
    # If below average, recommend help
    if diversity < modelMedianDiversity:
        needsHelp = True
    
        # Two most common words:
        stopwords = nltk.corpus.stopwords.words('english')
        allWordExceptStopDist = nltk.FreqDist(w.lower() for w in allWords if w not in stopwords)
        mostCommon= allWordExceptStopDist.most_common(2)
        # Source: https://stackoverflow.com/questions/28392860/
        #         print-10-most-frequently-occurring-words-of-a-text-that-including-and-excluding
        
        thesaurusesStr = f"""I recommend you focus on expanding your vocabulary. For example, your two most common words are '{mostCommon[0][0]}' and '{mostCommon[1][0]}'. Try using alternatives from a thesaurus. """

    
    # criteria1OutputStr = f"""Your essay has {totalWords} total words and {vocabWords} unique words, for a Diversity of {str(round(diversity*100, 2))}%. """ 
    criteria1OutputStr = f"""{thesaurusesStr if needsHelp else "Your vocabulary is in good shape! Keep up the good work!"}""" 
        
    return criteria1OutputStr, needsHelp

### Criteria 2

In [32]:
########################################################
# Exteractive Summarization Criteria2
########################################################

def run_criteria2(doc, key_doc):
    needsHelp = False

    if not key_doc:
        criteria3OutputStr = "No prompt or key words were entered."
        return criteria3OutputStr

    sent_bounds = get_sent_bounds(doc)
    
    
    key_unit_vector = get_unit_vector(key_doc)
    key_unit_vector = normalize_unit_vector(key_unit_vector)
    key_sent_rank = sent_uv_rank(key_unit_vector, sent_bounds)
  
    key_rank_mean, key_rank_med, key_rank_mode, key_mean_word_count, key_mean_lex_div = get_top_ranks(doc, key_sent_rank)

    if key_rank_mean > .35:
        criteria2OutputStr = f"""Your essay appears to follow the topic well."""
    
    else:
        criteria2OutputStr = f"""Your essay seems to be a little off topic."""

    return criteria2OutputStr, needsHelp



### Criteria 3

In [33]:
########################################################
# Word count Criteria3
########################################################

def run_criteria3(doc, word_count_req, sent_length_tuple):
    needsHelp = False
    
    num_sents = surface_proxies.sentence_count(doc)
    word_count = surface_proxies.word_count(doc)
    average_sentence_length = surface_proxies.average_sentence_length(doc)

    min_length = sent_length_tuple[0]
    max_length = sent_length_tuple[1]
    
    # Check word count
    if word_count > word_count_req:
        criteria3OutputStr_wc = f"Word count meets the minimum requirement."
    
    else:
        needsHelp = True
        criteria3OutputStr_wc = f"Your word count is {word_count}, which is below the word count requirement of {word_count_req}."
    
    # Check sentence length
    if average_sentence_length > max_length:
        needsHelp = True
        criteria3OutputStr_sl = f"Most of your sentences seem a to be pretty long, review your paper and check for run-on sentences."
    
    elif average_sentence_length < min_length:
        needsHelp = True
        criteria3OutputStr_sl = f"Most of your sentences seem a to be on the shorter side, review your paper and check for fragmented sentences."

    else:
        criteria3OutputStr_sl = f"Your sentences length looks good! Keep up the good work."
    
    criteria3OutputStr = criteria3OutputStr_wc + "\n\n" + criteria3OutputStr_sl


    return criteria3OutputStr, needsHelp

### Recommender

In [34]:
# Returns resources based on whether the criteria needHelp

def run_recommender(recommender_links, needHelp):
    
    # Initialize empty strings
    criteria1ResourceStr = ""
    criteria2ResourceStr = ""
    criteria3ResourceStr = ""
    
    ### Use the Multi-Armed Bandit and the csv file recommender_results.csv 
    #   to populate these url strings:
    #
    #
    #   ...
    #
    #
    # I'm populating it with placeholders here:
    output_url_string_1 = "https://tinyurl.com/46t3j9s6"
    output_url_string_2 = "https://www.khanacademy.org/humanities/grammar"
    output_url_string_3 = "https://www.khanacademy.org/humanities/grammar"

    if needHelp[0] == True:
        # If criteria1 needs help, make the string not empty
        criteria1ResourceStr = "Here's a resource to help expand your vocabulary: " + output_url_string_1
    
    if needHelp[1] == True:
        # If criteria2 needs help, make the string not empty
        criteria2ResourceStr = "Here's a resource to help you work on organization: " + output_url_string_2

    if needHelp[2] == True:
        # If criteria3 needs help, make the string not empty
        criteria3ResourceStr = "Here's a resource to help you work to improve content: " + output_url_string_3

    return criteria1ResourceStr, criteria2ResourceStr, criteria3ResourceStr

### Criteria Evaluation

In [35]:
# Just parses the checkboxes from the UI

def evaluate_criteria(criteria):
    runCriteria = [False,False,False]
    
    if 'Vocabulary' in criteria:
        runCriteria[0] = True
 
    if 'Organization' in criteria:
        runCriteria[1] = True
        
    if 'Content' in criteria:
        runCriteria[2] = True
    
    return runCriteria

### Main function

In [36]:
######################################################################################
# This is the function called when you click submit on the UI

def run_model_with_feedback(essay, criteria, recommender, essayType, essay_propmt, word_count, sent_min, sent_max):

    nlp = spacy.load("en_core_web_md")
    
    # add PyTextRank to the spaCy pipeline
    nlp.add_pipe("textrank", config={"scrubber": {"@misc": "articles_scrubber"}})

    ########################################################
    # Placeholders for outputs and variables 
    ##########################################################
    output_highlighted_list = []  # List of tuples. Refer to example below for format
    recommender_links = []        # append links to this
    
    # These get replaced with the results for the UI
    criteria1OutputStr = "Did not run evaluation on Vocabulary Diversity"
    criteria2OutputStr = "Did not run evaluation on Organization"
    criteria3OutputStr = "Did not run evaluation on Content"
    
    # Check the essay field wasn't left empty before running models
    # Return warning plus three empty criteria results and an empty recommender links
    if not essay:
        return f"""Invalid/empty essay field, try again""", "", "", "", ""

    if essay_propmt:
        key_doc = nlp(essay_propmt)
    
    else:
        key_doc = False
    
    # Set these to true if the NLP models say the student needs help
    # Then recommender will make a list of resources based on these
    needHelp=[False,False,False]
    
    # Whether the user asked us to evaluate the criteria
    runCriteria=evaluate_criteria(criteria)
    
    ########################################################
    # All processing gets done in the functions. This just gets the output
    # Shouldn't need to be updated
    ##########################################################

    # Created by processing a string of text with the nlp object
    doc = nlp(essay)

    # Make tuple for the sentence length requirements
    sent_length_tuple = (sent_min, sent_max)

    ## Criteria1:
    if runCriteria[0]:
        criteria1OutputStr, needHelp[0] = run_criteria1(essay, doc)
    
    ## Criteria2:
    if runCriteria[1]:
        criteria2OutputStr, needHelp[1] = run_criteria2(doc, key_doc)
    
    ## Criteria 3:
    if runCriteria[2]:
        criteria3OutputStr, needHelp[2] = run_criteria3(doc, word_count, sent_length_tuple)

    # ## Criteria 4:
    # if runCriteria[3]:
    #     criteria4OutputStr, needHelp[3] = run_criteria3(doc, sent_length_tuple)
    
    
    
    ########################################################
    # These lines are just for highlighter test
    # Force some characters to be highlighted in the output
    ##########################################################
    counter = 0
    for element in essay:
        counter = counter + 1
        if counter < 5:
            output_highlighted_list.append((element,"Vocabulary"))
        elif (counter < 20 and counter > 15):
            output_highlighted_list.append((element,"Organization"))
        elif (counter < 50 and counter > 40):
            output_highlighted_list.append((element,"Content"))
        else:
            output_highlighted_list.append((element, None))
    # End code that should be deleted when we have real output    
    
    # The output is a list of tuples, where the first is the character in the essay
    # and the second is which criteria to highlight it for
    # Example:
        #[('T', None),
        # ('h', None),
        # ('e', None),
        # (' ', None),
        # ('f', 'Criteria1'),
        # ('a', 'Criteria1'),
        # ('s', 'Criteria2'),
        # ('t', 'Criteria2')]
    
    
    
    
    ##########################################################
    # Recommender links:
    ##########################################################
    if recommender == True:
        criteria1Link, criteria2Link, criteria3Link = run_recommender(recommender_links, needHelp)
        criteria1OutputStr = criteria1OutputStr + criteria1Link
        criteria2OutputStr = criteria2OutputStr + criteria2Link
        criteria3OutputStr = criteria3OutputStr + criteria3Link
    # Else: do nothing and don't append anything
    
    
    
    print("criteria2OutputStr: ", criteria1OutputStr, criteria2OutputStr, criteria3OutputStr)
    ##########################################################
    # Return the results
    ########################################################## 
    return (f"""Evaluated student submission on {" and ".join(criteria)} with recommender turned {"on" if recommender else "off"}""", 
            
            criteria1OutputStr,
            criteria2OutputStr,
            criteria3OutputStr)
            #todo: 
            #output_highlighted_list )
    
    
    
# End function


### Interface

In [37]:
######################################################################################
# This is the actual interface code
iface = gr.Interface(
    # this is the function call with the UI inputs serving as the arguments
    run_model_with_feedback,
    
    
    
    ##########################################################
    # Inputs to the User Interface
    ##########################################################
    [
        # First argument passed in is the essay, as a string
        gr.inputs.Textbox(lines=10, placeholder="Copy the body of your essay here...", default="", label="Student Essay:"),
        # Second set of options are which rubric/criteria to check
        gr.inputs.CheckboxGroup( 
                                ["Vocabulary", "Organization", "Content"], 
                                default=["Vocabulary", "Organization", "Content"],
                                label="Evaluate on which criteria?"),
        # An option to turn on/off the recommender engine
        gr.inputs.Checkbox(label="Recommend videos for improvement?", default=True),
        gr.inputs.Dropdown(["Persuasive/ Narrative/Expository", "Source Dependent Responses", "N/A"], label="Essay Type"),
        gr.inputs.Textbox(lines=10, placeholder="Paste prompt here...", default="", label="Propmt (or Keywords):"),
        gr.inputs.Number(label="word count", default=150),
        gr.inputs.Number(label="sentence length min threshoold", default=25),
        gr.inputs.Number(label="sentence length max threshoold", default=15),

    ],
    
    ##########################################################
    # these are the output components
    ##########################################################
    [
        gr.outputs.Textbox(type="str", label="Evaluation:"),
        
        gr.outputs.Textbox(type="str", label="Vocabulary Diversity Results:"),
        gr.outputs.Textbox(type="str", label="Organization Results:"),
        gr.outputs.Textbox(type="str", label="Content Results:"),
        # gr.outputs.Textbox(type="str", label=" Results:"),
        
        #todo: 
        #gr.outputs.HighlightedText(color_map={"Vocabulary": "green", "Organization": "pink", "Content": "blue"}, show_legend=True, label="Criteria highlighting:"),
    ],
    
    
    
    ##########################################################
    # examples the UI lets you select from .. these are optional
    ##########################################################
    examples=[
        ["How @CAPS4 you feel if your favorite book was taken off the shelves of your school or public library? I, along with many other students, @CAPS4 find this discouraging and distastrous, so I do not believe that censorship should affect books that are on the shelves. Otherwise, a demolished love of reading, crushed individuality, and separated population @MONTH1 be born.     Like the beloved @PERSON2 @PERSON2 series by @PERSON1, many books and series are being taken out of libraries' collections due to people in society finding them offensive. In this case, the world of witchcraft in which this story blooms is against some religious beliefs; therefore, some individuals within a religion campaign to have these books banned. Fortunately, none of the libraries I visit, with their eclectic collections, had banned this series, or I @CAPS4 not have the strong thirst for literature as I do now. All books have the potential to pull a student into the wonderful world of reading, like @PERSON2 did for me, so taking away books that are most likely to spark an interest or start a firework of creativity @CAPS4 not only affect this generation, but the futures of all.          If this censorhip was to be allowed, who is to say what all  could be censored? Who @CAPS4 be the final judge as to what books @CAPS4 be banned? It @CAPS4 all come down to power and who was willing enough to take it. This struggle to be on top has the possibility of seperating people apart like political parties. Disagreements could turn into debates, and those could turn into fights. It can be concluded that people are stubborn for their beliefs, and to have someone choose what everyone is allowed to believe @CAPS4  be wrong. For instance, it @CAPS4 be like an @CAPS1 forcing a @CAPS2 to not believe in @CAPS3; a vegetarian commanding that meat can no longer be eaten; a woman taking away men's voting rights. Censorship @CAPS4 lead to the disrespect of other's opinions, and disrespect is never a beneficial thing.     Each and every person has a different opinion on what is offensive or not, so to censor books @CAPS4 be to censor all individual mentality. Without each person's unique thoughts and beliefs, the world @CAPS4 become similiarly vapid and dull. Differences in beliefs is what adds variety to the population and what makes a person special; additionally, free thought is a right all people should have. If someone was to limit the mental, literary stimulants that are out in the world, the amount of creativity and individuality @CAPS4 decrease.     To conclude, censorship @CAPS4 be a disrespect to individuality, personal beliefs, and the overall joy of reading a good book. Just because one might not believe in what a story says, it does not mean that the piece of literature should be forbidden. No one is being forced to read the books that grace the hundreds of shelves in a library, so if someone is offended, simply do not read it. So how @CAPS4 you feel if your favorite book was gone from all libraries? Disrespected? That is how I @CAPS4 feel", 
           ["Vocabulary", "Organization", "Content"], True, "Persuasive/ Narrative/Expository", "Censorship in the Libraries. All of us can think of a book that we hope none of our children or any other children have taken off the shelf. But if I have the right to remove that book from the shelf -- that work I abhor -- then you also have exactly the same right and so does everyone else. And then we have no books left on the shelf for any of us. --Katherine Paterson, Author"],
    ],
    
    
    ##########################################################
    # Other settings for the UI
    ##########################################################
    allow_flagging="never",
    theme="default", #"default", "huggingface", "seafoam", "grass", "peach", "dark",
    title='Essay Evaluation and Feedback',
    
    description="This is an automated tool for student essay feedback. Unlike traditional Automated Essay Scoring \
    systems, this tool focuses on modularity and interpretability. The student inputs their essay, determines which \
    criteria to be graded on, and then receives instant feedback. Not only does the tool make a determination on \
    the selected criteria, it explains how it reached it's conclusion and then it recommends resources the student \
    can use to improve. The student can then score the resources they were assigned, which allows the tool to \
    determine how useful the resources are and improve future recommendations. Currently, it only supports 10th \
    grade.",
    
    article="Authors: Chris Roche, Nathan Deinlein, Darryl Dawkins. \
             Developed for the Southern Methodist University, M.S. Data Science program. \
             SMU Data Science Review where this research is published: https://scholar.smu.edu/datasciencereview/all_issues.html \
             GitHub repository: https://github.com/cmroche1/DS_Capstone"
)



# Lastly, launch the application
# adding share=True makes a link you can share for 72hrs
iface.launch(share=True)



# Documentation with examples:
# https://www.gradio.app/docs/



##########################################################
### END
##########################################################



Running on local URL:  http://127.0.0.1:7865/
Running on public URL: https://48304.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces (https://huggingface.co/spaces)


(<gradio.routes.App at 0x24cb6405f60>,
 'http://127.0.0.1:7865/',
 'https://48304.gradio.app')

Exception in callback None(<Task finishe...> result=None>)
handle: <Handle>
Traceback (most recent call last):
  File "c:\Python310\lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
TypeError: 'NoneType' object is not callable


criteria2OutputStr:  I recommend you focus on expanding your vocabulary. For example, your two most common words are 'books' and 'beliefs'. Try using alternatives from a thesaurus. Here's a resource to help expand your vocabulary: https://tinyurl.com/46t3j9s6 Your essay appears to follow the topic well. Word count meets the minimum requirement.

Most of your sentences seem a to be pretty long, review your paper and check for run-on sentences.Here's a resource to help you work to improve content: https://www.khanacademy.org/humanities/grammar
