# Analysing rewrite question answers with edit distance

As the rewrite answers were inconsistent in length and validity (some people just put in token submissions), I've personally sampled four rewrites and compared their Levenshtein edit distances.

In [1]:
# Sample texts. Name of the question is the key, and I've specified the response ID also.

texts = {
    "ChoiceBA0": {
        "servey_set": 1,
        "response_id": "R_3gnImmcOVIfpDVn",
        "whyq_response": "Text 2 sounds less human, since it is a dry collection of statements.",
        "original_text": "The definition of political power is the ability to make choices and get others to do things that they would not otherwise do. This is the case because in politics, people often use their authority or influence to get others to conform to their wishes or follow certain rules or guidelines that are set by",
        "response_text": "One may argue that the definition of political power is the ability to make choices and get others to do things that they would not otherwise do. It is perhaps debatable since in politics people often use their authority or influence to get others to conform to their wishes or follow certain rules or guidelines that are set by"
    },

    "ChoiceEF11": {
        "servey_set": 8,
        "response_id": "R_2r1Y11dxZ2pjPBe",
        "whyq_response": "I chose text 2 to be AI-generated because the mistakes made in the text seem illogical. The writer seems to know proper structuring, using commas and hyphens, but cannot write \"engine\" correctly.",
        "original_text": "vandalism in the local newsagents! yesterday, a young woman leaved from a Coffee not with a coffee in one hand and her laptop and in the other, her handbag over the shoulder when she heard a big noise on the corner. She saw a young couple outside their car - left engining turning and stereo playing",
        "response_text": "vandalism in the local newsagents! yesterday a young woman leaved from a Coffee not with a coffee in one hand and her laptop and in the other her handbag, over the shoulder when she heard a big noise on the corner. She saw a young couple outside their car left engining turning and stereo playing"
    },

    "RateEF11": {
        "servey_set": 8,
        "response_id": "R_5iDRu7Nk4WvRb2A",
        "whyq_response": "The writing is a more simplistic style, but it is not indicative of either AI or human authorship. I honestly have no idea.",
        "original_text": "Wow, it would be really fun to go to Sarah's party, but I already have plans that night. I really don't want to make Sarah disappointed, so I need to find a way to politely turn down her invitation. Here's what I can say: \"Hey Sarah, thanks so much for inviting me to your party! I really appreciate it. Unfortunately, I already have something else planned for that night. I'm so sorry to miss out on the fun, but I hope you have a great time!\" How does",
        "response_text": "Wow, it would be really fun to go to Sarah's party, but I already have plans that night. I really don't want to make Sarah disappointed, so I need to find a way to politely turn down her invitation. I know what I can say: \"Hey Sarah, thanks so much for inviting me to your party! I really appreciate it. Unfortunately, I already have something else planned for that night. I'm so sorry to miss out on the fun, but I hope you have a great time!\" How does"
    },

    "RateBA6": {
        "servey_set": 5,
        "response_id": "R_2N3295pEVvdbPAR",
        "whyq_response": "Quite storytelling like, I would expect an AI to be more frank and straightforward, telling a series of facts rather than a narrative",
        "original_text": "But why did Americans, in particular, embrace this ambitious architectural pursuit? The answer lies within a complex interplay of economic prosperity, rapid urbanization, and a distinctive cultural spirit of innovation and progress. During the late 19th and early 20th centuries, the United States experienced an unprecedented era of industrialization and economic growth. Massive fortunes were amassed, leading to an influx of capital and a surging demand for office space in rapidly expanding cities like New York, Chicago, and Philadelphia. Traditional, low-rise buildings simply could not accommodate the burgeoning",
        "response_text": "But why did Americans, in particular, embrace this ambitious architectural pursuit? The answer lies within a complex interplay of economic prosperity, rapid urbanization, and a distinctive cultural spirit of innovation and progress. During the late 19th and early 20th centuries, the United States experienced an unprecedented era of industrialization and economic growth. Massive fortunes were amassed, leading to an influx of capital and a surging demand for office space in rapidly expanding cities like New York, Chicago, and Philadelphia, whilst traditional, low-rise buildings simply could not accommodate the burgeoning"
    },
}

## Levenshtein edit distance measurements


In [None]:
%pip install python-Levenshtein

import Levenshtein as levenshtein
import json

# Calculate the Levenshtein distance for all texts
for key, value in texts.items():
    original_text = value["original_text"]
    response_text = value["response_text"]

    distance = levenshtein.distance(original_text, response_text)
    
    texts[key]["levenshtein_distance"] = distance
    print(f"Levenshtein distance for {key}: {distance}")



Collecting python-Levenshtein
  Downloading python_levenshtein-0.27.1-py3-none-any.whl (9.4 kB)
Installing collected packages: python-Levenshtein
Successfully installed python-Levenshtein-0.27.1

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
Levenshtein distance for ChoiceBA0: 42
Levenshtein distance for ChoiceEF11: 5
Levenshtein distance for RateEF11: 6
Levenshtein distance for RateBA6: 9


## Difference higlights

In [7]:
from Levenshtein import editops

def highlight_differences(original, response):
    """
    Highlights the differences between the original and response texts and prints them with latex formatting.

    Docs for editops at https://rapidfuzz.github.io/Levenshtein/levenshtein.html#Levenshtein.editops
    Args:
        original (str): The original text.
        response (str): The response text.
    Returns:
        None
    """
    ops = editops(original, response)
    original_highlighted = list(original)
    response_highlighted = list(response)

    # Apply highlights for each operation
    for op, i, j in ops:
        if op == 'replace':
            original_highlighted[i] = r"\sethlcolor{lightyellow}\hl{" + str(original[i]) + r"}"
            response_highlighted[j] = r"\sethlcolor{lightyellow}\hl{" + str(response[j]) + r"}"
        elif op == 'delete':
            original_highlighted[i] = r"\sethlcolor{lightred}\hl{" + str(original[i]) + r"}"
        elif op == 'insert':
            response_highlighted[j] = r"\sethlcolor{lightgreen}\hl{" + str(response[j]) + r"}"

    # Join the highlighted text back into strings
    original_highlighted = ''.join(original_highlighted)
    response_highlighted = ''.join(response_highlighted)

    return original_highlighted, response_highlighted

# Highlight differences for all texts
for key, value in texts.items():
    original_text = value["original_text"]
    response_text = value["response_text"]

    original_highlighted, response_highlighted = highlight_differences(original_text, response_text)

    # Print the highlighted differences
    print(f"Highlighted differences for {key}:")
    print("Original:", original_highlighted)
    print("Response:", response_highlighted)
    print("\n")

Highlighted differences for ChoiceBA0:
Original: \sethlcolor{lightyellow}\hl{T}he definition of political power is the ability to make choices and get others to do things that they would not otherwise do. \sethlcolor{lightyellow}\hl{T}\sethlcolor{lightyellow}\hl{h}\sethlcolor{lightred}\hl{i}\sethlcolor{lightred}\hl{s} is \sethlcolor{lightyellow}\hl{t}\sethlcolor{lightred}\hl{h}e\sethlcolor{lightyellow}\hl{ }\sethlcolor{lightyellow}\hl{c}ase\sethlcolor{lightyellow}\hl{ }be\sethlcolor{lightyellow}\hl{c}\sethlcolor{lightyellow}\hl{a}\sethlcolor{lightyellow}\hl{u}\sethlcolor{lightyellow}\hl{s}e in politics\sethlcolor{lightred}\hl{,} people often use their authority or influence to get others to conform to their wishes or follow certain rules or guidelines that are set by
Response: \sethlcolor{lightgreen}\hl{O}\sethlcolor{lightgreen}\hl{n}\sethlcolor{lightgreen}\hl{e}\sethlcolor{lightgreen}\hl{ }\sethlcolor{lightgreen}\hl{m}\sethlcolor{lightgreen}\hl{a}\sethlcolor{lightgreen}\hl{y}\sethlcol