In [11]:
import json

# 1. Loading the stories

In [47]:
with open("stories/ttcw_short_stories.json", "r") as f:
    stories = json.load(f)

print("There are %d stories in the dataset" % len(stories))
story = stories[2]
print(story.keys()) # Note: for the LLM-generate stories, "content" is the generated story, and for NewYork-published stories, "content" is a URL to the New Yorker website.

for k in story.keys():
    v = story[k][:100] + "..." if k == "content" else story[k]
    print("[%s] %s" % (k, v))

There are 48 stories in the dataset
dict_keys(['story_idx', 'story_id', 'story_name', 'plot', 'content'])
[story_idx] 0
[story_id] 0_GPT3.5
[story_name] Maintenance, Hvidovre
[plot] A woman experiences a disorienting night in a maternity ward where she encounters other similarly disoriented new mothers, leading to an uncanny mix-up where she leaves the hospital with a baby that she realizes is not her own, yet accepts the situation with an inexplicable sense of happiness.
[content] The clock struck midnight as Lily stumbled through the sterile corridors of the maternity ward. Exha...


# 2. Loading the tests

In [48]:
with open("tests/ttcw_all_tests.json", "r") as f:
    tests = json.load(f)

In [49]:
for test in tests:
    print("----- TTCW %d -----" % test["ttcw_idx"])
    print("Torrance Dimension: %s" % test["torrance_dimension"])
    print("Category: %s" % test["category"])
    print("Question formulation: %s" % test["question"])
    print("Prompt: %s..." % (test["full_prompt"][:100]))

----- TTCW 1 -----
Torrance Dimension: Fluency
Category: Narrative Ending
Question formulation: Does the end of the story feel natural and earned, as opposed to arbitrary or abrupt?
Prompt: If the writer ends the piece simply because they are 'tired of writing', the conclusion might feel a...
----- TTCW 2 -----
Torrance Dimension: Fluency
Category: Understandability and Coherence
Question formulation: Do the different elements of the story work together to form a unified, engaging, and satisfying whole?
Prompt: A well-crafted story usually follows a logical path, where the events in the beginning set up the mi...
----- TTCW 3 -----
Torrance Dimension: Fluency
Category: Scene vs Summary
Question formulation: Does the story have an appropriate balance between scene and summary/exposition or it relies on one of the elements heavily compared to the other?
Prompt: 'Scene' and 'summary/exposition' are two crucial elements of narrative storytelling, and balancing t...
----- TTCW 4 -----
Torra

# 3. Loading the annotations

In [50]:
from IPython.display import display
import pandas as pd

with open("annotations/ttcw_annotations.json", "r") as f:
    annotations = json.load(f)

# Each of the 14 tests was completed by 3 independent expert annotators.
# 48 stories x 14 tests x 3 annotators = 2016 annotations
print("Total of %d annotations" % len(annotations))
print(annotations[0].keys())

Total of 2016 annotations
dict_keys(['story_idx', 'story_id', 'expert_idx', 'ttcw_idx', 'category', 'binary_verdict', 'explanation'])


In [51]:
# For the story selected above (Maintenance by GPT4), let's look at all the administered tests. 
for test in tests:
    print("----- TTCW %d -----" % test["ttcw_idx"])
    
    table = []
    for anno in annotations:
        if anno["story_id"] == story["story_id"] and test["ttcw_idx"] == anno["ttcw_idx"]:
            table.append({"Expert Idx": anno["expert_idx"], "Verdict": anno["binary_verdict"], "Explanation": anno["explanation"][:100]+"..."})
    title = "[%s; %s] %s" % (test["torrance_dimension"], test["category"], test["question"])
    display(pd.DataFrame(table).style.set_caption(title))

----- TTCW 1 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,"The end of the story feels arbitrary, in so far as the ideas at the end of the story don't seem to f..."
1,1,No,"Many of the decisions the story makes are startling and baffling, starting with the mothers’ decisio..."
2,7,No,The ending to the story feels contrived and pat rather than meaningfully earned or even truly relate...


----- TTCW 2 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,"The story does not work together as a unified whole, and there are some aspects that do not make sen..."
1,1,No,The narrative consistently shifts so quickly that it is entirely implausible or nonsensical and incl...
2,7,No,The two main threads of this story: the baby mix-up and Lily's development into an artist don't feel...


----- TTCW 3 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,The summary of events over the course of the last two pages is rambling and redundant. The events he...
1,1,No,"Though the story does contain some scenes within its summary, all of them are ludicrous, starting wi..."
2,7,No,"There are both scenes and summary here, but neither are fully satisfying, and the pattern by which t..."


----- TTCW 4 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,"The last couple pages are characterized by a vague sense of time, which is repetitive and not tied t..."
1,1,No,The narrative pairs its absurd leaps in logic with equally massive leaps in time. Both are impossibl...
2,7,No,"While the narrative does include moments of both compressed time, the use of this technique feels el..."


----- TTCW 5 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,"The language is straightforward, and does not make sophisticated use of these devices...."
1,1,No,"The story uses overwrought language throughout. For instance, early on: “Each room seemed to contain..."
2,7,No,The language throughout lacks sophistication—I didn't find any allusions to speak of; any idioms fee...


----- TTCW 6 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,Yes,"There is a good balance between action and Lily's interior thoughts, especially in the first half of..."
1,1,No,There is no emotional experience to this story that is not outright stated....
2,7,No,"While the story offers us perhaps too many descriptions of Lily's thoughts, her interiority does not..."


----- TTCW 7 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,The turns taken in the story don't seem appropriate. Lily's decision to switch babies isn't really e...
1,1,No,The story makes unhinged decisions. It is bold and some of these decisions I might love in an entire...
2,7,No,There are surprises here but they feel mostly arbitrary—the reader is never given either the larger ...


----- TTCW 8 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,"The other mother disappears from the narrative without much explanation, and Lily is really the only..."
1,1,No,Lily is an empty vessel. I do not believe she is really a character. She makes decisions that are lu...
2,7,No,"The story does not provide diverse perspectives, and the one perspective that we do get feels more l..."


----- TTCW 9 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,"The end of the story drifts toward cliche and uninteresting language. For example, ""she brought toge..."
1,1,No,I am tempted to answer yes to this because the narrative’s logic is so wild I can’t even fathom it! ...
2,7,No,"This piece is full of cliches, including phrases like ""twist of fate"" and ""Lily's heart skipped a be..."


----- TTCW 10 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,There is nothing really original about the story's form or structure....
1,1,No,The originality here is that the turns and twists this story takes are practically impossible to ima...
2,7,No,"While this story fails to execute on a traditional three-act structure, it doesn't do so in a way th..."


----- TTCW 11 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,Perhaps if the emotional connection between the hospital encounter and Lily's later life could be ex...
1,1,No,"To be frank, I do not believe this story has themes. It does not feel as though it is intended to an..."
2,7,No,While the story seems to be attempting to offer up some deep takeaways about fate and the power of a...


----- TTCW 12 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,I couldn't discern much of any meaning or allusion below the surface meaning of the narrative....
1,1,No,"There is nothing happening in this story that is not stated outright, often to the point of real lau..."
2,7,No,"The piece clearly has aspirations towards a larger meaning, but rather than allowing those to come t..."


----- TTCW 13 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,"Overall, no. There is good attention to details at the beginning, including things like Lily's room ..."
1,1,No,"There are attempts to create scenes here and there, especially in the beginning in the hospital, bef..."
2,7,No,There is very little sensory detail here—descriptions are primarily visual and often revolve around ...


----- TTCW 14 -----


Unnamed: 0,Expert Idx,Verdict,Explanation
0,9,No,"Lily is the only character that is developed to any extent, and even then it is unclear why she does..."
1,1,No,"The characters are just names. Lily does something fascinating, devastating, but the decision is not..."
2,7,No,"With the exception of the fact that her mother died when she was young, Lily's backstory is left out..."
