In the lecture linked below Kurt Vonnegut, author of Slaughterhouse Five, Cat's Cradle, and others, describes graphing the shapes of stories along two axes.
The image below shows several story shapes that Vonnegut created and appears in his essay collection A Man Without a Country.
The lecture and the image above suggest a simple idea for viewing the shape of the story as a graph of the good or ill fortune of its central character(s) from the beginning of the story to the end.
As Vonnegut says, "there's no reason why the shapes of stories can't be fed into computers...", but can we program a computer to generate the shape of a given story on its own?
This project uses a pretrained transformer model to classify the sentiment of passages of a story. The sentiment score (positive or negative) approximates good fortune or ill fortune of the story's central characters. Plotting a rolling average of the sentiment score yields a line that traces the shape of the story along Vonnegut's axes.
How well does this simple approach work? Testing the program on stories that Vonnegut discusses allows us to compare the generated shape to some sort of expected shape.
In another lecture Vonnegut describes how the ambiguity of Hamlet makes it a great story and makes it a difficult story to plot on these two axes. The plot above seems to corroborate that idea. The shape is negative throughout, but does contain significant upward swings.
It would be an interesting exercise to annotate the shape of this plot (and the others below) with the major plot points from the story to see how well the computer-generated shape matches reader expectations.
The shape of Metamorphosis isn't quite as pessimistic as Vonnegut's proposed shape, but the program does generate a shape of general ill fortune.
This shape doesn't quite match what we expect for the Cinderella story. The graph shows the central "peak" followed by a "pit" before rising steeply at the end, but I admit that I had to tweak some of the text processing and sizing of the rolling average window to get this result. One problem that Cinderella poses for this model is that the text of the story is very short. In order to accumulate enough scores to plot a shape we have to score sentiment over very short segments of the text (i.e., ~100 tokens).
I'm fairly pleased with the results above. Although some hand-engineering went into the parameters used to produce the shapes of these stories the approach and the pretrained transformer model achieved what I was hoping for. I might be able to remove some of the hand-engineering by dynamically selecting story segmentation and window parameters based on the length of the text.
Although the pretrained model yields sentiment scores that render pleasing story shapes we might do better with a model trained (or re-trained) on more representative text. The Hugging Face model was fine-tuned on the Stanford Sentiment Treebank.
I use a pretrained sentiment analysis pipeline from the Hugging Face transformers
library to classify the sentiment of story segments.
In order to graph story shapes with a "hand-drawn" appearance I use matplotlib's XKCD module.
My first attempt used a sliding window over the story text to track sentiment rather than a rolling average, but the pretrained model scores were too polarized (i.e., mean absolute value over 0.92) to plot a pleasant shape. The image below shows the result of plotting the sentiment of a sliding window of 100 tokens over Beowulf with the window advanced one line at a time:
This first approach also suggests a drawback of using a pretrained sentiment analysis model: one cannot alter the maximum input embedding length. The pretrained Hugging Face model has a maximum input length of 512 tokens, so it can only classify the sentiment of short samples of the story. As a result the sliding window approach uses a very small sliding window. Using a rolling average more effectively incorporates sentiment from broader sections of the story.
The emotional arcs of stories are dominated by six basic shapes by Reagan et. al provides a much more thorough exploration of computational approaches to the shapes of stories.
If you'd like to play around with this program yourself you can do so either via a command-line program or in a Python shell.
After cloning the repository run the following commands to initialize a Python virtual environment and install dependencies:
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt
Run the command-line program with
python -m story_shapes.story_shapes
The command-line usage is shown below
usage: story_shapes.py [-h] [--story-path STORY_PATH] [--title TITLE] [--shape-path SHAPE_PATH]
Generate a graph of the shape of a story a la Kurt Vonnegut.
optional arguments:
-h, --help show this help message and exit
--story-path STORY_PATH
Filepath to read the story from.
--title TITLE The title of the story
--shape-path SHAPE_PATH
Filepath to write the shape graph to.