<div align=center id="top"><h1>Donne unDonne</h1></div>

## Contents

I.    [Introduction](#Introduction)

II.   [Sonnets](#Sonnets)

III.  [Early Modern English](#Early-Modern-English)

IV.   [Approaches](#Approaches)

V.   [Description](#Description)

VI.  [Discussion](#Discussion)

VII. [Future Work](#Future-Work)

------------------------------------

## Introduction

This is yet another work in progress, a final project for Dr. Noriko Tomuro's [Topics in AI: Natural Language Processing](http://condor.depaul.edu/ntomuro/courses/NLP594s18) course, CSC 594, at DePaul University: a probabilistic Petrarchan sonnet generator. Specifically, this system is a suite of tools for generating sonnets closely modeled on the work of John Donne (1572-1631), a preacher, lawyer, and poet active in the early 17c, during the reign of King James.

Donne's work, reflecting his life—both imprisoned and destitute due to a marriage scandal and essentially commanded by King James to become Royal Chaplain after his conversion to Protestantism—ranged from very down-to-earth, witty love poems, to intensely emotional religious meditations. <p>

Given Donne's work as input, the system will output a 14-line poem adhering to varying extent to the Petrarchan sonnet structure as introduced into English by [Sir Thomas Wyatt](https://en.wikipedia.org/wiki/Thomas_Wyatt_%28poet%29#Wyatt's_poetry_and_influence) and implemented by Donne.

The system imposes sundry algorithmic constraints which encode the qualitative features described by literary tradition (i.e., scansion), and generates poetry from the parsed Donne corpus with a mixed set of possible methods, including probabilistic or stochastic context-free grammar, and smoothed and unsmoothed MLE *n*-gram modeling at the word and character levels. Vector space semantics and sentiment analysis inform these techniques alongside resources such as pronunciation and concept dictionaries and thesauri, viz. [WordNet](https://wordnet.princeton.edu/), [SentiWordNet](http://sentiwordnet.isti.cnr.it), [Lin's Thesaurus](http://www.nltk.org/_modules/nltk/corpus/reader/lin.html), and [CMUdict](http://www.speech.cs.cmu.edu/cgi-bin/cmudict).

Sonnets are an ideal and somewhat traditional target for algorithmic generation due to
their highly structured forms, in terms of features such as rhyme
schemes (such as *ababab*), metre (e.g., iambic pentameter), and the
number of lines. These writing constraints are useful both for
[human](http://www.huffingtonpost.com/scott-barry-kaufman/does-creativity-require-c_b_948460.html)
creativity—forcing thinkers to construct novel solutions—and also
fruitfully narrow the space of generative possibilities.

Thus, the foundation, in terms of NLP, begins with solving these
formal problems: evaluating text phonetically to allow for rhyming and
performing
“[scansion](https://owl.english.purdue.edu/owl/resource/570/02/)” to
ensure adherence to rhythmic requirements, with some authorial leeway 
with respect to how tightly to require rhymes or allowing for orthographic 
matching in addition to phonographic.

The general purpose here was to leverage the domain of English literature in a productive NLP task, establishing a foundation from which I can both learn and explore more of these techniques, and subjects, and create useful tools for writers, literary theorists, and linguists of all stripes; a sort of cross-disciplinary armanentarium. By “productive” I refer to manifesting meaningful intent as code and text, a dynamic we might diagram simply as [Intent] → [Code ↔ Text].

My reasoning is that just as writing workshops and output practice are complementary to reading and listening in literature and language learning, programmatically producing poetry is a powerful means to study Digital Humanities and NLP. Doing so with tightly structured poetry allows for the exploration of algorithmic constraints alongside data-driven models.

The project itself I view as something of a playground for creative, theoretical experimentation, not merely tool-making, but tool-making itself as artistic practice and literary hermeneutics.

This view is perhaps influenced by my desire for a Cognitive Computational Creative Writing, with workshops acting as the labwork of Literature departments, where generating text motivates a more intensive literary analysis, and vice versa.

## Sonnets 

To describe Petrarchan sonnets in more detail: These 14-line poems as written in English typically adhere to a fourfold structure (4-4-4-2), in two main parts. The first eight lines (2 4-line quatrains) are called the octave or octet, which traditionally introduces a problem, and the final 6-lines form the sestet (quatrain and couplet), which might develop a solution. Each line is approximately 10 syllables, with a particular rhythm and rhyme scheme, as shown, where the octave rhyme scheme is [abba, abba] and the sestet is [cddc, ee].

![](assets/img/hsdeath.png)

## Early Modern English

As you may have noted with the quotes from Donne’s poems in these slides, a complication here is the highly variable English spelling and pronunciation as the system was standardized over time with the spread of printing. Capitalization and typography are other components that introduce challenging variation, as described at the [OED](https://public.oed.com/blog/early-modern-english-pronunciation-and-spelling/).

To list some examples, there is the silent ⟨e⟩ as seen in “Holy Sonnet X” above, interchangeable ⟨v⟩ and ⟨u⟩ depending on word position: “euer” (“ever”), interchangeable ⟨i⟩, ⟨j⟩ and ⟨y⟩ in certain cases: “ioifull praier” (“joyful prayer”), a possible capital “J” when consonant: “… to rule Justely… ”, capitalization for “important” nouns, and elocutionary punctuation, its use based on its effects on speech/sound...

Additionally, Donne [disliked](https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1741-4113.2008.00552.x) printing, which has added to many variations in any given poem as editors and transcribers introduced their own versions over the centuries.

![](assets/img/flea.png)
<center><i>Typographic pun from Donne’s “<a href="https://www.bl.uk/shakespeare/articles/a-close-reading-of-the-flea">The Flea</a>” with long ⟨s⟩, or ſ</i></center>

## Approaches

There have been many approaches to poetry generation with computers. These include using context-free grammars to generate lines, weighted with probabilities for parses induced from analyzing input text. A recent look at PCFG (Probabilistic Context-Free Grammar) for poetry generation can be seen from [Goldberg's critique](https://medium.com/@yoav.goldberg/an-adversarial-review-of-adversarial-generation-of-natural-language-409ac3378bd7) of Rajeswar, et al., [2017](https://arxiv.org/abs/1705.10929).

One can also take an evolutionary approach ([Manurung, 2004](https://www.era.lib.ed.ac.uk/handle/1842/314)) and define a fitness function based on some metric (e.g., grammatical, meaningful, ‘aesthetically pleasing’) and use it to modify encoded language [genetically](https://natureofcode.com/book/chapter-9-the-evolution-of-code/) through an iterative process of selection, combination and mutation. 

And of course, we have various means to use the probability distributions of a text to conditionally sample from a corpus and create new sequences. These productions operate within the constraints of the type of poetic structure we want to create, such as haiku or sonnets.

[Markov chains](https://shiffman.net/a2z/markov/) and the association with poetry and text generation have existed since their onset, with Andrei Markov’s modeling of Pushkin’s verse novel, *Eugene Onegin* ([1913](https://www.americanscientist.org/article/first-links-in-the-markov-chain)), or Claude Shannon's explorations, giving such results as “IN NO IST LAT WHEY CRATICT FROURE BIRS GROCID…” ([1948](https://en.wikipedia.org/wiki/A_Mathematical_Theory_of_Communication)).

For this project, I explored a probabilistic context-free grammar approach and *n*-gram likelihood-based generation at both word and character levels.


Description
===========

The chain of imports begins with the main file, but traces back from
main.py to lode.py, in a somewhat convoluted sequence, so I will begin
the description from the loading point.
<p>
*Components*:

-   *Load (lode.py)*

-   *Sentiment (sins.py)*

-   *Scansion (scansion.py)*

-   *Grammar (gramarye.py)*

-   *POS and Word-based Generators (genesis.py)*

-   *Character-level generator (karkoav.py)*

- *Graphical User Interface (gui.py)*

-   *Main (main.py)*


In [2]:
from karkoav import *
from gui import *

words, words_set, sents = parse_sonnet()

# Instantiate bauplan, Holy Sonnet X: 'Death be not proud... '
hsdeath = sonnets[titles[12]]
hslines = list(filter(lambda x: x, hsdeath.split('\n')))
alpha = hslines[0].split()[0]
kpunk = end_punk_scheme(hslines)

seed = hslines[0].split()[0].lower()
cfd = nltk.ConditionalFreqDist(nltk.bigrams(tokens))
cpd = nltk.ConditionalProbDist(cfd, nltk.LaplaceProbDist,
                               bins=len(types))

if __name__ == '__main__':
    gg = GenGUI()
    gg.main()


As the background section may have suggested, choosing the data to use in this project was difficult. The simplest approach for accomplishing my goals was to balance between modernized versions to simplify parsing, with a few choice archaic versions that reflected my own preferences. In this sense, perhaps the most challenging aspect of selecting data was choosing the right blend, so to speak, to both execute the code smoothly and achieve appealing results.

A small measure of preprocessing was necessary to correct what appeared to be scanning errors and which caused problems for tokenization, such as extra spaces around punctuation within words, and to mark titles to make them easier to distinguish when loading.

Initially, the lode.py file loaded
[BEEP](http://svr-www.eng.cam.ac.uk/comp.speech/Section1/Lexical/beep.html),
a British English phonetic dictionary similar to CMU’s Pronunciation
Dictionary, based on
[ARPAbet](https://en.wikipedia.org/wiki/ARPABET)-style transcriptions.
However, early versions of BEEP featured inconsistent stress markings—an
important feature for evaluating the metre of a poem where stress is
alternated over syllable sequences to establish rhythmic articulatory
patterns. Later versions of BEEP simply removed the markings entirely.
Therefore, given the primarily orthographic reception I imagine the
project will have, I simply imported CMUdict from NLTK instead, despite
its North American basis.

At any rate, as well as importing various corpora and tokenizers, the
loading module parses the types, tokens, lines, and files on three
levels: a larger corpus of John Donne’s poetry and prose; just the
sonnets and sonnet-like poems; and individual sonnets—and defines an
initial syllable counting function. The system is essentially hardcoded
for particular files and formats at the moment, as the guiding,
simplifying idea behind this prototype version of the project was to
generate sonnet structures by directly emulating a template for a given
run, namely a particular Donne sonnet’s rhyme and metrical scheme, etc.,
sampling from a relatively small inventory of syntactic and semantic
elements.

In [None]:
# Sample code from lode.py

def parse_sonnet(ix: int = 12) -> list:
    """Collect sonnet's types, tokens, and sentences.
       Default is HSDeath."""
    sonnet = sonnets[titles[ix]]
    sents = sonnet.strip().split('\n')
    # "Send me some token, that my hope may live... "
    tokens = [w for sent in sents
                for w in nltk.word_tokenize(sent)
                if w not in string.punctuation.replace("'", "")]
    types = set(tokens)
    return [tokens, types, sents]


Next, the sins.py trains three models or loads them if pretrained and
available. The *word* model is a standard Word2Vec model; the *phrase* model uses gensim’s
phrase tools to create embeddings for trigrams (ignoring the stopwords
in the intervening distance between content words). Primarily, this was
intended to expand the resulting thematic clusters obtained from the
gen\_syn() function. This function is a modified version of available code, 
here focused on ‘subjectivity’ (1 –
obj\_score), with options to return synsets or orientation (positive or
negative). There is also a function available to measure the ‘Word
Mover’s Similarity,’ calculating inverse of the model’s
Euclidean distance result for input of two lines, via the equation from gensim’s WmdSimilarity
class.

Also of note is the *[FastText](https://arxiv.org/abs/1607.04606v2)* model, which I discovered after searching
for character-level vector space approaches to complement the character-level Markov
generator, conceptually if nothing else. This function is thus set to the default, as it uses subword
features for embeddings. It is referred to as the *k* model as *k* is
how I have preferred to distinguish the unsmoothed character *n*-grams
used with the order *k* Markov chain from the smoothed NLTK bigram-based
generator.

In [4]:
# Sample code from sins.py

def phrase_model(sins: list):
    """Trigram-based features."""
    try:
        model = Word2Vec.load('data/phrasesmod')
    except:
        frasier = Phrases(sins, delimiter=b' ', threshold=0, 
                          common_terms=stopwords, scoring="npmi")
        bigram_trans = Phraser(frasier)
        trigram = Phrases(bigram_trans[sins],
                          common_terms=stopwords,
                          delimiter=b' ', threshold=0,
                          scoring="npmi")
        trigram_trans = Phraser(trigram)
        model = Word2Vec(trigram_trans[sins])
        model.save('data/phrasesmod')
    return model


The scansion.py script defines a series of small functions for
evaluating the parsed text according to poetic structure, namely for
comparing lines during the generation process. This includes processing
and filtering, counting characters, counting syllables by referencing
the stress marks in the pronunciation dictionary’s results, evaluating
the numbers indicating the stress, tagging lines, and comparing whether
line ending phonetics (rhymes), metre (e.g., iambic pentameter),
punctuation, and tag patterns match. <p>

The punctuation matching is part of
the aforementioned emulation, which will be discussed shortly. An
additional function analyses the “lexical diversity,” or type-token
ratio, as this has been referenced as a useful metric for establishing
the aesthetic effects of poetry by
[Simonton](https://link.springer.com/article/10.1007/BF00123412), et al.

In [5]:
# Sample code from scansion.py

def get_ttr(lines: list) -> float:
    """Return type-token ratio."""
    lines = '\n'.join(lines)
    tokens = filter_lline(nltk.word_tokenize(lines))
    stemmer = SnowballStemmer("english")
    stems = [stemmer.stem(token) for token in tokens]
    types = set(stems)
    return len(types)/len(tokens)

def mimick(line: str, punk: str) -> str:
    """Capitalize line and add required punctuation."""
    return line[:1].upper() + line[1:] + punk


The gramarye.py file encompasses my crude attempts at using an
aesthetically defined grammar that I extrapolated from parts of the
sonnets that I prefer, to induce a probabilistic context-free grammar
for generating new lines that mimic that grammar. A function for
defining said grammar writes the rules for the regular expression
parser, which parses a sentence and uses regular expressions to rewrite
it into Penn Treebank format, and creates a Tree from the resulting
string, doing so repeatedly to create a treebank of sorts. 

The other functions in the file are modifications of NLTK’s CFG and ProbDist
generate functions, adding an analysis of probabilities to the tree
traversal and terminal selections.

![](assets/img/pcfg.png)
<p>
<center><small><em>[PCFG example tree via Hoffman, 2009 slide, in turn via Jurafsky & Schutze](https://courses.cs.washington.edu/courses/cse590a/09wi/pcfg.pdf)</em></small></center>

Extending from this, the genesis.py module defines a function to parse
the sonnets’ lines per the above grammar functions, by inducing a grammar
from the custom treebank and invoking the modified generate methods. As
this can be a slow process, a timer allows the process to expire after an
adjustable limit. Two functions are associated with the word-level
generator using the Laplace-smoothed conditional frequency distribution
of bigrams. This basic smoothing was used primarily to establish foundational working code for the project.

The first function, *gen\_from\_cfd*, initiates the process within broad
constraints to create a sonnet-like structure, and pre- and post-
process the results from the *generate\_model* function, which generates
words and evaluates potential results before appending them by using the
scansion functions to compare with the ‘ur’ sonnet, the template, which
by default is Holy Sonnet X, “Death be not proud… ” At the moment, this
focuses on requiring 10 syllables, rhyming, and a measure of
subjectivity, sentiment-wise.

In [6]:
# Sample code from genesis.py

def gensen(st: list, n: int = 20, d: int = 4, m: int = 13) -> list:
    """Induce stochastic CFG from sonnet 'treebank'.
       Use modified NLTK functions to probabilistically generate new sentences."""
    gensents = []
    sontag = [gram(sent) for sent in st]
    prodsp = [p for _, t in sontag
                       for p in t.productions()]
    grammarp = induce_pcfg(Nonterminal("S"), prodsp)
    for sentence in generated(grammarp, n=n, depth=d):
        if len(sentence) <= m:
            gensents.append(' '.join(sentence))
    return gensents

Next, we have karkoav.py, which defines a class and its methods for the
character-level generator. This was a late addition to the project that
became extensive, scratch-built Markov chain code with, ironically, the
most promise in meeting the project goals. This was inspired by Yoav
Goldberg’s demonstration in response to Andrej Karpathy’s famous
char-RNN
[article](http://karpathy.github.io/2015/05/21/rnn-effectiveness/),
Goldberg showing that a char-OMM (to use Jurafsky, et al.’s acronym for
Markov chains, “Observable Markov Models”) of suitable order *k* is very
effective, without the compute costs of the char-RNN.

The constructor first uses a sequence of functions to train or reload a
dictionary containing *k*-gram state keys and their transition
probabilities based on potential next-characters. When *.generate()* is
called in the main file, it operates line-by-line, calling functions
which select *k*-grams to act as seeds and sampling subsequent
characters under a rather complicated series of conditions: essentially
attempting to select seeds from thematic clusters found by the FastText
and other models based on their similarity to previous seeds for
cohesion, falling back on more random seeds, and constraining and
mimicking the sonnet form similar to the other generators based on
conditions of length and the metrical results, etc. from the scansion
functions, and post-processing lines with a *\_mimickry* function to
emulate the template punctuation.

The results have a title appended based on similar seed logic, searching
for synonyms for the title of the sonnet which contributed the most to
the generated sonnet, where synonym results are available, falling back
on randomly selected sonnet words’ synonyms.

Most recently, I added a simple visualization, as described in the future work section below.

Finally, the main.py file calls functions to load and define the corpora
and template sonnet, and accepts command-line arguments to choose
between the above generators, printing the respective results.

In [7]:
# Sample code from karkoav.py

def _train_mod(self) -> dict:
    """Create dictionary of letter k-grams;
       Values are tallies of letters that follow k chars.
    """
    k = self._k
    model = defaultdict(Counter)
    for line in self._imp:
        line = ''.join(filter(lambda x: x in string.printable, line))
        # Start state; preserving final EOL observations to shift probabilities, 
        # allow for constraints, such as rhyme check.
        lex = ["<s>"] + list(line)
        for i in range(len(lex)-k): # Update count for distribution.
            model[tuple(lex[i:i+k])][lex[i+k]] += 1
    self._prob_dist(model)


Finally, the main.py file calls functions to load and define the corpora
and template sonnet, and accepts command-line arguments to choose
between the above generators, printing the respective results.

Discussion
==========

Analysis
--------
The design of constraints in combination with the generators posed a far
more difficult problem than expected. Using
the basic bigram probability distributions as a foundation, I was
able to incorporate rough phonetic information from CMU's pronunciation dictionary through the
use of the scansion functions described above. Though formally correct
for the most part, there was something lacking in the results. 

Here is an example:

> Of rusty iron ground teach My indisposed parts hee
> 
> This even so But alas is true no;
> 
> Than nature Which hell or think that borrow
> 
> Love clergy only Who abroad I we.
> 
> Yong Contemptuous Yet to thee first me,
> 
> Are aptest to stand stiff till doom be profane lo,
> 
> Candles Which Now receive such gay goe doe,
> 
> Love Philosophy But might try freely.
> 
> Spoken well a calfe an arm and trifle When,
> 
> Which drives them upon Thy Prophets and well;
> 
> Hat to sleepe or make that sacrifice Which hell,
> 
> And flexible to pursue things we then?
> 
> All ages were in ranke itchie lust and country,
> 
> If they spend more Holy mourning as I.

The rhymes match the source sonnet, as well as the punctuation, as
explicitly conditioned and appended. The syllable counts are mostly in
the range of 10 per line, although the mixture of archaic language and
the relatively small amount of data in the CMU dictionary made accuracy
difficult. This became a frequent theme: performing automatic analysis
on words such as “slipperinesse,” written in the older style, variant
spellings cause errors and a kind of lossiness, missed opportunities.

Likewise we can see inconsistent capitalization, although in this case I
believe I failed to properly implement line capitalization.
Capitalization is inconsistent and somewhat arbitrary in early Englishes, and I did attempt to
honor these authorial choices when tokenizing the text, although this
can be refined with domain knowledge (recognize the purpose of emphasis,
common noun-orientation), to counterbalance potential editorial and
compositor errors and influences.

I believe the main fault is the sense of morphological rigidity and a
general syntactic incoherence with abrupt shifts. This could be
mitigated with more careful algorithm design, and/or blending with other
models with subword information, as we’ll see.

From here, I wanted to take a more syntactic approach, but found the
basic CFG a bit too straightforward. Thus, I went with a PCFG. The
problem here was inducing a grammar from the tree productions, without a
pertinent treebank. Perhaps I misunderstood the pipeline, but I had the
impression that I needed to create my own. I took this opportunity to
apply a method I came up with years ago for “reverse canoneering”
(reverse engineering canonized works) structured poems as a creative
practice: essentially applying a series of arbitrary, stylized
algorithmic transformations to the structure of a poem to extrapolate a
template, a form of custom, generative pareidolia, we might say. Such as deciding to look at the blend of voiced and plosive
sounds, the shape of lines, the inflections and parts-of-speech,
metaphors (containers, etc.), and the like. This notion was originally an exploration of bottom-up techniques to create personalized literary canons, loosening traditional interpretive constraints regarding form especially.

In this case, now aided with a computer, I pored over the sonnets and
focused on the tag patterns from segments of lines, creating a custom
grammar for it. It took time to create rules that wouldn’t recursively,
endlessly expand, or divide the poems too finely, and what I have now is
a bit more baroque than I would like and generally reflects practical more than
aesthetic goals, but at least it’s far more efficient than earlier versions. It
also doesn’t seem to create very good results, although it sometimes
produces real gems; it’s hit or miss:

> Thirst art read from woe flesh
> 
> O his envious soul with me us can kill;
> 
> For all art now
> 
> All this and as than my deign.
> 
> Infinities gained already,
> 
> Jacob profane nor thou on wrath still only,
> 
> Sleep made for all which at dost minerals,
> 
> Nor fills not and holds long and numberless.
> 
> Endless business me made in whose,
> 
> Forgiveness glorified from ends thine end mild weakness;

This is from a shorter, simpler grammar than I have now. I tried out
countless grammars, and suspect I am missing a key piece of logic to
make the system work properly. Perhaps a combination of something like
the much larger Penn Treebank extended with Donne’s sentences, or
perhaps simply parsing Donne’s sentences via Stanford would be best, a process that will require some adjustment to current tag patterns.

Again the problem with archaic language came about, with an
overabundance of nouns in the results. I imagine this made gleaning
information from entropic differences difficult.

From the current, larger grammar, before timing out:

![](assets/img/pcfgen.png)

Finally, I applied the character-level approach, which consumed the bulk
of my experimentation with various constraints and combinations. This
approach proved to produce quite coherent results, presumably because of
the flexibility introduced by the subword-based procedure. However, I
had difficulties constraining this flexible generation with the same
rules applied to the other generators. Instead, I tried to focus on more
semantic and affective aspects, before eventually applying a rather clumsy rhyme constraint that essentially collects rhymes from the source sonnets and appends them to generated lines.

Using Word2Vec’s similar embeddings gave somewhat choppy results, hence
I made use of multiword embeddings, before discovering that I could also
use FastText, which seemed appropriate in its use of subword features. I
did not quantitatively compare the results, but the sets were subtly
different and I felt the latter model added coherence. Since we might
describe such results as thematic (vs. semantic), I felt it plausible
that I might introduce and maintain themes over the poem by integrating
these as seeds in the *k*-gram:char dictionary when possible.

Around this point, I began trying to introduce more properties of
Donne’s style into the sonnets. For instance, his Holy Sonnets are known
for using the Petrarchan tradition to introduce a sort of spiritual
problem in the first 8 lines (two quatrains)—the octet or octave, with a
turn at the 8th line (the “volta”) toward resolving the problem in a
more meditative tone over the final 6 lines (the final 4-line quatrain
and the couplet). I decided I might influence the generated poems by
requiring that the octet have a more subjective sentiment, allowing
it to wane into more neutral tones in the final lines. This actually did
seem to have some effect, though perhaps it was my imagination playing
tricks, in the same way that generating titles seemed to influence my
reading of the poems. 

Here is an example of a result for the char-OMM via the work-in-progress GUI:

![](assets/img/jacob.png)

At one point when evaluating results I realized I’d
made an indexing error in my code by reading the poems and sensing a
“turn” around the 7th line where I’d intended it to be at the 8th
(we often see uses of words like “alas” or mid-line full stops
accompying this in results), so I suppose that basic bit of sentiment
manipulation had some effect, though I had the sense it took a line or
two for the probabilities to shift accordingly.

As this model took a line-based approach—as opposed to the full circular
scans over the input used by Goldberg in his response to Karpathy, we
preserve these line-specific lexical distributions.

## Further discussion

For the PCFG, despite customizable tree depth limits, it took time to write clauses that wouldn’t cause the induction function to spin endlessly, and ultimately the archaic language, I suspect, is difficult to tag. The parser often sees long strings of nouns or verbs rather than potential predicate-argument structures and the like. Still, it has potential as a constraint for the other models.
<p>
For instance, an early result can be seen here:
> 
<div style="font-family: Garamond; text-indent: 0px;">All dim flower pilgrim shall thee arise<br>
Heaven doth my everlasting profane;<br>
Unto dissemble wills's sign and jointure<br>
Him yet who hath and slave death.<br>
It at my st's dwell now wilt,<br>
Of I invest sinned and imprisoned and so i,<br>
Son loath flesh time and let he died,</div>

The Penn POS tags for this were:<br>
> 
<div style="font-family: Garamond; text-indent: 0px;">All/DT dim/NN flower/NN pilgrim/VBP shall/MD thee/VB arise/NN<br>
Heaven/NNP doth/CC my/PRP\$ everlasting/NN profane/NN;<br>
Unto/NNP dissemble/JJ wills’s/NN sign/NN and jointure/NN<br>
Him/NNP yet/RB who/WP hath/NN and/CC slave/VB death/NN.<br>
It/PRP at/IN my/PRP$ st’s/NN dwell/NN now/RB wilt/VBP,<br>
Of/IN I/PRP invest/VBP sinned/VBN and/CC imprisoned/VBN and/CC so/RB i/JJ,<br>
Son/NNP loath/NN flesh/JJ time/NN and/CC let/VB he/PP died/VB,</div>

The bigram model was relatively easy to implement and constrain, because we are working with conventional tokens, but the whole-word approach to constructions gave a feeling of morphological stiffness with respect to stems and affixes. When working on combinations at this level, relying on present and previous word conditions, the range of possibilities feels too limited, with a narrow permutation space of length ~8 vs. ~40.

Best results were achieved, subjectively speaking, with the character-level approach. With a larger feature space, we can map a single author’s style more effectively, I think, with the small, signature details. We can also handle morphology with respect to lexical stems and functional affixes, giving more fluidity to the results. It was very interesting to watch the model produce “endless infinities” or “endless in me” from occurrences of “numberless infinities” and “endless.” 

We can also manipulate results at the phonemic level. At the same time, I found the lines harder to weave into a consistent poetic form. There is a poor correspondence to dictionary headwords, and the results shift rapidly from verbatim mimicry to gibberish with a change in order from, say, 4 to 5 characters. Yoav Goldberg, in his response to Andrej Karpathy that inspired the use of this model, noted a lack of context awareness for such models, as opposed to an RNN learning about opening and closing brackets, for example.

## Conclusion

This proved to be more challenging that I intended, and many of the problems began with choosing an archaic mode of English to constrain closely, rather than allowing automatic, wide-ranging inference from machine learning procedures. The creative process starts at the choice of data. Any decision affects both practical and artistic acceptability.
<p>
I consider this project a failure in terms of implementing available tools well, and producing something worthwhile. But a success as a learning experience. Building the Markov chain from scratch and attempting to induce a useful grammar gave me a much deeper understanding, complementing my knowledge of neural networks.
<p>
The most fundamental realization I had, practically speaking, was that for acceptable results as a tool or cultural artefact, I need to combine approaches, and ensure the system can be tweaked and tuned by users at every step, at simple or more advanced levels.
<p>
Some interesting observations I had was that adding a title to the generated sonnets influenced my perception, as I began searching for a relationship when reading, and also how many latent variables, so to speak, can be captured by low-level models, such as revealing frequent patterns like alliteration.

## Future Work

<p>
One can see the value of production-oriented projects for learning, as it forces us to [notice](https://www.victoria.ac.nz/lals/about/staff/publications/paul-nation/2007-Four-strands.pdf#page=3) [PDF, pp. 3-4] gaps in our more receptive abilities that we’ve glossed over when reading and listening, or when writing code where data analysis and interpretation is the final step. 
<p>
In this way, the role of NLG in Computer Science curricula might be analagous to how I envision Creative Writing workshops in Literature departments as STEM-like labs, complementing the receptive with the productive to learn and master concepts.

In interdisciplinary terms, there are rich, untapped resources in theory and criticism for NLG, as one might find when exploring analyses of structured poetry. In turn, NLG can reveal more in literature and augment artistic production.
<p>
Workshops might create a feedback loop, with computation quantifying and cognitive science analyzing writing to augment and attenuate it, and reciprocally, creativity and theory force may awareness of subjectivity and plasticity in these empirical models.

Future Work
-----------

The code is in dire need of
refactoring, and especially symbolic
restructuring may be needed, with constraint handling searches and
structures as seen in Peter Norvig and Stuart Russell’s seminal *AIMA*
textbooks.

We might also blend this model with the PCFG
model, or at the least, parts-of-speech tags, by creating Markov chains
based on template sonnet tag schemes, comparing sequences of results to
ensure syntactic validity, and/or by using an evolutionary algorithm to
blend and mutate results from different models, per the work of
Manurung, et al. ([2004](https://www.era.lib.ed.ac.uk/handle/1842/314)).

More fine-grained use of sentiment is also of interest, especially in
mirroring source texts to imitate the personality of Donne to create a
kind of “flatline construct”—Donne frequently uses a rough humor in his language, in contrast to the
loftier spiritual tones. Rhyming is a difficult component in part
because the transmission is textual primarily, and read by readers who
speak and subvocalize many Englishes.

British English itself no longer resembles the earlier versions, where
words such as “prove” once rhymed with “love,” as David [Crystal](https://www.youtube.com/watch?v=YGO7TYQs4dY) and his
son often point out with Shakespeare and what’s called OP: “Original
Pronunciation”. Thus, I did not
prioritize too strongly that implementation of rhymes at this stage. But I do believe that
more complicated algorithms, looking both
forward and backwards, might be useful in this regard, allowing for
smoother backing off and re-attempting constructions. I have recently found work
related to this from Reddy & Knight
([2011](http://www.aclweb.org/anthology/P11-2014)).

At present, the Tkinter GUI is still in development. I had initially wanted an interactive interface for the program via
Tkinter or web-based GUI, and intend at least to allow for user input
which could weight and shift results, making it a more useful writing
tool. This would also impel me to include piecewise visualization of the
generation process, in a way that would not add too much overhead.
Perhaps keeping logs for later reconstruction, rather than on-line.

More careful proofreading and preprocessing of source texts is
necessary, I think, and this brings us to the tricky aspect of the many,
many versions of Donne’s poetry. There is a massive
[tome](http://donnevariorum.tamu.edu/) analyzing these variations due to
editorial choices and mistakes over the centuries since his work was
published. Which versions to use as “legitimate” ground truth is
difficult; I chose versions here that I considered modern enough to have
many dictionary results and consistent matches (the more archaic
versions vary more within texts, I believe), offset by some more archaic
versions that I personally found appealing, to give the results a more
aesthetically appealing flavor.

This dynamic is of interest, also: when designing these as writing
tools, how much to hardcode one’s personal preferences, and/or to design
the code so that it can be attenuated or generalized, adapted. We might
attempt to reproduce strictly the nuances of Donne’s work, or create a
general sonnet generator, or make the code itself more a part of the
poems, so that it forms a seamless creative artefact from
author/programmer to the produced text, which may or may not be a
protean, endlessly interactive work, or something more stationary and
immutable.

A visualization I have in mind is one
inspired from encountering Ron Padgett's *Creative Reading* a couple of
years ago:

![](assets/img/padgett.png)

That is, I wanted to trace the path of how the language model “read” the
inputs to produce the sonnet, by looking at the most contributing
sonnets per line and indexing initial parallels, color-coding lines and
tracing as we move across the output’s lines to where they would have
been in the input. I began a function for this with some tools in mind,
which spawned the function I am currently using for title generation.

Currently, the visualization is in the proof-of-concept stages, and rather weak proof, at that.

![](assets/img/jacobvis.png)

I considered analyzing the phonetic inventory of the result poems to
influence linearity and curvature to reproduce the “[bouba/kiki
effect](https://en.wikipedia.org/wiki/Bouba/kiki_effect),” the tendency
to assign “sharp” or “round” visuals to consonants like “k” and “b,”
respectively.

For evolutionary and other purposes, I also need to devise a more
comprehensive metric for the fitness function, which again would be
influenced by user input. This touches a bit on using as the “standard”
the source, or the user’s perceptions, reflecting perhaps the conflict
in literary theory over how closely to incorporate the author’s intent
into the reading of a work.

When I reviewed capitals in results, I found myself uncertain whether
Donne had intended the capitals mid-line, for various reasons, or
whether bugs in my program had caused this. I realized that the age-old question—for those who choose to ask, went from “What did the author mean?” to “What properties in the input and code caused the mechanistic production of this result?”, with factors to consider including the authors’ intentions and mistakes, with ‘authors’ ranging from Donne to editors and compositors of varying time periods (e.g., Early Modern English in the early 17th century and its capitalization practices) to the programmers involved with various modules used, computer scientists’ design of algorithms that were implemented by the programmers, data used in the creation of various models used for parsing and sentiment analysis, and of course myself. 

The process of production and comprehension was intimately intertwined
here, and I found that having some measure of domain-knowledge made this
process easier, allowing me to compose code with respect to the poetry’s
properties from memory, rather than having to continually consult the
source text(s). Typically, it is knowledge of code and maths that would
streamline that composition process. To me, this emphasized the
uniqueness of systems such as this that cross domains, and the
increasing importance of practitioner diversity and participation for
the ever-widening gyre of code’s integration into society, to ease
cognitive load and/or quickly catch mistakes during development, for
example.