## Generating Free Verse with Markovify

We've already seen in a previous notebook how to generate sentences using [markovify](https://github.com/jsvine/markovify) and an entire novel(!!) using [markov-novel](https://github.com/accraze/python-markov-novel). In this short notebook, I'll show you how to adapt markovify in order to also generate free verse. 

To work with this notebook locally on your own machine in your own Jupyter environment, you'll need to download this .ipynb file, as well as both .txt files ("moon_poetry.txt" and "sea_poetry.txt"). You'll also need to install markovify if you haven't already. 

One potential way to make markov-generated poetry more semantically coherent is to build the model on top of a thematic corpus, which is to say a corpus of poems all devoted to a similar subject, such as love, nature, illness, war, peace, etc. I've started a "Datasets" Page on our Elms site that includes a link to the [Kaggle poetry dataset](https://www.kaggle.com/michaelarman/poemsdataset). This dataset, which has been released into the public domain, is organized by poetic type and theme. I've consolidated all the lunar poems into a single file to make it easy to work with them, and done the same for the nautical poems. 

Let's first take a look at the "moon" poetry corpus. We'll use what should by now be familiar syntax to open and read our file:

In [None]:
pip install markovify #install markovify if you haven't already

In [None]:
moon_poetry = open("moon_poetry.txt").read()

We can use the Python print() function to display the dataset in our notebook. Since we've assigned the entire moon corpus to the variable "moon_poetry," our print command is succinct:

In [None]:
print(moon_poetry)

Now let's markovify it to generate new lines of poetry using Markov chains. I've annotated the block of code below so you can get a better sense of what's happening. Run the code cell to output seven lines of free verse:

In [None]:
import markovify #import the markovify library

# Build the model, which uses "markovify.NewlineText" instead of the previous "markovify.Text"
#"markovify.Text" has as its main unit of composition the sentence,
#wheareas "markovify.NewlineText" has the poetic line as the main unit of composition
#Note the use of "moon_poetry" as our variable, which holds our entire moon poetry corpus
text_model = markovify.NewlineText(moon_poetry)

# Print seven randomly-generated poetic lines of no more than 280 characters
for i in range(7):
    print(text_model.make_short_sentence(280))

If you don't like the output, run the cell again: it will keep generating new text. Copy-and-paste any output that you do like into a markdown cell to preserve it.  

You can slightly alter the code to 1.) make the poetic lines longer or shorter, 2.) increase or decrease the number of lines, and 3.) adjust the degree of novelty (the smaller the markov order or state, the more novel the output will be, sometimes at the expense of coherence):

In [None]:
import markovify #sample code from https://github.com/jsvine/markovify

# Build the model.
text_model = markovify.NewlineText(moon_poetry, state_size=1)

#Change the number following "range" to specify how many poetic lines you want
#Change the number following "sentence" to indicate how many characters you want per line
#Change the value for state_size, above, to modify the markov order.
for i in range(8):
    print(text_model.make_short_sentence(200))

Here's some sample output that I decided to retain because I found it compelling:

>The tide of words too
>to catch the moon may learn,
>As mellow as a worm.
>There is no more its swelling surge confines,
>Shrunk to a blue-eyed hawk;
>Her Bonnet is the moon is caught lightly,
>Past the charred silos, past the Jade-gate Pass.

I really like "as mellow as a worm"!

Keep in mind that digital poets will often edit and revise output to make it more coherent or intentional, or just more of a collaboration between human and machine. Nanni Balestrini, as we saw last week, altered punctuation, pronouns, and even diction (generally substituting a cognate word for one proferred by the computer) in "Tape Mark I." Carolyn Lamb, et al, in ["A Taxonomy of Generative Poetry Techniques"](https://archive.bridgesmathart.org/2016/bridges2016-195.pdf) call this process "Human Enhancement":

>The most obvious way for a human to enhance computer-generated poetry is to edit the poetry >generator’s output. While this arguably invalidates the generator’s usefulness, it is an >established practice. John Cage, for instance, removed unwanted words from the output of his >algorithms. Computational text generation is seen by many as a "jumping-off point" from which >they acquire raw material.

As one of your "Going Further" posts, you might subject output from Markovify to this kind of revision and copy-editing. Post the original free verse output along with your tweaked and edited version, explaining what you liked about the original and what interventions you made to further develop it. 

Now that we've experimented with the moon poetry corpus, let's markovify the sea poetry corpus. First let's take a look at what we've got in the original file:

In [None]:
sea_poetry = open('sea_poetry.txt').read()
print(sea_poetry)

We're now ready to generate some free verse based on the sea poems. We'll use the same block of code as before, the only difference being that we'll point the code to the sea poetry corpus instead of the moon poetry corpus (swapping out the "moon_poetry" variable for the "sea_poetry" variable):

In [None]:
import markovify #sample code from https://github.com/jsvine/markovify

# Build the model.
text_model = markovify.NewlineText(sea_poetry, state_size=1)

#Change the number following "range" to specify how many poetic lines you want
#Change the number following "sentence" to indicate how many characters you want per line
#Change the value for state_size, above, to modify the markov order.
for i in range(8):
    print(text_model.make_short_sentence(200))

Once again, keep re-running the cell until you get something you like. Then copy-and-paste it into a new markdown cell.  

As one final experiment, we can combine the text models for two different texts. Study the following code, which creates a combined markov model from both the moon poetry corpus and the sea poetry corpus: 

In [None]:
#Assign each model to a different variable and specify the state size:
model_a = markovify.NewlineText(sea_poetry, state_size=2)
model_b = markovify.NewlineText(moon_poetry, state_size=2)

#Combine them into a new model called "model_combo"
model_combo = markovify.combine([ model_a, model_b ], [ 1, 2 ])

If you look at that final line of code in the above code cell, you'll see that there's a list structure at the end holding two elements, a "1" and a "2". Those values specify what weights to assign to each corpus. The value "2" applies to the moon poetry corpus, which is assigned to the "model_b" variable; the value "1" applies to the sea poetry corpus, which is assigned to the "model_a" variable (and occupies a parallel list position in the list to the immediate left of the numbers list). Together these numerical values indicate that the moon corpus is weighted more heavily than the sea corpus, meaning that moon themes and imagery predominate over sea themes and imagery. You can change those proportions by adjusting those values.  

Let's run the next code cell, which uses the combined model ("model_combo") to generate poetic lines:

In [None]:
for i in range(8):
    print(model_combo.make_short_sentence(200))