For this assignment, I felt quite pleased that I could move away from working with emoji-text. The markov chain was very interesting to me; as I read over the tutorial, I couldn't help but be enamored by the fact that we could now create 'predictive text'. 

A while back, there was this social media trend where people would post their iMessage predictive text sentences on forums for people to laugh at. The premise was that you could only click on the first word that pops up in the autofill toolbar on iMessage, and continue creating a sentence from the words that the application feeds you. The results varied from "I like food" to "I drink cactus juice". 

I really enjoyed this interaction between the user and the software, and I thought it would be interesting to explore the relationship between generative text and direct user-inputted text. In other words, I wanted a human and the machine to co-author a piece of text.

In [43]:
import markovify

In [44]:
textA = open("dracula.txt").read()

In [45]:
textB = open("uglyduckling.txt").read()

First, I imported Markovify, and found two pieces of text to use as test subjects in order to familiarize myself with markov chains. I chose Dracula (a longer length novel) and Ugly Duckling (a shorter length story). 

In [46]:
class SentencesByChar(markovify.Text):
    def word_split(self, sentence):
        return list(sentence)
    def word_join(self, words):
        return "".join(words)

Initially, I knew that I should start with words as the n-gram unit. Breaking it down to character-units seems very specific, and I felt that it would benefit me if I wanted to create some variation with spelling or new word generation (such as the mood generator in the tutorial). However, since I am mainly concerned with the larger sentence portions, I didn't think I would be using SentencesByChar. However, just in case I needed to transition to character n-grams later on, I kept this piece of code here.

In [47]:
generator_a = markovify.Text(textA)
generator_b = markovify.Text(textB)
combo = markovify.combine([generator_a, generator_b], [0.5, 0.5])

What was really fascinating to me was the markovify.combine feature. Because I intended to combine machine-generated text with user input, I wanted to experiment with this to see what I could do with it. Following the tutorial, I stored both the Dracula text and Ugly Duckling text in their respective 'generator' variables, combined both generators with 50/50 equal weights, and stored them in the 'combo' variable.

In [48]:
print(combo.make_sentence())

I threw on some clothes and got her death-warrant.


Quite the strange output. This most likely was from the Dracula text since the Ugly Duckling had no clothes or death warrant. I printed another output.

In [54]:
print(combo.make_sentence())

There were, he said, six in the way that was heart-breaking to hear.


In [55]:
print(combo.make_sentence())

I admit that in all sorts of shapes, as well as I had to hurry breakfast, for the sake of those dear to his old wound might act detrimentally on Jonathan.


It seemed that both outputs were mainly from Dracula. I wondered if the size of the text might affect the output in some way, despite both weights being equal. I readjusted the weights again, this time with more emphasis on the Ugly Duckling.

In [56]:
combo = markovify.combine([generator_a, generator_b], [0.2, 0.8])

In [57]:
print(combo.make_sentence())

I, too, shall go with Jack and the occasion, and stood silent, waiting.


In [58]:
combo = markovify.combine([generator_a, generator_b], [0.1, 0.9])

In [59]:
print(combo.make_sentence())

Straightway he began to get down and struck him over the jamb of the dogs, and carrying him in, placed him on a butcher's shop in time.


In [60]:
print(combo.make_sentence())

The Roumanians were wild, and wanted to get down and the other houses.


In [61]:
print(combo.make_sentence())

But the wind blew so hard that the work if I may.


Adjusted to the extreme (0.1 for Dracula and 0.9 for Ugly Duckling), it seemed that the sentences became more incoherent, which was probably a sign that the Ugly Duckling text was taking over. 

After playing around a bit more with the values, I was content with using markovify.combine to attempt to create a human-machine co-author piece. 

In [49]:
# change to "word" for a word-level model
level = "word"
# controls the length of the n-gram
order = 7
# controls the number of lines to output
output_n = 1
# weights between the models; text A first, text B second.
# if you want to completely exclude one model, set its corresponding value to 0
weights = [0.1, 0.9]
# limit sentence output to this number of characters
length_limit = 150

The tutorial was absolutely amazing to follow. I changed the n-gram level from char level to word level, fiddled around with the order until I hit a sweet spot, and set the output line to 1 with a 150 character limit. I knew that my 'user corpus' would be pretty small compared to Dracula, so I skewed the weights: 0.1 Dracula, 0.9 User Text.

In [50]:
textInput = input("Sentence here:")

Sentence here:One day a young woman was walking to the store to buy some bread


In [51]:
model_cls = markovify.Text if level == "word" else SentencesByChar
gen_a = model_cls(textA, state_size=order)
gen_b = model_cls(textInput, state_size=order)
gen_combo = markovify.combine([gen_a, gen_b], weights)
for i in range(output_n):
    out = gen_combo.make_short_sentence(length_limit, test_output=False)
    out = out.replace("\n", " ")
    print(out)
    print()

He find in patience just how is his strength, and what are his powers.



I created a variable to store the user input, and then combined the user input text with the Dracula text using markovify.combine. I received an output, which was a bit surprising since I expected something to go wrong. Of course, given that generator b only had one line of text to work with, I knew that the output was mainly going to be from Dracula. 

To remedy this, I created an empty "userinput" textfile, and continuously added whatever text the user inputted to the file. This way, as the user continues to 'write' the story alongside the predictive-text generator, the User Text File will increase in content as well. 

In [53]:
while True:
    textInput1 = open("userinput.txt", "a")
    textInput = input()
    textInput1.write(textInput)
    model_cls = markovify.Text if level == "word" else SentencesByChar
    gen_a = model_cls(textA, state_size=order)
    gen_b = model_cls(textInput, state_size=order)
    gen_combo = markovify.combine([gen_a, gen_b], weights)
    for i in range(output_n):
        out = gen_combo.make_short_sentence(length_limit, test_output=False)
        out = out.replace("\n", " ")
        print()
        print(out)
        print()

One day, a young woman was walking to the store to buy some bread

He had only one outburst and that was yesterday at an unusual time.

Still, the woman thought. She should probably buy some bread for him to feel better

I knew the swaying round forms, the bright hard eyes, the white teeth, the ruddy colour, the voluptuous lips.

Thinking about his features made her think back to when they first met

That going down to the vault a second time was a remarkable piece of daring.



KeyboardInterrupt: Interrupted by user

Here is an example of my first run. I trapped everything in a while loop (I know...but this is only for testing), and the program began by asking me to type a sentence into the box. 

I typed out "One day, a young woman was walking to the store to buy some bread", to which the generator built off of by saying, "He had only one outburst and that was yesterday at an unusual time." 

Extending the story, I wrote, "Still, the woman thought. She should probably buy some bread for him to feel better", to which the generator writes, "I knew the swaying round forms, the bright hard eyes, the white teeth, the ruddy colour, the voluptuous lips."

I thought this was very interesting in that I was technically the one making up meaning for this story, but I was also led on by what the machine was putting out as well. I imagine that if someone were to keep writing alongside the generator, the story would change as the user input textfile gets larger and larger, eventually producing an 'original' work of sorts, co-authored by man and machine. 

Just to test out if the user text file was actually being written by the user input, I took a peek. Sure enough, the file was getting updated by each new sentence the user was typing in.

In [66]:
ending = False;
while ending == False:
    textInput1 = open("userinput.txt", "a")
    textInput = input()
    textInput1.write(textInput)
    if textInput == "THE END":
        ending = True;
    else:
        model_cls = markovify.Text if level == "word" else SentencesByChar
        gen_a = model_cls(textA, state_size=order)
        gen_b = model_cls(textInput, state_size=order)
        gen_combo = markovify.combine([gen_a, gen_b], weights)
        for i in range(output_n):
            out = gen_combo.make_short_sentence(length_limit, test_output=False)
            out = out.replace("\n", " ")
            print()
            print(out)
            print()

One day, a young boy was walking through the dark woods

There are many ships weighing anchor at the moment in your so great Port of London.

Yet, he was not interested in these ships. All he wanted to do was to pick some flowers in the forest clearing

The poor fellow was quite broken down; now and again he gave a low groan which he could not suppress--he was thinking of his wife.

Even though he looks like a young boy, he was, in reality, a thirty year old man

You have copied maps of it, and you know it at least more than we do.

His doctor's words echoed through his mind. He remembers looking at the maps of his body, registering his bodily condition

We shall in future be able to ease his bonds for a few hours each day.

THE END


Here is my final output. I made some adjustments by taking out the while loop, and replacing it with conditional logic. As long as the user wants, they can continue writing with the generator. If the user intends to end the story, typing in "THE END" in caps will break the loop.

I checked the text file, and saw that my corpus was getting larger:

*One day, a young woman was walking to the store to buy some breadStill, the woman thought. She should probably buy some bread for him to feel betterThinking about his features made her think back to when they first metOne day, a little boy was walking through the dark woodsThis was the unmistakable feeling of ghosts, the boy thought. He felt that he should probably walk a bit fasterHowever, if Lucy could see me now, she would be horrified at my cowardice, the boy thoughtTHE ENDOne day, a young girl was happily skipping through the woodsThe girl was quite happy with this situation. She had gotten off track a few times when exploring the woods, and this time, she didn't want to lose her way"Don't worry," I told The Professor. That young girl won't lose her way in the woods this timeI remembered that in Lucy's Diary, she had mentioned that the girl suffers from frequent memory loss"Anyways" I told the Professor, "I best be going now"One day, a young boy was walking through the dark woodsYet, he was not interested in these ships. All he wanted to do was to pick some flowers in the forest clearingEven though he looks like a young boy, he was, in reality, a thirty year old manHis doctor's words echoed through his mind. He remembers looking at the maps of his body, registering his bodily condition*

For reference, I have attached the text file to this repository as well.

I wrote a short story to test out the process, and the results were quite surprising. I found myself invested in the story I was creating (a 30 year old man who looks like a young boy picking flowers in the woods and having flashbacks at the doctors about his condition), and it felt as if I was under the whim of the machine. Whatever the generator spat out, I bent over backwards to accomodate and spin the story in a way that made sense. In this fashion, it seems that the generator had power over me, despite the fact that we were supposed to be co-authors.

I really enjoyed working with markov chains and predictive text generation. I feel that I may explore this further for my final project as well. It would also be interesting to see if I can keep writing with the generator and build a larger corpus over time as well.