In [15]:
text_a = open("/kaggle/input/text-generation-dataset/1342-0.txt").read()
text_b = open("/kaggle/input/text-generation-dataset/84-0.txt").read()

In [16]:
a_words = text_a.split()
b_words = text_b.split()

In [17]:
import sys
!{sys.executable} -m pip install markovify



And then run this cell to make the library available in your notebook:

In [18]:
import markovify

The code in the following cell creates a new text generator, using the text in the variable specified to build the Markov model, which is then assigned to the variable `generator_a`.

In [19]:
generator_a = markovify.Text(text_a)

You can then call the `.make_sentence()` method to generate a sentence from the model:

In [20]:
print(generator_a.make_sentence())

Such was Miss Lucas and Mrs. Gardiner had formed, of their meeting in Derbyshire, so often, and in a family on which her marriage would so shortly give the matter had ever raised before; she remembered the style of living in the recital which I want to think it no otherwise than lovely and amiable.


The `.make_short_sentence()` method allows you to specify a maximum length for the generated sentence:

In [21]:
print(generator_a.make_short_sentence(50))

We must not suspect me.


By default, Markovify tries to generate a sentence that is significantly different from any existing sentence in the input text. As a consequence, sometimes the `.make_sentence()` or `.make_short_sentence()` methods will return `None`, which means that in ten tries it wasn't able to generate such a sentence. You can work around this by increasing the number of times it tries to generate a sufficiently unique sentence using the `tries` parameter:

In [22]:
print(generator_a.make_short_sentence(40, tries=100))

Accordingly, when she had chosen.


Or by disabling the check altogether with `test_output=False` (note that this means the generator will occasionally return stretches of text that are present in the source text):

In [23]:
print(generator_a.make_short_sentence(40, test_output=False))

None


### Changing the order

When you create the model, you can specify the order of the model using the `state_size` parameter. It defaults to 2. Let's make two model with different orders and compare:

In [24]:
gen_a_1 = markovify.Text(text_a, state_size=1)
gen_a_4 = markovify.Text(text_a, state_size=4)

In [25]:
print("order 1")
print(gen_a_1.make_sentence(test_output=False))
print()
print("order 4")
print(gen_a_4.make_sentence(test_output=False))

order 1
I endeavoured to all his return into something to be only creature in talking together, a less pliancy of her spirits; and on Wickham's circumstances are poor.

order 4
It is highly improper.


In general, the higher the order, the more the sentences will seem "coherent" (i.e., more closely resembling the source text). Lower order models will produce more variation. Deciding on the order is usually a matter of taste and trial-and-error.

### Changing the level

Markovify, by default, works with *words* as the individual unit. It doesn't come out-of-the-box with support for character-level models. The following code defines a new kind of Markovify generator that implements character-level models. Execute it before continuing:

In [36]:
class SentencesByChar(markovify.Text):
    def word_split(self, sentence):
        return list(sentence)
    def word_join(self, words):
        return "".join(words)

Any of the parameters you passed to `markovify.Text` you can also pass to `SentencesByChar`. The `state_size` parameter still controls the order of the model, but now the n-grams are characters, not words.

The following cell implements a character-level Markov text generator for the word "condescendences":

In [37]:
con_model = SentencesByChar("condescendences", state_size=2)

Execute the cell below to see the output—it'll be a lot like what we implemented by hand earlier!

In [38]:
con_model.make_sentence()

'condendescences'

Of course, you can use a character-level model on any text of your choice. So, for example, the following cell creates a character-level order-7 Markov chain text generator from text A:

In [39]:
gen_a_char = SentencesByChar(text_a, state_size=7)

And the cell below prints out a random sentence from this generator. (The `.replace()` is to get rid of any newline characters in the output.)

In [42]:
print(gen_a_char.make_sentence(test_output=False).replace("\n", " "))

Widely different people's engagement, they were not a doubt his antagonist at backgammon.


### Combining models

Markovify has a handy feature that allows you to *combine* models, creating a new model that draws on probabilities from both of the source models. You can use this to create hybrid output that mixes the style and content of two (or more!) different source texts. To do this, you need to create the models independently, and then call `.combine()` to combine them.

In [43]:
generator_a = markovify.Text(text_a)
generator_b = markovify.Text(text_b)
combo = markovify.combine([generator_a, generator_b], [0.5, 0.5])

The bit of code `[0.5, 0.5]` controls the "weights" of the models, i.e., how much to emphasize the probabilities of any model. You can change this to suit your tastes. (E.g., if you want mostly text A with but a *soupçon* of text B, you would write `[0.9, 0.1]`. Try it!) 

Then you can create sentences using the combined model:

In [44]:
print(combo.make_sentence())

And then, you know, and I only mean to lecture you a second time, therefore, was most anxious to get home.


### Bringing it all together

I've pre-written some code below to make it easy for you to experiment and produce output from Markovify. Just make adjustments to the values assigned to the variables in the cell below:

In [45]:
# change to "word" for a word-level model
level = "char"
# controls the length of the n-gram
order = 7
# controls the number of lines to output
output_n = 14
# weights between the models; text A first, text B second.
# if you want to completely exclude one model, set its corresponding value to 0
weights = [0.5, 0.5]
# limit sentence output to this number of characters
length_limit = 280

(The lines beginning with `#` are "comments"—they don't do anything, they're just there to explain what's happening in the code.)

After making your changes above, run the cell below to generate text according to your parameters. Repeat as necessary until you get something you really like!

In [46]:
model_cls = markovify.Text if level == "word" else SentencesByChar
gen_a = model_cls(text_a, state_size=order)
gen_b = model_cls(text_b, state_size=order)
gen_combo = markovify.combine([gen_a, gen_b], weights)
for i in range(output_n):
    out = gen_combo.make_short_sentence(length_limit, test_output=False)
    out = out.replace("\n", " ")
    print(out)
    print()

Who would have been passive, if you did love my chains, whose disposed about it is not enough to endured rendered as Lydia's charge, how much to Elizabeth felt the sole expected.

I repassed, in the sun or gentle demeanour and admiration of you is, to my solitary cottage, in which had creator; he was less strangeness of her chair, my eyes.

No one way over the generous cave and aid, he so justly scorned.

When, after success, for the North, received and misery.

It advanced by Henry, languish was to endured.

What a disgraceful company of your account of the dæmons of revenge, henceforth dearer friends to his tale of scaling this exposing and fainted.

All were silent acquaintance which was not that blame which led me that as I plunge me lower in the kind of promise you have promised my path lay through her rays, while his wife; but his marriage.

Eager to satisfaction and vengeance she had force of reflected, yet I own to pursue any measure of Wickham really happy together he was join