#NLP: Building a Legible and Gramatically correct Sentence with Markovify

The goal of this model is to build a legible and gramatically correct sentences using Markovify library, which uses the logics of Markov chains. Here we are training the model by taking the data as 
1. Shakespeare's 3 most famous tragic novels, Macbeth, Hamlet and Julius Caeser
2. The novel Carroll Alice
3. Bible


In [1]:
#Downloading necessary libraries
!pip install nltk
!pip install spacy
!pip install markovify
!python -m spacy download en

Collecting markovify
  Downloading markovify-0.9.3.tar.gz (28 kB)
Collecting unidecode
  Downloading Unidecode-1.3.2-py3-none-any.whl (235 kB)
[K     |████████████████████████████████| 235 kB 8.2 MB/s 
[?25hBuilding wheels for collected packages: markovify
  Building wheel for markovify (setup.py) ... [?25l[?25hdone
  Created wheel for markovify: filename=markovify-0.9.3-py3-none-any.whl size=18622 sha256=0b9a689cb2fc12de0487b6d64fc7f461e243f3f52ba6e0ccec420ec60db13a24
  Stored in directory: /root/.cache/pip/wheels/d9/f0/5b/748a27bdf2496bd4df51acb9442dae516efce507ff4849813e
Successfully built markovify
Installing collected packages: unidecode, markovify
Successfully installed markovify-0.9.3 unidecode-1.3.2
Collecting en_core_web_sm==2.2.5
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz (12.0 MB)
[K     |████████████████████████████████| 12.0 MB 5.2 MB/s 
[38;5;2m✔ Download and installation successful[0m


In [2]:
#importing all the necessary libraries
import spacy
import re
import markovify
from nltk.corpus import gutenberg
import nltk
import warnings
warnings.filterwarnings('ignore')

nltk.download('gutenberg')
!python -m spacy download en

[nltk_data] Downloading package gutenberg to /root/nltk_data...
[nltk_data]   Unzipping corpora/gutenberg.zip.
Collecting en_core_web_sm==2.2.5
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz (12.0 MB)
[K     |████████████████████████████████| 12.0 MB 5.4 MB/s 
[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('en_core_web_sm')
[38;5;2m✔ Linking successful[0m
/usr/local/lib/python3.7/dist-packages/en_core_web_sm -->
/usr/local/lib/python3.7/dist-packages/spacy/data/en
You can now load the model via spacy.load('en')


In [3]:
#inspect gutenberg text corpus 
print(gutenberg.fileids())

['austen-emma.txt', 'austen-persuasion.txt', 'austen-sense.txt', 'bible-kjv.txt', 'blake-poems.txt', 'bryant-stories.txt', 'burgess-busterbrown.txt', 'carroll-alice.txt', 'chesterton-ball.txt', 'chesterton-brown.txt', 'chesterton-thursday.txt', 'edgeworth-parents.txt', 'melville-moby_dick.txt', 'milton-paradise.txt', 'shakespeare-caesar.txt', 'shakespeare-hamlet.txt', 'shakespeare-macbeth.txt', 'whitman-leaves.txt']


In [4]:
#we are going to use some of the books in Gutenberg for this model

hamlet = gutenberg.raw('shakespeare-hamlet.txt')
macbeth = gutenberg.raw('shakespeare-macbeth.txt')
caesar = gutenberg.raw('shakespeare-caesar.txt')
carroll = gutenberg.raw('carroll-alice.txt')
bible = gutenberg.raw('bible-kjv.txt')

print('\nRaw:\n', hamlet[:3000])
print('\nRaw:\n', macbeth[:3000])
print('\nRaw:\n', caesar[:3000])
print('\nRaw:\n', carroll[:3000])
print('\nRaw:\n', bible[:3000])


Raw:
 [The Tragedie of Hamlet by William Shakespeare 1599]


Actus Primus. Scoena Prima.

Enter Barnardo and Francisco two Centinels.

  Barnardo. Who's there?
  Fran. Nay answer me: Stand & vnfold
your selfe

   Bar. Long liue the King

   Fran. Barnardo?
  Bar. He

   Fran. You come most carefully vpon your houre

   Bar. 'Tis now strook twelue, get thee to bed Francisco

   Fran. For this releefe much thankes: 'Tis bitter cold,
And I am sicke at heart

   Barn. Haue you had quiet Guard?
  Fran. Not a Mouse stirring

   Barn. Well, goodnight. If you do meet Horatio and
Marcellus, the Riuals of my Watch, bid them make hast.
Enter Horatio and Marcellus.

  Fran. I thinke I heare them. Stand: who's there?
  Hor. Friends to this ground

   Mar. And Leige-men to the Dane

   Fran. Giue you good night

   Mar. O farwel honest Soldier, who hath relieu'd you?
  Fra. Barnardo ha's my place: giue you goodnight.

Exit Fran.

  Mar. Holla Barnardo

   Bar. Say, what is Horatio there?
  Hor. A p

In [5]:
#utility function for text cleaning
def text_cleaner(text):
  text = re.sub(r'--', ' ', text)
  text = re.sub('[\[].*?[\]]', '', text)
  text = re.sub(r'(\b|\s+\-?|^\-?)(\d+|\d*\.\d+)\b','', text)
  text = ' '.join(text.split())
  return text

In [6]:
#remove chapter indicator
hamlet = re.sub(r'Chapter \d+', '', hamlet)
macbeth = re.sub(r'Chapter \d+', '', macbeth)
caesar = re.sub(r'Chapter \d+', '', caesar)
carroll = re.sub(r'Chapter \d+', '', carroll)
bible = re.sub(r'Chapter \d+', '', bible)
#apply cleaning function to corpus
hamlet = text_cleaner(hamlet)
caesar = text_cleaner(caesar)
macbeth = text_cleaner(macbeth)
carroll = text_cleaner(carroll)
bible = text_cleaner(bible)

In [7]:
#parse cleaned novels
nlp = spacy.load('en')
hamlet_doc = nlp(hamlet)
macbeth_doc = nlp(macbeth)
caesar_doc = nlp(caesar)
bible_spliced = bible[:100000]
bible_doc = nlp(bible_spliced)
carroll_doc = nlp(carroll)

In [8]:
hamlet_sents = ' '.join([sent.text for sent in hamlet_doc.sents if len(sent.text) > 1])
macbeth_sents = ' '.join([sent.text for sent in macbeth_doc.sents if len(sent.text) > 1])
caesar_sents = ' '.join([sent.text for sent in caesar_doc.sents if len(sent.text) > 1])
carroll_sents = ' '.join([sent.text for sent in carroll_doc.sents if len(sent.text) > 1])
bible_sents = ' '.join([sent.text for sent in bible_doc.sents if len(sent.text) > 1])

In [9]:
shakespeare_sents = hamlet_sents + macbeth_sents + caesar_sents

In [10]:
print(shakespeare_sents)



In [11]:
print(carroll_sents)



In [12]:
print(bible_sents)

The Old Testament of the King James Bible The First Book of Moses: Called Genesis: In the beginning God created the heaven and the earth. : And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters. And God said, Let there be light: and there was light. And God saw the light, that it was good: and God divided the light from the darkness. And God called the light Day, and the darkness he called Night. And the evening and the morning were the first day. And God said, Let there be a firmament in the midst of the waters, and let it divide the waters from the waters. And God made the firmament, and divided the waters which were under the firmament from the waters which were above the firmament: and it was so. And God called the firmament Heaven. And the evening and the morning were the second day. And God said, Let the waters under the heaven be gathered together unto one place, and let the dry land appear: 

In [13]:
generator_s = markovify.Text(shakespeare_sents, state_size=3)
generator_c = markovify.Text(carroll_sents, state_size=3)
generator_b = markovify.Text(bible_sents, state_size=3)

In [23]:
#We will randomly generate three sentences
for i in range(3):
    print(generator_s.make_sentence())


#We will randomly generate three more sentences of no more than 100 characters
for i in range(3):
    print(generator_s.make_short_sentence(max_chars=100))

Ile goe no more : I am afraid, to thinke what I haue seene: see what I see.
To be thus, is nothing, but to be nothing else but mad.
None
O hatefull Error, Melancholies Childe: Why do'st thou leade these men about the streets?
And will he steale out of his heart, and turne him going.
What art thou that vsurp'st this time of meeting Thus much the businesse is.


In [15]:
for i in range(3):
    print(generator_c.make_sentence())


for i in range(3):
    print(generator_c.make_short_sentence(max_chars=100))
    

For the Mouse was swimming away from her as hard as she could, and soon found herself safe in a thick wood.
None
As there seemed to be full of soup.
Alice waited a little, half expecting to see it trot away quietly into the wood.
She went on growing, and growing, and very soon found an opportunity of taking it away.
CHAPTER V. Advice from a Caterpillar The Caterpillar and Alice looked round, eager to see the Queen.


In [16]:
for i in range(3):
    print(generator_b.make_sentence())


for i in range(3):
    print(generator_b.make_short_sentence(max_chars=100))

: And on the seventh day God ended his work which he had made: : And he dwelt in the plain of Moreh.
: And his mother said unto him, Thou shalt not take a wife for my son of my kindred, and take a wife to my son of the bondwoman will I make a nation, because he is thy seed.
And the LORD appeared unto him the same night, and said, I have gotten a man from the LORD.
And Abimelech called Isaac, and said, Behold, of a surety bear a child, which am old?
And God said unto the servant, What man is this that thou hast done?
And I came this day unto the well, and filled her pitcher, and came up.


In [17]:
#next we will use spacy's part of speech to generate more legible text

class POSifiedText(markovify.Text):

    def word_split(self, sentence):
        return ['::'.join((word.orth_, word.pos_)) for word in nlp(sentence)]

    def word_join(self, words):
        sentence = ' '.join(word.split('::')[0] for word in words)
        return sentence

#instantiate class with our text
generator_s2 = POSifiedText(shakespeare_sents, state_size=3)
generator_c2 = POSifiedText(carroll_sents, state_size=3)
generator_b2 = POSifiedText(bible_sents, state_size=3)

In [18]:
#now we will use the above generator to generate sentences
for i in range(5):
    print(generator_s2.make_sentence())
    
#print 100 characters or less sentences
print('\n')
for i in range(5):
    print(generator_s2.make_short_sentence(max_chars=100))
    

Liue a thousand yeeres , I shall vnfold Ham .
Why sir , Cobble you Fla. Thou art a Scholler ; speake to it , though it haue no tongue , will speake With most myraculous Organ .
Come to the Capitoll , This way will I  Disrobe the Images , If you would graunt the time Banq .
Comes Caesar to the Capitoll , directly heere Bru .
None


Why one faire Daughter , and your Honour .
That I did my Lord , no other occasion Ham .
Betweene the acting of a dreadfull thing , And the rich East to boot Mal .
Indeed , they say , I am bound to heare Gho .
Caesar Caes Let me haue men about me , that are heap'd on Caesar Cassi .


In [19]:
for i in range(5):
    print(generator_c2.make_sentence())


print('\n')
for i in range(5):
    print(generator_c2.make_short_sentence(max_chars=100))

None
So she set to work at once to eat some of the other birds tittered audibly .
The Mouse looked at her rather inquisitively , and seemed to quiver all over with fright .
And it 'll fetch things when you throw them , and all sorts of little birds and beasts , as well as she could do to hold it .
The baby grunted again , and did not venture to say it out loud .


She carried the pepper - box in her hand , and a bright idea came into her head .
When she got back to the game .
As for pulling me out of the house , quite forgetting that she was losing her temper .
A little bright - eyed terrier , you know , and he says it 's so useful , it 's coming down !
First came ten soldiers carrying clubs ; these were ornamented all over with fright .


In [20]:
for i in range(5):
    print(generator_b2.make_sentence())


print('\n')
for i in range(5):
    print(generator_b2.make_short_sentence(max_chars=100))

And Enoch lived sixty and five years , and begat Cainan   And I will put enmity between thee and the woman was taken into Pharaoh 's house .
At the time appointed I will return unto thee according to the time of the evening , even the men of his house , that ruled over all that he had unto Isaac .
He that is born in the house , and to thy seed after thee in their generations for an everlasting covenant , to be his wife .
And he said , I will multiply thy seed as the dust of the ground ,  And Hadoram , and Uzal , and Diklah ,  And Hadoram , and Uzal , and Diklah ,  And to rule over the day and over the fowl of the air , upon all that moveth upon the earth , and of one speech .
He that is born in the house , and from the land of Shinar ; and they dwelt there .


 And the sons of Joktan .
And Abraham gave all that he had , Put , I pray thee , between me and you .
 And they came to the place which God had spoken to him .
And the angel of God called to Hagar out of heaven , and the height o