# Markov Chain Sentence Builder
This is a program to build random sentences based on the data with sentences fed into it. This program uses a simple Markov chain that checks at every one and/or two words and/or three words in which the user can choose the number of Markov chains to be applied.

## Import Libraries

In [1]:
import random
from collections import defaultdict

## Load and Process Corpus

In [2]:
def load_training_file(file):
    with open(file) as f:
        raw_sentences = f.read()
        return raw_sentences

def prep_training(raw_sentences):
    raw_sentences = raw_sentences.lower()
    raw_sentences = raw_sentences.strip(",_”“")
    raw_sentences = raw_sentences.replace('"', "")
    raw_sentences = raw_sentences.replace('_', "")
    raw_sentences = raw_sentences.replace('”', "")
    raw_sentences = raw_sentences.replace('“', "")
    corpus = raw_sentences.replace('\n',' ').split()
    return corpus

## Build Markov Models

In [3]:
def map_word_to_word(corpus):
    limit = len(corpus) - 1
    dict1_to_1 = defaultdict(list)
    for index, word in enumerate(corpus):
        if index < limit:
            suffix = corpus[index + 1]
            dict1_to_1[word].append(suffix)
    return dict1_to_1

def map_2_words_to_word(corpus):
    limit = len(corpus) - 2
    dict2_to_1 = defaultdict(list)
    for index, word in enumerate(corpus):
        if index < limit:
            key = word + ' ' + corpus[index + 1]
            suffix = corpus[index + 2]
            dict2_to_1[key].append(suffix)
    return dict2_to_1

def map_3_words_to_word(corpus):
    limit = len(corpus) - 3
    dict3_to_1 = defaultdict(list)
    for index, word in enumerate(corpus):
        if index < limit:
            key = word + ' ' + corpus[index + 1] + ' ' + corpus[index + 2]
            suffix = corpus[index + 3]
            dict3_to_1[key].append(suffix)
    return dict3_to_1

## Select Random Seed

In [4]:
def random_word(corpus):            
    seed = input("Enter a word to start a sentence: ")
    if seed in corpus:
        word = seed
    else:
        word = None
        print("Try another word as a seed that exists in the corpus used.")      
    return word

## Apply the Markov Models

In [5]:
def word_after_single(prefix, suffix_map_1):
    accepted_words = []
    suffixes = suffix_map_1.get(prefix)
    if suffixes != None:
        for candidate in suffixes:
            accepted_words.append(candidate)
    return accepted_words

def  word_after_double(prefix, suffix_map_2):
    accepted_words = []
    suffixes = suffix_map_2.get(prefix)
    if suffixes != None:
        for candidate in suffixes:
            accepted_words.append(candidate)
    return accepted_words

def  word_after_triple(prefix, suffix_map_3):
    accepted_words = []
    suffixes = suffix_map_3.get(prefix)
    if suffixes != None:
        for candidate in suffixes:
            accepted_words.append(candidate)
    return accepted_words

## Build a Sentence

In [6]:
def sentence_builder(suffix_map_1, suffix_map_2, suffix_map_3, corpus):
    final_sentence = ""
    try:
        number_of_sentences = int(input("How many sentences do you want? "))
        number_of_markov_chains = int(input("Choose 1, 2, or 3 Markov chains to be applied. How many Markov chains would you want to apply? "))
    except:
        print("You entered something other than integers. Enter only integers.")
        return final_sentence
    stop_characters = [".",":","!","?"]
    current_sentence = []
    word = random_word(corpus)    
    keep_building = True
    if word != None:
        current_sentence.append(word)         
        for i in range(number_of_sentences):
            keep_building = True      
            while keep_building == True:
                if number_of_markov_chains == 1:
                    word_choices = word_after_single(word, suffix_map_1)
                    word = random.choice(word_choices)
                    current_sentence.append(word)
                    if any(character in word[-1] for character in stop_characters):
                        keep_building = False
                        break
                elif number_of_markov_chains == 2:
                    word_choices = word_after_single(word, suffix_map_1)
                    word = random.choice(word_choices)
                    current_sentence.append(word)
                    if any(character in word[-1] for character in stop_characters):
                        keep_building = False
                        break
                    prefix = current_sentence[-2] + ' ' + current_sentence[-1]
                    word_choices = word_after_double(prefix, suffix_map_2)
                    word = random.choice(word_choices)
                    current_sentence.append(word)
                    if any(character in word[-1] for character in stop_characters):
                        keep_building = False
                        break
                elif number_of_markov_chains == 3:
                    word_choices = word_after_single(word, suffix_map_1)
                    word = random.choice(word_choices)
                    current_sentence.append(word)
                    if any(character in word[-1] for character in stop_characters):
                        keep_building = False
                        break
                    prefix = current_sentence[-2] + ' ' + current_sentence[-1]
                    word_choices = word_after_double(prefix, suffix_map_2)
                    word = random.choice(word_choices)
                    current_sentence.append(word)
                    if any(character in word[-1] for character in stop_characters):
                        keep_building = False
                        break
                    prefix = current_sentence[-3] + ' ' + current_sentence[-2] + ' ' + current_sentence[-1]
                    word_choices = word_after_triple(prefix, suffix_map_3)
                    word = random.choice(word_choices)
                    current_sentence.append(word)
                    if any(character in word[-1] for character in stop_characters):
                        keep_building = False
                        break
                else:
                    print("You entered an integer of Markov chains either less than 1 or more than 3, which are not available options to choose. Please only choose 1, 2, or 3 Markov chains to be applied.")
    else:
        pass
    for i in current_sentence:
        if final_sentence == "":
            final_sentence = final_sentence + i
        else:
            final_sentence = final_sentence + ' ' + i
    return final_sentence

## Code to Generate Random Sentences

In [7]:
raw_sentences = load_training_file("Frankenstein.txt")
corpus = prep_training(raw_sentences)
suffix_map_1 = map_word_to_word(corpus)
suffix_map_2 = map_2_words_to_word(corpus)
suffix_map_3 = map_3_words_to_word(corpus)

In [8]:
print(sentence_builder(suffix_map_1, suffix_map_2, suffix_map_3, corpus))

How many sentences do you want? 30
Choose 1, 2, or 3 Markov chains to be applied. How many Markov chains would you want to apply? 1
Enter a word to start a sentence: the
the dæmon whose eyes and proportionably large. after having traversed immense wealth rather increased in two countrymen passed rapidly in my liberty and amiable. i bounded within my dear frankenstein, who suffered through the winter, instead of health is not help owning to work to be a kind precaution necessary for i was. he aimed a conversation i must create. the rain again allowed to resign myself for several hours passed; several people of de lacey and lovely moon. a perfect forms of the windows of me from entering on one thought, one circumstance weighs against something whispers to reflect upon you were not debar him with mockery of voices are sorrowful, now know how strange thoughts were covered myself from the banks of the most hated. for nearly the birds, and departed. i do not interrupted by fury; revenge alon

In [9]:
print(sentence_builder(suffix_map_1, suffix_map_2, suffix_map_3, corpus))

How many sentences do you want? 30
Choose 1, 2, or 3 Markov chains to be applied. How many Markov chains would you want to apply? 2
Enter a word to start a sentence: the
the moon, everything was made me tremble, and my persecutor. sometimes i felt as i was; so frightful loudness from thee a flash of footsteps along the mountains, the changes of the sun became the same time, i thought of my enemy. she left to conjecture that some of the old man might be resigned, but how am sent to see you, encompassed by the soul is as his maker owe him towards whom you have already adopted, for there were women were allowed my thoughts, but i therefore, with supernatural force his abhorred myself. but, my pride and i lived; their murderer escapes; he intended to delay. for myself, and sat by his many letters; we are of a fellow professor, would lecture had removed my prejudices against modern philosophers were impressed upon the arrow which had already mounted considerably. the wind was contrary to my

In [10]:
print(sentence_builder(suffix_map_1, suffix_map_2, suffix_map_3, corpus))

How many sentences do you want? 30
Choose 1, 2, or 3 Markov chains to be applied. How many Markov chains would you want to apply? 3
Enter a word to start a sentence: the
the time that the other sailors to be the purport of this book from which felix replied in a sensitive being and inaccessible. this part of a stranger can live in the servants, happening to enslave it. i truly thank you, my best years spent under the exhaustion, a long pause of mankind. but on you will rejoice to alter the course of lectures upon me in the moon had disappeared in the light which canopied me. morning dawned before the resolutions of mont blanc, render him happy before this change seemed ravished with delight on her dark orbs nearly covered with ice, and his companion entered the cabin where i was or fidelity in my own heart the greatest consolation that i was the fire that consumes your own heart. i saw an accumulation of anguish, combined with disdain me. with this deep of agony; let him live with feel