# Markov simple

This notebook uses the code of 'N-Order text generation' from the Notebook [n_order_text_generation](https://github.com/experimental-informatics/hands-on-text-generators/blob/master/n_order_text_generation.ipynb) in a simplified version - better to use, less easy to understand.

## Define a function to create a vocabulary

Insert your text (as string) into the function.<br>
There is an optional argument n.<br>
The function returns the vocabulary as a dictionary.

In [1]:
def create_vocabulary(txt, n=6):
    
    # Store all inside a temporary vocabulary.
    temp_vocabulary = {}

    for i in range(len(txt) -n): # Now it's important to stop the loop at len() - n.

        # The current token (i) and the next tokens (i+n) are key.
        key = txt[i:i+n]

        # The next token after the last token of key is the corresponding value.
        value = txt[i+n]

        # First check if the key exists in the dictionary already.
        if key in temp_vocabulary.keys():
            # If yes, append the value to the list.
            temp_vocabulary[key].append(value)

        # Else insert the new key + the value in form of a [list].
        else:
            temp_vocabulary[key] = [value]
            
    # Return the vocabulary.
    
    return temp_vocabulary

## Define a function to generate text

We will insert our created vocabulary.<br>
Optional arguments are
- number_of_tokens
- input_

In [2]:
def generate_text(vocabulary, number_of_tokens=100, input_=''):
    
    import random
    
    n = len(random.choice(list(vocabulary.keys())))
    
    generated_text = input_
    
    # pick a random key if input_ is empty or too short
    if len(generated_text) < n or generated_text[-n:] not in list(vocabulary.keys()):
        
        generated_text += random.choice(list(vocabulary.keys()))
    
    for i in range(number_of_tokens):
        
        # get last n token as key
        key = generated_text[-n:]
        
        # append a random choice of the options to our text
        generated_text += random.choice(vocabulary[key])
        
    return generated_text

## Use the functions to generate text

If you have executed the two functions above, all you have to do is to work with the code below.

In [3]:
''' Load your text. '''

with open('data/wiki_selection.txt', 'r', encoding='utf-8') as f:
    txt = f.read()
print(txt[:100])

Aesthetics is a branch of philosophy that deals with the nature  of  beauty and taste, as well as th


In [4]:
''' Define the vocabulary. '''

vocabulary = create_vocabulary(txt, n=8)

In [5]:
''' Generate text. '''

input_ = 'A generated text is '
new_text = generate_text(vocabulary, number_of_tokens=150, input_=input_)
print(new_text)

A generated text is the hypotheses (and not have to be a predicate is true. If the class of tasks T and perhaps inseparable (though determinacy of meaningless, or the eff


## Appendix: Generate the past

Predictive models are well known for predicting the future based on historic data. But we're also guessing about the past. Through reversing our text we can generate new text that leads to a desired end.

In [6]:
''' Load and reverse your text. '''

with open('data/wiki_selection.txt', 'r', encoding='utf-8') as f:
    txt = f.read()
    # Reverse text.
    txt = txt[::-1]
print(txt[:100])



.rehtona eno htiw yrav-oc dna tcurtsnoc-oc ygolonhcet dna secrof laicos taht setats taht namdeirF 


In [7]:
''' Define the vocabulary. '''

vocabulary = create_vocabulary(txt, n=6)

In [8]:
''' Generate text. '''

input_ = ' is the aesthetic identity.'
input_ = input_[::-1]
new_text = generate_text(vocabulary, number_of_tokens=150, input_=input_)
print(new_text)
print(new_text[::-1])

.ytitnedi citehtsea eht si scitsitats fo erutcurtsnocer ot seirt yllaitnetop ro elttil evah llits tub ,metsys ot deilppa neeb evah ,nartA ttocS dna tsitnegremE nA :noitazimitpO


Optimization: An Emergentist and Scott Atran, have been applied to system, but still have little or potentially tries to reconstructure of statistics is the aesthetic identity.
