# Sentence Generator

Generating sentences.

So each cell is a different iteration of the sentence generator, so you can see the process.

Rerun the cell to get a new sentence output

In [11]:
from random import choice, random
from ipywidgets import Output, Button

def coin():
    """
    Return a random boolean
    """
    return random() > 0.5

## Simple Sentences

Start with simplest sentence: a noun (the subject) a verb and another noun (the object).

In [12]:
noun_a = [
    'John',
    'Jim',
    'Gordon Ramsay',
    'Steven',
    'Khan',
    'Roy',
    'Michael',
    'Vladmir Putin'
]

verb = [
    'jumps',
    'runs',
    'cooks',
    'annoys',
    'helps',
    'reads',
    'chooses'
]

noun_b = [
    'the fence',
    'the shark',
    'the child',
    'the mistress',
    'the working class',
    'the book'
]

choice(noun_a) + ' ' + choice(verb) + ' ' + choice(noun_b)

'Khan annoys the book'

Kinda redundant... We have two sets of nouns, could we combine them?

Well, nouns come in two flavors, proper and improper. Improper nouns have articles ("a/an" and "the"), while proper nouns don't (kind of). So we can have a set for proper nouns, and a set for improper nouns, and pick between them randomly

In [13]:
nouns_proper = [
    'John',
    'Jim',
    'Gordon Ramsay',
    'Steven',
    'Khan',
    'Roy',
    'Michael',
    'Vladmir Putin'
]

nouns_improper = [
    'fence',
    'shark',
    'child',
    'mistress',
    'citizen',
    'book',
    'egg'
]

verbs = [
    'jumps',
    'runs',
    'cooks',
    'annoys',
    'helps',
    'reads',
    'chooses',
    'writes',
    'punches'
]

def noun():
    if coin():
        return choice(nouns_proper)
    else:
        return 'the ' + choice(nouns_improper)

# Following the apparent convention
def verb():
    return choice(verbs)
    
# Geneate sentence
noun() + ' ' + verb() + ' ' + noun()

'Roy punches John'

Now we got a bit more intrigue... the improper nouns only begin with "the", since there's a bit more code that needs to go into "a/an". Namely, we have to programatically choose between "a" and "an" depending on if the noun starts with a vowel.

Also, I'm going to imply that any noun that doesn't allow for both "a/an" and "the" is proper (e.g. "the working class")

In [14]:
nouns_proper = [
    'John',
    'Jim',
    'Gordon Ramsay',
    'Steven',
    'Khan',
    'Roy',
    'Michael',
    'Vladmir Putin',
    'the working class',
    'the biker gang',
    'the sky',
    'France'
]

nouns_improper = [
    'fence',
    'shark',
    'child',
    'mistress',
    'citizen',
    'book',
    'egg'
]

verbs = [
    'jumps',
    'runs',
    'cooks',
    'annoys',
    'helps',
    'reads',
    'chooses',
    'writes',
    'punches'
]

def noun():
    if coin():
        return choice(nouns_proper)
    else:
        noun = choice(nouns_improper)
        if coin():
            return 'the ' + noun
        elif noun[0] in 'aeiou': # Usually not y
            return 'an ' + noun
        else:
            return 'a ' + noun
            

# Following the apparent convention
def verb():
    return choice(verbs)
    
# Generate sentence
noun() + ' ' + verb() + ' ' + noun()

'Gordon Ramsay runs the sky'

Now comes the first tough part: plural nouns. You can have one citizen cook an egg, or multiple citizens cook an egg. However, if we want to implement this, it will introduce new problems. We need a singular and plural form of a noun. With most nouns you just attach an 's' to the end, but _some_ nouns are special (e.g. person singular, people plural). So we have to account for both. _Also_, the verb will change depending on wether the noun is singular or plural. So now we introduce dependencies.

We'll have to create a staged approach. We first determine our subject. Then, from there, we format the verb based on the subject's plurality, and finally we introduce the object.

As for singular and plural, special nouns will be provided in arrays, and plain strings imply add an 's'.

In [15]:
nouns_improper = [
    'fence',
    'shark',
    ['child', 'children'],
    ['mistress', 'misstresses'],
    'citizen',
    'book',
    'egg'
]

nouns_proper = [
    'John',
    'Jim',
    'Gordon Ramsay',
    'Steven',
    'Khan',
    'Roy',
    'Michael',
    'Vladmir Putin',
    'the working class',
    'the biker gang',
    'the sky',
    'France'
]

verbs = [
    'jump',
    'run',
    'cook',
    'annoy',
    'help',
    'read',
    'choose',
    'write',
    ['punch', 'punches']
]

def noun_improper(plural):
    value = choice(nouns_improper)
    if plural:
        if type(value) is list:
            word = value[1]
        else:
            word = value + 's'
        return 'the ' + word
    else:
        if type(value) is list:
            word = value[0]
        else:
            word = value
        if coin():
            if word[0] in 'aeiou': # usually not y
                return 'an ' + word
            else:
                return 'a ' + word
        else:
            return 'the ' + word
    
def noun_proper():
    return choice(nouns_proper)

def noun():
    if coin():
        plural = coin()
        return noun_improper(plural), plural
    else:
        return noun_proper(), False
    
def verb(plural):
    value = choice(verbs)
    if type(value) is list and plural:
        return value[0]
    elif type(value) is list:
        return value[1]
    elif plural:
        return value
    else:
        return value + 's'
    
# Build sentence
subject, plural = noun()
action = verb(plural)
object_, _ = noun()
subject + ' ' + action + ' ' + object_

'the books read the sharks'

We're starting to make sentences more interesting. However, the code is kind of getting unruly, so I'm going to do some refactoring. I've introduced a class structure here

In [16]:
class RandomValue:
    @classmethod
    def random(cls):
        value = choice(cls.data)
        return cls(*value) if type(value) is list else cls(value)
    

class ProperNoun(RandomValue):
    data = [
        'John',
        'Jim',
        'Gordon Ramsay',
        'Steven',
        'Khan',
        'Roy',
        'Michael',
        'Vladmir Putin',
        'the working class',
        'the biker gang',
        'the sky',
        'France'
    ]
    
    def __init__(self, word):
        self.word = word
        
    def get_word(self, plural):
        return self.word

    
class ImproperNoun(RandomValue):
    data = [
        'fence',
        'shark',
        ['child', 'children'],
        ['mistress', 'misstresses'],
        'citizen',
        'book',
        'egg'
    ]
    
    def __init__(self, singular, plural=None):
        self.singular = singular
        self.plural = plural or (singular + 's')
        
    def get_word(self, plural):
        word = self.plural if plural else self.singular
        article = 'the' if plural or coin() else 'an' if word[0] in 'aeiou' else 'a'
        return article + ' ' + word


def noun():
    if coin():
        value = ImproperNoun.random()
        plural = coin()
        return value.get_word(plural), plural
    else:
        return ProperNoun.random().get_word(False), False
    
    
class Verb(RandomValue):
    data = [
        'jump',
        'run',
        'cook',
        'annoy',
        'help',
        'read',
        'choose',
        'write',
        ['punch', 'punches'],
        'throw',
        ['catch', 'catches']
    ]
        
    def __init__(self, plural, singular=None):
        self.plural = plural
        self.singular = singular or (plural + 's')
        
    def get_word(self, plural):
        return self.plural if plural else self.singular
    

# Build sentence
subject, plural = noun()
action = Verb.random().get_word(plural)
object_, _ = noun()
subject + ' ' + action + ' ' + object_

'John helps Roy'

## Adding More Sentence Parts

So now, I want to add more to the sentence. There is more to a sentence than "subject verbs object"...

You remember those sentence diagram things that we had to do in elementary school? What we've been writing are simple sentences, the most basic of simple sentences mind you.

I found this series of pages [here](https://academicguides.waldenu.edu/writingcenter/grammar/home) on grammar, so I'm gonna study that to see what I can add.

I want to stick with simple sentences for now, and flesh them out as much as possible, before I go on to complex sentences (compound should be pretty easy after that, I think?).

Next up, prepositions! These are the parts of the sentence that go "beside the fireplace" or "with a thing".

So, we'll add another layer. Pick a random sentence structure. We can use the same RandomValue class heirarchy

In [17]:
class RandomValue:
    @classmethod
    def random(cls):
        value = choice(cls.data)
        return cls(*value) if type(value) is list else cls(value)
    

class RandomWord(RandomValue):
    def __init__(self, word):
        self.word = word
        
    def generate(self, plural=None):
        return self.word
    
    
class ProperNoun(RandomWord):
    data = [
        'John',
        'Jim',
        'Gordon Ramsay',
        'Steven',
        'Khan',
        'Roy',
        'Michael',
        'Vladmir Putin',
        'the working class',
        'the biker gang',
        'the sky',
        'France'
    ]
    is_pluralible = False

    
class ImproperNoun(RandomValue):
    data = [
        'fence',
        'shark',
        ['child', 'children'],
        ['mistress', 'misstresses'],
        'citizen',
        'book',
        'egg'
    ]
    is_pluralible = True
    
    def __init__(self, singular, plural=None):
        self.singular = singular
        self.plural = plural or (singular + 's')
        
    def generate(self, plural=None):
        plural = coin() if plural is None else plural
        word = self.plural if plural else self.singular
        article = 'the' if plural or coin() else 'an' if word[0] in 'aeiou' else 'a'
        return article + ' ' + word


class Noun:
    subclasses = [
        ProperNoun,
        ImproperNoun
    ]
    
    @classmethod
    def random(cls):
        subclass = choice(cls.subclasses)
        return subclass.random()

    
class Verb(RandomValue):
    data = [
        'jump',
        'run',
        'cook',
        'annoy',
        'help',
        'read',
        'choose',
        'write',
        ['punch', 'punches'],
        'throw',
        ['catch', 'catches']
    ]
        
    def __init__(self, plural, singular=None):
        self.plural = plural
        self.singular = singular or (plural + 's')
        
    def generate(self, plural=None):
        plural = coin() if plural is None else plural
        return self.plural if plural else self.singular


class SubjectVerb:
    @classmethod
    def random(cls):
        return cls(coin(), Noun.random(), Verb.random())
    
    def __init__(self, plural, subject, verb):
        self.plural = subject.is_pluralible and plural
        self.subject = subject
        self.verb = verb
        
    def generate(self):
        return ' '.join([
            self.subject.generate(self.plural),
            self.verb.generate(self.plural)
        ])
    
    
class Preposition(RandomWord):
    data = [
        'with',
        'without',
        'above',
        'around',
        'by',
        'at',
        'below',
        'inside',
        'in the middle of'
    ]
    
    
class PrepositionalPhrase:
    @classmethod
    def random(cls):
        return cls(coin(), Preposition.random(), Noun.random())
    
    def __init__(self, plural, preposition, noun):
        self.plural = noun.is_pluralible and plural
        self.preposition = preposition
        self.noun = noun
        
    def generate(self):
        return ' '.join([
            self.preposition.generate(self.plural),
            self.noun.generate(self.plural)
        ])
    

class Sentence(RandomValue):
    data = [
        [[ SubjectVerb, Noun ]],
        [[ SubjectVerb, Noun, PrepositionalPhrase ]]
    ]
    
    def __init__(self, structure):
        self.structure = structure
        
    def generate(self):
        return ' '.join([ Part.random().generate() for Part in self.structure ])
    

# Build sentence
Sentence.random().generate()

'a mistress helps the sharks inside Khan'

So there is a recursive structure to the sentence generator. A sentence can be made up of a random part, which are also made up of random parts, and so on. I want to make a unified class structure to simplify the code. One thing I want to do is handle subject-verb agreement without passing extra parameters to `generate`. That way we can have one interface for every random part of the sentence.

In [18]:
class RandomValue:
    @classmethod
    def random(cls):
        values = choice(cls.data)
        return cls(*values) if type(values) is list else cls(values)
    

class RandomWord(RandomValue):
    is_pluralible = False
    
    def __init__(self, word):
        self.word = word
        
    def generate(self):
        return self.word
    

class RandomPart:
    @classmethod
    def random(cls):
        subclasses = choice(cls.data)
        if type(subclasses) is not list:
            values = [ subclasses.random() ]
        else:
            values = [ subclass.random() for subclass in subclasses ]
        return cls(*values)
    
    def __init__(self, *args):
        self.structure = args
        
    def generate(self):
        return ' '.join([ part.generate() for part in self.structure ])
    
    
class ProperNoun(RandomWord):
    data = [
        'John',
        'Jim',
        'Gordon Ramsay',
        'Steven',
        'Khan',
        'Roy',
        'Michael',
        'Vladmir Putin',
        'the working class',
        'the biker gang',
        'the sky',
        'France'
    ]
    is_plural = False
    
    
class Preposition(RandomWord):
    data = [
        'with',
        'without',
        'above',
        'around',
        'by',
        'at',
        'below',
        'inside',
        'in the middle of'
    ]
    

class Pluralible(RandomValue):
    def __init__(self, is_plural=None):
        self.is_plural = is_plural if is_plural is not None else coin()
        
    def generate(self):
        return self.plural if self.is_plural else self.singular
    
    
class ImproperNoun(Pluralible):
    data = [
        'fence',
        'shark',
        ['child', 'children'],
        ['mistress', 'misstresses'],
        'citizen',
        'book',
        'egg'
    ]
    
    def __init__(self, singular, plural=None):
        self.singular = singular
        self.plural = plural or (singular + 's')
        super().__init__()

    def get_article(self, word):
        if self.is_plural or coin():
            article = 'the'
        elif word[0] in 'aeiou':
            article = 'an'
        else:
            article = 'a'
        return article + ' ' + word
        
    def generate(self):
        word = super().generate()
        return self.get_article(word)
    
    
class Verb(Pluralible):
    data = [
        'jump',
        'run',
        'cook',
        'annoy',
        'help',
        'read',
        'choose',
        'write',
        ['punch', 'punches'],
        'throw',
        ['catch', 'catches']
    ]
        
    def __init__(self, plural, singular=None, is_plural=None):
        self.plural = plural
        self.singular = singular or (plural + 's')
        super().__init__()


class Noun(RandomPart):
    data = [
        ProperNoun,
        ImproperNoun
    ]
        
    @property
    def is_plural(self):
        return self.structure[0].is_plural


class SubjectVerb(RandomPart):
    data = [[ Noun, Verb ]]
    
    def __init__(self, subject, verb):
        self.subject = subject
        self.verb = verb
        
    def generate(self):
        # Subject verb agreement
        self.verb.is_plural = self.subject.is_plural
        
        # Return phrase
        return ' '.join([
            self.subject.generate(),
            self.verb.generate()
        ])
    
    
class PrepositionalPhrase(RandomPart):
    data = [[ Preposition, Noun ]]
    

class Sentence(RandomPart):
    data = [
        [ SubjectVerb, Noun ],
        [ SubjectVerb, Noun, PrepositionalPhrase ]
    ]
    

# Build sentence
Sentence.random().generate()

'France helps the eggs'

Now let's try compound sentences. A compound sentence is two simple sentences combined with a conjunction

In [19]:
class RandomValue:
    @classmethod
    def random(cls):
        values = choice(cls.data)
        return cls(*values) if type(values) is list else cls(values)
    

class RandomWord(RandomValue):
    is_pluralible = False
    
    def __init__(self, word):
        self.word = word
        
    def generate(self):
        return self.word
    

class RandomPart:
    @classmethod
    def random(cls):
        subclasses = choice(cls.data)
        if type(subclasses) is not list:
            values = [ subclasses.random() ]
        else:
            values = [ subclass.random() for subclass in subclasses ]
        return cls(*values)
    
    def __init__(self, *args):
        self.structure = args
        
    def generate(self):
        return ' '.join([ part.generate() for part in self.structure ])
    
    
class RandomPluralible(RandomValue):
    def __init__(self, is_plural=None):
        self.is_plural = is_plural if is_plural is not None else coin()
        
    def generate(self):
        return self.plural if self.is_plural else self.singular
    
    
class ProperNoun(RandomWord):
    data = [
        'John',
        'Jim',
        'Gordon Ramsay',
        'Steven',
        'Khan',
        'Roy',
        'Michael',
        'Vladmir Putin',
        'the working class',
        'the biker gang',
        'the sky',
        'France'
    ]
    is_plural = False
    
    
class Preposition(RandomWord):
    data = [
        'with',
        'without',
        'above',
        'around',
        'by',
        'at',
        'below',
        'inside',
        'in the middle of'
    ]
    
    
class Conjunction(RandomWord):
    data = [
        'and',
        'but'
    ]
    
    
class ImproperNoun(RandomPluralible):
    data = [
        'fence',
        'shark',
        ['child', 'children'],
        ['mistress', 'misstresses'],
        'citizen',
        'book',
        'egg'
    ]
    
    def __init__(self, singular, plural=None):
        self.singular = singular
        self.plural = plural or (singular + 's')
        super().__init__()

    def get_article(self, word):
        if self.is_plural or coin():
            article = 'the'
        elif word[0] in 'aeiou':
            article = 'an'
        else:
            article = 'a'
        return article + ' ' + word
        
    def generate(self):
        word = super().generate()
        return self.get_article(word)
    
    
class Verb(RandomPluralible):
    data = [
        'jump',
        'run',
        'cook',
        'annoy',
        'help',
        'read',
        'choose',
        'write',
        ['punch', 'punches'],
        'throw',
        ['catch', 'catches']
    ]
        
    def __init__(self, plural, singular=None, is_plural=None):
        self.plural = plural
        self.singular = singular or (plural + 's')
        super().__init__()


class Noun(RandomPart):
    data = [
        ProperNoun,
        ImproperNoun
    ]
        
    @property
    def is_plural(self):
        return self.structure[0].is_plural


class SubjectVerb(RandomPart):
    data = [[ Noun, Verb ]]
    
    def __init__(self, subject, verb):
        self.subject = subject
        self.verb = verb
        
    def generate(self):
        # Subject verb agreement
        self.verb.is_plural = self.subject.is_plural
        
        # Return phrase
        return ' '.join([
            self.subject.generate(),
            self.verb.generate()
        ])
    
    
class PrepositionalPhrase(RandomPart):
    data = [[ Preposition, Noun ]]
    

class SimpleSentence(RandomPart):
    data = [
        [ SubjectVerb, Noun ],
        [ SubjectVerb, Noun, PrepositionalPhrase ]
    ]
    

class Sentence(RandomPart):
    data = [
        SimpleSentence,
        [ SimpleSentence, Conjunction, SimpleSentence ]
    ]
    

# Build sentence
Sentence.random().generate()

'the fences punch the eggs with Michael'

And complex sentences should be easy, we can use the same framework just with an extra layer

In [26]:
class RandomValue:
    @classmethod
    def random(cls):
        values = choice(cls.data)
        return cls(*values) if type(values) is list else cls(values)
    

class RandomWord(RandomValue):
    is_pluralible = False
    
    def __init__(self, word):
        self.word = word
        
    def generate(self):
        return self.word
    

class RandomPart:
    @classmethod
    def random(cls):
        subclasses = choice(cls.data)
        if type(subclasses) is not list:
            values = [ subclasses.random() ]
        else:
            values = [ subclass.random() for subclass in subclasses ]
        return cls(*values)
    
    def __init__(self, *args):
        self.structure = args
        
    def generate(self):
        return ' '.join([ part.generate() for part in self.structure ])
    
    
class RandomPluralible(RandomValue):
    def __init__(self, is_plural=None):
        self.is_plural = is_plural if is_plural is not None else coin()
        
    def generate(self):
        return self.plural if self.is_plural else self.singular
    
    
class ProperNoun(RandomWord):
    data = [
        'John',
        'Jim',
        'Gordon Ramsay',
        'Steven',
        'Khan',
        'Roy',
        'Michael',
        'Vladmir Putin',
        'the working class',
        'the biker gang',
        'the sky',
        'France'
    ]
    is_plural = False
    
    
class Preposition(RandomWord):
    data = [
        'with',
        'without',
        'above',
        'around',
        'by',
        'at',
        'below',
        'inside',
        'in the middle of'
    ]
    
    
class CompoundConjunction(RandomWord):
    data = [
        'and',
        'but'
    ]
    

class ComplexConjunction(RandomWord):
    data = [
        'although',
        'because',
        'while'
    ]
    
    
class ImproperNoun(RandomPluralible):
    data = [
        'fence',
        'shark',
        ['child', 'children'],
        ['mistress', 'misstresses'],
        'citizen',
        'book',
        'egg'
    ]
    
    def __init__(self, singular, plural=None):
        self.singular = singular
        self.plural = plural or (singular + 's')
        super().__init__()

    def get_article(self, word):
        if self.is_plural or coin():
            article = 'the'
        elif word[0] in 'aeiou':
            article = 'an'
        else:
            article = 'a'
        return article + ' ' + word
        
    def generate(self):
        word = super().generate()
        return self.get_article(word)
    
    
class Verb(RandomPluralible):
    data = [
        'jump',
        'run',
        'cook',
        'annoy',
        'help',
        'read',
        'choose',
        'write',
        ['punch', 'punches'],
        'throw',
        ['catch', 'catches'],
        'correct',
        ['modify', 'modifies']
    ]
        
    def __init__(self, plural, singular=None, is_plural=None):
        self.plural = plural
        self.singular = singular or (plural + 's')
        super().__init__()


class Noun(RandomPart):
    data = [
        ProperNoun,
        ImproperNoun
    ]
        
    @property
    def is_plural(self):
        return self.structure[0].is_plural


class SubjectVerb(RandomPart):
    data = [[ Noun, Verb ]]
    
    def __init__(self, subject, verb):
        self.subject = subject
        self.verb = verb
        
    def generate(self):
        # Subject verb agreement
        self.verb.is_plural = self.subject.is_plural
        
        # Return phrase
        return ' '.join([
            self.subject.generate(),
            self.verb.generate()
        ])
    
    
class PrepositionalPhrase(RandomPart):
    data = [[ Preposition, Noun ]]
    

class SimpleSentence(RandomPart):
    data = [
        [ SubjectVerb, Noun ],
        [ SubjectVerb, Noun, PrepositionalPhrase ]
    ]
    

class ComplexOrSimpleSentence(RandomPart):
    data = [
        [ SimpleSentence ],
        [ ComplexConjunction, SimpleSentence, SimpleSentence ],
        [ SimpleSentence, ComplexConjunction, SimpleSentence ],
    ]
    

class Sentence(RandomPart):
    data = [
        ComplexOrSimpleSentence,
        [ ComplexOrSimpleSentence, CompoundConjunction, ComplexOrSimpleSentence ]
    ]
    

# Build sentence
Sentence.random().generate()

'a mistress jumps Vladmir Putin'