# Probabilistic Tavern Name Generation:

This notebook uses industrial strength NLP and a standard markov-chain to generate tavern names.

There are many sources online that provide randomly generated names, however their implementation of "randomly generated" usually consists of selecting a name from a preconstructed list.

The method used below sources 500k pubs names to train a stochastic model for name generation.


In [20]:
import spacy
import markovify

In [5]:
import pandas as pd

In [6]:
columns = ['fsa_id', 'name',
           'address', 'postcode',
           'easting', 'northing',
           'latitude', 'longitude',
           'local_authority']

pub_names = pd.read_csv('data/uk_pub_data.csv', names=columns)

In [8]:
corpus = '. '.join(list(pub_names['name']))

In [9]:
len(corpus)

972245

In [12]:
pub_names['length'] = pub_names['name'].apply(lambda x: len(x))

In [14]:
pub_names.length.mean()

16.85441957879223

In [26]:
nlp = spacy.load("en")

In [27]:
class PosifiedText(markovify.Text):
    
    def word_split(self, sentence):
        return ["::".join((word.orth_, word.pos_)) for word in nlp(sentence)]
    
    def word_join(self, words):
        tagged_words = [word.split("::") for word in words]
        sentence = ''
        for word, pos_tag in tagged_words:
            if pos_tag == "PUNCT":
                sentence += word
            else:
                word = " " + word
                sentence += word
        return sentence.lstrip()

In [28]:
# Build the model.
text_model = PosifiedText(corpus)

for i in range(15):
    print(text_model.make_short_sentence(20, tries=25))

THE WHITE HART INN.
Yates 's Wine Bar.
Crow 's Nest.
City Golf Club Ltd.
Monton Bowling Club.
Up The Junction Pub.
New Red Lion.
Dolgellau Golf Club.
Stag 's Head.
Two Point Bar.
Paxton 's Head.
Live at Home.
Traveller 's Rest.
Bouchra At The Bank.
New York Club.


In [44]:
def normalise_name(text):
    tokens = {
        "PH": "Public House",
        "Ph.": "Public House.",
        " 's": "'s",
        ".": "",
    }
    if text:
        for token, value in tokens.items():
            text = text.replace(token, value)
    return text

In [40]:
def make_tavern():
    return normalise_name(text_model.make_short_sentence(20, tries=25))

In [49]:
print("Walking into {}".format(make_tavern()))

Walking into Ye Olde Wine House


In [50]:
print("Our travellers prefer it to {}, which sat across the path from {}.".format(make_tavern(), make_tavern()))

Our travellers prefer it to The Cock & Seaman, which sat across the path from The Red Cow Inn.


In [51]:
print("Their plan tonight was to drink here, then move to {}".format(make_tavern()))
print("However Durin was looking forward to the ale at {}".format(make_tavern()))

Their plan tonight was to drink here, then move to Old Town Hall Bars
However Durin was looking forward to the ale at The Hop & Barley


In [52]:
print("They finished their pints and walked passed the rowdy crowd at {}".format(make_tavern()))

They finished their pints and walked passed the rowdy crowd at Balcony Bar & Grill


In [55]:
print("Looking forward to the final stop: {}".format(make_tavern()))

Looking forward to the final stop: Frog & Bucket
