# Bad-Libs

Automatically converts any book into a Mad-Libs style game of silliness.

This is a tutorial notebook to teach the basics of spaCy from scratch.

Requires spaCy: https://spacy.io

How it works:
 1. First it scans text for adjectives, nouns, verbs, people, and locations. 
 2. Then it creates placeholders that the "player" (that's you!) can provide their own answers for.
 3. It then fills-in-the blanks to replace the missing spots with a word of the appropriate part of speech.

Finally we read the finished bad-lib result while hilarity ensues.

In [4]:
#Grant access to your local g-drive
from google.colab import drive
drive.mount('/content/drive/')
path = '/content/drive/My Drive/Colab Notebooks/'

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).


In [2]:
#Download the large model!
!python -m spacy download 'en_core_web_lg'

Collecting en_core_web_lg==2.1.0
[?25l  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.1.0/en_core_web_lg-2.1.0.tar.gz (826.9MB)
[K     |████████████████████████████████| 826.9MB 1.1MB/s 
[?25hBuilding wheels for collected packages: en-core-web-lg
  Building wheel for en-core-web-lg (setup.py) ... [?25l[?25hdone
  Created wheel for en-core-web-lg: filename=en_core_web_lg-2.1.0-cp36-none-any.whl size=828255076 sha256=2e063f2f804d66d1d10044adbbb195b4fc28724b7905fa6ebcd78ea2fb2e35d7
  Stored in directory: /tmp/pip-ephem-wheel-cache-c_pknvds/wheels/b4/d7/70/426d313a459f82ed5e06cc36a50e2bb2f0ec5cb31d8e0bdf09
Successfully built en-core-web-lg
Installing collected packages: en-core-web-lg
Successfully installed en-core-web-lg-2.1.0
[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('en_core_web_lg')


In [0]:
import io
import re
import spacy
import datetime
import math
import random
from operator import itemgetter

In [0]:
#Initialize with the large model, it has the most accuracy.
nlp = spacy.load('en_core_web_lg')

### Load the content

Always understand the content structure first.  We'll be splitting up by chapters and paragraphs of those chapters.  We really need to know what the text looks like before we start processing it!

In [0]:
filename = path+"great_expectations.txt"
with io.open(filename, mode="r", encoding="utf-8") as f:
    content = f.read()

In [6]:
print(len(content))

1013443


In [7]:
#Inspect the text to see what it looks like.  Note the breaks with '\n\nChapter X\n\n'
print(content[0:10000])

The Project Gutenberg EBook of Great Expectations, by Charles Dickens

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.org


Title: Great Expectations

Author: Charles Dickens

Posting Date: August 20, 2008 [EBook #1400]
Release Date: July, 1998
Last Updated: March 4, 2018

Language: English

Character set encoding: UTF-8

*** START OF THIS PROJECT GUTENBERG EBOOK GREAT EXPECTATIONS ***




Produced by An Anonymous Volunteer





GREAT EXPECTATIONS

[1867 Edition]

by Charles Dickens


[Project Gutenberg Editor's Note: There is also another version of
this work etext98/grexp10.txt scanned from a different edition]




Chapter I

My father's family name being Pirrip, and my Christian name Philip, my
infant tongue could make of both names nothing longer or more explicit
than Pip. So, I called m

### Breaking up the chapters and paragraphs

spaCy (and many other NLP libraries), don't break up text by paragraph.  So we need to do some string nerding on this thing.  A more professional name is "text preprocessing".  We will split on chapter using a breaker.  We also do some text cleanup

In [8]:
chapters = []
content = content.replace('--',' ')
ch_split = '\n\nChapter '
pa_split = '\n\n'
for chapter in content.split(ch_split)[1:]:
    paragraphs = chapter.split(pa_split)[2:]
    paragraphs = [para.replace('\n',' ').strip() for para in paragraphs]
    chapters.append(paragraphs)
    print('Chapter',len(chapters),': ',len(paragraphs))

Chapter 1 :  45
Chapter 2 :  64
Chapter 3 :  46
Chapter 4 :  55
Chapter 5 :  73
Chapter 6 :  5
Chapter 7 :  92
Chapter 8 :  105
Chapter 9 :  73
Chapter 10 :  56
Chapter 11 :  128
Chapter 12 :  26
Chapter 13 :  68
Chapter 14 :  8
Chapter 15 :  93
Chapter 16 :  17
Chapter 17 :  75
Chapter 18 :  133
Chapter 19 :  110
Chapter 20 :  73
Chapter 21 :  43
Chapter 22 :  109
Chapter 23 :  49
Chapter 24 :  51
Chapter 25 :  54
Chapter 26 :  54
Chapter 27 :  65
Chapter 28 :  40
Chapter 29 :  119
Chapter 30 :  70
Chapter 31 :  35
Chapter 32 :  46
Chapter 33 :  64
Chapter 34 :  30
Chapter 35 :  61
Chapter 36 :  75
Chapter 37 :  36
Chapter 38 :  110
Chapter 39 :  104
Chapter 40 :  117
Chapter 41 :  53
Chapter 42 :  46
Chapter 43 :  58
Chapter 44 :  77
Chapter 45 :  55
Chapter 46 :  44
Chapter 47 :  40
Chapter 48 :  64
Chapter 49 :  82
Chapter 50 :  48
Chapter 51 :  66
Chapter 52 :  42
Chapter 53 :  84
Chapter 54 :  82
Chapter 55 :  64
Chapter 56 :  35
Chapter 57 :  107
Chapter 58 :  65
Chapter 59 :  1

### Parsing the paragraphs with spaCy nlp

This will take the text and make a spaCy "Doc" for each chapter's paragraphs.

In [9]:
docs = []
for paragraphs in chapters:
    for text in paragraphs[0:2]:
        docs.append(nlp(text))
        print(text)
        print('')
    print('---------')

I give Pirrip as my father's family name, on the authority of his tombstone and my sister, Mrs. Joe Gargery, who married the blacksmith. As I never saw my father or my mother, and never saw any likeness of either of them (for their days were long before the days of photographs), my first fancies regarding what they were like were unreasonably derived from their tombstones. The shape of the letters on my father's, gave me an odd idea that he was a square, stout, dark man, with curly black hair. From the character and turn of the inscription, “Also Georgiana Wife of the Above,” I drew a childish conclusion that my mother was freckled and sickly. To five little stone lozenges, each about a foot and a half long, which were arranged in a neat row beside their grave, and were sacred to the memory of five little brothers of mine, who gave up trying to get a living, exceedingly early in that universal struggle, I am indebted for a belief I religiously entertained that they had all been born on

### Exploring the parsed structure

We'll look at the sentences, tokens, nouns, and entities of the first parsed paragraph.

In [0]:
chapter_1 = docs[0]

In [11]:
sents = 0
for sent in chapter_1.sents:
    print(sent)
    print('----------')
    sents+=1
print(sents)

I give Pirrip as my father's family name, on the authority of his tombstone and my sister, Mrs. Joe Gargery, who married the blacksmith.
----------
As I never saw my father or my mother, and never saw any likeness of either of them (for their days were long before the days of photographs), my first fancies regarding what they were like were unreasonably derived from their tombstones.
----------
The shape of the letters on my father's, gave me an odd idea that he was a square, stout, dark man, with curly black hair.
----------
From the character and turn of the inscription, “Also Georgiana Wife of the Above,” I drew a childish conclusion that my mother was freckled and sickly.
----------
To five little stone lozenges, each about a foot and a half long, which were arranged in a neat row beside their grave, and were sacred to the memory of five little brothers of mine, who gave up trying to get a living, exceedingly early in that universal struggle, I am indebted for a belief I religiousl

In [12]:
sents = 0
toks = 0
for sent in chapter_1.sents:
    sents+=1
    for tok in sent:
        print(tok.pos_,' | ',tok.text)
        toks+=1
print(sents,toks)

PRON  |  I
VERB  |  give
PROPN  |  Pirrip
ADP  |  as
DET  |  my
NOUN  |  father
PART  |  's
NOUN  |  family
NOUN  |  name
PUNCT  |  ,
ADP  |  on
DET  |  the
NOUN  |  authority
ADP  |  of
DET  |  his
NOUN  |  tombstone
CCONJ  |  and
DET  |  my
NOUN  |  sister
PUNCT  |  ,
PROPN  |  Mrs.
PROPN  |  Joe
PROPN  |  Gargery
PUNCT  |  ,
PRON  |  who
VERB  |  married
DET  |  the
NOUN  |  blacksmith
PUNCT  |  .
ADP  |  As
PRON  |  I
ADV  |  never
VERB  |  saw
DET  |  my
NOUN  |  father
CCONJ  |  or
DET  |  my
NOUN  |  mother
PUNCT  |  ,
CCONJ  |  and
ADV  |  never
VERB  |  saw
DET  |  any
NOUN  |  likeness
ADP  |  of
DET  |  either
ADP  |  of
PRON  |  them
PUNCT  |  (
ADP  |  for
DET  |  their
NOUN  |  days
VERB  |  were
ADJ  |  long
ADP  |  before
DET  |  the
NOUN  |  days
ADP  |  of
NOUN  |  photographs
PUNCT  |  )
PUNCT  |  ,
DET  |  my
ADJ  |  first
NOUN  |  fancies
VERB  |  regarding
PRON  |  what
PRON  |  they
VERB  |  were
INTJ  |  like
VERB  |  were
ADV  |  unreasonably
VERB  |  derived
A

In [13]:
sents = 0
toks = 0
poss = set()
for sent in chapter_1.sents:
    sents+=1
    for tok in sent:
        toks+=1
        poss.add(tok.pos_)
print(poss)

{'PRON', 'ADP', 'NOUN', 'ADV', 'INTJ', 'NUM', 'PROPN', 'CCONJ', 'PART', 'PUNCT', 'ADJ', 'VERB', 'DET'}



## Here's a handy list of the PoS tags and their example words

#### CC: conjunction, coordinating
```
& 'n and both but either et for less minus neither nor or plus so
therefore times v. versus vs. whether yet
```

#### CD: numeral, cardinal
```
mid-1890 nine-thirty forty-two one-tenth ten million 0.5 one forty-
seven 1987 twenty '79 zero two 78-degrees eighty-four IX '60s .025
fifteen 271,124 dozen quintillion DM2,000 ...
```

#### DT: determiner
```
all an another any both del each either every half la many much nary
neither no some such that the them these this those
```

#### EX: existential there
```
there
```

#### IN: preposition or conjunction, subordinating
```
astride among uppon whether out inside pro despite on by throughout
below within for towards near behind atop around if like until below
next into if beside ...
```

#### JJ: adjective or numeral, ordinal
```
third ill-mannered pre-war regrettable oiled calamitous first separable
ectoplasmic battery-powered participatory fourth still-to-be-named
multilingual multi-disciplinary ...
```

#### JJR: adjective, comparative
```
bleaker braver breezier briefer brighter brisker broader bumper busier
calmer cheaper choosier cleaner clearer closer colder commoner costlier
cozier creamier crunchier cuter ...
```

#### JJS: adjective, superlative
```
calmest cheapest choicest classiest cleanest clearest closest commonest
corniest costliest crassest creepiest crudest cutest darkest deadliest
dearest deepest densest dinkiest ...
```

#### LS: list item marker
```
A A. B B. C C. D E F First G H I J K One SP-44001 SP-44002 SP-44005
SP-44007 Second Third Three Two * a b c d first five four one six three
two
```

#### MD: modal auxiliary
```
can cannot could couldn't dare may might must need ought shall should
shouldn't will would
```

#### NN: noun, common, singular or mass
```
common-carrier cabbage knuckle-duster Casino afghan shed thermostat
investment slide humour falloff slick wind hyena override subhumanity
machinist ...
```

#### NNP: noun, proper, singular
```
Motown Venneboerger Czestochwa Ranzer Conchita Trumplane Christos
Oceanside Escobar Kreisler Sawyer Cougar Yvette Ervin ODI Darryl CTCA
Shannon A.K.C. Meltex Liverpool ...
```

#### NNS: noun, common, plural
```
undergraduates scotches bric-a-brac products bodyguards facets coasts
divestitures storehouses designs clubs fragrances averages
subjectivists apprehensions muses factory-jobs ...
```

#### PDT: pre-determiner
```
all both half many quite such sure this
```

#### POS: genitive marker
```
' 's
```

#### PRP: pronoun, personal
```
hers herself him himself hisself it itself me myself one oneself ours
ourselves ownself self she thee theirs them themselves they thou thy us
```

#### PRP$: pronoun, possessive
```
her his mine my our ours their thy your
```

#### RB: adverb
```
occasionally unabatingly maddeningly adventurously professedly
stirringly prominently technologically magisterially predominately
swiftly fiscally pitilessly ...
```

#### RBR: adverb, comparative
```
further gloomier grander graver greater grimmer harder harsher
healthier heavier higher however larger later leaner lengthier less-
perfectly lesser lonelier longer louder lower more ...
```

#### RBS: adverb, superlative
```
best biggest bluntest earliest farthest first furthest hardest
heartiest highest largest least less most nearest second tightest worst
```

#### RP: particle
```
aboard about across along apart around aside at away back before behind
by crop down ever fast for forth from go high i.e. in into just later
low more off on open out over per pie raising start teeth that through
under unto up up-pp upon whole with you
```

#### TO: "to" as preposition or infinitive marker
```
to
```

#### UH: interjection
```
Goodbye Goody Gosh Wow Jeepers Jee-sus Hubba Hey Kee-reist Oops amen
huh howdy uh dammit whammo shucks heck anyways whodunnit honey golly
man baby diddle hush sonuvabitch ...
```

#### VB: verb, base form
```
ask assemble assess assign assume atone attention avoid bake balkanize
bank begin behold believe bend benefit bevel beware bless boil bomb
boost brace break bring broil brush build ...
```

#### VBD: verb, past tense
```
dipped pleaded swiped regummed soaked tidied convened halted registered
cushioned exacted snubbed strode aimed adopted belied figgered
speculated wore appreciated contemplated ...
```

#### VBG: verb, present participle or gerund
```
telegraphing stirring focusing angering judging stalling lactating
hankerin' alleging veering capping approaching traveling besieging
encrypting interrupting erasing wincing ...
```

#### VBN: verb, past participle
```
multihulled dilapidated aerosolized chaired languished panelized used
experimented flourished imitated reunifed factored condensed sheared
unsettled primed dubbed desired ...
```

#### VBP: verb, present tense, not 3rd person singular
```
predominate wrap resort sue twist spill cure lengthen brush terminate
appear tend stray glisten obtain comprise detest tease attract
emphasize mold postpone sever return wag ...
```

#### VBZ: verb, present tense, 3rd person singular
```
bases reconstructs marks mixes displeases seals carps weaves snatches
slumps stretches authorizes smolders pictures emerges stockpiles
seduces fizzes uses bolsters slaps speaks pleads ...
```

#### WDT: WH-determiner
```
that what whatever which whichever
```

#### WP: WH-pronoun
```
that what whatever whatsoever which who whom whosoever
```

#### WRB: Wh-adverb
```
how however whence whenever where whereby whereever wherein whereof why
```

In [0]:
# All the nouns in all the sentences
sents = 0
toks = 0
nouns = set()
tags = set()
for sent in chapter_1.sents:
    sents+=1
    for tok in sent:
        toks+=1
        if tok.tag_ == 'NN':
            nouns.add(tok)
print(sents,toks)
print(nouns)

5 232
{belief, inscription, mother, father, tombstone, idea, shape, father, grave, struggle, family, memory, man, state, turn, blacksmith, row, foot, mother, stone, conclusion, likeness, existence, name, sister, hair, father, mine, authority, half, character, living}


In [0]:
# All the noun chunks (chunked using the spaCy model and algorithm)
chunks = 0
for concept in chapter_1.noun_chunks:
    print(concept)
    chunks += 1
print(chunks)

I
Pirrip
my father's family name
the authority
his tombstone
my sister
Mrs. Joe Gargery
who
the blacksmith
I
my father
my mother
any likeness
them
their days
the days
photographs
my first fancies
what
they
their tombstones
The shape
the letters
my father
me
an odd idea
he
a square, stout, dark man
curly black hair
the character
the inscription
Also Georgiana Wife
the Above
I
a childish conclusion
my mother
five little stone lozenges
a foot
a neat row
their grave
the memory
five little brothers
mine
who
a living
that universal struggle
I
a belief
I
they
their backs
their hands
their trousers-pockets
them
this state
existence
56


In [14]:
#Entities in each sentence
for sent in chapter_1.sents:
    for ent in sent.ents:
        if(ent.text.strip()):
            print(ent, ent.label_, ent.start, ent.end)

Pirrip ORG 2 3
Joe Gargery PERSON 21 23
their days DATE 50 52
the days DATE 55 57
first ORDINAL 62 63
Georgiana Wife of the Above PERSON 118 123
five CARDINAL 139 140
about a foot QUANTITY 145 148
half CARDINAL 150 151
five CARDINAL 171 172


In [15]:
# All the entities in all the sentences
for sent in chapter_1.sents:
    if len(sent.ents)>0:
        print('--------')
        print(sent)
    for ent in sent.ents:
        if(ent.text.strip()):
            print('     ',ent.text,' | ',ent.label_)

--------
I give Pirrip as my father's family name, on the authority of his tombstone and my sister, Mrs. Joe Gargery, who married the blacksmith.
      Pirrip  |  ORG
      Joe Gargery  |  PERSON
--------
As I never saw my father or my mother, and never saw any likeness of either of them (for their days were long before the days of photographs), my first fancies regarding what they were like were unreasonably derived from their tombstones.
      their days  |  DATE
      the days  |  DATE
      first  |  ORDINAL
--------
From the character and turn of the inscription, “Also Georgiana Wife of the Above,” I drew a childish conclusion that my mother was freckled and sickly.
      Georgiana Wife of the Above  |  PERSON
--------
To five little stone lozenges, each about a foot and a half long, which were arranged in a neat row beside their grave, and were sacred to the memory of five little brothers of mine, who gave up trying to get a living, exceedingly early in that universal struggle, I

### Kinds of entities we have in our text

Let's list the entity types that exist, and explore our interests for our Bad-Libs project!

In [16]:
labels = set()
for doc in docs:
    for ent in doc.ents:
        labels.add(ent.label_)
print(labels)

{'NORP', 'MONEY', 'TIME', 'EVENT', 'PERSON', 'PRODUCT', 'ORG', 'DATE', 'LOC', 'LANGUAGE', 'GPE', 'WORK_OF_ART', 'FAC', 'ORDINAL', 'CARDINAL', 'QUANTITY'}


### Bad Libs will use People and Geo-Political Entities

We want to ask the player for replacements for these names

In [17]:
placeholders = []
entity_filters = {'PERSON', 'GPE'}
for doc in docs:
    ph = []
    for ent in doc.ents:
        if len(ent.text)>0 and ent.label_ in entity_filters:
            ph.append({
                'text':  ent.text,
                'kind':  ent.label_,
                'tag':   ent.label_,
                'start': ent.start_char,
                'end':   ent.end_char,
                'length':   len(ent.text)
            })
            #print(idx,ent.label_,ent.start,ent.end)
    placeholders.append(ph)
print(placeholders[1])

[{'text': 'Philip Pirrip', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 358, 'end': 371, 'length': 13}, {'text': 'Georgiana', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 403, 'end': 412, 'length': 9}, {'text': 'Alexander', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 463, 'end': 472, 'length': 9}, {'text': 'Bartholomew', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 474, 'end': 485, 'length': 11}, {'text': 'Abraham', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 487, 'end': 494, 'length': 7}, {'text': 'Tobias', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 496, 'end': 502, 'length': 6}, {'text': 'Roger', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 508, 'end': 513, 'length': 5}, {'text': 'Pip', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 949, 'end': 952, 'length': 3}]


In [18]:
# Normalize and group entities, (this is just an example and isnt used in the game, but could be)
uniques = set()
for ph in placeholders[1]:
    uniques.add(ph['text'].lower())
print(uniques)

{'roger', 'abraham', 'pip', 'georgiana', 'bartholomew', 'philip pirrip', 'alexander', 'tobias'}


### Getting some Parts-of-Speech

We're going to also get replacements for nouns, verbs, and adjectives.

In [0]:
pos_filters = {'NOUN', 'ADJ', 'VERB'}
tag_filters = {'JJS', 'JJR', 'NN', 'NNS', 'VB', 'VBD', 'VBG'}
for idx,doc in enumerate(docs):
    nouns = []
    verbs = []
    adjs  = []
    for tok in doc:
        if len(tok.text)>0 and tok.pos_ in pos_filters and tok.tag_ in tag_filters:
            obj = {
                'text':  tok.text,
                'kind':  tok.pos_,
                'tag':   tok.tag_,
                'start': tok.idx,
                'end':   tok.idx+len(tok.text),
                'length':   len(tok.text)
            }
            if tok.pos_ == 'NOUN':
                nouns.append(obj)
            if tok.pos_ == 'VERB':
                verbs.append(obj)
            if tok.pos_ == 'ADJ':
                adjs.append(obj)
            #print(idx,tok.pos_,tok.idx,tok.idx+len(tok.text))
    nouns.sort(key=itemgetter('length'), reverse=True)
    verbs.sort(key=itemgetter('length'), reverse=True)
    adjs.sort(key=itemgetter('length'), reverse=True)
    placeholders[idx] += nouns[:8] + verbs[:8] + adjs[:8]

### Seeing what our placeholders look like

These will be queried for replacements by the players (that's us!)

In [20]:
[placeholder.sort(key=itemgetter('start')) for placeholder in placeholders]
for ph in placeholders[1]:
    print(ph)

{'text': 'impression', 'kind': 'NOUN', 'tag': 'NN', 'start': 130, 'end': 140, 'length': 10}
{'text': 'identity', 'kind': 'NOUN', 'tag': 'NN', 'start': 148, 'end': 156, 'length': 8}
{'text': 'have', 'kind': 'VERB', 'tag': 'VB', 'start': 182, 'end': 186, 'length': 4}
{'text': 'afternoon', 'kind': 'NOUN', 'tag': 'NN', 'start': 218, 'end': 227, 'length': 9}
{'text': 'found', 'kind': 'VERB', 'tag': 'VBD', 'start': 262, 'end': 267, 'length': 5}
{'text': 'churchyard', 'kind': 'NOUN', 'tag': 'NN', 'start': 337, 'end': 347, 'length': 10}
{'text': 'Philip Pirrip', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 358, 'end': 371, 'length': 13}
{'text': 'Georgiana', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 403, 'end': 412, 'length': 9}
{'text': 'were', 'kind': 'VERB', 'tag': 'VBD', 'start': 432, 'end': 436, 'length': 4}
{'text': 'Alexander', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 463, 'end': 472, 'length': 9}
{'text': 'Bartholomew', 'kind': 'PERSON', 'tag': 'PERSON', 'start': 474, 'end': 485, '

### A sample document

Now known as a "lib", we'll see what it looks like when we insert blank spaces into the text for our placeholders

In [21]:
lib = docs[1].text
print(lib)

Ours was the marsh country, down by the river, within, as the river wound, twenty miles of the sea. My first most vivid and broad impression of the identity of things seems to me to have been gained on a memorable raw afternoon towards evening. At such a time I found out for certain that this bleak place overgrown with nettles was the churchyard; and that Philip Pirrip, late of this parish, and also Georgiana wife of the above, were dead and buried; and that Alexander, Bartholomew, Abraham, Tobias, and Roger, infant children of the aforesaid, were also dead and buried; and that the dark flat wilderness beyond the churchyard, intersected with dikes and mounds and gates, with scattered cattle feeding on it, was the marshes; and that the low leaden line beyond was the river; and that the distant savage lair from which the wind was rushing was the sea; and that the small bundle of shivers growing afraid of it all and beginning to cry, was Pip.


In [0]:
for ph in placeholders[1]:
    lib = lib.replace(
        ph['text'],
        '_' * len(ph['text'])
    )

In [23]:
lib

'Ours was the marsh country, down by the river, within, as the river wound, twenty miles of the sea. My first most vivid and broad __________ of the ________ of things seems to me to ____ been gained on a memorable raw _________ towards evening. At such a time I _____ out for certain that this bleak place overgrown with nettles was the __________; and that _____________, late of this parish, and also _________ wife of the above, ____ dead and buried; and that _________, ___________, _______, ______, and _____, infant ________ of the _________, ____ also dead and buried; and that the dark flat __________ beyond the __________, intersected with dikes and mounds and gates, with scattered cattle _______ on it, was the marshes; and that the low leaden line beyond was the river; and that the distant savage lair from which the wind was _______ was the sea; and that the small bundle of shivers _______ afraid of it all and _________ to cry, was ___.'

In [24]:
number = random.randint(2,len(placeholders)-1)
lib = docs[number].text
for ph in placeholders[number]:
    lib = lib.replace(
        ph['text'],
        '_' * len(ph['text'])
    )
print('Bad-Lib #',number)
print('------------------------')
print(lib)

Bad-Lib # 89
------------------------
What a doleful night! How anxious, how dismal, how long! There was an inhospitable smell in the room, of cold soot and hot dust; and, as I looked up into the _______ of the tester over my head, I _______ what a number of blue-bottle flies from the ________', and _______ from the market, and grubs from the _______, must be _______ on up there, lying by for next summer. This led me to _________ whether any of them ever _______ down, and then I _______ that I felt light falls on my face, a disagreeable turn of _______, __________ other and more objectionable __________ up my back. When I had lain awake a little while, those extraordinary voices with which silence teems began to make themselves audible. The closet _________, the _________ sighed, the little washing-stand ticked, and one guitar-string played occasionally in the chest of drawers. At about the same time, the eyes on the wall ________ a new __________, and in every one of those staring rou

## Bad Libs is ready!

All we need is some user interface stuff, and we're good to go!

First we'll make some instructions based on part of speech tags:

In [25]:
tag_descriptions = {
    'PERSON': 'Person',
    'GPE': 'Place',
    'JJS': 'Adjective ending in "est"', 
    'JJR': 'Adjective ending in "er"', 
    'NN': 'Singular noun', 
    'NNS': 'Plural noun', 
    'VB': 'Verb in base form', 
    'VBD': 'Verb ending in "ed"', 
    'VBG': 'Verb ending in "ing"'
}

#Get a random chapter!
#number = random.randint(2,len(placeholders)-1)
number = 98
print(number, len(docs[number].text))

98 352


In [26]:
lib = docs[number].text
for ph in placeholders[number]:
    tag = ph['tag']
    print('"", #', tag_descriptions[tag])

"", # Person
"", # Verb ending in "ed"
"", # Place
"", # Verb ending in "ed"
"", # Singular noun
"", # Verb ending in "ed"
"", # Plural noun
"", # Verb ending in "ed"
"", # Singular noun
"", # Verb ending in "ing"
"", # Adjective ending in "est"
"", # Plural noun
"", # Plural noun
"", # Verb ending in "ed"
"", # Plural noun
"", # Verb ending in "ed"
"", # Verb ending in "ing"
"", # Singular noun
"", # Singular noun


In [0]:
answers = [
    "Julia Roberts", # PERSON
    "shined", # Verb ending in "ed"
    "Rochester", # GPE
    "electrocuted", # Verb ending in "ed"
    "worm", # Singular noun
    "fiddled", # Verb ending in "ed"
    "butterflies", # Plural noun
    "trapped", # Verb ending in "ed"
    "pigeon", # Singular noun
    "singing", # Verb ending in "ing"
    "luckiest", # Adjective ending in "est"
    "horses", # Plural noun
    "shoehorns", # Plural noun
    "witnessed", # Verb ending in "ed"
    "blankets", # Plural noun
    "tested", # Verb ending in "ed"
    "leaping", # Verb ending in "ing"
    "pillow", # Singular noun
    "refridgerator", # Singular noun
]

In [29]:
badlib = docs[number].text
for idx,ph in enumerate(placeholders[number]):
    badlib = badlib.replace(
        ph['text'],
        answers[idx]
    )
print(badlib)

When Julia Roberts shined been down to Rochester and electrocuted his worm, he fiddled back to me at our butterflies, and trapped the pigeon to singing on me. He was the luckiest of horses, and at stated shoehorns witnessed off the blankets, and tested them in the leaping pillow that was kept ready, and put them on again, with a patient refridgerator that I was deeply grateful for.
