# Harry Potter Stretches His Legs
A [scathing review](https://www.latimes.com/archives/la-xpm-2003-sep-19-oe-bloom19-story.html) of Harry Potter and Stephen King by Harold Bloom has given rise to a popular copy pasted [meme](https://www.reddit.com/r/copypasta/comments/5rissv/harry_potter_plebeian_pasta/), in which the following is claimed:

>I went to the Yale bookstore and bought and read a copy of “Harry Potter and the Sorcerer’s Stone.” I suffered a great deal in the process. The writing was dreadful; the book was terrible. As I read, I noticed that every time a character went for a walk, **the author wrote instead that the character “stretched his legs.”** I began marking on the back of an envelope every time that phrase was repeated. I stopped only after I had marked the envelope **several dozen times.**

In fact, Harold Bloom misquotes his earlier [review](https://www.wsj.com/articles/SB963270836801555352) - he had not counted dozen of occurences of the phrase "stretches his legs", but of cliches akin to it. Regardless, let's investigate this claim: how many times does the sentence really appear in the Harry Potter books?

## Reading Harry Potter
Let's not inflect more pain on Harold Bloom and let Python handle the reading. We legally purchased the books as .epub files, which where uncompressed into folders containing xhtml files for each chapters. We can read the content of the books by parsing these xhtml files. We look for the 'p' tag, split the text roughly by sentences with period delimiters, and replace a few formatting characters.

In [369]:
from os import listdir
from os.path import isdir, join
from bs4 import BeautifulSoup
from unicodedata import normalize

# the folder names for each book
book_folders = ['hp{}'.format(n) for n in range(1,8)]

# a list of lists to contain the paragraphs in each book and chapter
all_text = []

# loop through each book folder
for book in book_folders:
    print('Reading book {}...'.format(book))
    book_sentences = []
    chapter_count = 0
#   loop through each chapter
    for chapter in listdir(book):
        if chapter.endswith('.xhtml'):
            with open(join(book,chapter), encoding='utf-8') as f:
                chapter_count+=1
                content = f.read()
                # parse the content using BeautifulSoup
                parsed_content = BeautifulSoup(content, 'html.parser')
                # clean up each sentence
                for paragraph in parsed_content.find_all('p'):
                    paragraph_text=paragraph.getText(strip=True)
                    paragraph_sentences = paragraph_text.replace('\xad', '').replace('\u200b', '').split('. ')
                    for sentence in paragraph_sentences:
                        if len(sentence.strip()) >1:
                            book_sentences.append(sentence)
    print('   {} chapters read, total {} sentences.'.format(chapter_count, len(book_sentences)))
    all_text.append(book_sentences)

Reading book hp1...
   18 chapters read, total 6044 sentences.
Reading book hp2...
   15 chapters read, total 6425 sentences.
Reading book hp3...
   23 chapters read, total 8054 sentences.
Reading book hp4...
   38 chapters read, total 13382 sentences.
Reading book hp5...
   37 chapters read, total 16553 sentences.
Reading book hp6...
   31 chapters read, total 11521 sentences.
Reading book hp7...
   36 chapters read, total 12798 sentences.


Let's make sure it all checks out by taking a few random sentences:

In [374]:
from random import sample
for n in range(5):
    random_book = sample(all_text,1)[0]
    print(sample(random_book,1)[0],'\n')

“Search me.” 

And at long last, Harry believed him 

“Here,” said Harry, and they placed him in a niche where a suit of armor had stood earlier 

“What’s that thing — hanging underneath?” said Ron, a slight quiver in his voice. 

Bagman didn't seem in any particular rush to spill the beans, though 



## Counting up
We are now ready to do the counting. Since there are many variations of the expression "stretching legs", we'll just look for occurences where both "stretch" and "leg" appear. This should take care of all possibilities.

In [384]:
i=1
occurences = []
for book in all_text:
    print('------In book', i,'------')
    i+=1
    stretches_in_book = 0
    for sentence in book:
        lowered_sentence = sentence.lower()
        if 'stretch' in lowered_sentence and 'leg' in lowered_sentence:
            print(sentence, '\n')
            stretches_in_book+=1
            occurences.append(sentence)
    if stretches_in_book == 0:
        print('No occurences \n')

------In book 1 ------
He was in a very good mood until lunchtime, when he thought he’d stretch his legs and walk across the road to buy himself a bun from the bakery. 

------In book 2 ------
He’d probably thought it was a shame that the monster had been cooped up so long, and thought it deserved the chance to stretch its many legs; Harry could just imagine the thirteen-year-old Hagrid trying to fit a leash and collar on it 

------In book 3 ------
No occurences 

------In book 4 ------
It stretched out its legs rigidly, then did a back flip, breaking the thread and landing on the desk, where it began to cartwheel in circles 

Four fully grown, enormous, vicious-looking dragons were rearing onto their hind legs inside an enclosure fenced with thick planks of wood, roaring and snorting - torrents of fire were shooting into the dark sky from their open, fanged mouths, fifty feet above the ground on their outstretched necks 

Harry watched the dragon nearest to them teeter dangerously on

## Assessing the results
>He was in a very good mood until lunchtime, when he thought he’d **stretch his legs** and walk across the road to buy himself a bun from the bakery. 

This is a valid useage of the expression. +1 

>He’d probably thought it was a shame that the monster had been cooped up so long, and thought it deserved the chance to **stretch its many legs**; Harry could just imagine the thirteen-year-old Hagrid trying to fit a leash and collar on it 

Another good one. +1

>It **stretched out its legs** rigidly, then did a back flip, breaking the thread and landing on the desk, where it began to cartwheel in circles 

Not sure if this counts as the expression for taking a walk!

>Four fully grown, enormous, vicious-looking dragons were rearing onto their hind **leg**s inside an enclosure fenced with thick planks of wood, roaring and snorting - torrents of fire were shooting into the dark sky from their open, fanged mouths, fifty feet above the ground on their out**stretch**ed necks 

This one definitely doesn't count.

>Harry watched the dragon nearest to them teeter dangerously on its back **leg**s; its jaws **stretch**ed wide in a silent howl; its nostrils were suddenly devoid of flame, though still smoking - then, very slowly, it fell 

Neither does this one.

>“It's all right,” said Moody, sitting down and **stretching out his wooden leg** with a groan 

>He moved over to his desk, sat down, **stretched out his wooden leg** with a slight groan, and pulled out his hip flask. 

>She got up, **stretched her front legs**, and then moved aside for him to pass.

>'Ginnys had a word with us about you,' said Fred, **stretching out his legs** on the table in front of them and causing several booklets on careers with the Ministry of Magic to slide off on to the floor 

Here again, the words are used not as an expression, but in the literal sense.

>You think you've had it bad, at least you've been able to get out and about, **stretch your legs**, get into a few fights… I've been stuck inside for a month.'

Finaly, another one. +1

>A boy was sitting on top of the gray blankets, his **legs stretched out** in front of him, holding a book. 

Another literal use.

## Final results
**Total Occurences = 11**

**Actual Occurences of the expression = 4**

## Conclusion
This joke is vastly inaccurate. Well done Slytherin, however...