# Sentiment analysis with TFLearn

In this notebook, we'll continue Andrew Trask's work by building a network for sentiment analysis on the movie review data. Instead of a network written with Numpy, we'll be using [TFLearn](http://tflearn.org/), a high-level library built on top of TensorFlow. TFLearn makes it simpler to build networks just by defining the layers. It takes care of most of the details for you.

We'll start off by importing all the modules we'll need, then load and prepare the data.

In [1]:
import pandas as pd
import numpy as np
import tensorflow as tf
import tflearn
from tflearn.data_utils import to_categorical

## Preparing the data

Following along with Andrew, our goal here is to convert our reviews into word vectors. The word vectors will have elements representing words in the total vocabulary. If the second position represents the word 'the', for each review we'll count up the number of times 'the' appears in the text and set the second position to that count. I'll show you examples as we build the input data from the reviews data. Check out Andrew's notebook and video for more about this.

### Read the data

Use the pandas library to read the reviews and postive/negative labels from comma-separated files. The data we're using has already been preprocessed a bit and we know it uses only lower case characters. If we were working from raw data, where we didn't know it was all lower case, we would want to add a step here to convert it. That's so we treat different variations of the same word, like `The`, `the`, and `THE`, all the same way.

In [2]:
reviews = pd.read_csv('reviews.txt', header=None)
labels = pd.read_csv('labels.txt', header=None)

### Counting word frequency

To start off we'll need to count how often each word appears in the data. We'll use this count to create a vocabulary we'll use to encode the review data. This resulting count is known as a [bag of words](https://en.wikipedia.org/wiki/Bag-of-words_model). We'll use it to select our vocabulary and build the word vectors. You should have seen how to do this in Andrew's lesson. Try to implement it here using the [Counter class](https://docs.python.org/2/library/collections.html#collections.Counter).

> **Exercise:** Create the bag of words from the reviews data and assign it to `total_counts`. The reviews are stores in the `reviews` [Pandas DataFrame](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html). If you want the reviews as a Numpy array, use `reviews.values`. You can iterate through the rows in the DataFrame with `for idx, row in reviews.iterrows():` ([documentation](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iterrows.html)). When you break up the reviews into words, use `.split(' ')` instead of `.split()` so your results match ours.

In [11]:
from collections import Counter

total_counts = Counter()

for idx, row in reviews.iterrows():
    for review in row.values:
        for word in review.split(' '):
            total_counts[word] += 1

print("Total words in data set: ", len(total_counts))

Total words in data set:  74074


Let's keep the first 10000 most frequent words. As Andrew noted, most of the words in the vocabulary are rarely used so they will have little effect on our predictions. Below, we'll sort `vocab` by the count value and keep the 10000 most frequent words.

In [12]:
vocab = sorted(total_counts, key=total_counts.get, reverse=True)[:10000]
print(vocab[:60])

['', 'the', '.', 'and', 'a', 'of', 'to', 'is', 'br', 'it', 'in', 'i', 'this', 'that', 's', 'was', 'as', 'for', 'with', 'movie', 'but', 'film', 'you', 'on', 't', 'not', 'he', 'are', 'his', 'have', 'be', 'one', 'all', 'at', 'they', 'by', 'an', 'who', 'so', 'from', 'like', 'there', 'her', 'or', 'just', 'about', 'out', 'if', 'has', 'what', 'some', 'good', 'can', 'more', 'she', 'when', 'very', 'up', 'time', 'no']


What's the last word in our vocabulary? We can use this to judge if 10000 is too few. If the last word is pretty common, we probably need to keep more words.

In [13]:
print(vocab[-1], ': ', total_counts[vocab[-1]])

assassins :  30


The last word in our vocabulary shows up in 30 reviews out of 25000. I think it's fair to say this is a tiny proportion of reviews. We are probably fine with this number of words.

**Note:** When you run, you may see a different word from the one shown above, but it will also have the value `30`. That's because there are many words tied for that number of counts, and the `Counter` class does not guarantee which one will be returned in the case of a tie.

Now for each review in the data, we'll make a word vector. First we need to make a mapping of word to index, pretty easy to do with a dictionary comprehension.

> **Exercise:** Create a dictionary called `word2idx` that maps each word in the vocabulary to an index. The first word in `vocab` has index `0`, the second word has index `1`, and so on.

In [29]:
word2idx = {}

for index, word in enumerate(vocab):
    word2idx[index] = word

for key in word2idx.keys():
    print(key, word2idx[key])
    

0 
1 the
2 .
3 and
4 a
5 of
6 to
7 is
8 br
9 it
10 in
11 i
12 this
13 that
14 s
15 was
16 as
17 for
18 with
19 movie
20 but
21 film
22 you
23 on
24 t
25 not
26 he
27 are
28 his
29 have
30 be
31 one
32 all
33 at
34 they
35 by
36 an
37 who
38 so
39 from
40 like
41 there
42 her
43 or
44 just
45 about
46 out
47 if
48 has
49 what
50 some
51 good
52 can
53 more
54 she
55 when
56 very
57 up
58 time
59 no
60 even
61 my
62 would
63 which
64 story
65 only
66 really
67 see
68 their
69 had
70 we
71 were
72 me
73 well
74 than
75 much
76 get
77 bad
78 been
79 people
80 will
81 do
82 other
83 also
84 into
85 first
86 great
87 because
88 how
89 him
90 don
91 most
92 made
93 its
94 then
95 make
96 way
97 them
98 could
99 too
100 movies
101 any
102 after
103 think
104 characters
105 character
106 watch
107 two
108 films
109 seen
110 many
111 life
112 being
113 plot
114 acting
115 never
116 love
117 little
118 best
119 where
120 over
121 did
122 show
123 know
124 off
125 ever
126 man
127 does
128 here
12

1430 happening
1431 likable
1432 hurt
1433 lived
1434 puts
1435 tired
1436 spot
1437 pain
1438 jerry
1439 image
1440 giant
1441 thats
1442 st
1443 offer
1444 confused
1445 trailer
1446 suggest
1447 ray
1448 victims
1449 fresh
1450 include
1451 difference
1452 adults
1453 al
1454 billy
1455 impact
1456 summer
1457 christian
1458 alien
1459 arthur
1460 step
1461 event
1462 followed
1463 fellow
1464 hasn
1465 appeared
1466 approach
1467 park
1468 sub
1469 system
1470 gorgeous
1471 putting
1472 laughed
1473 actresses
1474 mix
1475 share
1476 murders
1477 confusing
1478 notice
1479 martin
1480 porn
1481 direct
1482 moral
1483 mediocre
1484 content
1485 mom
1486 lacks
1487 rape
1488 holes
1489 race
1490 supposedly
1491 americans
1492 flaws
1493 wall
1494 latter
1495 land
1496 l
1497 creative
1498 worthy
1499 tragedy
1500 agent
1501 clich
1502 relationships
1503 students
1504 lighting
1505 gem
1506 answer
1507 thin
1508 helps
1509 random
1510 ii
1511 wondering
1512 proves
1513 funniest
1514 m

2466 wilson
2467 experiences
2468 teens
2469 twisted
2470 aliens
2471 notch
2472 sympathy
2473 dogs
2474 realizes
2475 mgm
2476 uninteresting
2477 colors
2478 guilty
2479 nude
2480 security
2481 combined
2482 ted
2483 draw
2484 friday
2485 universe
2486 terror
2487 path
2488 princess
2489 struggling
2490 matrix
2491 convince
2492 dozen
2493 gritty
2494 mountain
2495 enter
2496 appreciated
2497 birth
2498 atrocious
2499 blah
2500 driver
2501 frightening
2502 irish
2503 recognize
2504 technically
2505 gas
2506 sword
2507 committed
2508 walked
2509 sun
2510 aka
2511 lugosi
2512 sarah
2513 changing
2514 magical
2515 false
2516 darkness
2517 explains
2518 court
2519 friendly
2520 directly
2521 pitt
2522 department
2523 occasional
2524 suspenseful
2525 massive
2526 variety
2527 abuse
2528 legendary
2529 performed
2530 prior
2531 surface
2532 demons
2533 featured
2534 experienced
2535 vietnam
2536 theory
2537 offered
2538 paint
2539 narration
2540 anger
2541 johnson
2542 reputation
2543 beach

3572 beaten
3573 rap
3574 cup
3575 distant
3576 piano
3577 bullets
3578 narrator
3579 border
3580 pleasantly
3581 cagney
3582 achievement
3583 empire
3584 spooky
3585 attracted
3586 temple
3587 butt
3588 picking
3589 beer
3590 ken
3591 accused
3592 perry
3593 cia
3594 ironically
3595 spoiled
3596 mile
3597 repetitive
3598 territory
3599 splendid
3600 prevent
3601 consequences
3602 goldberg
3603 notes
3604 uninspired
3605 clues
3606 francisco
3607 blows
3608 slap
3609 craig
3610 response
3611 turner
3612 logical
3613 falk
3614 motivation
3615 blank
3616 mate
3617 melodramatic
3618 farce
3619 blob
3620 ellen
3621 slightest
3622 revealing
3623 throat
3624 gandhi
3625 shortly
3626 dire
3627 chain
3628 emily
3629 opposed
3630 alexander
3631 indians
3632 psychiatrist
3633 souls
3634 watches
3635 modesty
3636 morality
3637 root
3638 cameron
3639 tall
3640 wes
3641 subplot
3642 los
3643 lacked
3644 suits
3645 cared
3646 shoes
3647 documentaries
3648 ruth
3649 pulling
3650 duo
3651 precious
365

4424 roth
4425 plant
4426 prize
4427 reaching
4428 armed
4429 steel
4430 sits
4431 constructed
4432 kingdom
4433 inane
4434 waited
4435 shelley
4436 motivations
4437 stooges
4438 defend
4439 edgar
4440 pretending
4441 ruby
4442 photos
4443 fooled
4444 taught
4445 letters
4446 confidence
4447 hop
4448 blade
4449 boot
4450 conservative
4451 wizard
4452 vegas
4453 wished
4454 unhappy
4455 transfer
4456 rotten
4457 amitabh
4458 shake
4459 tonight
4460 ollie
4461 montana
4462 staring
4463 vengeance
4464 suitable
4465 gangsters
4466 destroying
4467 dixon
4468 virginia
4469 connery
4470 paltrow
4471 jenny
4472 timon
4473 latin
4474 cleverly
4475 crocodile
4476 delivering
4477 gundam
4478 artificial
4479 dickens
4480 miracle
4481 buildings
4482 iron
4483 pants
4484 stevens
4485 global
4486 ballet
4487 agents
4488 relation
4489 writes
4490 bela
4491 pearl
4492 eager
4493 namely
4494 inferior
4495 raising
4496 carradine
4497 humble
4498 combine
4499 blunt
4500 sappy
4501 closest
4502 behave
4503

5534 stopping
5535 gut
5536 nicolas
5537 fluff
5538 oldest
5539 bully
5540 stages
5541 nails
5542 fifties
5543 filth
5544 agency
5545 receiving
5546 promote
5547 popping
5548 heading
5549 hack
5550 previews
5551 grudge
5552 banal
5553 snakes
5554 tenant
5555 misguided
5556 depending
5557 tip
5558 scorsese
5559 reeves
5560 posted
5561 harmless
5562 conveys
5563 dates
5564 acceptance
5565 lifted
5566 widescreen
5567 disguise
5568 grainy
5569 zombi
5570 confident
5571 emerges
5572 exceptions
5573 remade
5574 demonstrates
5575 explores
5576 records
5577 runner
5578 triple
5579 steele
5580 commander
5581 rhythm
5582 unexplained
5583 locals
5584 alvin
5585 schlock
5586 obligatory
5587 cassavetes
5588 studying
5589 pole
5590 morally
5591 employed
5592 connections
5593 ants
5594 naschy
5595 nolte
5596 preachy
5597 route
5598 strongest
5599 studies
5600 absorbed
5601 denis
5602 yawn
5603 harold
5604 campaign
5605 portrayals
5606 firing
5607 hokey
5608 gear
5609 understands
5610 beware
5611 diff

6575 sarcastic
6576 stella
6577 vanilla
6578 niven
6579 slip
6580 dashing
6581 dame
6582 stones
6583 arguments
6584 august
6585 establishment
6586 indiana
6587 sucker
6588 bachchan
6589 obsessive
6590 suspected
6591 pursue
6592 salesman
6593 jury
6594 sanders
6595 fleshed
6596 longoria
6597 incorrect
6598 knocks
6599 hannah
6600 bounty
6601 deputy
6602 anticipated
6603 adaption
6604 peaceful
6605 resulted
6606 furniture
6607 vince
6608 distract
6609 angie
6610 begging
6611 arrest
6612 shift
6613 boiled
6614 predictably
6615 exploding
6616 puzzle
6617 mormon
6618 serum
6619 fills
6620 fascist
6621 dramatically
6622 carnage
6623 demonstrated
6624 bogart
6625 jersey
6626 herbert
6627 hyper
6628 canceled
6629 stream
6630 mouthed
6631 hybrid
6632 debbie
6633 insomnia
6634 tolerable
6635 demille
6636 sixth
6637 wine
6638 didnt
6639 clive
6640 virtual
6641 gielgud
6642 spice
6643 climb
6644 sarandon
6645 rowlands
6646 brainless
6647 psychologist
6648 shootout
6649 goodbye
6650 morons
6651 def

7668 ala
7669 hobgoblins
7670 yell
7671 laying
7672 sophia
7673 howling
7674 castro
7675 beers
7676 charges
7677 sitcoms
7678 clarity
7679 kaufman
7680 explode
7681 wholesome
7682 dreaming
7683 filmography
7684 redford
7685 sylvia
7686 layered
7687 dreyfuss
7688 climatic
7689 achievements
7690 deathtrap
7691 pee
7692 jabba
7693 gypsy
7694 faye
7695 relaxed
7696 booth
7697 ealing
7698 grandpa
7699 drowning
7700 depardieu
7701 facility
7702 coolest
7703 pia
7704 busby
7705 polar
7706 redgrave
7707 warehouse
7708 inserted
7709 semblance
7710 wesley
7711 captivated
7712 rebellion
7713 senator
7714 holt
7715 espionage
7716 conclusions
7717 flip
7718 bronte
7719 raid
7720 faded
7721 runaway
7722 flashing
7723 practical
7724 earliest
7725 pornographic
7726 betrayed
7727 sunk
7728 monks
7729 regime
7730 defining
7731 gena
7732 unconventional
7733 taut
7734 annoy
7735 imho
7736 kinski
7737 ruled
7738 grin
7739 val
7740 sack
7741 sheridan
7742 bloodbath
7743 unnecessarily
7744 babes
7745 tools
7

8447 scarier
8448 orlando
8449 vertigo
8450 depend
8451 condemned
8452 pertwee
8453 brennan
8454 winners
8455 uptight
8456 bing
8457 mostel
8458 echoes
8459 geniuses
8460 caution
8461 luxury
8462 stares
8463 register
8464 surgeon
8465 fixed
8466 observed
8467 henchman
8468 captive
8469 strain
8470 emphasize
8471 russians
8472 hale
8473 predecessors
8474 lions
8475 thereafter
8476 hunky
8477 beth
8478 mines
8479 cringing
8480 decency
8481 cowardly
8482 representing
8483 rockets
8484 collaboration
8485 fighters
8486 promoting
8487 posh
8488 nifty
8489 irrational
8490 proportions
8491 scandal
8492 robbing
8493 lists
8494 accomplishment
8495 dickinson
8496 pub
8497 fanning
8498 echo
8499 subdued
8500 limp
8501 dodgy
8502 stevenson
8503 caruso
8504 missions
8505 crooked
8506 gusto
8507 identities
8508 federal
8509 engagement
8510 abstract
8511 celeste
8512 temptation
8513 freaked
8514 richly
8515 lansbury
8516 imprisoned
8517 horn
8518 ledger
8519 honey
8520 laced
8521 extraordinarily
8522 

9091 rewrite
9092 nutty
9093 kamal
9094 backdrops
9095 callahan
9096 yearning
9097 jed
9098 harlem
9099 unemployed
9100 welcomed
9101 rusty
9102 roommates
9103 pranks
9104 lange
9105 smarmy
9106 watered
9107 recreate
9108 heavens
9109 zabriskie
9110 seinfeld
9111 counterpart
9112 naval
9113 duryea
9114 locales
9115 poke
9116 guided
9117 placing
9118 daphne
9119 architecture
9120 connolly
9121 stretches
9122 bash
9123 sessions
9124 bursts
9125 poo
9126 illiterate
9127 speechless
9128 apologize
9129 fugitive
9130 revelations
9131 drums
9132 wits
9133 crouse
9134 carson
9135 woven
9136 garcia
9137 supporters
9138 bruckheimer
9139 overacts
9140 mold
9141 incompetence
9142 sheila
9143 lovingly
9144 discernible
9145 mcintire
9146 cbc
9147 bw
9148 grounded
9149 pets
9150 waits
9151 bolivia
9152 jackass
9153 bravery
9154 loathing
9155 denied
9156 dreadfully
9157 loop
9158 hare
9159 consisted
9160 palette
9161 honorable
9162 wipe
9163 vivah
9164 tel
9165 entertainer
9166 regrets
9167 embarrass


### Text to vector function

Now we can write a function that converts a some text to a word vector. The function will take a string of words as input and return a vector with the words counted up. Here's the general algorithm to do this:

* Initialize the word vector with [np.zeros](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html), it should be the length of the vocabulary.
* Split the input string of text into a list of words with `.split(' ')`. Again, if you call `.split()` instead, you'll get slightly different results than what we show here.
* For each word in that list, increment the element in the index associated with that word, which you get from `word2idx`.

**Note:** Since all words aren't in the `vocab` dictionary, you'll get a key error if you run into one of those words. You can use the `.get` method of the `word2idx` dictionary to specify a default returned value when you make a key error. For example, `word2idx.get(word, None)` returns `None` if `word` doesn't exist in the dictionary.

In [51]:
def text_to_vector(text):
    word_vector = np.zeros((1, len(vocab) ))
    
    for word in text.split(' '):
        if(word in word2idx.keys()):
            index = word2idx[word]
            word_vector[index] += 1
    return word_vector
    

If you do this right, the following code should return

```
text_to_vector('The tea is for a party to celebrate '
               'the movie so she has no time for a cake')[:65]
                   
array([0, 1, 0, 0, 2, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0])
```       

In [52]:
text_to_vector('The tea is for a party to celebrate '
               'the movie so she has no time for a cake')[:65]

array([[ 0.,  0.,  0., ...,  0.,  0.,  0.]])

Now, run through our entire review data set and convert each review to a word vector.

In [None]:
word_vectors = np.zeros((len(reviews), len(vocab)), dtype=np.int_)
for ii, (_, text) in enumerate(reviews.iterrows()):
    word_vectors[ii] = text_to_vector(text[0])

In [None]:
# Printing out the first 5 word vectors
word_vectors[:5, :23]

### Train, Validation, Test sets

Now that we have the word_vectors, we're ready to split our data into train, validation, and test sets. Remember that we train on the train data, use the validation data to set the hyperparameters, and at the very end measure the network performance on the test data. Here we're using the function `to_categorical` from TFLearn to reshape the target data so that we'll have two output units and can classify with a softmax activation function. We actually won't be creating the validation set here, TFLearn will do that for us later.

In [None]:
Y = (labels=='positive').astype(np.int_)
records = len(labels)

shuffle = np.arange(records)
np.random.shuffle(shuffle)
test_fraction = 0.9

train_split, test_split = shuffle[:int(records*test_fraction)], shuffle[int(records*test_fraction):]
trainX, trainY = word_vectors[train_split,:], to_categorical(Y.values[train_split], 2)
testX, testY = word_vectors[test_split,:], to_categorical(Y.values[test_split], 2)

In [None]:
trainY

## Building the network

[TFLearn](http://tflearn.org/) lets you build the network by [defining the layers](http://tflearn.org/layers/core/). 

### Input layer

For the input layer, you just need to tell it how many units you have. For example, 

```
net = tflearn.input_data([None, 100])
```

would create a network with 100 input units. The first element in the list, `None` in this case, sets the batch size. Setting it to `None` here leaves it at the default batch size.

The number of inputs to your network needs to match the size of your data. For this example, we're using 10000 element long vectors to encode our input data, so we need 10000 input units.


### Adding layers

To add new hidden layers, you use 

```
net = tflearn.fully_connected(net, n_units, activation='ReLU')
```

This adds a fully connected layer where every unit in the previous layer is connected to every unit in this layer. The first argument `net` is the network you created in the `tflearn.input_data` call. It's telling the network to use the output of the previous layer as the input to this layer. You can set the number of units in the layer with `n_units`, and set the activation function with the `activation` keyword. You can keep adding layers to your network by repeated calling `net = tflearn.fully_connected(net, n_units)`.

### Output layer

The last layer you add is used as the output layer. Therefore, you need to set the number of units to match the target data. In this case we are predicting two classes, positive or negative sentiment. You also need to set the activation function so it's appropriate for your model. Again, we're trying to predict if some input data belongs to one of two classes, so we should use softmax.

```
net = tflearn.fully_connected(net, 2, activation='softmax')
```

### Training
To set how you train the network, use 

```
net = tflearn.regression(net, optimizer='sgd', learning_rate=0.1, loss='categorical_crossentropy')
```

Again, this is passing in the network you've been building. The keywords: 

* `optimizer` sets the training method, here stochastic gradient descent
* `learning_rate` is the learning rate
* `loss` determines how the network error is calculated. In this example, with the categorical cross-entropy.

Finally you put all this together to create the model with `tflearn.DNN(net)`. So it ends up looking something like 

```
net = tflearn.input_data([None, 10])                          # Input
net = tflearn.fully_connected(net, 5, activation='ReLU')      # Hidden
net = tflearn.fully_connected(net, 2, activation='softmax')   # Output
net = tflearn.regression(net, optimizer='sgd', learning_rate=0.1, loss='categorical_crossentropy')
model = tflearn.DNN(net)
```

> **Exercise:** Below in the `build_model()` function, you'll put together the network using TFLearn. You get to choose how many layers to use, how many hidden units, etc.

In [None]:
# Network building
def build_model():
    # This resets all parameters and variables, leave this here
    tf.reset_default_graph()
    
    #### Your code ####
    
    model = tflearn.DNN(net)
    return model

## Intializing the model

Next we need to call the `build_model()` function to actually build the model. In my solution I haven't included any arguments to the function, but you can add arguments so you can change parameters in the model if you want.

> **Note:** You might get a bunch of warnings here. TFLearn uses a lot of deprecated code in TensorFlow. Hopefully it gets updated to the new TensorFlow version soon.

In [None]:
model = build_model()

## Training the network

Now that we've constructed the network, saved as the variable `model`, we can fit it to the data. Here we use the `model.fit` method. You pass in the training features `trainX` and the training targets `trainY`. Below I set `validation_set=0.1` which reserves 10% of the data set as the validation set. You can also set the batch size and number of epochs with the `batch_size` and `n_epoch` keywords, respectively. Below is the code to fit our the network to our word vectors.

You can rerun `model.fit` to train the network further if you think you can increase the validation accuracy. Remember, all hyperparameter adjustments must be done using the validation set. **Only use the test set after you're completely done training the network.**

In [None]:
# Training
model.fit(trainX, trainY, validation_set=0.1, show_metric=True, batch_size=128, n_epoch=10)

## Testing

After you're satisified with your hyperparameters, you can run the network on the test set to measure its performance. Remember, *only do this after finalizing the hyperparameters*.

In [None]:
predictions = (np.array(model.predict(testX))[:,0] >= 0.5).astype(np.int_)
test_accuracy = np.mean(predictions == testY[:,0], axis=0)
print("Test accuracy: ", test_accuracy)

## Try out your own text!

In [None]:
# Helper function that uses your model to predict sentiment
def test_sentence(sentence):
    positive_prob = model.predict([text_to_vector(sentence.lower())])[0][1]
    print('Sentence: {}'.format(sentence))
    print('P(positive) = {:.3f} :'.format(positive_prob), 
          'Positive' if positive_prob > 0.5 else 'Negative')

In [None]:
sentence = "Moonlight is by far the best movie of 2016."
test_sentence(sentence)

sentence = "It's amazing anyone could be talented enough to make something this spectacularly awful"
test_sentence(sentence)