# Sentiment Analysis

## Using XGBoost in SageMaker

_Deep Learning Nanodegree Program | Deployment_

---

As our first example of using Amazon's SageMaker service we will construct a random tree model to predict the sentiment of a movie review. You may have seen a version of this example in a pervious lesson although it would have been done using the sklearn package. Instead, we will be using the XGBoost package as it is provided to us by Amazon.

## Instructions

Some template code has already been provided for you, and you will need to implement additional functionality to successfully complete this notebook. You will not need to modify the included code beyond what is requested. Sections that begin with '**TODO**' in the header indicate that you need to complete or implement some portion within them. Instructions will be provided for each section and the specifics of the implementation are marked in the code block with a `# TODO: ...` comment. Please be sure to read the instructions carefully!

In addition to implementing code, there may be questions for you to answer which relate to the task and your implementation. Each section where you will answer a question is preceded by a '**Question:**' header. Carefully read each question and provide your answer below the '**Answer:**' header by editing the Markdown cell.

> **Note**: Code and Markdown cells can be executed using the **Shift+Enter** keyboard shortcut. In addition, a cell can be edited by typically clicking it (double-click for Markdown cells) or by pressing **Enter** while it is highlighted.

## Step 1: Downloading the data

The dataset we are going to use is very popular among researchers in Natural Language Processing, usually referred to as the [IMDb dataset](http://ai.stanford.edu/~amaas/data/sentiment/). It consists of movie reviews from the website [imdb.com](http://www.imdb.com/), each labeled as either '**pos**itive', if the reviewer enjoyed the film, or '**neg**ative' otherwise.

> Maas, Andrew L., et al. [Learning Word Vectors for Sentiment Analysis](http://ai.stanford.edu/~amaas/data/sentiment/). In _Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies_. Association for Computational Linguistics, 2011.

We begin by using some Jupyter Notebook magic to download and extract the dataset.

In [1]:
%mkdir ../data
!wget -O ../data/aclImdb_v1.tar.gz http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
!tar -zxf ../data/aclImdb_v1.tar.gz -C ../data

--2020-09-13 14:16:54--  http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
Resolving ai.stanford.edu (ai.stanford.edu)... 171.64.68.10
Connecting to ai.stanford.edu (ai.stanford.edu)|171.64.68.10|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 84125825 (80M) [application/x-gzip]
Saving to: ‘../data/aclImdb_v1.tar.gz’


2020-09-13 14:16:57 (22.4 MB/s) - ‘../data/aclImdb_v1.tar.gz’ saved [84125825/84125825]



## Step 2: Preparing the data

The data we have downloaded is split into various files, each of which contains a single review. It will be much easier going forward if we combine these individual files into two large files, one for training and one for testing.

In [9]:
import os
import glob

def read_imdb_data(data_dir='../data/aclImdb'):
    data = {}
    labels = {}
    
    for data_type in ['train', 'test']:
        data[data_type] = {}
        labels[data_type] = {}
        
        # print("data[data_type] : ", data)
        # print("labels[data_type] : ", labels)
        
        for sentiment in ['pos', 'neg']:
            data[data_type][sentiment] = []
            labels[data_type][sentiment] = []
            
            # print("data[data_type][sentiment] : ", data)
            # print("labels[data_type][sentiment] : ", labels)
            
            path = os.path.join(data_dir, data_type, sentiment, '*.txt')
            print("path : ", path)
            
            
            files = glob.glob(path)
            # print("current data : ", data)
            # print("files : ", files)
            
            for f in files:
                # print("f : ", f)
                
                with open(f) as review:
                    data[data_type][sentiment].append(review.read())
                    # Here we represent a positive review by '1' and a negative review by '0'
                    labels[data_type][sentiment].append(1 if sentiment == 'pos' else 0)
                # print("current data : ", data)
                # print("current labels : ", labels)
                    
            assert len(data[data_type][sentiment]) == len(labels[data_type][sentiment]), \
                    "{}/{} data size does not match labels size".format(data_type, sentiment)
                
    return data, labels

In [8]:
data, labels = read_imdb_data()
print("IMDB reviews: train = {} pos / {} neg, test = {} pos / {} neg".format(
            len(data['train']['pos']), len(data['train']['neg']),
            len(data['test']['pos']), len(data['test']['neg'])))

path :  ../data/aclImdb/train/pos/*.txt
f :  ../data/aclImdb/train/pos/2054_9.txt
f :  ../data/aclImdb/train/pos/5629_10.txt
f :  ../data/aclImdb/train/pos/6191_9.txt
f :  ../data/aclImdb/train/pos/10872_7.txt
f :  ../data/aclImdb/train/pos/11490_9.txt
f :  ../data/aclImdb/train/pos/11969_10.txt
f :  ../data/aclImdb/train/pos/6992_10.txt
f :  ../data/aclImdb/train/pos/4109_10.txt
f :  ../data/aclImdb/train/pos/5556_10.txt
f :  ../data/aclImdb/train/pos/2451_8.txt
f :  ../data/aclImdb/train/pos/1650_10.txt
f :  ../data/aclImdb/train/pos/1_7.txt
f :  ../data/aclImdb/train/pos/2255_8.txt
f :  ../data/aclImdb/train/pos/414_10.txt
f :  ../data/aclImdb/train/pos/2746_10.txt
f :  ../data/aclImdb/train/pos/3003_9.txt
f :  ../data/aclImdb/train/pos/9611_8.txt
f :  ../data/aclImdb/train/pos/155_10.txt
f :  ../data/aclImdb/train/pos/12479_10.txt
f :  ../data/aclImdb/train/pos/11741_10.txt
f :  ../data/aclImdb/train/pos/12423_10.txt
f :  ../data/aclImdb/train/pos/2465_10.txt
f :  ../data/aclImdb/t

f :  ../data/aclImdb/train/pos/6615_8.txt
f :  ../data/aclImdb/train/pos/5188_8.txt
f :  ../data/aclImdb/train/pos/1307_10.txt
f :  ../data/aclImdb/train/pos/10957_10.txt
f :  ../data/aclImdb/train/pos/3217_8.txt
f :  ../data/aclImdb/train/pos/7553_10.txt
f :  ../data/aclImdb/train/pos/2903_9.txt
f :  ../data/aclImdb/train/pos/999_10.txt
f :  ../data/aclImdb/train/pos/2799_10.txt
f :  ../data/aclImdb/train/pos/4067_8.txt
f :  ../data/aclImdb/train/pos/6594_10.txt
f :  ../data/aclImdb/train/pos/4907_8.txt
f :  ../data/aclImdb/train/pos/1847_10.txt
f :  ../data/aclImdb/train/pos/4444_10.txt
f :  ../data/aclImdb/train/pos/9398_9.txt
f :  ../data/aclImdb/train/pos/9229_7.txt
f :  ../data/aclImdb/train/pos/2553_7.txt
f :  ../data/aclImdb/train/pos/10633_9.txt
f :  ../data/aclImdb/train/pos/2363_7.txt
f :  ../data/aclImdb/train/pos/2386_8.txt
f :  ../data/aclImdb/train/pos/589_10.txt
f :  ../data/aclImdb/train/pos/11839_9.txt
f :  ../data/aclImdb/train/pos/4428_10.txt
f :  ../data/aclImdb/tr

f :  ../data/aclImdb/train/pos/10541_10.txt
f :  ../data/aclImdb/train/pos/2258_7.txt
f :  ../data/aclImdb/train/pos/4961_10.txt
f :  ../data/aclImdb/train/pos/6026_8.txt
f :  ../data/aclImdb/train/pos/7322_10.txt
f :  ../data/aclImdb/train/pos/10108_10.txt
f :  ../data/aclImdb/train/pos/10585_9.txt
f :  ../data/aclImdb/train/pos/6061_10.txt
f :  ../data/aclImdb/train/pos/5917_7.txt
f :  ../data/aclImdb/train/pos/8495_8.txt
f :  ../data/aclImdb/train/pos/1921_9.txt
f :  ../data/aclImdb/train/pos/624_9.txt
f :  ../data/aclImdb/train/pos/10105_8.txt
f :  ../data/aclImdb/train/pos/11375_9.txt
f :  ../data/aclImdb/train/pos/2794_8.txt
f :  ../data/aclImdb/train/pos/9571_9.txt
f :  ../data/aclImdb/train/pos/5193_9.txt
f :  ../data/aclImdb/train/pos/11706_10.txt
f :  ../data/aclImdb/train/pos/11976_10.txt
f :  ../data/aclImdb/train/pos/8333_9.txt
f :  ../data/aclImdb/train/pos/1390_9.txt
f :  ../data/aclImdb/train/pos/2932_10.txt
f :  ../data/aclImdb/train/pos/3659_9.txt
f :  ../data/aclImdb

f :  ../data/aclImdb/train/pos/12098_8.txt
f :  ../data/aclImdb/train/pos/11029_7.txt
f :  ../data/aclImdb/train/pos/4692_10.txt
f :  ../data/aclImdb/train/pos/984_7.txt
f :  ../data/aclImdb/train/pos/9036_10.txt
f :  ../data/aclImdb/train/pos/10823_8.txt
f :  ../data/aclImdb/train/pos/3856_10.txt
f :  ../data/aclImdb/train/pos/6995_7.txt
f :  ../data/aclImdb/train/pos/1130_10.txt
f :  ../data/aclImdb/train/pos/2783_8.txt
f :  ../data/aclImdb/train/pos/8694_10.txt
f :  ../data/aclImdb/train/pos/3355_10.txt
f :  ../data/aclImdb/train/pos/4995_9.txt
f :  ../data/aclImdb/train/pos/2914_8.txt
f :  ../data/aclImdb/train/pos/6659_10.txt
f :  ../data/aclImdb/train/pos/3018_10.txt
f :  ../data/aclImdb/train/pos/106_10.txt
f :  ../data/aclImdb/train/pos/8589_7.txt
f :  ../data/aclImdb/train/pos/1558_10.txt
f :  ../data/aclImdb/train/pos/8371_10.txt
f :  ../data/aclImdb/train/pos/9506_8.txt
f :  ../data/aclImdb/train/pos/4368_10.txt
f :  ../data/aclImdb/train/pos/4240_9.txt
f :  ../data/aclImdb/

f :  ../data/aclImdb/train/pos/10571_8.txt
f :  ../data/aclImdb/train/pos/4749_8.txt
f :  ../data/aclImdb/train/pos/11114_10.txt
f :  ../data/aclImdb/train/pos/161_8.txt
f :  ../data/aclImdb/train/pos/9501_10.txt
f :  ../data/aclImdb/train/pos/8252_9.txt
f :  ../data/aclImdb/train/pos/7857_10.txt
f :  ../data/aclImdb/train/pos/9986_9.txt
f :  ../data/aclImdb/train/pos/11446_10.txt
f :  ../data/aclImdb/train/pos/5026_8.txt
f :  ../data/aclImdb/train/pos/10240_8.txt
f :  ../data/aclImdb/train/pos/1048_8.txt
f :  ../data/aclImdb/train/pos/6631_7.txt
f :  ../data/aclImdb/train/pos/5880_10.txt
f :  ../data/aclImdb/train/pos/11870_10.txt
f :  ../data/aclImdb/train/pos/1202_9.txt
f :  ../data/aclImdb/train/pos/5820_7.txt
f :  ../data/aclImdb/train/pos/998_7.txt
f :  ../data/aclImdb/train/pos/11848_10.txt
f :  ../data/aclImdb/train/pos/11269_8.txt
f :  ../data/aclImdb/train/pos/6202_7.txt
f :  ../data/aclImdb/train/pos/4450_9.txt
f :  ../data/aclImdb/train/pos/3176_9.txt
f :  ../data/aclImdb/t

f :  ../data/aclImdb/train/pos/12377_10.txt
f :  ../data/aclImdb/train/pos/7540_9.txt
f :  ../data/aclImdb/train/pos/8398_8.txt
f :  ../data/aclImdb/train/pos/12141_8.txt
f :  ../data/aclImdb/train/pos/8641_7.txt
f :  ../data/aclImdb/train/pos/11498_7.txt
f :  ../data/aclImdb/train/pos/9472_8.txt
f :  ../data/aclImdb/train/pos/12235_7.txt
f :  ../data/aclImdb/train/pos/806_10.txt
f :  ../data/aclImdb/train/pos/12294_8.txt
f :  ../data/aclImdb/train/pos/269_8.txt
f :  ../data/aclImdb/train/pos/1245_7.txt
f :  ../data/aclImdb/train/pos/4974_10.txt
f :  ../data/aclImdb/train/pos/12361_7.txt
f :  ../data/aclImdb/train/pos/10097_9.txt
f :  ../data/aclImdb/train/pos/2025_10.txt
f :  ../data/aclImdb/train/pos/813_10.txt
f :  ../data/aclImdb/train/pos/6492_10.txt
f :  ../data/aclImdb/train/pos/857_8.txt
f :  ../data/aclImdb/train/pos/7282_9.txt
f :  ../data/aclImdb/train/pos/12497_10.txt
f :  ../data/aclImdb/train/pos/10129_7.txt
f :  ../data/aclImdb/train/pos/12127_8.txt
f :  ../data/aclImdb/

f :  ../data/aclImdb/train/pos/1575_9.txt
f :  ../data/aclImdb/train/pos/9381_8.txt
f :  ../data/aclImdb/train/pos/9580_7.txt
f :  ../data/aclImdb/train/pos/6641_8.txt
f :  ../data/aclImdb/train/pos/1992_10.txt
f :  ../data/aclImdb/train/pos/2171_10.txt
f :  ../data/aclImdb/train/pos/4652_9.txt
f :  ../data/aclImdb/train/pos/5403_10.txt
f :  ../data/aclImdb/train/pos/8989_9.txt
f :  ../data/aclImdb/train/pos/4179_10.txt
f :  ../data/aclImdb/train/pos/4891_10.txt
f :  ../data/aclImdb/train/pos/3081_9.txt
f :  ../data/aclImdb/train/pos/3574_10.txt
f :  ../data/aclImdb/train/pos/7905_10.txt
f :  ../data/aclImdb/train/pos/5879_10.txt
f :  ../data/aclImdb/train/pos/9070_7.txt
f :  ../data/aclImdb/train/pos/8268_7.txt
f :  ../data/aclImdb/train/pos/11066_8.txt
f :  ../data/aclImdb/train/pos/10611_8.txt
f :  ../data/aclImdb/train/pos/11652_10.txt
f :  ../data/aclImdb/train/pos/573_9.txt
f :  ../data/aclImdb/train/pos/5936_10.txt
f :  ../data/aclImdb/train/pos/10710_9.txt
f :  ../data/aclImdb/

f :  ../data/aclImdb/train/pos/5494_10.txt
f :  ../data/aclImdb/train/pos/801_8.txt
f :  ../data/aclImdb/train/pos/5333_10.txt
f :  ../data/aclImdb/train/pos/2947_10.txt
f :  ../data/aclImdb/train/pos/8175_7.txt
f :  ../data/aclImdb/train/pos/7005_9.txt
f :  ../data/aclImdb/train/pos/2632_7.txt
f :  ../data/aclImdb/train/pos/1860_9.txt
f :  ../data/aclImdb/train/pos/10840_9.txt
f :  ../data/aclImdb/train/pos/12077_8.txt
f :  ../data/aclImdb/train/pos/11075_10.txt
f :  ../data/aclImdb/train/pos/4879_10.txt
f :  ../data/aclImdb/train/pos/7684_7.txt
f :  ../data/aclImdb/train/pos/4324_7.txt
f :  ../data/aclImdb/train/pos/3783_10.txt
f :  ../data/aclImdb/train/pos/4531_8.txt
f :  ../data/aclImdb/train/pos/8245_9.txt
f :  ../data/aclImdb/train/pos/4079_9.txt
f :  ../data/aclImdb/train/pos/8388_7.txt
f :  ../data/aclImdb/train/pos/9613_10.txt
f :  ../data/aclImdb/train/pos/8794_9.txt
f :  ../data/aclImdb/train/pos/2868_7.txt
f :  ../data/aclImdb/train/pos/4398_9.txt
f :  ../data/aclImdb/trai

f :  ../data/aclImdb/train/pos/11503_10.txt
f :  ../data/aclImdb/train/pos/958_9.txt
f :  ../data/aclImdb/train/pos/7399_10.txt
f :  ../data/aclImdb/train/pos/10053_8.txt
f :  ../data/aclImdb/train/pos/6833_10.txt
f :  ../data/aclImdb/train/pos/871_9.txt
f :  ../data/aclImdb/train/pos/8649_9.txt
f :  ../data/aclImdb/train/pos/2985_8.txt
f :  ../data/aclImdb/train/pos/9066_8.txt
f :  ../data/aclImdb/train/pos/3980_10.txt
f :  ../data/aclImdb/train/pos/6116_8.txt
f :  ../data/aclImdb/train/pos/5106_8.txt
f :  ../data/aclImdb/train/pos/6723_10.txt
f :  ../data/aclImdb/train/pos/8829_9.txt
f :  ../data/aclImdb/train/pos/3931_7.txt
f :  ../data/aclImdb/train/pos/982_8.txt
f :  ../data/aclImdb/train/pos/10432_10.txt
f :  ../data/aclImdb/train/pos/6244_8.txt
f :  ../data/aclImdb/train/pos/9052_8.txt
f :  ../data/aclImdb/train/pos/2910_7.txt
f :  ../data/aclImdb/train/pos/1275_8.txt
f :  ../data/aclImdb/train/pos/136_10.txt
f :  ../data/aclImdb/train/pos/3665_9.txt
f :  ../data/aclImdb/train/p

f :  ../data/aclImdb/train/pos/5513_7.txt
f :  ../data/aclImdb/train/pos/2585_7.txt
f :  ../data/aclImdb/train/pos/4343_9.txt
f :  ../data/aclImdb/train/pos/1016_8.txt
f :  ../data/aclImdb/train/pos/799_8.txt
f :  ../data/aclImdb/train/pos/3671_7.txt
f :  ../data/aclImdb/train/pos/7387_7.txt
f :  ../data/aclImdb/train/pos/712_9.txt
f :  ../data/aclImdb/train/pos/2151_10.txt
f :  ../data/aclImdb/train/pos/1904_7.txt
f :  ../data/aclImdb/train/pos/12463_8.txt
f :  ../data/aclImdb/train/pos/8182_7.txt
f :  ../data/aclImdb/train/pos/9263_8.txt
f :  ../data/aclImdb/train/pos/11042_7.txt
f :  ../data/aclImdb/train/pos/4683_7.txt
f :  ../data/aclImdb/train/pos/2292_8.txt
f :  ../data/aclImdb/train/pos/10737_10.txt
f :  ../data/aclImdb/train/pos/3493_8.txt
f :  ../data/aclImdb/train/pos/6567_10.txt
f :  ../data/aclImdb/train/pos/5388_8.txt
f :  ../data/aclImdb/train/pos/3362_8.txt
f :  ../data/aclImdb/train/pos/8306_10.txt
f :  ../data/aclImdb/train/pos/4814_10.txt
f :  ../data/aclImdb/train/p

f :  ../data/aclImdb/train/pos/6915_9.txt
f :  ../data/aclImdb/train/pos/4526_10.txt
f :  ../data/aclImdb/train/pos/6840_10.txt
f :  ../data/aclImdb/train/pos/9391_8.txt
f :  ../data/aclImdb/train/pos/7346_8.txt
f :  ../data/aclImdb/train/pos/665_9.txt
f :  ../data/aclImdb/train/pos/115_10.txt
f :  ../data/aclImdb/train/pos/5178_10.txt
f :  ../data/aclImdb/train/pos/7006_8.txt
f :  ../data/aclImdb/train/pos/10368_7.txt
f :  ../data/aclImdb/train/pos/6517_10.txt
f :  ../data/aclImdb/train/pos/1305_10.txt
f :  ../data/aclImdb/train/pos/8036_8.txt
f :  ../data/aclImdb/train/pos/7861_10.txt
f :  ../data/aclImdb/train/pos/11353_9.txt
f :  ../data/aclImdb/train/pos/2969_10.txt
f :  ../data/aclImdb/train/pos/4687_8.txt
f :  ../data/aclImdb/train/pos/7355_10.txt
f :  ../data/aclImdb/train/pos/11927_10.txt
f :  ../data/aclImdb/train/pos/11688_10.txt
f :  ../data/aclImdb/train/pos/5706_9.txt
f :  ../data/aclImdb/train/pos/7462_10.txt
f :  ../data/aclImdb/train/pos/3212_7.txt
f :  ../data/aclImdb

f :  ../data/aclImdb/train/pos/6321_7.txt
f :  ../data/aclImdb/train/pos/10104_10.txt
f :  ../data/aclImdb/train/pos/6904_8.txt
f :  ../data/aclImdb/train/pos/8883_8.txt
f :  ../data/aclImdb/train/pos/8966_10.txt
f :  ../data/aclImdb/train/pos/6075_10.txt
f :  ../data/aclImdb/train/pos/454_8.txt
f :  ../data/aclImdb/train/pos/5227_10.txt
f :  ../data/aclImdb/train/pos/10241_8.txt
f :  ../data/aclImdb/train/pos/12429_7.txt
f :  ../data/aclImdb/train/pos/8428_7.txt
f :  ../data/aclImdb/train/pos/3860_10.txt
f :  ../data/aclImdb/train/pos/1494_8.txt
f :  ../data/aclImdb/train/pos/1154_10.txt
f :  ../data/aclImdb/train/pos/6738_7.txt
f :  ../data/aclImdb/train/pos/7268_9.txt
f :  ../data/aclImdb/train/pos/12340_8.txt
f :  ../data/aclImdb/train/pos/3424_10.txt
f :  ../data/aclImdb/train/pos/10594_8.txt
f :  ../data/aclImdb/train/pos/1811_10.txt
f :  ../data/aclImdb/train/pos/2309_9.txt
f :  ../data/aclImdb/train/pos/6837_10.txt
f :  ../data/aclImdb/train/pos/1765_8.txt
f :  ../data/aclImdb/

f :  ../data/aclImdb/train/pos/1676_9.txt
f :  ../data/aclImdb/train/pos/4833_10.txt
f :  ../data/aclImdb/train/pos/3552_10.txt
f :  ../data/aclImdb/train/pos/9755_7.txt
f :  ../data/aclImdb/train/pos/9953_10.txt
f :  ../data/aclImdb/train/pos/7731_9.txt
f :  ../data/aclImdb/train/pos/4895_10.txt
f :  ../data/aclImdb/train/pos/9174_9.txt
f :  ../data/aclImdb/train/pos/885_8.txt
f :  ../data/aclImdb/train/pos/5341_10.txt
f :  ../data/aclImdb/train/pos/7269_7.txt
f :  ../data/aclImdb/train/pos/397_9.txt
f :  ../data/aclImdb/train/pos/12310_10.txt
f :  ../data/aclImdb/train/pos/464_10.txt
f :  ../data/aclImdb/train/pos/2731_10.txt
f :  ../data/aclImdb/train/pos/2886_8.txt
f :  ../data/aclImdb/train/pos/8357_10.txt
f :  ../data/aclImdb/train/pos/11582_10.txt
f :  ../data/aclImdb/train/pos/1610_10.txt
f :  ../data/aclImdb/train/pos/6724_8.txt
f :  ../data/aclImdb/train/pos/4362_7.txt
f :  ../data/aclImdb/train/pos/2248_7.txt
f :  ../data/aclImdb/train/pos/5097_8.txt
f :  ../data/aclImdb/tra

f :  ../data/aclImdb/train/pos/8025_9.txt
f :  ../data/aclImdb/train/pos/3599_8.txt
f :  ../data/aclImdb/train/pos/4446_10.txt
f :  ../data/aclImdb/train/pos/718_10.txt
f :  ../data/aclImdb/train/pos/11818_10.txt
f :  ../data/aclImdb/train/pos/7161_9.txt
f :  ../data/aclImdb/train/pos/10052_10.txt
f :  ../data/aclImdb/train/pos/10497_8.txt
f :  ../data/aclImdb/train/pos/2922_10.txt
f :  ../data/aclImdb/train/pos/6955_10.txt
f :  ../data/aclImdb/train/pos/5546_7.txt
f :  ../data/aclImdb/train/pos/673_8.txt
f :  ../data/aclImdb/train/pos/2400_7.txt
f :  ../data/aclImdb/train/pos/10462_7.txt
f :  ../data/aclImdb/train/pos/5418_10.txt
f :  ../data/aclImdb/train/pos/12441_9.txt
f :  ../data/aclImdb/train/pos/6481_10.txt
f :  ../data/aclImdb/train/pos/3055_9.txt
f :  ../data/aclImdb/train/pos/554_7.txt
f :  ../data/aclImdb/train/pos/5318_7.txt
f :  ../data/aclImdb/train/pos/10576_7.txt
f :  ../data/aclImdb/train/pos/2084_8.txt
f :  ../data/aclImdb/train/pos/4949_8.txt
f :  ../data/aclImdb/tr

f :  ../data/aclImdb/train/pos/6691_9.txt
f :  ../data/aclImdb/train/pos/6324_8.txt
f :  ../data/aclImdb/train/pos/4496_8.txt
f :  ../data/aclImdb/train/pos/1477_7.txt
f :  ../data/aclImdb/train/pos/12374_7.txt
f :  ../data/aclImdb/train/pos/9593_10.txt
f :  ../data/aclImdb/train/pos/7380_9.txt
f :  ../data/aclImdb/train/pos/2295_7.txt
f :  ../data/aclImdb/train/pos/4510_7.txt
f :  ../data/aclImdb/train/pos/7119_10.txt
f :  ../data/aclImdb/train/pos/5134_10.txt
f :  ../data/aclImdb/train/pos/1348_10.txt
f :  ../data/aclImdb/train/pos/8737_9.txt
f :  ../data/aclImdb/train/pos/7345_7.txt
f :  ../data/aclImdb/train/pos/8279_10.txt
f :  ../data/aclImdb/train/pos/9553_9.txt
f :  ../data/aclImdb/train/pos/11412_10.txt
f :  ../data/aclImdb/train/pos/4633_9.txt
f :  ../data/aclImdb/train/pos/2872_10.txt
f :  ../data/aclImdb/train/pos/524_10.txt
f :  ../data/aclImdb/train/pos/5916_10.txt
f :  ../data/aclImdb/train/pos/10464_7.txt
f :  ../data/aclImdb/train/pos/1751_8.txt
f :  ../data/aclImdb/tr

f :  ../data/aclImdb/train/pos/5170_9.txt
f :  ../data/aclImdb/train/pos/2908_8.txt
f :  ../data/aclImdb/train/pos/7377_9.txt
f :  ../data/aclImdb/train/pos/9349_8.txt
f :  ../data/aclImdb/train/pos/2934_10.txt
f :  ../data/aclImdb/train/pos/6356_8.txt
f :  ../data/aclImdb/train/pos/8241_7.txt
f :  ../data/aclImdb/train/pos/5994_8.txt
f :  ../data/aclImdb/train/pos/6844_10.txt
f :  ../data/aclImdb/train/pos/2150_10.txt
f :  ../data/aclImdb/train/pos/5898_8.txt
f :  ../data/aclImdb/train/pos/7568_10.txt
f :  ../data/aclImdb/train/pos/4029_10.txt
f :  ../data/aclImdb/train/pos/10111_7.txt
f :  ../data/aclImdb/train/pos/4672_10.txt
f :  ../data/aclImdb/train/pos/10485_8.txt
f :  ../data/aclImdb/train/pos/9387_9.txt
f :  ../data/aclImdb/train/pos/2878_8.txt
f :  ../data/aclImdb/train/pos/5419_10.txt
f :  ../data/aclImdb/train/pos/11954_9.txt
f :  ../data/aclImdb/train/pos/8042_7.txt
f :  ../data/aclImdb/train/pos/7950_9.txt
f :  ../data/aclImdb/train/pos/5394_10.txt
f :  ../data/aclImdb/tr

f :  ../data/aclImdb/train/pos/6579_10.txt
f :  ../data/aclImdb/train/pos/3828_10.txt
f :  ../data/aclImdb/train/pos/7869_9.txt
f :  ../data/aclImdb/train/pos/11747_8.txt
f :  ../data/aclImdb/train/pos/8970_10.txt
f :  ../data/aclImdb/train/pos/12075_7.txt
f :  ../data/aclImdb/train/pos/1211_7.txt
f :  ../data/aclImdb/train/pos/10005_7.txt
f :  ../data/aclImdb/train/pos/4564_10.txt
f :  ../data/aclImdb/train/pos/11104_10.txt
f :  ../data/aclImdb/train/pos/6910_9.txt
f :  ../data/aclImdb/train/pos/8536_9.txt
f :  ../data/aclImdb/train/pos/291_10.txt
f :  ../data/aclImdb/train/pos/10557_9.txt
f :  ../data/aclImdb/train/pos/3270_8.txt
f :  ../data/aclImdb/train/pos/7361_10.txt
f :  ../data/aclImdb/train/pos/3760_7.txt
f :  ../data/aclImdb/train/pos/4422_8.txt
f :  ../data/aclImdb/train/pos/6933_9.txt
f :  ../data/aclImdb/train/pos/8084_7.txt
f :  ../data/aclImdb/train/pos/11964_7.txt
f :  ../data/aclImdb/train/pos/1511_8.txt
f :  ../data/aclImdb/train/pos/2190_10.txt
f :  ../data/aclImdb/

f :  ../data/aclImdb/train/pos/11148_9.txt
f :  ../data/aclImdb/train/pos/9712_10.txt
f :  ../data/aclImdb/train/pos/145_10.txt
f :  ../data/aclImdb/train/pos/6792_10.txt
f :  ../data/aclImdb/train/pos/4572_10.txt
f :  ../data/aclImdb/train/pos/6277_7.txt
f :  ../data/aclImdb/train/pos/3632_7.txt
f :  ../data/aclImdb/train/pos/1140_10.txt
f :  ../data/aclImdb/train/pos/3383_8.txt
f :  ../data/aclImdb/train/pos/2205_8.txt
f :  ../data/aclImdb/train/pos/11429_7.txt
f :  ../data/aclImdb/train/pos/10305_8.txt
f :  ../data/aclImdb/train/pos/9999_8.txt
f :  ../data/aclImdb/train/pos/8687_9.txt
f :  ../data/aclImdb/train/pos/97_9.txt
f :  ../data/aclImdb/train/pos/11620_8.txt
f :  ../data/aclImdb/train/pos/1479_8.txt
f :  ../data/aclImdb/train/pos/10614_7.txt
f :  ../data/aclImdb/train/pos/10894_8.txt
f :  ../data/aclImdb/train/pos/10865_7.txt
f :  ../data/aclImdb/train/pos/4759_7.txt
f :  ../data/aclImdb/train/pos/3258_10.txt
f :  ../data/aclImdb/train/pos/10769_10.txt
f :  ../data/aclImdb/t

f :  ../data/aclImdb/train/neg/1943_4.txt
f :  ../data/aclImdb/train/neg/8901_1.txt
f :  ../data/aclImdb/train/neg/11845_1.txt
f :  ../data/aclImdb/train/neg/1628_1.txt
f :  ../data/aclImdb/train/neg/1867_2.txt
f :  ../data/aclImdb/train/neg/11182_3.txt
f :  ../data/aclImdb/train/neg/2127_3.txt
f :  ../data/aclImdb/train/neg/9072_1.txt
f :  ../data/aclImdb/train/neg/11328_3.txt
f :  ../data/aclImdb/train/neg/2142_4.txt
f :  ../data/aclImdb/train/neg/435_2.txt
f :  ../data/aclImdb/train/neg/7655_1.txt
f :  ../data/aclImdb/train/neg/4435_1.txt
f :  ../data/aclImdb/train/neg/4757_4.txt
f :  ../data/aclImdb/train/neg/6073_1.txt
f :  ../data/aclImdb/train/neg/9120_1.txt
f :  ../data/aclImdb/train/neg/5255_4.txt
f :  ../data/aclImdb/train/neg/344_2.txt
f :  ../data/aclImdb/train/neg/10399_4.txt
f :  ../data/aclImdb/train/neg/4972_3.txt
f :  ../data/aclImdb/train/neg/3581_2.txt
f :  ../data/aclImdb/train/neg/9174_4.txt
f :  ../data/aclImdb/train/neg/11502_1.txt
f :  ../data/aclImdb/train/neg/

f :  ../data/aclImdb/train/neg/4760_1.txt
f :  ../data/aclImdb/train/neg/12207_1.txt
f :  ../data/aclImdb/train/neg/282_1.txt
f :  ../data/aclImdb/train/neg/6239_1.txt
f :  ../data/aclImdb/train/neg/2685_2.txt
f :  ../data/aclImdb/train/neg/8562_4.txt
f :  ../data/aclImdb/train/neg/6265_1.txt
f :  ../data/aclImdb/train/neg/6458_1.txt
f :  ../data/aclImdb/train/neg/8779_4.txt
f :  ../data/aclImdb/train/neg/12345_3.txt
f :  ../data/aclImdb/train/neg/9626_2.txt
f :  ../data/aclImdb/train/neg/10033_1.txt
f :  ../data/aclImdb/train/neg/4322_1.txt
f :  ../data/aclImdb/train/neg/1582_1.txt
f :  ../data/aclImdb/train/neg/370_1.txt
f :  ../data/aclImdb/train/neg/5105_4.txt
f :  ../data/aclImdb/train/neg/5351_2.txt
f :  ../data/aclImdb/train/neg/11686_1.txt
f :  ../data/aclImdb/train/neg/11287_3.txt
f :  ../data/aclImdb/train/neg/9987_1.txt
f :  ../data/aclImdb/train/neg/1803_4.txt
f :  ../data/aclImdb/train/neg/2835_1.txt
f :  ../data/aclImdb/train/neg/10791_2.txt
f :  ../data/aclImdb/train/neg

f :  ../data/aclImdb/train/neg/8611_3.txt
f :  ../data/aclImdb/train/neg/10494_1.txt
f :  ../data/aclImdb/train/neg/10134_1.txt
f :  ../data/aclImdb/train/neg/5311_1.txt
f :  ../data/aclImdb/train/neg/7015_4.txt
f :  ../data/aclImdb/train/neg/10001_4.txt
f :  ../data/aclImdb/train/neg/11558_1.txt
f :  ../data/aclImdb/train/neg/8654_1.txt
f :  ../data/aclImdb/train/neg/5434_1.txt
f :  ../data/aclImdb/train/neg/11036_1.txt
f :  ../data/aclImdb/train/neg/1049_3.txt
f :  ../data/aclImdb/train/neg/1939_1.txt
f :  ../data/aclImdb/train/neg/7612_3.txt
f :  ../data/aclImdb/train/neg/84_3.txt
f :  ../data/aclImdb/train/neg/11588_1.txt
f :  ../data/aclImdb/train/neg/10237_2.txt
f :  ../data/aclImdb/train/neg/5734_1.txt
f :  ../data/aclImdb/train/neg/10800_1.txt
f :  ../data/aclImdb/train/neg/6936_2.txt
f :  ../data/aclImdb/train/neg/1687_1.txt
f :  ../data/aclImdb/train/neg/9408_3.txt
f :  ../data/aclImdb/train/neg/8693_4.txt
f :  ../data/aclImdb/train/neg/465_1.txt
f :  ../data/aclImdb/train/ne

f :  ../data/aclImdb/train/neg/1131_1.txt
f :  ../data/aclImdb/train/neg/11147_2.txt
f :  ../data/aclImdb/train/neg/7971_1.txt
f :  ../data/aclImdb/train/neg/10896_1.txt
f :  ../data/aclImdb/train/neg/8193_3.txt
f :  ../data/aclImdb/train/neg/11665_3.txt
f :  ../data/aclImdb/train/neg/1820_4.txt
f :  ../data/aclImdb/train/neg/8474_2.txt
f :  ../data/aclImdb/train/neg/6766_3.txt
f :  ../data/aclImdb/train/neg/10742_1.txt
f :  ../data/aclImdb/train/neg/7700_2.txt
f :  ../data/aclImdb/train/neg/5031_3.txt
f :  ../data/aclImdb/train/neg/8715_4.txt
f :  ../data/aclImdb/train/neg/8366_4.txt
f :  ../data/aclImdb/train/neg/4894_2.txt
f :  ../data/aclImdb/train/neg/5484_4.txt
f :  ../data/aclImdb/train/neg/7202_1.txt
f :  ../data/aclImdb/train/neg/9857_1.txt
f :  ../data/aclImdb/train/neg/9110_3.txt
f :  ../data/aclImdb/train/neg/5588_3.txt
f :  ../data/aclImdb/train/neg/10043_1.txt
f :  ../data/aclImdb/train/neg/6109_1.txt
f :  ../data/aclImdb/train/neg/8759_1.txt
f :  ../data/aclImdb/train/ne

f :  ../data/aclImdb/train/neg/3861_4.txt
f :  ../data/aclImdb/train/neg/5530_1.txt
f :  ../data/aclImdb/train/neg/5385_1.txt
f :  ../data/aclImdb/train/neg/8944_1.txt
f :  ../data/aclImdb/train/neg/4127_1.txt
f :  ../data/aclImdb/train/neg/7764_1.txt
f :  ../data/aclImdb/train/neg/7520_1.txt
f :  ../data/aclImdb/train/neg/10449_4.txt
f :  ../data/aclImdb/train/neg/8549_3.txt
f :  ../data/aclImdb/train/neg/15_1.txt
f :  ../data/aclImdb/train/neg/3897_4.txt
f :  ../data/aclImdb/train/neg/1839_1.txt
f :  ../data/aclImdb/train/neg/980_4.txt
f :  ../data/aclImdb/train/neg/11904_1.txt
f :  ../data/aclImdb/train/neg/5685_1.txt
f :  ../data/aclImdb/train/neg/493_4.txt
f :  ../data/aclImdb/train/neg/3578_1.txt
f :  ../data/aclImdb/train/neg/12124_1.txt
f :  ../data/aclImdb/train/neg/10400_1.txt
f :  ../data/aclImdb/train/neg/8668_4.txt
f :  ../data/aclImdb/train/neg/3129_1.txt
f :  ../data/aclImdb/train/neg/5976_3.txt
f :  ../data/aclImdb/train/neg/192_2.txt
f :  ../data/aclImdb/train/neg/1757

f :  ../data/aclImdb/train/neg/3438_4.txt
f :  ../data/aclImdb/train/neg/3806_4.txt
f :  ../data/aclImdb/train/neg/335_4.txt
f :  ../data/aclImdb/train/neg/11024_4.txt
f :  ../data/aclImdb/train/neg/5262_4.txt
f :  ../data/aclImdb/train/neg/3625_2.txt
f :  ../data/aclImdb/train/neg/8677_2.txt
f :  ../data/aclImdb/train/neg/8765_4.txt
f :  ../data/aclImdb/train/neg/8606_2.txt
f :  ../data/aclImdb/train/neg/698_3.txt
f :  ../data/aclImdb/train/neg/10463_1.txt
f :  ../data/aclImdb/train/neg/7666_2.txt
f :  ../data/aclImdb/train/neg/12172_3.txt
f :  ../data/aclImdb/train/neg/2000_3.txt
f :  ../data/aclImdb/train/neg/7434_2.txt
f :  ../data/aclImdb/train/neg/6002_1.txt
f :  ../data/aclImdb/train/neg/2136_1.txt
f :  ../data/aclImdb/train/neg/10462_4.txt
f :  ../data/aclImdb/train/neg/6778_3.txt
f :  ../data/aclImdb/train/neg/4026_1.txt
f :  ../data/aclImdb/train/neg/5967_4.txt
f :  ../data/aclImdb/train/neg/1833_1.txt
f :  ../data/aclImdb/train/neg/3358_1.txt
f :  ../data/aclImdb/train/neg/3

f :  ../data/aclImdb/train/neg/11470_3.txt
f :  ../data/aclImdb/train/neg/5316_2.txt
f :  ../data/aclImdb/train/neg/9842_1.txt
f :  ../data/aclImdb/train/neg/2289_3.txt
f :  ../data/aclImdb/train/neg/9709_1.txt
f :  ../data/aclImdb/train/neg/7546_1.txt
f :  ../data/aclImdb/train/neg/4472_1.txt
f :  ../data/aclImdb/train/neg/8438_3.txt
f :  ../data/aclImdb/train/neg/9720_2.txt
f :  ../data/aclImdb/train/neg/11104_1.txt
f :  ../data/aclImdb/train/neg/4805_1.txt
f :  ../data/aclImdb/train/neg/10775_3.txt
f :  ../data/aclImdb/train/neg/7807_1.txt
f :  ../data/aclImdb/train/neg/9043_3.txt
f :  ../data/aclImdb/train/neg/6921_1.txt
f :  ../data/aclImdb/train/neg/5039_3.txt
f :  ../data/aclImdb/train/neg/4253_2.txt
f :  ../data/aclImdb/train/neg/1139_1.txt
f :  ../data/aclImdb/train/neg/7989_1.txt
f :  ../data/aclImdb/train/neg/1553_3.txt
f :  ../data/aclImdb/train/neg/6520_4.txt
f :  ../data/aclImdb/train/neg/8711_1.txt
f :  ../data/aclImdb/train/neg/3267_2.txt
f :  ../data/aclImdb/train/neg/

f :  ../data/aclImdb/train/neg/11128_2.txt
f :  ../data/aclImdb/train/neg/9306_1.txt
f :  ../data/aclImdb/train/neg/523_1.txt
f :  ../data/aclImdb/train/neg/9597_3.txt
f :  ../data/aclImdb/train/neg/12090_3.txt
f :  ../data/aclImdb/train/neg/8320_4.txt
f :  ../data/aclImdb/train/neg/2736_4.txt
f :  ../data/aclImdb/train/neg/8277_3.txt
f :  ../data/aclImdb/train/neg/9606_4.txt
f :  ../data/aclImdb/train/neg/3090_1.txt
f :  ../data/aclImdb/train/neg/4600_4.txt
f :  ../data/aclImdb/train/neg/5134_4.txt
f :  ../data/aclImdb/train/neg/8425_1.txt
f :  ../data/aclImdb/train/neg/1069_3.txt
f :  ../data/aclImdb/train/neg/11790_1.txt
f :  ../data/aclImdb/train/neg/8173_1.txt
f :  ../data/aclImdb/train/neg/7867_2.txt
f :  ../data/aclImdb/train/neg/7481_1.txt
f :  ../data/aclImdb/train/neg/7012_1.txt
f :  ../data/aclImdb/train/neg/757_3.txt
f :  ../data/aclImdb/train/neg/694_2.txt
f :  ../data/aclImdb/train/neg/2355_3.txt
f :  ../data/aclImdb/train/neg/180_4.txt
f :  ../data/aclImdb/train/neg/1174

f :  ../data/aclImdb/train/neg/3071_4.txt
f :  ../data/aclImdb/train/neg/824_3.txt
f :  ../data/aclImdb/train/neg/8736_3.txt
f :  ../data/aclImdb/train/neg/6937_2.txt
f :  ../data/aclImdb/train/neg/8402_4.txt
f :  ../data/aclImdb/train/neg/3088_2.txt
f :  ../data/aclImdb/train/neg/11710_4.txt
f :  ../data/aclImdb/train/neg/7034_2.txt
f :  ../data/aclImdb/train/neg/7195_2.txt
f :  ../data/aclImdb/train/neg/3150_1.txt
f :  ../data/aclImdb/train/neg/8591_1.txt
f :  ../data/aclImdb/train/neg/8217_1.txt
f :  ../data/aclImdb/train/neg/4010_2.txt
f :  ../data/aclImdb/train/neg/5995_1.txt
f :  ../data/aclImdb/train/neg/10052_4.txt
f :  ../data/aclImdb/train/neg/5144_1.txt
f :  ../data/aclImdb/train/neg/3089_1.txt
f :  ../data/aclImdb/train/neg/3423_4.txt
f :  ../data/aclImdb/train/neg/7175_4.txt
f :  ../data/aclImdb/train/neg/6311_3.txt
f :  ../data/aclImdb/train/neg/8007_1.txt
f :  ../data/aclImdb/train/neg/3735_4.txt
f :  ../data/aclImdb/train/neg/4014_2.txt
f :  ../data/aclImdb/train/neg/59

f :  ../data/aclImdb/train/neg/8185_2.txt
f :  ../data/aclImdb/train/neg/1796_2.txt
f :  ../data/aclImdb/train/neg/3701_2.txt
f :  ../data/aclImdb/train/neg/7429_1.txt
f :  ../data/aclImdb/train/neg/4730_4.txt
f :  ../data/aclImdb/train/neg/8380_1.txt
f :  ../data/aclImdb/train/neg/9771_3.txt
f :  ../data/aclImdb/train/neg/2220_1.txt
f :  ../data/aclImdb/train/neg/8620_2.txt
f :  ../data/aclImdb/train/neg/7135_3.txt
f :  ../data/aclImdb/train/neg/2553_1.txt
f :  ../data/aclImdb/train/neg/2382_1.txt
f :  ../data/aclImdb/train/neg/10315_2.txt
f :  ../data/aclImdb/train/neg/1641_1.txt
f :  ../data/aclImdb/train/neg/8081_4.txt
f :  ../data/aclImdb/train/neg/4648_1.txt
f :  ../data/aclImdb/train/neg/8649_1.txt
f :  ../data/aclImdb/train/neg/3883_1.txt
f :  ../data/aclImdb/train/neg/3730_2.txt
f :  ../data/aclImdb/train/neg/4868_4.txt
f :  ../data/aclImdb/train/neg/7823_1.txt
f :  ../data/aclImdb/train/neg/691_3.txt
f :  ../data/aclImdb/train/neg/791_1.txt
f :  ../data/aclImdb/train/neg/4559

f :  ../data/aclImdb/train/neg/3720_2.txt
f :  ../data/aclImdb/train/neg/2886_1.txt
f :  ../data/aclImdb/train/neg/3006_3.txt
f :  ../data/aclImdb/train/neg/3646_1.txt
f :  ../data/aclImdb/train/neg/10541_4.txt
f :  ../data/aclImdb/train/neg/6444_4.txt
f :  ../data/aclImdb/train/neg/12275_2.txt
f :  ../data/aclImdb/train/neg/6386_1.txt
f :  ../data/aclImdb/train/neg/1625_1.txt
f :  ../data/aclImdb/train/neg/9766_3.txt
f :  ../data/aclImdb/train/neg/2671_4.txt
f :  ../data/aclImdb/train/neg/903_3.txt
f :  ../data/aclImdb/train/neg/12143_1.txt
f :  ../data/aclImdb/train/neg/1129_1.txt
f :  ../data/aclImdb/train/neg/8272_3.txt
f :  ../data/aclImdb/train/neg/7720_4.txt
f :  ../data/aclImdb/train/neg/5008_1.txt
f :  ../data/aclImdb/train/neg/10596_1.txt
f :  ../data/aclImdb/train/neg/11007_1.txt
f :  ../data/aclImdb/train/neg/2638_1.txt
f :  ../data/aclImdb/train/neg/10288_1.txt
f :  ../data/aclImdb/train/neg/3725_4.txt
f :  ../data/aclImdb/train/neg/10216_3.txt
f :  ../data/aclImdb/train/n

f :  ../data/aclImdb/train/neg/6831_4.txt
f :  ../data/aclImdb/train/neg/5413_1.txt
f :  ../data/aclImdb/train/neg/9672_3.txt
f :  ../data/aclImdb/train/neg/12160_4.txt
f :  ../data/aclImdb/train/neg/1470_1.txt
f :  ../data/aclImdb/train/neg/8481_4.txt
f :  ../data/aclImdb/train/neg/6791_1.txt
f :  ../data/aclImdb/train/neg/10930_1.txt
f :  ../data/aclImdb/train/neg/5270_2.txt
f :  ../data/aclImdb/train/neg/5382_3.txt
f :  ../data/aclImdb/train/neg/1478_4.txt
f :  ../data/aclImdb/train/neg/7257_1.txt
f :  ../data/aclImdb/train/neg/10964_1.txt
f :  ../data/aclImdb/train/neg/4289_3.txt
f :  ../data/aclImdb/train/neg/2502_1.txt
f :  ../data/aclImdb/train/neg/9200_3.txt
f :  ../data/aclImdb/train/neg/242_1.txt
f :  ../data/aclImdb/train/neg/2712_2.txt
f :  ../data/aclImdb/train/neg/4634_2.txt
f :  ../data/aclImdb/train/neg/11094_1.txt
f :  ../data/aclImdb/train/neg/6857_4.txt
f :  ../data/aclImdb/train/neg/1393_2.txt
f :  ../data/aclImdb/train/neg/7272_4.txt
f :  ../data/aclImdb/train/neg/

f :  ../data/aclImdb/train/neg/9268_3.txt
f :  ../data/aclImdb/train/neg/6509_4.txt
f :  ../data/aclImdb/train/neg/5048_2.txt
f :  ../data/aclImdb/train/neg/12188_4.txt
f :  ../data/aclImdb/train/neg/206_2.txt
f :  ../data/aclImdb/train/neg/4279_2.txt
f :  ../data/aclImdb/train/neg/58_3.txt
f :  ../data/aclImdb/train/neg/421_4.txt
f :  ../data/aclImdb/train/neg/5579_2.txt
f :  ../data/aclImdb/train/neg/1000_4.txt
f :  ../data/aclImdb/train/neg/2401_3.txt
f :  ../data/aclImdb/train/neg/9805_1.txt
f :  ../data/aclImdb/train/neg/4932_4.txt
f :  ../data/aclImdb/train/neg/3308_3.txt
f :  ../data/aclImdb/train/neg/3073_4.txt
f :  ../data/aclImdb/train/neg/3542_1.txt
f :  ../data/aclImdb/train/neg/3862_4.txt
f :  ../data/aclImdb/train/neg/12393_4.txt
f :  ../data/aclImdb/train/neg/2190_2.txt
f :  ../data/aclImdb/train/neg/9045_1.txt
f :  ../data/aclImdb/train/neg/6121_2.txt
f :  ../data/aclImdb/train/neg/77_4.txt
f :  ../data/aclImdb/train/neg/4174_3.txt
f :  ../data/aclImdb/train/neg/6049_4.

f :  ../data/aclImdb/train/neg/10587_1.txt
f :  ../data/aclImdb/train/neg/279_1.txt
f :  ../data/aclImdb/train/neg/1364_4.txt
f :  ../data/aclImdb/train/neg/3724_1.txt
f :  ../data/aclImdb/train/neg/7774_3.txt
f :  ../data/aclImdb/train/neg/5463_1.txt
f :  ../data/aclImdb/train/neg/4465_2.txt
f :  ../data/aclImdb/train/neg/1456_1.txt
f :  ../data/aclImdb/train/neg/10062_1.txt
f :  ../data/aclImdb/train/neg/6619_1.txt
f :  ../data/aclImdb/train/neg/2201_3.txt
f :  ../data/aclImdb/train/neg/4927_3.txt
f :  ../data/aclImdb/train/neg/4468_1.txt
f :  ../data/aclImdb/train/neg/2893_2.txt
f :  ../data/aclImdb/train/neg/9592_4.txt
f :  ../data/aclImdb/train/neg/10977_1.txt
f :  ../data/aclImdb/train/neg/1360_4.txt
f :  ../data/aclImdb/train/neg/9897_4.txt
f :  ../data/aclImdb/train/neg/10559_3.txt
f :  ../data/aclImdb/train/neg/3053_1.txt
f :  ../data/aclImdb/train/neg/4129_1.txt
f :  ../data/aclImdb/train/neg/8113_4.txt
f :  ../data/aclImdb/train/neg/1372_1.txt
f :  ../data/aclImdb/train/neg/

f :  ../data/aclImdb/train/neg/5710_1.txt
f :  ../data/aclImdb/train/neg/7736_2.txt
f :  ../data/aclImdb/train/neg/10798_1.txt
f :  ../data/aclImdb/train/neg/2311_1.txt
f :  ../data/aclImdb/train/neg/6132_2.txt
f :  ../data/aclImdb/train/neg/10656_4.txt
f :  ../data/aclImdb/train/neg/2595_3.txt
f :  ../data/aclImdb/train/neg/10689_4.txt
f :  ../data/aclImdb/train/neg/3108_3.txt
f :  ../data/aclImdb/train/neg/1779_1.txt
f :  ../data/aclImdb/train/neg/8149_1.txt
f :  ../data/aclImdb/train/neg/2047_1.txt
f :  ../data/aclImdb/train/neg/9378_4.txt
f :  ../data/aclImdb/train/neg/5647_3.txt
f :  ../data/aclImdb/train/neg/1940_3.txt
f :  ../data/aclImdb/train/neg/1719_1.txt
f :  ../data/aclImdb/train/neg/11478_4.txt
f :  ../data/aclImdb/train/neg/1618_3.txt
f :  ../data/aclImdb/train/neg/8618_1.txt
f :  ../data/aclImdb/train/neg/5260_1.txt
f :  ../data/aclImdb/train/neg/2064_1.txt
f :  ../data/aclImdb/train/neg/3805_2.txt
f :  ../data/aclImdb/train/neg/11556_1.txt
f :  ../data/aclImdb/train/ne

f :  ../data/aclImdb/train/neg/12428_2.txt
f :  ../data/aclImdb/train/neg/308_1.txt
f :  ../data/aclImdb/train/neg/2269_1.txt
f :  ../data/aclImdb/train/neg/7158_1.txt
f :  ../data/aclImdb/train/neg/11391_2.txt
f :  ../data/aclImdb/train/neg/12489_1.txt
f :  ../data/aclImdb/train/neg/8041_1.txt
f :  ../data/aclImdb/train/neg/8144_3.txt
f :  ../data/aclImdb/train/neg/11706_3.txt
f :  ../data/aclImdb/train/neg/8470_2.txt
f :  ../data/aclImdb/train/neg/3431_4.txt
f :  ../data/aclImdb/train/neg/8894_4.txt
f :  ../data/aclImdb/train/neg/8886_3.txt
f :  ../data/aclImdb/train/neg/6628_1.txt
f :  ../data/aclImdb/train/neg/1350_1.txt
f :  ../data/aclImdb/train/neg/5782_3.txt
f :  ../data/aclImdb/train/neg/4074_2.txt
f :  ../data/aclImdb/train/neg/8360_3.txt
f :  ../data/aclImdb/train/neg/2574_3.txt
f :  ../data/aclImdb/train/neg/3922_3.txt
f :  ../data/aclImdb/train/neg/10617_1.txt
f :  ../data/aclImdb/train/neg/11882_1.txt
f :  ../data/aclImdb/train/neg/2109_4.txt
f :  ../data/aclImdb/train/ne

f :  ../data/aclImdb/train/neg/3653_1.txt
f :  ../data/aclImdb/train/neg/4158_1.txt
f :  ../data/aclImdb/train/neg/7990_4.txt
f :  ../data/aclImdb/train/neg/5936_3.txt
f :  ../data/aclImdb/train/neg/4286_1.txt
f :  ../data/aclImdb/train/neg/396_3.txt
f :  ../data/aclImdb/train/neg/11725_4.txt
f :  ../data/aclImdb/train/neg/2376_4.txt
f :  ../data/aclImdb/train/neg/11755_3.txt
f :  ../data/aclImdb/train/neg/412_4.txt
f :  ../data/aclImdb/train/neg/12203_2.txt
f :  ../data/aclImdb/train/neg/5204_1.txt
f :  ../data/aclImdb/train/neg/556_1.txt
f :  ../data/aclImdb/train/neg/12165_3.txt
f :  ../data/aclImdb/train/neg/7716_4.txt
f :  ../data/aclImdb/train/neg/5129_1.txt
f :  ../data/aclImdb/train/neg/6046_2.txt
f :  ../data/aclImdb/train/neg/9707_3.txt
f :  ../data/aclImdb/train/neg/4631_2.txt
f :  ../data/aclImdb/train/neg/772_4.txt
f :  ../data/aclImdb/train/neg/4556_2.txt
f :  ../data/aclImdb/train/neg/3681_1.txt
f :  ../data/aclImdb/train/neg/410_4.txt
f :  ../data/aclImdb/train/neg/2989

f :  ../data/aclImdb/test/pos/2054_9.txt
f :  ../data/aclImdb/test/pos/785_7.txt
f :  ../data/aclImdb/test/pos/5629_10.txt
f :  ../data/aclImdb/test/pos/5855_7.txt
f :  ../data/aclImdb/test/pos/7641_8.txt
f :  ../data/aclImdb/test/pos/12359_7.txt
f :  ../data/aclImdb/test/pos/6738_10.txt
f :  ../data/aclImdb/test/pos/5416_8.txt
f :  ../data/aclImdb/test/pos/11969_10.txt
f :  ../data/aclImdb/test/pos/5242_10.txt
f :  ../data/aclImdb/test/pos/5556_10.txt
f :  ../data/aclImdb/test/pos/9875_10.txt
f :  ../data/aclImdb/test/pos/1769_10.txt
f :  ../data/aclImdb/test/pos/11741_10.txt
f :  ../data/aclImdb/test/pos/6849_8.txt
f :  ../data/aclImdb/test/pos/2800_8.txt
f :  ../data/aclImdb/test/pos/741_7.txt
f :  ../data/aclImdb/test/pos/3355_9.txt
f :  ../data/aclImdb/test/pos/6114_10.txt
f :  ../data/aclImdb/test/pos/2622_8.txt
f :  ../data/aclImdb/test/pos/3347_8.txt
f :  ../data/aclImdb/test/pos/4109_7.txt
f :  ../data/aclImdb/test/pos/8981_9.txt
f :  ../data/aclImdb/test/pos/3929_8.txt
f :  .

f :  ../data/aclImdb/test/pos/11355_10.txt
f :  ../data/aclImdb/test/pos/9044_8.txt
f :  ../data/aclImdb/test/pos/3225_10.txt
f :  ../data/aclImdb/test/pos/11395_10.txt
f :  ../data/aclImdb/test/pos/3775_10.txt
f :  ../data/aclImdb/test/pos/11144_7.txt
f :  ../data/aclImdb/test/pos/3134_10.txt
f :  ../data/aclImdb/test/pos/2198_9.txt
f :  ../data/aclImdb/test/pos/10274_8.txt
f :  ../data/aclImdb/test/pos/2964_7.txt
f :  ../data/aclImdb/test/pos/7728_9.txt
f :  ../data/aclImdb/test/pos/4333_9.txt
f :  ../data/aclImdb/test/pos/3369_7.txt
f :  ../data/aclImdb/test/pos/1456_8.txt
f :  ../data/aclImdb/test/pos/11972_7.txt
f :  ../data/aclImdb/test/pos/2360_7.txt
f :  ../data/aclImdb/test/pos/7956_10.txt
f :  ../data/aclImdb/test/pos/8873_10.txt
f :  ../data/aclImdb/test/pos/1114_10.txt
f :  ../data/aclImdb/test/pos/5670_8.txt
f :  ../data/aclImdb/test/pos/7001_10.txt
f :  ../data/aclImdb/test/pos/7447_8.txt
f :  ../data/aclImdb/test/pos/3162_10.txt
f :  ../data/aclImdb/test/pos/11946_7.txt


f :  ../data/aclImdb/test/pos/5468_10.txt
f :  ../data/aclImdb/test/pos/7414_9.txt
f :  ../data/aclImdb/test/pos/4292_7.txt
f :  ../data/aclImdb/test/pos/460_8.txt
f :  ../data/aclImdb/test/pos/8510_8.txt
f :  ../data/aclImdb/test/pos/7999_7.txt
f :  ../data/aclImdb/test/pos/5390_7.txt
f :  ../data/aclImdb/test/pos/10718_10.txt
f :  ../data/aclImdb/test/pos/1856_7.txt
f :  ../data/aclImdb/test/pos/2968_7.txt
f :  ../data/aclImdb/test/pos/11804_10.txt
f :  ../data/aclImdb/test/pos/1667_10.txt
f :  ../data/aclImdb/test/pos/2926_7.txt
f :  ../data/aclImdb/test/pos/8539_8.txt
f :  ../data/aclImdb/test/pos/11388_10.txt
f :  ../data/aclImdb/test/pos/2264_7.txt
f :  ../data/aclImdb/test/pos/11146_7.txt
f :  ../data/aclImdb/test/pos/6705_8.txt
f :  ../data/aclImdb/test/pos/9013_10.txt
f :  ../data/aclImdb/test/pos/11918_10.txt
f :  ../data/aclImdb/test/pos/8281_9.txt
f :  ../data/aclImdb/test/pos/3104_7.txt
f :  ../data/aclImdb/test/pos/3772_8.txt
f :  ../data/aclImdb/test/pos/5214_8.txt
f :  

f :  ../data/aclImdb/test/pos/5375_7.txt
f :  ../data/aclImdb/test/pos/5451_8.txt
f :  ../data/aclImdb/test/pos/329_10.txt
f :  ../data/aclImdb/test/pos/8706_8.txt
f :  ../data/aclImdb/test/pos/5803_9.txt
f :  ../data/aclImdb/test/pos/8502_10.txt
f :  ../data/aclImdb/test/pos/5578_10.txt
f :  ../data/aclImdb/test/pos/4549_8.txt
f :  ../data/aclImdb/test/pos/6787_10.txt
f :  ../data/aclImdb/test/pos/8659_9.txt
f :  ../data/aclImdb/test/pos/5490_9.txt
f :  ../data/aclImdb/test/pos/5755_8.txt
f :  ../data/aclImdb/test/pos/9561_10.txt
f :  ../data/aclImdb/test/pos/8997_10.txt
f :  ../data/aclImdb/test/pos/10755_9.txt
f :  ../data/aclImdb/test/pos/873_8.txt
f :  ../data/aclImdb/test/pos/3147_10.txt
f :  ../data/aclImdb/test/pos/11941_7.txt
f :  ../data/aclImdb/test/pos/9385_10.txt
f :  ../data/aclImdb/test/pos/338_10.txt
f :  ../data/aclImdb/test/pos/7165_10.txt
f :  ../data/aclImdb/test/pos/5232_8.txt
f :  ../data/aclImdb/test/pos/10499_8.txt
f :  ../data/aclImdb/test/pos/6180_10.txt
f :  

f :  ../data/aclImdb/test/pos/6090_10.txt
f :  ../data/aclImdb/test/pos/5238_7.txt
f :  ../data/aclImdb/test/pos/5816_10.txt
f :  ../data/aclImdb/test/pos/4596_7.txt
f :  ../data/aclImdb/test/pos/10765_10.txt
f :  ../data/aclImdb/test/pos/5023_7.txt
f :  ../data/aclImdb/test/pos/7031_10.txt
f :  ../data/aclImdb/test/pos/12448_10.txt
f :  ../data/aclImdb/test/pos/1464_9.txt
f :  ../data/aclImdb/test/pos/4210_7.txt
f :  ../data/aclImdb/test/pos/8287_10.txt
f :  ../data/aclImdb/test/pos/5415_9.txt
f :  ../data/aclImdb/test/pos/8742_10.txt
f :  ../data/aclImdb/test/pos/6121_9.txt
f :  ../data/aclImdb/test/pos/11238_10.txt
f :  ../data/aclImdb/test/pos/7336_7.txt
f :  ../data/aclImdb/test/pos/1803_10.txt
f :  ../data/aclImdb/test/pos/12126_8.txt
f :  ../data/aclImdb/test/pos/6354_10.txt
f :  ../data/aclImdb/test/pos/11440_8.txt
f :  ../data/aclImdb/test/pos/5088_7.txt
f :  ../data/aclImdb/test/pos/10951_7.txt
f :  ../data/aclImdb/test/pos/5093_10.txt
f :  ../data/aclImdb/test/pos/559_10.txt

f :  ../data/aclImdb/test/pos/12111_8.txt
f :  ../data/aclImdb/test/pos/8479_9.txt
f :  ../data/aclImdb/test/pos/11011_10.txt
f :  ../data/aclImdb/test/pos/6143_9.txt
f :  ../data/aclImdb/test/pos/6846_8.txt
f :  ../data/aclImdb/test/pos/12030_8.txt
f :  ../data/aclImdb/test/pos/6509_8.txt
f :  ../data/aclImdb/test/pos/2038_8.txt
f :  ../data/aclImdb/test/pos/2855_10.txt
f :  ../data/aclImdb/test/pos/12386_10.txt
f :  ../data/aclImdb/test/pos/5258_7.txt
f :  ../data/aclImdb/test/pos/9142_8.txt
f :  ../data/aclImdb/test/pos/11008_10.txt
f :  ../data/aclImdb/test/pos/4058_8.txt
f :  ../data/aclImdb/test/pos/435_7.txt
f :  ../data/aclImdb/test/pos/6873_8.txt
f :  ../data/aclImdb/test/pos/9213_10.txt
f :  ../data/aclImdb/test/pos/11743_9.txt
f :  ../data/aclImdb/test/pos/8132_7.txt
f :  ../data/aclImdb/test/pos/2432_10.txt
f :  ../data/aclImdb/test/pos/11736_9.txt
f :  ../data/aclImdb/test/pos/415_10.txt
f :  ../data/aclImdb/test/pos/3141_10.txt
f :  ../data/aclImdb/test/pos/5486_7.txt
f :

f :  ../data/aclImdb/test/pos/7178_10.txt
f :  ../data/aclImdb/test/pos/959_8.txt
f :  ../data/aclImdb/test/pos/2063_10.txt
f :  ../data/aclImdb/test/pos/5739_7.txt
f :  ../data/aclImdb/test/pos/11196_10.txt
f :  ../data/aclImdb/test/pos/9429_7.txt
f :  ../data/aclImdb/test/pos/7704_10.txt
f :  ../data/aclImdb/test/pos/9049_10.txt
f :  ../data/aclImdb/test/pos/8952_9.txt
f :  ../data/aclImdb/test/pos/3682_9.txt
f :  ../data/aclImdb/test/pos/1201_10.txt
f :  ../data/aclImdb/test/pos/11165_10.txt
f :  ../data/aclImdb/test/pos/6541_7.txt
f :  ../data/aclImdb/test/pos/9001_8.txt
f :  ../data/aclImdb/test/pos/3180_10.txt
f :  ../data/aclImdb/test/pos/5216_10.txt
f :  ../data/aclImdb/test/pos/2746_8.txt
f :  ../data/aclImdb/test/pos/1484_10.txt
f :  ../data/aclImdb/test/pos/6510_7.txt
f :  ../data/aclImdb/test/pos/144_9.txt
f :  ../data/aclImdb/test/pos/1409_10.txt
f :  ../data/aclImdb/test/pos/1563_8.txt
f :  ../data/aclImdb/test/pos/9117_10.txt
f :  ../data/aclImdb/test/pos/4057_7.txt
f : 

f :  ../data/aclImdb/test/pos/6795_10.txt
f :  ../data/aclImdb/test/pos/5853_8.txt
f :  ../data/aclImdb/test/pos/7365_8.txt
f :  ../data/aclImdb/test/pos/12403_10.txt
f :  ../data/aclImdb/test/pos/7538_7.txt
f :  ../data/aclImdb/test/pos/11722_9.txt
f :  ../data/aclImdb/test/pos/8890_7.txt
f :  ../data/aclImdb/test/pos/11328_9.txt
f :  ../data/aclImdb/test/pos/10834_10.txt
f :  ../data/aclImdb/test/pos/4705_10.txt
f :  ../data/aclImdb/test/pos/5042_7.txt
f :  ../data/aclImdb/test/pos/682_9.txt
f :  ../data/aclImdb/test/pos/6085_8.txt
f :  ../data/aclImdb/test/pos/9770_10.txt
f :  ../data/aclImdb/test/pos/9843_7.txt
f :  ../data/aclImdb/test/pos/4399_9.txt
f :  ../data/aclImdb/test/pos/846_7.txt
f :  ../data/aclImdb/test/pos/335_10.txt
f :  ../data/aclImdb/test/pos/10474_10.txt
f :  ../data/aclImdb/test/pos/47_10.txt
f :  ../data/aclImdb/test/pos/8927_8.txt
f :  ../data/aclImdb/test/pos/3475_9.txt
f :  ../data/aclImdb/test/pos/2644_10.txt
f :  ../data/aclImdb/test/pos/3112_10.txt
f :  .

f :  ../data/aclImdb/test/pos/6576_10.txt
f :  ../data/aclImdb/test/pos/1564_8.txt
f :  ../data/aclImdb/test/pos/6382_10.txt
f :  ../data/aclImdb/test/pos/1928_9.txt
f :  ../data/aclImdb/test/pos/3289_10.txt
f :  ../data/aclImdb/test/pos/322_8.txt
f :  ../data/aclImdb/test/pos/6863_9.txt
f :  ../data/aclImdb/test/pos/10313_10.txt
f :  ../data/aclImdb/test/pos/10336_7.txt
f :  ../data/aclImdb/test/pos/9268_9.txt
f :  ../data/aclImdb/test/pos/5358_9.txt
f :  ../data/aclImdb/test/pos/7143_10.txt
f :  ../data/aclImdb/test/pos/5952_9.txt
f :  ../data/aclImdb/test/pos/4688_10.txt
f :  ../data/aclImdb/test/pos/10195_10.txt
f :  ../data/aclImdb/test/pos/353_10.txt
f :  ../data/aclImdb/test/pos/7244_8.txt
f :  ../data/aclImdb/test/pos/1266_8.txt
f :  ../data/aclImdb/test/pos/0_10.txt
f :  ../data/aclImdb/test/pos/8847_10.txt
f :  ../data/aclImdb/test/pos/6437_7.txt
f :  ../data/aclImdb/test/pos/3554_10.txt
f :  ../data/aclImdb/test/pos/3551_7.txt
f :  ../data/aclImdb/test/pos/2680_9.txt
f :  ..

f :  ../data/aclImdb/test/pos/10972_10.txt
f :  ../data/aclImdb/test/pos/4456_10.txt
f :  ../data/aclImdb/test/pos/4044_10.txt
f :  ../data/aclImdb/test/pos/6255_10.txt
f :  ../data/aclImdb/test/pos/10126_9.txt
f :  ../data/aclImdb/test/pos/8438_8.txt
f :  ../data/aclImdb/test/pos/4511_7.txt
f :  ../data/aclImdb/test/pos/8468_8.txt
f :  ../data/aclImdb/test/pos/215_8.txt
f :  ../data/aclImdb/test/pos/2803_8.txt
f :  ../data/aclImdb/test/pos/2704_9.txt
f :  ../data/aclImdb/test/pos/2372_10.txt
f :  ../data/aclImdb/test/pos/976_8.txt
f :  ../data/aclImdb/test/pos/4587_8.txt
f :  ../data/aclImdb/test/pos/8754_10.txt
f :  ../data/aclImdb/test/pos/5570_10.txt
f :  ../data/aclImdb/test/pos/3253_9.txt
f :  ../data/aclImdb/test/pos/1251_10.txt
f :  ../data/aclImdb/test/pos/3924_7.txt
f :  ../data/aclImdb/test/pos/1673_8.txt
f :  ../data/aclImdb/test/pos/11411_8.txt
f :  ../data/aclImdb/test/pos/6426_7.txt
f :  ../data/aclImdb/test/pos/6425_9.txt
f :  ../data/aclImdb/test/pos/6135_9.txt
f :  ..

f :  ../data/aclImdb/test/pos/4522_7.txt
f :  ../data/aclImdb/test/pos/5340_8.txt
f :  ../data/aclImdb/test/pos/2629_10.txt
f :  ../data/aclImdb/test/pos/9805_10.txt
f :  ../data/aclImdb/test/pos/1274_7.txt
f :  ../data/aclImdb/test/pos/3431_10.txt
f :  ../data/aclImdb/test/pos/3933_7.txt
f :  ../data/aclImdb/test/pos/3057_9.txt
f :  ../data/aclImdb/test/pos/2881_9.txt
f :  ../data/aclImdb/test/pos/5734_10.txt
f :  ../data/aclImdb/test/pos/10546_10.txt
f :  ../data/aclImdb/test/pos/4636_10.txt
f :  ../data/aclImdb/test/pos/4298_8.txt
f :  ../data/aclImdb/test/pos/3243_9.txt
f :  ../data/aclImdb/test/pos/11851_8.txt
f :  ../data/aclImdb/test/pos/12486_9.txt
f :  ../data/aclImdb/test/pos/10496_10.txt
f :  ../data/aclImdb/test/pos/7173_10.txt
f :  ../data/aclImdb/test/pos/6852_7.txt
f :  ../data/aclImdb/test/pos/6116_10.txt
f :  ../data/aclImdb/test/pos/368_8.txt
f :  ../data/aclImdb/test/pos/8394_9.txt
f :  ../data/aclImdb/test/pos/1904_10.txt
f :  ../data/aclImdb/test/pos/7434_10.txt
f 

f :  ../data/aclImdb/test/pos/3474_10.txt
f :  ../data/aclImdb/test/pos/8637_10.txt
f :  ../data/aclImdb/test/pos/5731_9.txt
f :  ../data/aclImdb/test/pos/8542_9.txt
f :  ../data/aclImdb/test/pos/9936_10.txt
f :  ../data/aclImdb/test/pos/6294_7.txt
f :  ../data/aclImdb/test/pos/5525_7.txt
f :  ../data/aclImdb/test/pos/8900_7.txt
f :  ../data/aclImdb/test/pos/11368_10.txt
f :  ../data/aclImdb/test/pos/9704_7.txt
f :  ../data/aclImdb/test/pos/11863_8.txt
f :  ../data/aclImdb/test/pos/7908_7.txt
f :  ../data/aclImdb/test/pos/3753_10.txt
f :  ../data/aclImdb/test/pos/5034_7.txt
f :  ../data/aclImdb/test/pos/1899_10.txt
f :  ../data/aclImdb/test/pos/6686_10.txt
f :  ../data/aclImdb/test/pos/6966_7.txt
f :  ../data/aclImdb/test/pos/10682_10.txt
f :  ../data/aclImdb/test/pos/11713_9.txt
f :  ../data/aclImdb/test/pos/1968_8.txt
f :  ../data/aclImdb/test/pos/11623_10.txt
f :  ../data/aclImdb/test/pos/3785_8.txt
f :  ../data/aclImdb/test/pos/3667_10.txt
f :  ../data/aclImdb/test/pos/725_8.txt
f 

f :  ../data/aclImdb/test/pos/4183_10.txt
f :  ../data/aclImdb/test/pos/776_8.txt
f :  ../data/aclImdb/test/pos/4335_9.txt
f :  ../data/aclImdb/test/pos/7878_7.txt
f :  ../data/aclImdb/test/pos/8804_10.txt
f :  ../data/aclImdb/test/pos/10421_8.txt
f :  ../data/aclImdb/test/pos/5700_9.txt
f :  ../data/aclImdb/test/pos/3508_10.txt
f :  ../data/aclImdb/test/pos/1767_10.txt
f :  ../data/aclImdb/test/pos/11067_10.txt
f :  ../data/aclImdb/test/pos/11066_7.txt
f :  ../data/aclImdb/test/pos/70_8.txt
f :  ../data/aclImdb/test/pos/10937_8.txt
f :  ../data/aclImdb/test/pos/1152_8.txt
f :  ../data/aclImdb/test/pos/6005_9.txt
f :  ../data/aclImdb/test/pos/8811_7.txt
f :  ../data/aclImdb/test/pos/10047_10.txt
f :  ../data/aclImdb/test/pos/8944_10.txt
f :  ../data/aclImdb/test/pos/9204_8.txt
f :  ../data/aclImdb/test/pos/3855_10.txt
f :  ../data/aclImdb/test/pos/2280_10.txt
f :  ../data/aclImdb/test/pos/11687_10.txt
f :  ../data/aclImdb/test/pos/1458_8.txt
f :  ../data/aclImdb/test/pos/6716_8.txt
f :

f :  ../data/aclImdb/test/pos/2821_10.txt
f :  ../data/aclImdb/test/pos/7497_8.txt
f :  ../data/aclImdb/test/pos/10243_9.txt
f :  ../data/aclImdb/test/pos/4075_7.txt
f :  ../data/aclImdb/test/pos/4971_8.txt
f :  ../data/aclImdb/test/pos/11236_10.txt
f :  ../data/aclImdb/test/pos/10317_10.txt
f :  ../data/aclImdb/test/pos/897_8.txt
f :  ../data/aclImdb/test/pos/9157_10.txt
f :  ../data/aclImdb/test/pos/7987_8.txt
f :  ../data/aclImdb/test/pos/1780_10.txt
f :  ../data/aclImdb/test/pos/6325_9.txt
f :  ../data/aclImdb/test/pos/8295_10.txt
f :  ../data/aclImdb/test/pos/6556_8.txt
f :  ../data/aclImdb/test/pos/4385_8.txt
f :  ../data/aclImdb/test/pos/8868_7.txt
f :  ../data/aclImdb/test/pos/4518_10.txt
f :  ../data/aclImdb/test/pos/6826_7.txt
f :  ../data/aclImdb/test/pos/2858_9.txt
f :  ../data/aclImdb/test/pos/9014_9.txt
f :  ../data/aclImdb/test/pos/10378_8.txt
f :  ../data/aclImdb/test/pos/4694_10.txt
f :  ../data/aclImdb/test/pos/1250_9.txt
f :  ../data/aclImdb/test/pos/7054_7.txt
f :  

f :  ../data/aclImdb/test/pos/9686_10.txt
f :  ../data/aclImdb/test/pos/3063_10.txt
f :  ../data/aclImdb/test/pos/6669_9.txt
f :  ../data/aclImdb/test/pos/761_10.txt
f :  ../data/aclImdb/test/pos/3458_9.txt
f :  ../data/aclImdb/test/pos/7924_8.txt
f :  ../data/aclImdb/test/pos/9149_8.txt
f :  ../data/aclImdb/test/pos/3761_8.txt
f :  ../data/aclImdb/test/pos/10593_8.txt
f :  ../data/aclImdb/test/pos/11211_10.txt
f :  ../data/aclImdb/test/pos/5538_8.txt
f :  ../data/aclImdb/test/pos/11258_8.txt
f :  ../data/aclImdb/test/pos/1083_7.txt
f :  ../data/aclImdb/test/pos/11515_7.txt
f :  ../data/aclImdb/test/pos/9305_7.txt
f :  ../data/aclImdb/test/pos/7169_10.txt
f :  ../data/aclImdb/test/pos/1010_9.txt
f :  ../data/aclImdb/test/pos/8204_7.txt
f :  ../data/aclImdb/test/pos/8429_8.txt
f :  ../data/aclImdb/test/pos/3829_10.txt
f :  ../data/aclImdb/test/pos/7014_10.txt
f :  ../data/aclImdb/test/pos/8124_8.txt
f :  ../data/aclImdb/test/pos/6727_10.txt
f :  ../data/aclImdb/test/pos/7606_10.txt
f : 

f :  ../data/aclImdb/test/pos/8966_8.txt
f :  ../data/aclImdb/test/pos/700_7.txt
f :  ../data/aclImdb/test/pos/10254_8.txt
f :  ../data/aclImdb/test/pos/1743_8.txt
f :  ../data/aclImdb/test/pos/5164_10.txt
f :  ../data/aclImdb/test/pos/1477_8.txt
f :  ../data/aclImdb/test/pos/2421_8.txt
f :  ../data/aclImdb/test/pos/8662_9.txt
f :  ../data/aclImdb/test/pos/1276_9.txt
f :  ../data/aclImdb/test/pos/3868_7.txt
f :  ../data/aclImdb/test/pos/7470_8.txt
f :  ../data/aclImdb/test/pos/11145_8.txt
f :  ../data/aclImdb/test/pos/6569_9.txt
f :  ../data/aclImdb/test/pos/5493_10.txt
f :  ../data/aclImdb/test/pos/11522_8.txt
f :  ../data/aclImdb/test/pos/6099_7.txt
f :  ../data/aclImdb/test/pos/128_10.txt
f :  ../data/aclImdb/test/pos/7757_10.txt
f :  ../data/aclImdb/test/pos/10246_9.txt
f :  ../data/aclImdb/test/pos/7264_8.txt
f :  ../data/aclImdb/test/pos/12466_10.txt
f :  ../data/aclImdb/test/pos/9127_8.txt
f :  ../data/aclImdb/test/pos/11108_10.txt
f :  ../data/aclImdb/test/pos/11115_7.txt
f :  

f :  ../data/aclImdb/test/pos/9077_9.txt
f :  ../data/aclImdb/test/pos/10863_10.txt
f :  ../data/aclImdb/test/pos/10973_7.txt
f :  ../data/aclImdb/test/pos/8288_10.txt
f :  ../data/aclImdb/test/pos/4319_9.txt
f :  ../data/aclImdb/test/pos/6149_10.txt
f :  ../data/aclImdb/test/pos/2498_7.txt
f :  ../data/aclImdb/test/pos/7283_9.txt
f :  ../data/aclImdb/test/pos/9382_10.txt
f :  ../data/aclImdb/test/pos/4860_10.txt
f :  ../data/aclImdb/test/pos/2558_10.txt
f :  ../data/aclImdb/test/pos/625_10.txt
f :  ../data/aclImdb/test/pos/2400_8.txt
f :  ../data/aclImdb/test/pos/925_10.txt
f :  ../data/aclImdb/test/pos/7409_8.txt
f :  ../data/aclImdb/test/pos/6400_10.txt
f :  ../data/aclImdb/test/pos/8722_9.txt
f :  ../data/aclImdb/test/pos/1082_9.txt
f :  ../data/aclImdb/test/pos/4284_9.txt
f :  ../data/aclImdb/test/pos/2161_7.txt
f :  ../data/aclImdb/test/pos/11521_10.txt
f :  ../data/aclImdb/test/pos/1736_10.txt
f :  ../data/aclImdb/test/pos/12395_10.txt
f :  ../data/aclImdb/test/pos/4009_8.txt
f 

f :  ../data/aclImdb/test/pos/8329_10.txt
f :  ../data/aclImdb/test/pos/5265_10.txt
f :  ../data/aclImdb/test/pos/2601_10.txt
f :  ../data/aclImdb/test/pos/7446_10.txt
f :  ../data/aclImdb/test/pos/6544_10.txt
f :  ../data/aclImdb/test/pos/6574_10.txt
f :  ../data/aclImdb/test/pos/7408_9.txt
f :  ../data/aclImdb/test/pos/9394_8.txt
f :  ../data/aclImdb/test/pos/9112_10.txt
f :  ../data/aclImdb/test/pos/4498_9.txt
f :  ../data/aclImdb/test/pos/3030_9.txt
f :  ../data/aclImdb/test/pos/11677_10.txt
f :  ../data/aclImdb/test/pos/6320_10.txt
f :  ../data/aclImdb/test/pos/7332_7.txt
f :  ../data/aclImdb/test/pos/831_7.txt
f :  ../data/aclImdb/test/pos/12020_10.txt
f :  ../data/aclImdb/test/pos/1108_8.txt
f :  ../data/aclImdb/test/pos/5310_9.txt
f :  ../data/aclImdb/test/pos/5519_7.txt
f :  ../data/aclImdb/test/pos/2023_7.txt
f :  ../data/aclImdb/test/pos/12209_9.txt
f :  ../data/aclImdb/test/pos/1838_10.txt
f :  ../data/aclImdb/test/pos/6227_8.txt
f :  ../data/aclImdb/test/pos/10646_10.txt
f

f :  ../data/aclImdb/test/neg/9140_2.txt
f :  ../data/aclImdb/test/neg/7893_1.txt
f :  ../data/aclImdb/test/neg/11537_3.txt
f :  ../data/aclImdb/test/neg/7212_2.txt
f :  ../data/aclImdb/test/neg/2392_2.txt
f :  ../data/aclImdb/test/neg/5921_3.txt
f :  ../data/aclImdb/test/neg/3302_1.txt
f :  ../data/aclImdb/test/neg/7444_4.txt
f :  ../data/aclImdb/test/neg/3242_1.txt
f :  ../data/aclImdb/test/neg/8586_3.txt
f :  ../data/aclImdb/test/neg/3528_1.txt
f :  ../data/aclImdb/test/neg/1874_1.txt
f :  ../data/aclImdb/test/neg/4328_1.txt
f :  ../data/aclImdb/test/neg/9324_4.txt
f :  ../data/aclImdb/test/neg/3401_3.txt
f :  ../data/aclImdb/test/neg/5257_1.txt
f :  ../data/aclImdb/test/neg/6746_1.txt
f :  ../data/aclImdb/test/neg/4610_1.txt
f :  ../data/aclImdb/test/neg/12061_1.txt
f :  ../data/aclImdb/test/neg/11199_4.txt
f :  ../data/aclImdb/test/neg/7962_1.txt
f :  ../data/aclImdb/test/neg/1675_4.txt
f :  ../data/aclImdb/test/neg/1297_2.txt
f :  ../data/aclImdb/test/neg/2039_4.txt
f :  ../data/

f :  ../data/aclImdb/test/neg/7095_1.txt
f :  ../data/aclImdb/test/neg/6408_2.txt
f :  ../data/aclImdb/test/neg/12378_1.txt
f :  ../data/aclImdb/test/neg/2894_4.txt
f :  ../data/aclImdb/test/neg/10062_4.txt
f :  ../data/aclImdb/test/neg/4510_1.txt
f :  ../data/aclImdb/test/neg/851_1.txt
f :  ../data/aclImdb/test/neg/6055_1.txt
f :  ../data/aclImdb/test/neg/6337_1.txt
f :  ../data/aclImdb/test/neg/12263_1.txt
f :  ../data/aclImdb/test/neg/11347_1.txt
f :  ../data/aclImdb/test/neg/8260_1.txt
f :  ../data/aclImdb/test/neg/7871_1.txt
f :  ../data/aclImdb/test/neg/3338_2.txt
f :  ../data/aclImdb/test/neg/613_4.txt
f :  ../data/aclImdb/test/neg/232_4.txt
f :  ../data/aclImdb/test/neg/4029_1.txt
f :  ../data/aclImdb/test/neg/3346_3.txt
f :  ../data/aclImdb/test/neg/11726_3.txt
f :  ../data/aclImdb/test/neg/10059_1.txt
f :  ../data/aclImdb/test/neg/2875_2.txt
f :  ../data/aclImdb/test/neg/1351_1.txt
f :  ../data/aclImdb/test/neg/3237_1.txt
f :  ../data/aclImdb/test/neg/5667_1.txt
f :  ../data/

f :  ../data/aclImdb/test/neg/1871_2.txt
f :  ../data/aclImdb/test/neg/8803_3.txt
f :  ../data/aclImdb/test/neg/10247_1.txt
f :  ../data/aclImdb/test/neg/2358_3.txt
f :  ../data/aclImdb/test/neg/1927_1.txt
f :  ../data/aclImdb/test/neg/10159_4.txt
f :  ../data/aclImdb/test/neg/3047_2.txt
f :  ../data/aclImdb/test/neg/1100_4.txt
f :  ../data/aclImdb/test/neg/2436_4.txt
f :  ../data/aclImdb/test/neg/6785_3.txt
f :  ../data/aclImdb/test/neg/8692_1.txt
f :  ../data/aclImdb/test/neg/647_1.txt
f :  ../data/aclImdb/test/neg/8653_2.txt
f :  ../data/aclImdb/test/neg/8714_2.txt
f :  ../data/aclImdb/test/neg/10787_3.txt
f :  ../data/aclImdb/test/neg/10711_2.txt
f :  ../data/aclImdb/test/neg/7743_1.txt
f :  ../data/aclImdb/test/neg/12485_4.txt
f :  ../data/aclImdb/test/neg/2745_4.txt
f :  ../data/aclImdb/test/neg/10860_2.txt
f :  ../data/aclImdb/test/neg/8953_1.txt
f :  ../data/aclImdb/test/neg/12113_1.txt
f :  ../data/aclImdb/test/neg/9769_3.txt
f :  ../data/aclImdb/test/neg/11747_3.txt
f :  ../d

f :  ../data/aclImdb/test/neg/10149_1.txt
f :  ../data/aclImdb/test/neg/1544_3.txt
f :  ../data/aclImdb/test/neg/860_4.txt
f :  ../data/aclImdb/test/neg/2906_1.txt
f :  ../data/aclImdb/test/neg/8097_2.txt
f :  ../data/aclImdb/test/neg/10328_1.txt
f :  ../data/aclImdb/test/neg/9052_3.txt
f :  ../data/aclImdb/test/neg/3307_1.txt
f :  ../data/aclImdb/test/neg/9635_3.txt
f :  ../data/aclImdb/test/neg/2347_3.txt
f :  ../data/aclImdb/test/neg/5698_1.txt
f :  ../data/aclImdb/test/neg/5001_1.txt
f :  ../data/aclImdb/test/neg/4294_2.txt
f :  ../data/aclImdb/test/neg/7986_1.txt
f :  ../data/aclImdb/test/neg/661_1.txt
f :  ../data/aclImdb/test/neg/2574_2.txt
f :  ../data/aclImdb/test/neg/10631_1.txt
f :  ../data/aclImdb/test/neg/4891_3.txt
f :  ../data/aclImdb/test/neg/278_1.txt
f :  ../data/aclImdb/test/neg/2725_3.txt
f :  ../data/aclImdb/test/neg/9819_4.txt
f :  ../data/aclImdb/test/neg/2961_4.txt
f :  ../data/aclImdb/test/neg/12417_4.txt
f :  ../data/aclImdb/test/neg/2743_1.txt
f :  ../data/ac

f :  ../data/aclImdb/test/neg/10511_1.txt
f :  ../data/aclImdb/test/neg/7892_1.txt
f :  ../data/aclImdb/test/neg/6781_1.txt
f :  ../data/aclImdb/test/neg/10820_3.txt
f :  ../data/aclImdb/test/neg/7203_1.txt
f :  ../data/aclImdb/test/neg/5180_1.txt
f :  ../data/aclImdb/test/neg/6277_1.txt
f :  ../data/aclImdb/test/neg/10526_2.txt
f :  ../data/aclImdb/test/neg/12167_1.txt
f :  ../data/aclImdb/test/neg/12409_4.txt
f :  ../data/aclImdb/test/neg/129_1.txt
f :  ../data/aclImdb/test/neg/8203_3.txt
f :  ../data/aclImdb/test/neg/11044_1.txt
f :  ../data/aclImdb/test/neg/7998_4.txt
f :  ../data/aclImdb/test/neg/10438_3.txt
f :  ../data/aclImdb/test/neg/2599_1.txt
f :  ../data/aclImdb/test/neg/5045_2.txt
f :  ../data/aclImdb/test/neg/5867_2.txt
f :  ../data/aclImdb/test/neg/5238_2.txt
f :  ../data/aclImdb/test/neg/5956_1.txt
f :  ../data/aclImdb/test/neg/3892_4.txt
f :  ../data/aclImdb/test/neg/8611_1.txt
f :  ../data/aclImdb/test/neg/8344_1.txt
f :  ../data/aclImdb/test/neg/3148_2.txt
f :  ../da

f :  ../data/aclImdb/test/neg/3352_4.txt
f :  ../data/aclImdb/test/neg/404_1.txt
f :  ../data/aclImdb/test/neg/3122_4.txt
f :  ../data/aclImdb/test/neg/8493_1.txt
f :  ../data/aclImdb/test/neg/4110_4.txt
f :  ../data/aclImdb/test/neg/9081_2.txt
f :  ../data/aclImdb/test/neg/10010_2.txt
f :  ../data/aclImdb/test/neg/9821_3.txt
f :  ../data/aclImdb/test/neg/4397_4.txt
f :  ../data/aclImdb/test/neg/9413_1.txt
f :  ../data/aclImdb/test/neg/2910_3.txt
f :  ../data/aclImdb/test/neg/3837_4.txt
f :  ../data/aclImdb/test/neg/6045_3.txt
f :  ../data/aclImdb/test/neg/1230_2.txt
f :  ../data/aclImdb/test/neg/4400_4.txt
f :  ../data/aclImdb/test/neg/980_2.txt
f :  ../data/aclImdb/test/neg/881_4.txt
f :  ../data/aclImdb/test/neg/11274_3.txt
f :  ../data/aclImdb/test/neg/359_4.txt
f :  ../data/aclImdb/test/neg/4208_3.txt
f :  ../data/aclImdb/test/neg/12108_1.txt
f :  ../data/aclImdb/test/neg/10447_2.txt
f :  ../data/aclImdb/test/neg/7436_4.txt
f :  ../data/aclImdb/test/neg/11716_2.txt
f :  ../data/ac

f :  ../data/aclImdb/test/neg/11935_3.txt
f :  ../data/aclImdb/test/neg/1729_1.txt
f :  ../data/aclImdb/test/neg/8677_1.txt
f :  ../data/aclImdb/test/neg/1548_1.txt
f :  ../data/aclImdb/test/neg/9623_2.txt
f :  ../data/aclImdb/test/neg/837_4.txt
f :  ../data/aclImdb/test/neg/5506_4.txt
f :  ../data/aclImdb/test/neg/3516_2.txt
f :  ../data/aclImdb/test/neg/6127_2.txt
f :  ../data/aclImdb/test/neg/7659_3.txt
f :  ../data/aclImdb/test/neg/6864_2.txt
f :  ../data/aclImdb/test/neg/10508_1.txt
f :  ../data/aclImdb/test/neg/3339_4.txt
f :  ../data/aclImdb/test/neg/6744_2.txt
f :  ../data/aclImdb/test/neg/2333_2.txt
f :  ../data/aclImdb/test/neg/1093_4.txt
f :  ../data/aclImdb/test/neg/8279_2.txt
f :  ../data/aclImdb/test/neg/9899_1.txt
f :  ../data/aclImdb/test/neg/5904_3.txt
f :  ../data/aclImdb/test/neg/10636_3.txt
f :  ../data/aclImdb/test/neg/9714_3.txt
f :  ../data/aclImdb/test/neg/321_1.txt
f :  ../data/aclImdb/test/neg/1453_2.txt
f :  ../data/aclImdb/test/neg/1192_1.txt
f :  ../data/ac

f :  ../data/aclImdb/test/neg/9291_4.txt
f :  ../data/aclImdb/test/neg/7827_4.txt
f :  ../data/aclImdb/test/neg/5375_1.txt
f :  ../data/aclImdb/test/neg/10665_1.txt
f :  ../data/aclImdb/test/neg/4154_4.txt
f :  ../data/aclImdb/test/neg/1114_4.txt
f :  ../data/aclImdb/test/neg/11085_1.txt
f :  ../data/aclImdb/test/neg/10771_1.txt
f :  ../data/aclImdb/test/neg/10784_4.txt
f :  ../data/aclImdb/test/neg/11115_2.txt
f :  ../data/aclImdb/test/neg/12260_1.txt
f :  ../data/aclImdb/test/neg/1411_4.txt
f :  ../data/aclImdb/test/neg/8904_2.txt
f :  ../data/aclImdb/test/neg/2096_4.txt
f :  ../data/aclImdb/test/neg/239_2.txt
f :  ../data/aclImdb/test/neg/10838_2.txt
f :  ../data/aclImdb/test/neg/480_3.txt
f :  ../data/aclImdb/test/neg/12465_2.txt
f :  ../data/aclImdb/test/neg/12466_4.txt
f :  ../data/aclImdb/test/neg/67_3.txt
f :  ../data/aclImdb/test/neg/4346_3.txt
f :  ../data/aclImdb/test/neg/11029_3.txt
f :  ../data/aclImdb/test/neg/10371_4.txt
f :  ../data/aclImdb/test/neg/3312_3.txt
f :  ../d

f :  ../data/aclImdb/test/neg/5517_2.txt
f :  ../data/aclImdb/test/neg/1902_4.txt
f :  ../data/aclImdb/test/neg/9298_1.txt
f :  ../data/aclImdb/test/neg/9062_3.txt
f :  ../data/aclImdb/test/neg/9730_1.txt
f :  ../data/aclImdb/test/neg/2329_2.txt
f :  ../data/aclImdb/test/neg/5932_1.txt
f :  ../data/aclImdb/test/neg/3198_1.txt
f :  ../data/aclImdb/test/neg/413_4.txt
f :  ../data/aclImdb/test/neg/3048_1.txt
f :  ../data/aclImdb/test/neg/1022_3.txt
f :  ../data/aclImdb/test/neg/927_3.txt
f :  ../data/aclImdb/test/neg/4752_1.txt
f :  ../data/aclImdb/test/neg/4648_2.txt
f :  ../data/aclImdb/test/neg/4673_4.txt
f :  ../data/aclImdb/test/neg/2649_2.txt
f :  ../data/aclImdb/test/neg/9247_2.txt
f :  ../data/aclImdb/test/neg/8115_3.txt
f :  ../data/aclImdb/test/neg/8806_1.txt
f :  ../data/aclImdb/test/neg/944_2.txt
f :  ../data/aclImdb/test/neg/4446_1.txt
f :  ../data/aclImdb/test/neg/5678_2.txt
f :  ../data/aclImdb/test/neg/2900_1.txt
f :  ../data/aclImdb/test/neg/8877_3.txt
f :  ../data/aclImd

f :  ../data/aclImdb/test/neg/4751_2.txt
f :  ../data/aclImdb/test/neg/1201_2.txt
f :  ../data/aclImdb/test/neg/4765_3.txt
f :  ../data/aclImdb/test/neg/9919_4.txt
f :  ../data/aclImdb/test/neg/4468_4.txt
f :  ../data/aclImdb/test/neg/1362_3.txt
f :  ../data/aclImdb/test/neg/10943_3.txt
f :  ../data/aclImdb/test/neg/6578_4.txt
f :  ../data/aclImdb/test/neg/486_4.txt
f :  ../data/aclImdb/test/neg/605_4.txt
f :  ../data/aclImdb/test/neg/6775_2.txt
f :  ../data/aclImdb/test/neg/2088_1.txt
f :  ../data/aclImdb/test/neg/6104_1.txt
f :  ../data/aclImdb/test/neg/4452_1.txt
f :  ../data/aclImdb/test/neg/9339_3.txt
f :  ../data/aclImdb/test/neg/3740_1.txt
f :  ../data/aclImdb/test/neg/7206_3.txt
f :  ../data/aclImdb/test/neg/9840_2.txt
f :  ../data/aclImdb/test/neg/1155_2.txt
f :  ../data/aclImdb/test/neg/10125_4.txt
f :  ../data/aclImdb/test/neg/6842_1.txt
f :  ../data/aclImdb/test/neg/11049_1.txt
f :  ../data/aclImdb/test/neg/7683_4.txt
f :  ../data/aclImdb/test/neg/5210_4.txt
f :  ../data/ac

f :  ../data/aclImdb/test/neg/6596_1.txt
f :  ../data/aclImdb/test/neg/9467_4.txt
f :  ../data/aclImdb/test/neg/1904_2.txt
f :  ../data/aclImdb/test/neg/2354_3.txt
f :  ../data/aclImdb/test/neg/7636_1.txt
f :  ../data/aclImdb/test/neg/6639_2.txt
f :  ../data/aclImdb/test/neg/11278_4.txt
f :  ../data/aclImdb/test/neg/6171_2.txt
f :  ../data/aclImdb/test/neg/7964_1.txt
f :  ../data/aclImdb/test/neg/4247_4.txt
f :  ../data/aclImdb/test/neg/1826_4.txt
f :  ../data/aclImdb/test/neg/2711_3.txt
f :  ../data/aclImdb/test/neg/11333_3.txt
f :  ../data/aclImdb/test/neg/7317_2.txt
f :  ../data/aclImdb/test/neg/2335_1.txt
f :  ../data/aclImdb/test/neg/7059_3.txt
f :  ../data/aclImdb/test/neg/328_4.txt
f :  ../data/aclImdb/test/neg/10397_1.txt
f :  ../data/aclImdb/test/neg/11287_4.txt
f :  ../data/aclImdb/test/neg/6951_1.txt
f :  ../data/aclImdb/test/neg/296_4.txt
f :  ../data/aclImdb/test/neg/11205_2.txt
f :  ../data/aclImdb/test/neg/1928_4.txt
f :  ../data/aclImdb/test/neg/5943_4.txt
f :  ../data/

f :  ../data/aclImdb/test/neg/7376_2.txt
f :  ../data/aclImdb/test/neg/3258_2.txt
f :  ../data/aclImdb/test/neg/4425_3.txt
f :  ../data/aclImdb/test/neg/7154_1.txt
f :  ../data/aclImdb/test/neg/8242_4.txt
f :  ../data/aclImdb/test/neg/1052_3.txt
f :  ../data/aclImdb/test/neg/11449_2.txt
f :  ../data/aclImdb/test/neg/2334_3.txt
f :  ../data/aclImdb/test/neg/9188_3.txt
f :  ../data/aclImdb/test/neg/4762_3.txt
f :  ../data/aclImdb/test/neg/1561_3.txt
f :  ../data/aclImdb/test/neg/6838_1.txt
f :  ../data/aclImdb/test/neg/2469_1.txt
f :  ../data/aclImdb/test/neg/3287_2.txt
f :  ../data/aclImdb/test/neg/454_1.txt
f :  ../data/aclImdb/test/neg/8103_4.txt
f :  ../data/aclImdb/test/neg/9232_1.txt
f :  ../data/aclImdb/test/neg/2489_3.txt
f :  ../data/aclImdb/test/neg/8400_2.txt
f :  ../data/aclImdb/test/neg/7010_4.txt
f :  ../data/aclImdb/test/neg/2815_4.txt
f :  ../data/aclImdb/test/neg/8688_3.txt
f :  ../data/aclImdb/test/neg/4789_4.txt
f :  ../data/aclImdb/test/neg/8283_4.txt
f :  ../data/acl

f :  ../data/aclImdb/test/neg/712_3.txt
f :  ../data/aclImdb/test/neg/7755_3.txt
f :  ../data/aclImdb/test/neg/6774_4.txt
f :  ../data/aclImdb/test/neg/6007_1.txt
f :  ../data/aclImdb/test/neg/408_3.txt
f :  ../data/aclImdb/test/neg/6229_1.txt
f :  ../data/aclImdb/test/neg/9608_1.txt
f :  ../data/aclImdb/test/neg/8674_1.txt
f :  ../data/aclImdb/test/neg/5676_1.txt
f :  ../data/aclImdb/test/neg/10527_3.txt
f :  ../data/aclImdb/test/neg/8728_1.txt
f :  ../data/aclImdb/test/neg/3345_3.txt
f :  ../data/aclImdb/test/neg/6591_2.txt
f :  ../data/aclImdb/test/neg/2558_4.txt
f :  ../data/aclImdb/test/neg/8652_1.txt
f :  ../data/aclImdb/test/neg/7862_1.txt
f :  ../data/aclImdb/test/neg/8068_2.txt
f :  ../data/aclImdb/test/neg/4442_1.txt
f :  ../data/aclImdb/test/neg/9130_4.txt
f :  ../data/aclImdb/test/neg/6712_2.txt
f :  ../data/aclImdb/test/neg/540_1.txt
f :  ../data/aclImdb/test/neg/3585_2.txt
f :  ../data/aclImdb/test/neg/7766_1.txt
f :  ../data/aclImdb/test/neg/10102_2.txt
f :  ../data/aclI

f :  ../data/aclImdb/test/neg/4930_1.txt
f :  ../data/aclImdb/test/neg/12157_3.txt
f :  ../data/aclImdb/test/neg/12124_3.txt
f :  ../data/aclImdb/test/neg/8540_2.txt
f :  ../data/aclImdb/test/neg/5949_4.txt
f :  ../data/aclImdb/test/neg/7480_1.txt
f :  ../data/aclImdb/test/neg/3703_2.txt
f :  ../data/aclImdb/test/neg/1624_1.txt
f :  ../data/aclImdb/test/neg/4633_3.txt
f :  ../data/aclImdb/test/neg/3733_4.txt
f :  ../data/aclImdb/test/neg/8420_4.txt
f :  ../data/aclImdb/test/neg/2246_1.txt
f :  ../data/aclImdb/test/neg/7588_4.txt
f :  ../data/aclImdb/test/neg/8122_1.txt
f :  ../data/aclImdb/test/neg/11910_2.txt
f :  ../data/aclImdb/test/neg/6235_1.txt
f :  ../data/aclImdb/test/neg/11985_1.txt
f :  ../data/aclImdb/test/neg/2284_3.txt
f :  ../data/aclImdb/test/neg/1568_4.txt
f :  ../data/aclImdb/test/neg/12111_1.txt
f :  ../data/aclImdb/test/neg/12333_1.txt
f :  ../data/aclImdb/test/neg/1242_1.txt
f :  ../data/aclImdb/test/neg/447_4.txt
f :  ../data/aclImdb/test/neg/5182_3.txt
f :  ../dat

f :  ../data/aclImdb/test/neg/5637_2.txt
f :  ../data/aclImdb/test/neg/7245_1.txt
f :  ../data/aclImdb/test/neg/9828_4.txt
f :  ../data/aclImdb/test/neg/10548_1.txt
f :  ../data/aclImdb/test/neg/485_4.txt
f :  ../data/aclImdb/test/neg/7847_2.txt
f :  ../data/aclImdb/test/neg/3390_4.txt
f :  ../data/aclImdb/test/neg/1765_1.txt
f :  ../data/aclImdb/test/neg/5858_1.txt
f :  ../data/aclImdb/test/neg/5980_2.txt
f :  ../data/aclImdb/test/neg/11990_4.txt
f :  ../data/aclImdb/test/neg/7863_1.txt
f :  ../data/aclImdb/test/neg/9494_1.txt
f :  ../data/aclImdb/test/neg/1147_1.txt
f :  ../data/aclImdb/test/neg/2738_1.txt
f :  ../data/aclImdb/test/neg/9993_2.txt
f :  ../data/aclImdb/test/neg/4503_1.txt
f :  ../data/aclImdb/test/neg/7818_2.txt
f :  ../data/aclImdb/test/neg/4330_3.txt
f :  ../data/aclImdb/test/neg/12181_4.txt
f :  ../data/aclImdb/test/neg/5819_2.txt
f :  ../data/aclImdb/test/neg/10713_3.txt
f :  ../data/aclImdb/test/neg/882_2.txt
f :  ../data/aclImdb/test/neg/727_3.txt
f :  ../data/ac

f :  ../data/aclImdb/test/neg/2778_2.txt
f :  ../data/aclImdb/test/neg/8360_3.txt
f :  ../data/aclImdb/test/neg/337_1.txt
f :  ../data/aclImdb/test/neg/1702_2.txt
f :  ../data/aclImdb/test/neg/7968_4.txt
f :  ../data/aclImdb/test/neg/1115_3.txt
f :  ../data/aclImdb/test/neg/4261_1.txt
f :  ../data/aclImdb/test/neg/6782_3.txt
f :  ../data/aclImdb/test/neg/9721_4.txt
f :  ../data/aclImdb/test/neg/9804_1.txt
f :  ../data/aclImdb/test/neg/3859_1.txt
f :  ../data/aclImdb/test/neg/1222_2.txt
f :  ../data/aclImdb/test/neg/6132_1.txt
f :  ../data/aclImdb/test/neg/5083_4.txt
f :  ../data/aclImdb/test/neg/4179_1.txt
f :  ../data/aclImdb/test/neg/5106_1.txt
f :  ../data/aclImdb/test/neg/6811_4.txt
f :  ../data/aclImdb/test/neg/7834_3.txt
f :  ../data/aclImdb/test/neg/9791_2.txt
f :  ../data/aclImdb/test/neg/7470_4.txt
f :  ../data/aclImdb/test/neg/8917_4.txt
f :  ../data/aclImdb/test/neg/1849_4.txt
f :  ../data/aclImdb/test/neg/7155_2.txt
f :  ../data/aclImdb/test/neg/946_1.txt
f :  ../data/aclIm

f :  ../data/aclImdb/test/neg/4886_3.txt
f :  ../data/aclImdb/test/neg/6644_1.txt
f :  ../data/aclImdb/test/neg/7750_3.txt
f :  ../data/aclImdb/test/neg/6358_4.txt
f :  ../data/aclImdb/test/neg/7545_1.txt
f :  ../data/aclImdb/test/neg/10074_3.txt
f :  ../data/aclImdb/test/neg/11956_2.txt
f :  ../data/aclImdb/test/neg/3715_1.txt
f :  ../data/aclImdb/test/neg/11608_1.txt
f :  ../data/aclImdb/test/neg/7500_1.txt
f :  ../data/aclImdb/test/neg/10639_4.txt
f :  ../data/aclImdb/test/neg/5147_1.txt
f :  ../data/aclImdb/test/neg/9133_3.txt
f :  ../data/aclImdb/test/neg/10627_2.txt
f :  ../data/aclImdb/test/neg/7112_4.txt
f :  ../data/aclImdb/test/neg/1127_1.txt
f :  ../data/aclImdb/test/neg/8873_4.txt
f :  ../data/aclImdb/test/neg/5700_1.txt
f :  ../data/aclImdb/test/neg/6466_1.txt
f :  ../data/aclImdb/test/neg/3325_1.txt
f :  ../data/aclImdb/test/neg/2313_1.txt
f :  ../data/aclImdb/test/neg/10118_1.txt
f :  ../data/aclImdb/test/neg/10488_3.txt
f :  ../data/aclImdb/test/neg/10172_1.txt
f :  ../

In [29]:
labels

{'train': {'pos': [1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,


In [12]:
from sklearn.utils import shuffle

def prepare_imdb_data(data, labels):
    """Prepare training and test sets from IMDb movie reviews."""
    
    #Combine positive and negative reviews and labels
    data_train = data['train']['pos'] + data['train']['neg']
    data_test = data['test']['pos'] + data['test']['neg']
    labels_train = labels['train']['pos'] + labels['train']['neg']
    labels_test = labels['test']['pos'] + labels['test']['neg']
    
    #Shuffle reviews and corresponding labels within training and test sets
    data_train, labels_train = shuffle(data_train, labels_train)
    data_test, labels_test = shuffle(data_test, labels_test)
    
    # Return a unified training data, test data, training labels, test labets
    return data_train, data_test, labels_train, labels_test

In [13]:
train_X, test_X, train_y, test_y = prepare_imdb_data(data, labels)
print("IMDb reviews (combined): train = {}, test = {}".format(len(train_X), len(test_X)))

IMDb reviews (combined): train = 25000, test = 25000


In [15]:
train_y[100]

1

## Step 3: Processing the data

Now that we have our training and testing datasets merged and ready to use, we need to start processing the raw data into something that will be useable by our machine learning algorithm. To begin with, we remove any html formatting that may appear in the reviews and perform some standard natural language processing in order to homogenize the data.

In [16]:
import nltk
nltk.download("stopwords")
from nltk.corpus import stopwords
from nltk.stem.porter import *
stemmer = PorterStemmer()

[nltk_data] Downloading package stopwords to
[nltk_data]     /home/ec2-user/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


In [47]:
stopwords.words("english")

['i',
 'me',
 'my',
 'myself',
 'we',
 'our',
 'ours',
 'ourselves',
 'you',
 "you're",
 "you've",
 "you'll",
 "you'd",
 'your',
 'yours',
 'yourself',
 'yourselves',
 'he',
 'him',
 'his',
 'himself',
 'she',
 "she's",
 'her',
 'hers',
 'herself',
 'it',
 "it's",
 'its',
 'itself',
 'they',
 'them',
 'their',
 'theirs',
 'themselves',
 'what',
 'which',
 'who',
 'whom',
 'this',
 'that',
 "that'll",
 'these',
 'those',
 'am',
 'is',
 'are',
 'was',
 'were',
 'be',
 'been',
 'being',
 'have',
 'has',
 'had',
 'having',
 'do',
 'does',
 'did',
 'doing',
 'a',
 'an',
 'the',
 'and',
 'but',
 'if',
 'or',
 'because',
 'as',
 'until',
 'while',
 'of',
 'at',
 'by',
 'for',
 'with',
 'about',
 'against',
 'between',
 'into',
 'through',
 'during',
 'before',
 'after',
 'above',
 'below',
 'to',
 'from',
 'up',
 'down',
 'in',
 'out',
 'on',
 'off',
 'over',
 'under',
 'again',
 'further',
 'then',
 'once',
 'here',
 'there',
 'when',
 'where',
 'why',
 'how',
 'all',
 'any',
 'both',
 'each

In [70]:
import re
from bs4 import BeautifulSoup

def review_to_words(review):
    text = BeautifulSoup(review, "html.parser").get_text() # Remove HTML tags
    text = re.sub(r"[^a-zA-Z0-9]", " ", text.lower()) # Convert to lower case
    words = text.split() # Split string into words
    words = [w for w in words if w not in stopwords.words("english")] # Remove stopwords
    words = [PorterStemmer().stem(w) for w in words] # stem
    
    print("words : ", words)
    
    return words

In [71]:
import pickle

cache_dir = os.path.join("../cache", "sentiment_analysis")  # where to store cache files
os.makedirs(cache_dir, exist_ok=True)  # ensure cache directory exists

def preprocess_data(data_train, data_test, labels_train, labels_test,
                    cache_dir=cache_dir, cache_file="preprocessed_data.pkl"):
    """Convert each review to words; read from cache if available."""
    
    
    # print("datatrain : ", data_train)
    # print("data_test : ", data_test)
    print("labels_train : ", labels_train)
    print("labels_test : ", labels_test)
    # If cache_file is not None, try to read from it first
    cache_data = None
    if cache_file is not None:
        print(" if cache_file is not None: 1")
        try:
            with open(os.path.join(cache_dir, cache_file), "rb") as f:
                cache_data = pickle.load(f)
            print("Read preprocessed data from cache file:", cache_file)
        except:
            print("pass")
            pass  # unable to read from cache, but that's okay
    
    # If cache is missing, then do the heavy lifting
    if cache_data is None:
        print("if cache_data is None:")
        
        # Preprocess training and test data to obtain words for each review
        #words_train = list(map(review_to_words, data_train))
        #words_test = list(map(review_to_words, data_test))
        words_train = [review_to_words(review) for review in data_train]
        words_test = [review_to_words(review) for review in data_test]
        
        # Write to cache file for future runs
        if cache_file is not None:
            print(" if cache_file is not None: 2")
            cache_data = dict(words_train=words_train, words_test=words_test,
                              labels_train=labels_train, labels_test=labels_test)
            with open(os.path.join(cache_dir, cache_file), "wb") as f:
                pickle.dump(cache_data, f)
            print("Wrote preprocessed data to cache file:", cache_file)
    else:
        print("cache data is exists")
        # Unpack data loaded from cache file
        words_train, words_test, labels_train, labels_test = (cache_data['words_train'],
                cache_data['words_test'], cache_data['labels_train'], cache_data['labels_test'])
    
    return words_train, words_test, labels_train, labels_test

In [72]:
total_train_X_data = len(train_X)
total_test_X_data = len(test_X)
total_train_Y_data = len(train_y)
total_train_Y_data = len(test_y)


print("total_train_X_data : ", total_train_X_data)
print("total_test_X_data : ", total_test_X_data)
print("total_train_Y_data : ", total_train_Y_data)
print("total_train_Y_data : ", total_train_Y_data)

total_train_X_data :  25000
total_test_X_data :  25000
total_train_Y_data :  25000
total_train_Y_data :  25000


In [76]:
array_target = 50

sample_train_X = train_X[:array_target]
sample_test_X = test_X[:array_target]
sample_train_y = train_y[:array_target]
sample_test_y = test_y[:array_target]

sample_train_X[0]

"Oliver Gruner is totally unknown to me. My friend showed me this film because he had seen Gruner in, what he called a pretty good sci-fi film, Nemesis. So as we watched this, we found ourselves fastforwarding through the BS drama parts just to get to the unbelievable action sequences. Gruner loves to kick and kick and kick. And kick! haha<br /><br />Gruner character is a graduate student who is forced to stay in a ghetto close to the one that he grew up in. He finds himself watching after the boy who lives with him because he really wants to join in the Mexican gang that keeps tormenting his family. Instead of joining up, Gruner tells the boy to fight back (against a gang? too crazy). Gruner plays a typical Van Damme character who kills everyone (or maims them pretty bad) and works to rid his block of these gangmembers.<br /><br />The plot was very cheesy and easy to think of. Gruner is probably not very well known because of his script-choosing if this movie is anything to compare po

In [77]:
result_sample_train_X, result_sample_test_X, result_sample_train_y, result_sample_test_y = preprocess_data(sample_train_X, sample_test_X, sample_train_y, sample_test_y)

labels_train :  [0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1]
labels_test :  [1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0]
 if cache_file is not None: 1
pass
if cache_data is None:
words :  ['oliv', 'gruner', 'total', 'unknown', 'friend', 'show', 'film', 'seen', 'gruner', 'call', 'pretti', 'good', 'sci', 'fi', 'film', 'nemesi', 'watch', 'found', 'fastforward', 'bs', 'drama', 'part', 'get', 'unbeliev', 'action', 'sequenc', 'gruner', 'love', 'kick', 'kick', 'kick', 'kick', 'hahagrun', 'charact', 'graduat', 'student', 'forc', 'stay', 'ghetto', 'close', 'one', 'grew', 'find', 'watch', 'boy', 'live', 'realli', 'want', 'join', 'mexican', 'gang', 'keep', 'torment', 'famili', 'instead', 'join', 'gruner', 'tell', 'boy', 'fight', 'back', 'gang', 'crazi', 'gruner', 'play', 'typic', 'van', '

words :  ['honesti', 'someon', 'told', 'director', 'lemoni', 'snicket', 'seri', 'unfortun', 'event', 'citi', 'angel', 'casper', 'go', 'neat', 'littl', 'low', 'budget', 'indi', 'film', 'real', 'good', 'say', 'person', 'must', 'joke', 'director', 'brad', 'siberl', 'realli', 'good', '10', 'item', 'less', 'similar', 'conceit', 'film', 'like', 'sunris', 'lost', 'translat', 'recent', 'involv', 'chanc', 'meet', 'two', 'peopl', 'serendip', 'put', 'probabl', 'never', 'cross', 'path', 'say', 'word', 'one', 'like', 'film', '10', 'item', 'less', 'focus', 'relationship', 'build', 'charact', 'come', 'understand', 'build', 'strength', 'weak', 'stori', 'involv', 'morgan', 'freeman', 'play', 'unnam', 'actor', 'goe', 'research', 'role', 'groceri', 'store', 'employe', 'upcom', 'independ', 'movi', 'thing', 'beyond', 'control', 'end', 'spend', 'day', 'ladi', '10', 'item', 'less', 'lane', 'play', 'paz', 'vega', 'rotten', 'marriag', 'hope', 'land', 'new', 'job', 'secretari', 'initi', 'freeman', 'charact', 'n

words :  ['film', 'artemisia', 'may', 'consid', 'treason', 'true', 'artist', 'licens', 'might', 'one', 'aver', 'document', 'histori', 'artemisia', 'gentileschi', 'subject', 'thumbscrew', 'still', 'affirm', 'r', 'ed', 'mari', 'garrard', 'gloria', 'steinem', 'eloqu', 'affirm', 'movi', 'differ', 'tortur', 'refus', 'condemn', 'lover', 'violat', 'may', 'movi', 'deviat', 'much', 'receiv', 'histori', 'yet', 'still', 'inform', 'human', 'heart', 'answer', 'hard', 'find', 'movi', 'director', 'cast', 'fill', 'gape', 'hole', 'histor', 'record', 'power', 'imagin', 'led', 'conclus', 'differ', 'record', 'find', 'record', 'movi', 'compel', 'movi', 'seem', 'histori', 'artemisia', 'painter', 'els', 'vision', 'frame', 'ravish', 'sic', 'film', 'composit', 'truli', 'grate', 'seldom', 'seen', 'movi', 'compel', 'eye', 'david', 'broadhurst']
words :  ['favorit', 'part', 'film', 'old', 'man', 'attempt', 'cure', 'neighbor', 'ill', 'put', 'strong', 'medicin', 'bath', 'sens', 'famili', 'sens', 'commun']
words :  

words :  ['movi', 'basic', 'human', 'relat', 'interact', 'main', 'charact', 'old', 'ladi', 'twilight', 'life', 'start', 'journey', 'past', 'analysi', 'live', 'life', 'journey', 'precipit', 'son', 'econom', 'crisi', 'intent', 'put', 'nurs', 'home', 'honest', 'look', 'issu', 'ask', 'point', 'life', 'plenti', 'secondari', 'idea', 'discuss', 'movi', 'famili', 'legaci', 'real', 'love', 'marriag', 'destini', 'although', 'type', 'movi', 'melodrama', 'noth', 'new', 'one', 'use', 'watch', 'famili', 'member', 'discuss', 'idea', 'good', 'perform', 'actor', 'charact', 'believ', 'time', 'charact', 'mayb', 'fulli', 'develop', 'realli', 'recommend', 'movi', 'quiet', 'saturday', 'afternoon']
words :  ['rip', '1984', 'hit', 'gremlin', 'quit', 'possibl', 'biggest', 'train', 'wreck', 'movi', 'ever', 'made', 'even', 'b', 'grade', 'movi', 'cheap', 'horror', 'movi', 'platform', 'complet', 'dwarf', 'movi', 'term', 'plot', 'act', 'good', 'begin', 'random', 'old', 'secur', 'guard', 'younger', 'punki', 'secur',

words :  ['decid', 'whether', 'one', 'favourit', 'movi', 'good', 'thriller', 'emot', 'core', 'still', 'decid', 'definit', 'like', 'first', 'movi', 'terri', 'gilliam', 'seen', 'first', 'impress', 'engag', 'till', 'end', 'complex', 'confus', 'movi', 'set', 'futur', 'man', 'jame', 'cole', 'bruce', 'willi', 'sent', 'futur', 'order', 'get', 'inform', 'past', '1996', 'specif', 'viru', 'kill', '5', 'billion', 'peopl', 'sent', 'futur', 'get', 'inform', 'also', 'involv', 'psychiatrist', 'call', 'kathryn', 'railli', 'love', 'stori', 'portray', 'beauti', 'realli', 'feel', 'long', 'love', 'long', 'regular', 'life', 'loos', 'end', 'tie', 'interest', 'manner', 'end', 'one', 'thing', 'like', 'movi', 'unlik', 'post', 'apocalypt', 'movi', 'movi', 'prefer', 'give', 'bore', 'social', 'commentari', 'instead', 'focus', 'one', 'guy', 'long', 'regular', 'life', 'want', 'see', 'ocean', 'especi', 'poignant', 'line', 'movi', 'choos', 'focu', 'tension', 'confus', 'person', 'mind', 'therefor', 'exactli', 'post', 

words :  ['scifi', 'past', 'weekend', 'check', 'scienc', 'fiction', 'vampir', 'erika', 'eleniak', 'could', 'go', 'wrong', 'b', 'movi', 'lot', 'start', 'even', 'classifi', 'b', 'movi', 'would', 'put', 'leagu', 'roger', 'corman', 'movi', 'even', 'meet', 'expect', 'money', 'spent', 'contact', 'lens', 'vampir', 'secondli', 'cast', 'horribl', 'ye', 'cast', 'udo', 'kier', 'captain', 'demet', 'smart', 'move', 'director', 'clearli', 'even', 'get', 'kier', 'memor', 'line', 'cast', 'eleniak', 'vampir', 'movi', 'also', 'smart', 'move', 'mean', 'bunch', 'horni', 'guy', 'go', 'buy', 'rent', 'record', 'flick', 'watch', 'get', 'seduc', 'vampir', 'director', 'writer', 'produc', 'screw', 'one', 'grant', 'got', 'money', 'poor', 'unfortu', 'soul', 'enjoy', 'watch', 'vampir', 'movi', 'hot', 'women', 'one', 'go', 'rememb', 'movi', 'anoth', 'two', 'three', 'year', 'thirdli', 'littl', 'thing', 'emphas', 'lazi', 'movi', 'exampl', 'van', 'hels', 'call', 'cross', 'crucifix', 'mina', 'stake', 'coffin', 'viewer',

words :  ['alway', 'fan', 'jackass', 'well', 'viva', 'la', 'bam', 'wildboyz', 'fan', 'someth', 'expect', 'high', 'whatev', 'hero', 'might', 'star', 'one', 'thing', 'learn', 'expect', 'lot', 'peopl', 'simpli', 'love', 'watch', 'listen', 'never', 'expect', 'much', 'caus', '99', '100', 'time', 'get', 'disappoint', 'although', 'heard', 'jackass', '2', 'come', 'thought', 'even', 'turn', 'expect', 'movi', 'result', 'sat', 'today', 'readi', 'laugh', 'also', 'readi', 'say', 'end', 'well', 'ok', 'littl', 'disappoint', 'wrong', 'everi', 'singl', 'member', 'jackass', 'crew', 'bring', 'movi', 'way', 'first', 'one', 'show', 'one', 'crazi', 'ass', 'stunt', 'make', 'whole', 'world', 'see', 'noth', 'wont', 'tri', 'harm', 'love', 'cri', 'eye', 'laugh', 'first', 'minut', 'till', 'last', 'second', 'movi', 'time', 'even', 'shout', 'laughter', 'abl', 'control', 'stunt', 'stunt', 'prank', 'prank', 'hilari', 'comment', 'flow', 'simpli', 'get', 'better', 'amaz', 'start', 'till', 'end', 'guarante', 'make', 'la

words :  ['georg', 'sander', 'play', 'saint', 'penultim', 'time', 'good', 'job', 'good', 'script', 'usual', 'good', 'rko', 'cast', 'around', 'non', 'charteri', 'stori', 'bristl', 'murder', 'good', 'clean', 'fun', 'thread', '1', 'new', 'york', 'polic', 'inspector', 'fernack', 'templar', 'friend', 'frame', 'corrupt', 'scandal', 'disgrac', 'st', 'come', 'london', 'tri', 'put', 'thing', 'right', 'nice', 'simpl', 'far', 'realli', '90', '000', 'world', 'thread', '2', 'anoth', 'tale', 'woman', 'take', 'reveng', 'peopl', 'murder', 'brother', 'wendi', 'barri', 'well', 'bump', 'nasti', 'men', 'saint', 'fall', 'love', 'boot', 'includ', 'baddi', 'direct', 'protect', 'fernack', 'cellar', 'creepi', 'shot', 'dead', 'stare', 'car', 'take', 'back', 'got', 'paul', 'guilfoyl', 'pearli', 'gate', 'must', 'suppos', 'homosexu', 'wit', 'dress', 'gown', 'begin', 'later', 'beguil', 'comment', 'st', 'think', 'keep', 'pet', 'palm', 'spring', 'fernack', 'play', 'jonathan', 'hale', 'usual', 'time', 'beaten', 'defla

words :  ['5', 'year', 'old', 'daughter', 'barbi', 'seri', 'movi', 'mix', 'feel', 'want', 'buy', 'whole', 'barbi', 'doll', 'imag', 'thing', 'recogn', 'movi', 'market', 'ploy', 'convinc', 'young', 'girl', 'buy', 'doll', 'make', 'money', 'mattel', 'morn', 'though', 'ask', 'watch', 'movi', 'lazi', 'saturday', 'morn', 'much', 'els', 'agre', 'know', 'movi', 'made', 'help', 'market', 'doll', 'seem', 'lose', 'appeal', 'bit', 'heard', 'doll', 'market', 'movi', 'like', 'bit', 'whichev', 'case', 'admit', 'somewhat', 'surpris', 'half', 'bad', 'fun', 'imagin', 'stori', 'full', 'magic', 'place', 'peopl', 'memor', 'charact', 'good', 'evil', 'essenti', 'annika', 'play', 'barbi', 'find', 'way', 'build', 'wand', 'light', 'revers', 'evil', 'spell', 'wizard', 'wenlock', 'among', 'thing', 'turn', 'sister', 'fli', 'hors', 'parent', 'stone', 'anim', 'pretti', 'good', 'disney', 'calibr', 'one', 'think', 'disney', 'standard', 'aspir', 'gener', 'pretti', 'good', 'movi', 'obvious', 'tailor', 'young', 'girl', 'r

In [None]:
result_sample_train_X

In [62]:
cache_file="preprocessed_data.pkl"
with open(os.path.join(cache_dir, cache_file), "rb") as f:
            cache_data = pickle.load(f)

In [63]:
cache_data

{'words_train': [['oliv',
   'gruner',
   'total',
   'unknown',
   'friend',
   'show',
   'film',
   'seen',
   'gruner',
   'call',
   'pretti',
   'good',
   'sci',
   'fi',
   'film',
   'nemesi',
   'watch',
   'found',
   'fastforward',
   'bs',
   'drama',
   'part',
   'get',
   'unbeliev',
   'action',
   'sequenc',
   'gruner',
   'love',
   'kick',
   'kick',
   'kick',
   'kick',
   'hahagrun',
   'charact',
   'graduat',
   'student',
   'forc',
   'stay',
   'ghetto',
   'close',
   'one',
   'grew',
   'find',
   'watch',
   'boy',
   'live',
   'realli',
   'want',
   'join',
   'mexican',
   'gang',
   'keep',
   'torment',
   'famili',
   'instead',
   'join',
   'gruner',
   'tell',
   'boy',
   'fight',
   'back',
   'gang',
   'crazi',
   'gruner',
   'play',
   'typic',
   'van',
   'damm',
   'charact',
   'kill',
   'everyon',
   'maim',
   'pretti',
   'bad',
   'work',
   'rid',
   'block',
   'gangmemb',
   'plot',
   'cheesi',
   'easi',
   'think',
   'gru

In [73]:
# Preprocess data
train_X, test_X, train_y, test_y = preprocess_data(train_X, test_X, train_y, test_y)

labels_train :  [0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0,

words :  ['haha', 'smile', 'smile', 'actual', 'made', 'way', 'movi', 'like', 'say', 'someth', 'guess', 'movi', 'creat', 'think', 'sort', 'psycholog', 'test', 'like', 'sort', 'drug', 'take', 'place', 'never', 'wittgenstein', 'wrote', 'famou', 'first', 'philosoph', 'piec', 'tractacu', 'sp', 'said', 'meaningless', 'useless', 'read', 'done', 'would', 'take', 'new', 'level', 'like', 'ladder', 'could', 'throw', 'away', 'work', 'see', 'thing', 'clariti', 'true', 'understand', 'movi', 'think', 'movi', 'without', 'doubt', 'worst', 'movi', 'seen', 'long', 'long', 'time', 'uniqu', 'way', 'first', 'snipe', 'love', 'watch', 'guy', 'kick', 'ass', 'variou', 'movi', 'suffer', 'weak', 'one', 'howev', 'although', 'know', 'movi', 'might', 'suck', 'would', 'never', 'suspect', 'could', 'bad', 'actual', 'fun', 'mean', 'snipe', 'know', 'might', 'good', 'alright', 'right', 'smile', 'thing', 'everi', 'level', 'pure', 'boredom', 'pure', 'unorigin', 'refer', 'profession', 'dead', 'obviou', 'yet', 'poorli', 'done

words :  ['honesti', 'someon', 'told', 'director', 'lemoni', 'snicket', 'seri', 'unfortun', 'event', 'citi', 'angel', 'casper', 'go', 'neat', 'littl', 'low', 'budget', 'indi', 'film', 'real', 'good', 'say', 'person', 'must', 'joke', 'director', 'brad', 'siberl', 'realli', 'good', '10', 'item', 'less', 'similar', 'conceit', 'film', 'like', 'sunris', 'lost', 'translat', 'recent', 'involv', 'chanc', 'meet', 'two', 'peopl', 'serendip', 'put', 'probabl', 'never', 'cross', 'path', 'say', 'word', 'one', 'like', 'film', '10', 'item', 'less', 'focus', 'relationship', 'build', 'charact', 'come', 'understand', 'build', 'strength', 'weak', 'stori', 'involv', 'morgan', 'freeman', 'play', 'unnam', 'actor', 'goe', 'research', 'role', 'groceri', 'store', 'employe', 'upcom', 'independ', 'movi', 'thing', 'beyond', 'control', 'end', 'spend', 'day', 'ladi', '10', 'item', 'less', 'lane', 'play', 'paz', 'vega', 'rotten', 'marriag', 'hope', 'land', 'new', 'job', 'secretari', 'initi', 'freeman', 'charact', 'n

words :  ['film', 'artemisia', 'may', 'consid', 'treason', 'true', 'artist', 'licens', 'might', 'one', 'aver', 'document', 'histori', 'artemisia', 'gentileschi', 'subject', 'thumbscrew', 'still', 'affirm', 'r', 'ed', 'mari', 'garrard', 'gloria', 'steinem', 'eloqu', 'affirm', 'movi', 'differ', 'tortur', 'refus', 'condemn', 'lover', 'violat', 'may', 'movi', 'deviat', 'much', 'receiv', 'histori', 'yet', 'still', 'inform', 'human', 'heart', 'answer', 'hard', 'find', 'movi', 'director', 'cast', 'fill', 'gape', 'hole', 'histor', 'record', 'power', 'imagin', 'led', 'conclus', 'differ', 'record', 'find', 'record', 'movi', 'compel', 'movi', 'seem', 'histori', 'artemisia', 'painter', 'els', 'vision', 'frame', 'ravish', 'sic', 'film', 'composit', 'truli', 'grate', 'seldom', 'seen', 'movi', 'compel', 'eye', 'david', 'broadhurst']
words :  ['favorit', 'part', 'film', 'old', 'man', 'attempt', 'cure', 'neighbor', 'ill', 'put', 'strong', 'medicin', 'bath', 'sens', 'famili', 'sens', 'commun']
words :  

words :  ['movi', 'basic', 'human', 'relat', 'interact', 'main', 'charact', 'old', 'ladi', 'twilight', 'life', 'start', 'journey', 'past', 'analysi', 'live', 'life', 'journey', 'precipit', 'son', 'econom', 'crisi', 'intent', 'put', 'nurs', 'home', 'honest', 'look', 'issu', 'ask', 'point', 'life', 'plenti', 'secondari', 'idea', 'discuss', 'movi', 'famili', 'legaci', 'real', 'love', 'marriag', 'destini', 'although', 'type', 'movi', 'melodrama', 'noth', 'new', 'one', 'use', 'watch', 'famili', 'member', 'discuss', 'idea', 'good', 'perform', 'actor', 'charact', 'believ', 'time', 'charact', 'mayb', 'fulli', 'develop', 'realli', 'recommend', 'movi', 'quiet', 'saturday', 'afternoon']
words :  ['rip', '1984', 'hit', 'gremlin', 'quit', 'possibl', 'biggest', 'train', 'wreck', 'movi', 'ever', 'made', 'even', 'b', 'grade', 'movi', 'cheap', 'horror', 'movi', 'platform', 'complet', 'dwarf', 'movi', 'term', 'plot', 'act', 'good', 'begin', 'random', 'old', 'secur', 'guard', 'younger', 'punki', 'secur',

words :  ['decid', 'whether', 'one', 'favourit', 'movi', 'good', 'thriller', 'emot', 'core', 'still', 'decid', 'definit', 'like', 'first', 'movi', 'terri', 'gilliam', 'seen', 'first', 'impress', 'engag', 'till', 'end', 'complex', 'confus', 'movi', 'set', 'futur', 'man', 'jame', 'cole', 'bruce', 'willi', 'sent', 'futur', 'order', 'get', 'inform', 'past', '1996', 'specif', 'viru', 'kill', '5', 'billion', 'peopl', 'sent', 'futur', 'get', 'inform', 'also', 'involv', 'psychiatrist', 'call', 'kathryn', 'railli', 'love', 'stori', 'portray', 'beauti', 'realli', 'feel', 'long', 'love', 'long', 'regular', 'life', 'loos', 'end', 'tie', 'interest', 'manner', 'end', 'one', 'thing', 'like', 'movi', 'unlik', 'post', 'apocalypt', 'movi', 'movi', 'prefer', 'give', 'bore', 'social', 'commentari', 'instead', 'focus', 'one', 'guy', 'long', 'regular', 'life', 'want', 'see', 'ocean', 'especi', 'poignant', 'line', 'movi', 'choos', 'focu', 'tension', 'confus', 'person', 'mind', 'therefor', 'exactli', 'post', 

words :  ['posh', 'spice', 'victoria', 'beckham', 'alleg', 'new', 'adventur', 'move', 'la', 'work', 'purpos', 'footbal', 'hubbi', 'david', 'galaxi', 'la', 'player', 'transfer', 'real', 'madrid', 'origin', 'go', 'full', 'seri', 'thank', 'abridg', 'one', 'hour', 'even', 'form', 'still', 'numbingli', 'intermin', 'like', 'virtual', 'realiti', 'tv', 'show', 'incid', 'come', 'across', 'blatantli', 'fake', 'programm', 'even', 'admit', 'posh', 'newli', 'appoint', 'person', 'assist', 'actress', 'ugli', 'betti', 'lookalik', 'hear', 'lame', 'written', 'perform', 'banter', 'earli', 'obviou', 'joke', 'beck', 'appar', 'dallianc', 'previou', 'rather', 'glamor', 'pa', 'rebecca', 'loo', 'though', 'name', 'mention', 'sequenc', 'involv', 'fake', 'blow', 'doll', 'trick', 'paparazzi', 'hopeless', 'attempt', 'pitch', 'basebal', 'could', 'entertain', 'acknowledg', 'piec', 'fluff', 'actress', 'imperson', 'lead', 'role', 'talent', 'impressionist', 'ronni', 'ancona', 'would', 'perfect', 'better', 'posh', 'posh'

words :  ['wonderland', 'fascin', 'film', 'chronicl', 'x', 'rate', 'film', 'star', 'john', 'c', 'holm', 'involv', 'brutal', 'wonderland', 'murder', 'movi', 'promot', 'mislead', 'one', 'think', 'romantic', 'portray', 'porn', 'industri', 'vein', 'boogi', 'night', 'case', 'fact', 'except', 'refer', 'made', 'newscast', 'john', 'holm', 'porn', 'star', 'brief', 'montag', 'real', 'life', 'footag', 'john', 'holm', 'film', 'strictli', 'drama', 'fallen', 'celebr', 'involv', 'murder', 'happen', 'despit', 'mislead', 'film', 'actual', 'engag', 'act', 'cast', 'excel', 'like', 'say', 'val', 'kilmer', 'amaz', 'abil', 'get', 'manner', 'john', 'holm', 'complet', 'convinc', 'watch', 'john', 'c', 'holm', 'probabl', 'look', 'act', 'like', 'real', 'life', 'john', 'c', 'holm', 'fan', 'like', 'stori', 'hollywood', 'think', 'enjoy', 'watch', 'wonderland']
words :  ['movi', 'demonstr', 'mood', 'music', 'textur', 'enough', 'make', 'good', 'film', 'sure', 'viewer', 'treat', 'numer', 'fine', 'scene', 'lo', 'angel'

words :  ['deliri', 'romant', 'comedi', 'intertwin', 'subplot', 'mesh', 'beauti', 'actor', 'bounc', 'line', 'precis', 'comic', 'time', 'feat', 'beauti', 'behold', 'cher', 'spineless', 'fianc', 'ask', 'help', 'make', 'peac', 'estrang', 'moodi', 'younger', 'brother', 'one', 'could', 'dream', 'consequ', 'follow', 'operat', 'symbol', 'cathol', 'church', 'confess', 'love', 'bite', 'fall', 'snow', 'moonstruck', 'timeless', 'smooth', 'take', '15', 'minut', 'pictur', 'rhythm', 'kick', 'earli', 'sequenc', 'grandfath', 'dog', 'cemeteri', 'littl', 'rough', 'follow', 'scene', 'cosmo', 'elderli', 'man', 'gate', 'seem', 'obtus', 'patchwork', 'plot', 'interwoven', 'nimbl', 'skill', 'movi', 'wobbl', 'tone', 'kooki', 'spirit', 'infecti', '1', '2']
words :  ['realli', 'well', 'made', 'movi', 'sumitra', 'bhave', 'alway', 'made', 'sensibl', 'cinema', 'favourit', 'film', 'movi', 'nation', 'award', 'would', 'pick', 'repres', 'india', 'oscar', 'least', 'thousand', 'time', 'better', 'shaaw', 'go', 'oscar', 'i

words :  ['clara', 'bow', 'hula', 'calhoun', 'daughter', 'plantat', 'owner', 'albert', 'gran', 'bill', 'calhoun', 'mainli', 'interest', 'play', 'card', 'booz', 'friend', 'interest', 'ride', 'countrysid', 'engin', 'clive', 'brook', 'anthoni', 'haldan', 'show', 'build', 'dam', 'one', 'father', 'friend', 'arlett', 'marchal', 'mr', 'bane', 'compet', 'attent', 'wife', 'maud', 'truax', 'margaret', 'haldan', 'show', 'contriv', 'final', 'lot', 'pre', 'code', 'element', 'like', 'nude', 'bath', 'wonder', 'locat', 'shoot', 'hawaii']
words :  ['repetit', 'music', 'annoy', 'narrat', 'terribl', 'cinematographi', 'effect', 'half', 'plot', 'seem', 'center', 'around', 'shock', 'valu', 'half', 'seem', 'focus', 'appeas', 'type', 'crowd', 'would', 'nag', 'peopl', 'start', 'fight', 'one', 'best', 'scene', 'delet', 'scene', 'section', 'one', 'principl', 'offic', 'mom', 'understand', 'cut', 'movi', 'seem', 'desper', 'make', 'point', 'anyth', 'could', 'domino', 'talk', 'soror', 'would', 'highlight', 'movi', '

words :  ['well', 'gave', 'away', '95', 'minut', '47', 'second', 'never', 'get', 'back', 'piec', 'trash', 'heard', 'someon', 'onlin', 'describ', 'movi', 'villain', 'subhuman', 'cannib', 'thought', 'promis', 'thought', 'would', 'like', 'descent', 'wrong', 'descent', 'psycholog', 'thriller', 'dynam', 'charact', 'strong', 'storylin', 'villain', 'total', 'unrealist', 'part', 'perform', 'enjoy', 'watch', 'movi', 'controversi', 'seen', 'level', 'gore', 'mani', 'film', 'movi', 'plain', 'suck', 'synopsi', 'blond', 'think', 'real', 'hot', 'admir', 'admir', 'friend', 'rememb', 'name', 'go', 'wood', 'car', 'break', 'warn', 'leav', 'man', 'name', 'mark', 'blond', 'get', 'unreason', 'hyster', 'next', 'morn', 'find', 'admir', 'friend', 'admir', 'impal', 'foot', 'whoop', 'worri', 'much', 'upset', 'car', 'start', 'get', 'impal', 'nail', 'nanosecond', 'coax', 'blond', 'leav', 'find', 'help', 'event', 'ensu', 'cannot', 'rememb', 'throughout', 'movi', 'shown', 'grotesqu', 'tortur', 'scene', 'substanc', '

words :  ['come', 'clean', 'reason', 'even', 'found', 'dvd', 'domin', 'monaghan', 'favorit', 'actor', 'mine', 'heard', 'titl', 'film', 'thought', 'go', 'differ', 'perhap', 'good', 'way', 'wrong', 'read', 'review', 'short', 'actual', 'excit', 'see', 'sent', 'copi', 'soon', 'abl', 'receiv', 'week', 'later', 'needless', 'say', 'disappoint', 'film', 'follow', 'jack', 'insomniac', 'often', 'plagu', 'condit', 'caus', 'doubt', 'realiti', 'head', 'give', 'away', 'happen', 'tell', 'film', 'sometim', 'frighten', 'realism', 'direct', 'fantast', 'focus', 'essenti', 'stori', 'without', 'allow', 'lose', 'entertain', 'thought', 'provok', 'moment', 'give', 'great', 'film', '9', '10', 'go', 'far', 'beyond', 'thought', 'short', 'could', 'achiev']
words :  ['aw', 'simpli', 'aw', 'prove', 'theori', 'star', 'power', 'suppos', 'great', 'tv', 'guy', 'direct', 'battlestar', 'titanica', 'guy', 'direct', 'shlop', 'schtock', 'schtick', 'chick', 'b', 'r', 'n', 'g', 'find', 'someth', 'thousand', 'time', 'interest'

words :  ['popular', 'radio', 'storytel', 'gabriel', 'one', 'robin', 'william', 'scraggi', 'speak', 'hush', 'hypnot', 'tone', 'becom', 'acquaint', 'friend', 'fourteen', 'year', 'old', 'boy', 'wisconsin', 'name', 'pete', 'logand', 'rori', 'culkin', 'written', 'book', 'detail', 'sexual', 'abus', 'parent', 'boot', 'pete', 'aid', 'compel', 'gabriel', 'still', 'sinc', 'partner', 'jess', 'bobbi', 'cannaval', 'good', 'happen', 'survivor', 'hiv', 'also', 'acquaint', 'pete', 'guardian', 'woman', 'name', 'donna', 'toni', 'collett', 'brilliant', 'gabriel', 'decid', 'want', 'meet', 'talk', 'two', 'person', 'goe', 'wisconsin', 'discov', 'secret', 'natur', 'prepar', 'find', 'base', 'real', 'event', 'happen', 'armistead', 'maupin', 'co', 'wrote', 'screenplay', 'terri', 'anderson', 'direct', 'patrick', 'stetner', 'film', 'move', 'lot', 'faster', '90', 'min', 'mayb', 'minut', 'longer', 'one', 'might', 'think', 'movi', 'genr', 'would', 'run', 'good', 'keep', 'action', 'storylin', 'lean', 'clear', 'bad',

words :  ['saw', 'pictur', '1940', '11', 'would', 'like', 'secur', 'dvd', '2006', 'film', 'greatest', 'adventur', 'time', 'like', 'epic', 'still', 'entertain', 'marvel', 'b', 'w', 'get', 'sens', 'real', 'bond', 'friendship', 'chemistri', 'actor', 'perform', 'sam', 'jaff', 'eduardo', 'cianelli', 'outstand', 'could', 'done', 'today', 'particularli', 'like', 'end', 'colonel', 'recit', 'end', 'kipl', 'poem', 'bodi', 'gunga', 'din', 'tell', 'untouch', 'better', 'man', 'gunga', 'din', 'make', 'movi', 'charact', 'today', 'cast', 'member', 'still', 'aliv', 'today', 'joan', 'fontain']
words :  ['movi', 'start', 'hilari', '15', 'second', 'mark', 'continu', 'throughout', 'movi', 'cannot', 'recal', 'scene', 'turn', 'look', 'peopl', 'laugh', 'perfect', 'actor', 'roll', 'way', 'look', 'way', 'dress', 'comed', 'part', 'great', 'see', 'actor', 'big', 'popular', 'see', 'peopl', 'like', 'movi', 'current', 'rate', '7', '9', 'imdb', 'think', '250', 'let', 'put', 'way', 'seen', 'funni', 'movi', 'sinc', 'am

words :  ['horror', 'fan', 'speak', '12', 'although', '12', 'apologis', 'might', 'deem', 'insult', 'short', 'appreci', 'imagin', 'disturb', 'well', 'written', 'origin', 'storytel', 'punctuat', 'unpredict', 'well', 'plant', 'scare', 'deliv', 'via', 'convinc', 'perform', 'heartili', 'recommend', 'avoid', 'steamer', 'made', 'director', 'appar', 'long', 'sinc', 'past', 'sell', 'date', 'accid', 'almost', 'everi', 'episod', 'feel', 'made', '1980', 'put', 'blame', 'squar', 'shoulder', 'old', 'boy', 'inde', '80', 'would', 'without', 'certain', 'movi', 'like', 'argento', 'carpent', 'landi', 'dant', 'barker', 'actual', 'clive', 'wtf', 'glad', 'see', 'romero', 'good', 'sens', 'give', 'miss', 'sure', 'ask', 'partak', 'perhap', 'point', 'finger', 'creator', 'mick', 'garri', 'whose', 'credenti', 'includ', 'logic', 'defi', 'depressingli', 'ill', 'advis', 'tv', 'remak', 'stanley', 'kubrick', 'masterpiec', 'shine', 'perhap', 'indic', 'state', 'televis', 'today', 'starv', 'good', 'tv', 'horror', 'applau

words :  ['parasomnia', 'interest', 'premis', 'stori', 'poorli', 'done', 'without', 'tension', 'even', 'logic', 'approach', 'cast', 'unconvinc', 'even', 'patrick', 'kilpatrick', 'play', 'great', 'role', 'movi', 'like', 'scanner', 'cop', '2', 'open', 'fire', 'sieg', '2', 'eras', 'rest', 'cast', 'unknown', 'good', 'except', 'jeffrey', 'comb', 'herbert', 'west', 'great', 'anim', 'trilog', 'play', 'role', 'like', 'sleep', 'littl', 'main', 'problem', 'action', 'charact', 'make', 'sens', 'stori', 'rather', 'dull', 'predict', 'cheap', 'comput', 'effect', 'mix', 'gori', 'scene', 'especi', 'end', 'could', 'much', 'better', 'get', 'good', 'review', 'one', 'averag', 'realli']
words :  ['made', 'french', 'brother', 'jule', 'giddeon', 'naudet', 'narrat', 'robert', 'de', 'niro', 'firefight', 'jame', 'hanlon', 'compel', 'heartbreak', 'tale', 'new', 'york', 'finest', 'shone', 'darkest', 'day', 'first', 'saw', 'young', 'naiv', '12', 'year', 'old', 'age', 'still', 'touch', 'know', 'seriou', '9', '11', '

words :  ['mysteri', 'scienc', 'theatr', '3000', 'fan', 'withstand', 'motion', 'pictur', 'foist', 'upon', 'absolut', 'reason', 'rate', 'super', 'action', 'blockbust', 'video', 'section', 'given', 'dread', 'restrict', 'view', 'sticker', 'assum', 'method', 'film', 'maker', 'ha', 'robert', 'napton', 'could', 'use', 'get', 'least', '4', '50', 'one', 'unsuspect', 'person', 'shame', 'robert', 'napton', 'shame', 'exploit', 'poor', 'mexican', 'actor', 'probabl', 'promis', 'hope', 'make', 'big', 'american', 'cinema', 'disgrac', 'one', 'moment', 'movi', 'hold', 'slightest', 'bit', 'action', 'use', 'snot', 'peopl', 'oh', 'look', 'rave', 'field', 'like', '6', 'asian', 'guy', 'background', 'alway', 'daytim', 'take', '1', '2', 'movi', 'show', 'anyth', 'importantli', 'watch', '1', '2', 'ps', 'owe', '4', '50']
words :  ['love', 'tudor', 'chirila', 'mayb', 'enjoy', 'movi', 'much', 'two', 'day', 'movi', 'premier', 'went', 'see', 'concert', 'saw', 'trailer', 'video', 'zmeu', 'movi', 'thought', 'figur', '

words :  ['go', 'say', 'first', 'given', 'film', '3', '10', 'thought', 'go', 'give', 'straight', '1', 'got', 'coupl', 'extra', 'point', 'bodi', 'count', 'would', 'let', 'explain', 'paid', 'liter', '1', 'dvd', 'supermarket', 'tend', 'lot', 'faith', 'bargain', 'horror', 'flick', 'b', 'movi', 'especi', 'film', 'aim', 'b', 'statu', 'suspect', 'number', 'reason', 'touch', 'sec', 'fail', 'magnific', 'shoot', 'b', 'miss', 'land', 'somewher', 'around', 'f', 'film', 'mani', 'opportun', 'good', 'pretti', 'much', 'fail', 'account', 'say', 'like', 'film', 'aim', 'b', 'statu', 'seem', 'tri', 'achiev', 'tri', 'blend', 'humour', 'horror', 'either', 'good', 'bad', 'exampl', 'later', 'freddi', 'film', 'dream', 'warrior', 'onward', 'freddi', 'style', 'nose', 'thumb', 'work', 'great', 'film', 'complet', 'bomb', 'respect', 'time', 'tri', 'inject', 'humour', 'mostli', 'stupid', 'admit', 'though', 'toward', 'begin', 'film', 'humour', 'good', 'fact', 'half', 'hour', 'like', 'film', 'prepar', 'congratul', 'an

words :  ['show', 'look', 'like', 'show', 'type', 'mid', '90', 'thing', 'one', 'differ', 'use', 'lot', 'comedi', 'action', 'one', 'mayb', 'littl', 'bit', 'drama', 'person', 'thought', 'good', 'show', 'understand', 'would', 'cancel', 'good', 'thing', 'fan', 'base', 'show', 'still', 'aliv', 'ever', 'sinc', '1997', 'date', 'hope', 'wb', 'bring', 'back', 'show', 'even', 'movi', 'know', 'gonna', 'imposs', 'hey', 'hurt', 'dream', 'anyway', 'would', 'recommend', 'seen', 'find', 'dvd', '13', 'episod', 'charact', 'great', 'stori', 'line', 'good', 'comedi', 'good', 'well', 'whole', 'show', 'great']
words :  ['come', 'shortli', 'imposit', 'moral', 'code', 'darken', 'spirit', 'writer', 'director', 'actor', 'first', 'film', 'adapt', 'w', 'somerset', 'maugham', 'human', 'bondag', 'titil', 'countless', 'moviego', 'shock', 'valu', 'today', 'fine', 'act', 'cast', 'excel', 'bett', 'davi', 'first', 'great', 'role', 'one', 'lesli', 'howard', 'best', 'perform', 'howard', 'english', 'wannab', 'parisian', 'a

words :  ['journey', 'center', 'earth', 'stori', 'tourist', 'hawaii', 'three', 'sibl', 'one', 'young', 'british', 'nanni', 'babysit', 'dog', 'sibl', 'accident', 'drive', 'jeep', 'basket', 'dog', 'biscuit', 'nanni', 'follow', 'might', 'safer', 'purchas', 'way', 'cave', 'sibl', 'intend', 'explor', 'guess', 'reason', 'actual', 'go', 'cave', 'place', 'start', 'cave', 'tri', 'get', 'avail', 'except', 'six', 'year', 'old', 'sister', 'tell', 'go', 'get', 'help', 'meanwhil', 'move', 'around', 'cave', 'continu', 'plummet', 'toward', 'earth', 'cavern', 'core', 'behold', 'find', 'citi', 'atlanti', 'bizarr', 'alien', 'habit', 'live', 'oppress', 'rule', 'one', 'alien', 'want', 'ask', 'mani', 'question', 'world', 'extern', 'see', 'rusti', 'lemorand', 'name', 'director', 'film', 'provid', 'comment', 'film', 'explain', 'part', 'latter', 'half', 'film', 'actual', 'sequel', 'alien', 'l', 'well', 'whatev', 'amazingli', 'cheap', 'movi', 'would', 'rank', 'slightli', 'higher', 'citi', 'limit', '1988', 'sci'

words :  ['robert', 'siodmak', 'fabul', 'job', 'b', 'noir', 'star', 'ella', 'rain', 'franchot', 'tone', 'alan', 'curti', 'might', 'add', 'without', 'lot', 'help', 'male', 'actor', 'e', 'curti', 'tone', 'rain', 'way', 'pretti', 'leggi', 'actress', 'one', 'reason', 'anoth', 'never', 'reach', 'statu', 'noir', 'counterpart', 'siodmak', 'use', 'sex', 'light', 'shadow', 'music', 'truli', 'remark', 'tackl', 'genr', 'shadow', 'light', 'effect', 'camera', 'angl', 'effect', 'highlight', 'film', 'take', 'place', 'nightclub', 'sexual', 'drum', 'riff', 'elisha', 'cook', 'eg', 'excit', 'rain', 'scene', 'bring', 'phantom', 'ladi', 'new', 'territori', 'siodmak', 'commit', 'materi', 'match', 'rain', 'give', 'sincer', 'perform', 'woman', 'love', 'tri', 'save', 'man', 'franchot', 'tone', 'phone', 'one', 'alan', 'curti', 'seem', 'upset', 'might', 'die', 'seem', 'happi', 'live', 'never', 'except', 'brief', 'moment', 'prison', 'seem', 'love', 'rain', 'amus', 'thing', 'mani', 'film', 'world', 'war', 'ii', 'p

words :  ['previous', 'wrote', 'love', 'titan', 'cri', 'end', 'mani', 'time', 'guy', '60', 'also', 'wonder', 'great', 'movi', 'mani', 'award', 'applaud', 'mani', 'critic', 'given', '7', '0', 'rate', 'imdb', 'com', 'user', 'well', 'look', 'breakdown', 'user', 'rate', '29', '0', 'vote', 'gave', '10', 'rate', '10', '7', 'gave', '1', 'rate', '10', '7', 'irrat', 'imdb', 'user', 'effect', 'pull', 'overal', 'rate', '7', '0', 'previou', 'comment', 'blame', 'unusu', 'vote', 'pattern', 'sudden', 'surg', '1', 'rate', 'high', '10', 'rate', 'drop', 'gradual', 'suddenli', 'revers', 'cours', 'jump', '1', 'rate', 'level', 'one', 'thing', 'hatr', 'leonardo', 'dicaprio', 'believ', 'tune', 'enough', 'chat', 'room', 'see', 'banter', 'young', 'peopl', 'young', 'men', 'mostli', 'defam', 'left', 'right', 'absolut', 'hate', 'man', 'part', 'give', 'credit', 'titan', 'answer', 'one', 'user', 'talk', 'someon', 'realli', 'like', 'movi', 'much', 'gave', '5', '6', 'etc', 'everyon', 'entitl', 'tast', 'one', 'convinc

words :  ['vijay', 'krishna', 'acharya', 'tashan', 'hype', 'styliz', 'product', 'sure', 'one', 'stylish', 'film', 'come', 'content', 'even', 'mass', 'reject', 'one', 'film', 'script', 'amateur', '2', 'year', 'old', 'babi', 'script', 'king', 'without', 'good', 'script', 'even', 'greatest', 'director', 'time', 'cannot', 'anyth', 'tashan', 'produc', 'success', 'product', 'banner', 'yash', 'raj', 'film', 'mega', 'star', 'appear', 'noth', 'earth', 'save', 'script', 'bland', 'thumb', 'perform', 'anil', 'kapoor', 'veteran', 'actor', 'could', 'okay', 'role', 'like', 'akshay', 'kumar', 'great', 'actor', 'fact', 'sole', 'save', 'grace', 'kareena', 'kapoor', 'never', 'look', 'hot', 'look', 'stun', 'leav', 'stand', 'saif', 'ali', 'khan', 'get', 'due', 'sanjay', 'mishra', 'manoj', 'phawa', 'yashpal', 'sharma', 'wast', 'tashan', 'bore', 'film', 'film', 'failur', 'box', 'offic', 'keep', 'away']
words :  ['whoever', 'play', 'game', 'video', 'game', 'anybodi', 'notic', 'gta', 'vice', 'citi', 'mansion',

words :  ['probabl', 'best', 'televis', 'show', 'ever', 'seen', 'first', 'saw', 'comedi', 'central', 'sever', 'year', 'ago', 'time', 'unawar', 'dramat', 'edit', 'shown', 'order', 'watch', 'three', 'seri', 'order', 'unedit', 'thank', 'internet', 'wondrou', 'seri', 'tube', 'glad', 'rediscov', 'think', 'comedi', 'central', 'sort', 'pick', 'chose', 'way', 'seri', 'one', 'two', 'make', 'season', 'tri', 'get', 'friend', 'famili', 'watch', 'nobodi', 'realli', 'seem', 'like', 'need', 'new', 'friend', 'made', 'best', 'could', 'even', 'felt', 'like', 'wane', 'bit', 'still', 'felt', 'compel', 'continu', 'watch', 'year', 'discov', 'littl', 'britain', 'immedi', 'recogn', 'paulin', 'log', 'influenc', 'marjori', 'fat', 'fighter', 'also', 'love', 'idea', 'writer', 'act', 'entir', 'show', 'new', 'done', 'impecc', 'lb', 'noth', 'log', 'offens', 'matt', 'david', 'love', 'inde', 'darkli', 'comed', 'piec', 'geniu', 'serial', 'murder', 'impli', 'cannib', 'name', 'probabl', 'found', 'wonder', 'uniqu', 'piec'

words :  ['look', 'forward', 'ride', 'horribl', 'disappoint', 'easili', 'amus', 'roller', 'coaster', 'amus', 'park', 'ride', 'roller', 'coaster', 'part', 'okay', '30', 'second', '90', 'second', 'ride', 'visual', 'dull', 'poorli', 'execut', 'tri', 'desper', 'like', 'mixtur', 'far', 'superior', 'indiana', 'jone', 'space', 'mountain', 'ride', 'disneyland', 'fail', 'everi', 'aspect', 'thrill', 'excit', 'least']
words :  ['david', 'lynch', 'new', 'short', 'lynchian', 'piec', 'full', 'dark', 'tension', 'silenc', 'discreet', 'textur', 'background', 'music', 'featur', 'two', 'beauti', 'actress', 'blond', 'brunett', 'recurr', 'theme', 'work', 'charact', 'creat', 'intrigu', 'slave', 'mistress', 'relationship', 'could', 'seen', 'direct', 'follow', 'kind', 'relationship', 'featur', 'mulholland', 'dr', 'beauti', 'lynch', 'fan']
words :  ['ah', 'ye', 'vs', 'seri', 'mvc2', 'pinnacl', 'said', 'get', 'half', 'crew', 'fell', 'asleep', 'job', 'unfortun', 'gameplay', 'half', 'get', 'wrong', 'fun', 'get', 

words :  ['noth', 'uniqu', 'either', 'tv', 'seri', 'movi', 'prequel', 'tv', 'show', 'found', 'everywher', 'els', 'life', 'entertain', 'david', 'lynch', 'disgust', 'style', 'stori', 'tell', 'moment', 'bodi', 'poor', 'misguid', 'girl', 'wash', 'beach', 'introduc', 'mind', 'numb', 'shadi', 'immor', 'charact', 'twin', 'peak', 'mind', 'numb', 'almost', 'pedophilia', 'disgust', 'way', 'movi', 'seem', 'romant', 'tell', 'destruct', 'human', 'life', 'random', 'psychedel', 'phenomena', 'movi', 'twin', 'peak', 'fire', 'come', 'walk', 'watch', 'make', 'sure', 'miss', 'anyth', 'simpli', 'one', 'man', 'obviou', 'sexual', 'fetish', 'extend', 'long', 'seri', 'fallow', 'ridicul', 'overli', 'pornograph', 'movi', 'save', 'self', 'agoni', 'suspens', 'watch', 'anyth', 'els', 'least', 'abil', 'tell', 'stori', 'rather', 'seduc', 'kind', 'mental', 'porn', 'movi', 'heard', 'lot', 'review', 'rant', 'rave', 'great', 'david', 'lynch', 'abil', 'defin', 'miseri', 'tragedi', 'make', 'kind', 'wonder', 'thing', 'life'

words :  ['pet', 'sematari', 'succe', 'two', 'major', 'situat', 'first', 'scari', 'horror', 'movi', 'produc', 'day', 'second', 'emot', 'clever', 'movi', 'overal', 'look', 'chill', 'scare', 'creepi', 'visual', 'stun', 'set', 'great', 'act', 'dialong', 'gruesom', 'effect', 'movi', 'look', 'classic', 'truli', 'must', 'see', 'horror', 'fan', 'probabl', 'best', 'adapt', 'king', 'novel', 'event', 'feel', 'littl', 'rush', 'compar', 'novel', 'mean', 'underr', 'movi', 'complet', 'horror', 'drama', 'accomplish', 'stephen', 'king', 'novel', 'wide', 'known', 'emot', 'gruesom', 'time', 'movi', 'captur', 'feel', 'mainli', 'great', 'charact', 'develop', 'feel', 'love', 'relationship', 'member', 'everyth', 'seem', 'happi', 'technic', 'happi', 'titl', 'pet', 'sematari', 'offer', 'appi', 'tragic', 'event', 'chang', 'movi', 'atmospher', 'turn', 'dark', 'movi', 'sinist', 'feel', 'sinc', 'open', 'credit', 'gage', 'kill', 'movi', 'becom', 'sad', 'gray', 'creepi', 'deal', 'loss', 'babi', 'son', 'someth', 'ru

words :  ['movi', 'basic', 'girl', 'cathol', 'school', 'end', 'get', 'troubl', 'put', 'red', 'dye', 'one', 'one', 'school', 'mate', 'shampoo', 'reprimand', 'act', 'decid', 'take', 'florida', 'vacat', 'way', 'meet', 'guy', 'local', 'diner', 'decid', 'would', 'meet', 'anoth', 'locat', 'later', 'girl', 'end', 'road', 'side', 'near', 'wood', 'stop', 'awhil', 'one', 'girl', 'decid', 'walk', 'around', 'bit', 'see', 'murder', 'happen', 'local', 'sheriff', 'involv', 'becom', 'scare', 'run', 'tell', 'other', 'happen', 'girl', 'decid', 'go', 'take', 'look', 'two', 'get', 'kill', 'killer', 'two', 'remain', 'girl', 'caught', 'killer', 'place', 'local', 'jail', 'cell', 'deputi', 'sheriff', 'meanwhil', 'keep', 'watch', 'girl', 'despit', 'insist', 'sheriff', 'killer', 'ignor', 'act', 'ignor', 'everybodi', 'els', 'movi', 'put', 'two', 'two', 'togeth', 'much', 'less', 'lousi', 'detect', 'work', 'best', 'part', 'rape', 'scene', 'killer', 'one', 'girl', 'decid', 'rape', 'jail', 'cell', 'seem', 'girl', 'a

words :  ['thank', 'jymn', 'magon', 'creat', 'disney', '2', 'best', 'cartoon', 'ever', 'show', 'improv', 'much', 'year', 'kid', 'like', 'thought', 'rip', 'ducktal', 'favorit', 'disney', 'thing', 'time', 'like', 'grandmoffromero', 'later', 'though', 'good', 'great', 'read', 'review', 'decid', 'give', 'anoth', 'chanc', 'bought', 'dvd', 'set', 'watch', 'whole', 'pilot', 'first', 'day', 'got', 'pleasantli', 'surpris', 'still', 'favorit', 'episod', 'although', 'seri', 'live', 'end', 'disc', '1', 'knew', 'go', 'top', 'tenner', 'charact', 'complex', 'charm', 'favorit', 'got', 'wildcat', 'absolut', 'hilari', 'sweet', 'boot', 'next', 'favorit', 'baloo', 'best', 'pilot', 'show', 'see', 'ol', 'jymn', 'built', 'show', 'around', 'kit', 'cloudkick', 'baloo', 'best', 'relationship', 'seri', 'louie', 'jim', 'cum', 'perfect', 'job', 'imperson', 'origin', 'voic', 'rebecca', 'made', 'laugh', 'pretti', 'hard', 'believ', 'baloo', 'eventu', 'marri', 'final', 'hero', 'molli', 'although', 'least', 'favorit', 

words :  ['night', 'bachelor', 'parti', 'paul', 'coleman', 'jason', 'lee', 'meet', 'gorgeou', 'dancer', 'becki', 'julia', 'stile', 'bar', 'drink', 'lot', 'togeth', 'next', 'morn', 'wake', 'bed', 'futur', 'mother', 'law', 'call', 'inform', 'fianc', 'e', 'karen', 'selma', 'blair', 'might', 'arriv', 'apart', 'desper', 'ask', 'becki', 'leav', 'place', 'hurri', 'sooner', 'find', 'crab', 'later', 'prepar', 'wed', 'dinner', 'parti', 'realiz', 'becki', 'cousin', 'karen', 'begin', 'funni', 'comedi', 'hilari', 'situat', 'first', 'attract', 'movi', 'certainli', 'central', 'trio', 'actress', 'actor', 'julia', 'stile', 'selma', 'blair', 'excel', 'actress', 'extrem', 'gorgeou', 'jason', 'lee', 'amazingli', 'funni', 'good', 'perform', 'laugh', 'lot', 'along', 'stori', 'scene', 'realli', 'hilari', 'exampl', 'paul', 'find', 'becki', 'bed', 'find', 'paint', 'imagin', 'mani', 'situat', 'drugstor', 'tri', 'buy', 'get', 'explan', 'crab', 'medicin', 'scene', 'neighbor', 'minist', 'karen', 'call', 'depart', 

words :  ['satir', 'commerci', 'lighthearted', 'war', 'john', 'cusack', 'play', 'brand', 'hauser', 'assassin', 'sent', 'turaqistan', 'take', 'omar', 'sharif', 'oil', 'busi', 'spell', 'troubl', 'former', 'vice', 'presid', 'us', 'compani', 'addit', 'hauser', 'must', 'juggl', 'fake', 'posit', 'trade', 'show', 'produc', 'wed', 'pop', 'princess', 'yonica', 'hillari', 'duff', 'nosi', 'liber', 'journalist', 'natali', 'marisa', 'tomei', 'assess', 'technic', 'aspect', 'act', 'main', 'charact', 'least', 'good', 'expect', 'john', 'cusack', 'dialogu', 'quit', 'obvious', 'written', 'often', 'seem', 'uncomfort', 'say', 'mayb', 'unrealist', 'accur', 'joan', 'put', 'forth', 'great', 'often', 'hilari', 'perform', 'marisa', 'tomei', 'never', 'big', 'fan', 'suitabl', 'role', 'work', 'well', 'hillari', 'duff', 'howev', 'pretti', 'terribl', 'need', 'attract', 'middl', 'eastern', 'russian', 'whatev', 'accent', 'suppos', 'pop', 'star', 'unfortun', 'went', '0', '3', 'like', 'said', 'write', 'seem', 'littl', '

words :  ['damsel', 'distress', 'delight', 'great', 'gershwin', 'song', 'fred', 'astair', 'joan', 'fontain', 'terrif', 'support', 'cast', 'head', 'graci', 'allen', 'georg', 'burn', 'typic', 'silli', 'plot', 'astair', 'film', 'american', 'danc', 'star', 'england', 'burn', 'publicist', 'allen', 'secretari', 'concoct', 'stori', 'love', 'bug', 'women', 'fall', 'victim', 'left', 'right', 'run', 'fontain', 'held', 'captiv', 'castl', 'domin', 'aunt', 'docil', 'father', 'silli', 'plot', 'great', 'song', 'includ', 'foggi', 'day', 'thing', 'look', 'nice', 'work', 'get', 'bother', 'fontain', 'sing', 'brief', 'decent', 'number', 'astair', 'surprisingli', 'good', 'danc', 'number', 'astair', 'burn', 'allen', 'includ', 'invent', 'fun', 'romp', 'amus', 'park', 'also', 'cast', 'reginald', 'gardin', 'constanc', 'collier', 'montagu', 'love', 'harri', 'watson', 'albert', 'ray', 'nobl', 'favorit', 'jan', 'duggan', 'lead', 'madrig', 'singer', 'jan', 'duggan', 'middl', 'swooni', 'trio', 'sing', 'nice', 'work

words :  ['utter', 'crap', 'pretti', 'well', 'sum', 'movi', 'rather', 'examin', 'colon', 'african', 'eleph', 'penlight', 'sit', 'think', 'wast', 'enough', 'time', 'watch', 'movi', 'need', 'wast', 'comment']
words :  ['uk', 'newspap', 'review', 'seem', 'concentr', 'fact', 'review', 'tend', 'know', 'tobi', 'young', 'journalist', 'whose', 'real', 'life', 'experi', 'movi', 'base', 'key', 'word', 'base', 'lose', 'friend', 'fictiti', 'romcom', 'sidney', 'young', 'join', 'prestigi', 'gossip', 'magazin', 'new', 'york', 'proce', 'make', 'gaff', 'gaff', 'final', 'get', 'right', 'make', 'involv', 'sell', 'movi', 'seriou', 'point', 'make', 'journalist', 'integr', 'howev', 'overdon', 'main', 'substanc', 'remain', 'comedi', 'centr', 'around', 'sidney', 'misadventur', 'script', 'cake', 'eat', 'sidney', 'stupid', 'well', 'mean', 'buffoon', 'time', 'smart', 'moder', 'obnoxi', 'skill', 'writer', 'contradict', 'never', 'much', 'issu', 'simon', 'pegg', 'sidney', 'project', 'likabl', 'well', 'jeff', 'bridg

words :  ['interest', 'way', 'look', 'human', 'often', 'behav', 'sometim', 'blind', 'desir', 'achiev', 'perfect', 'time', 'destroy', 'foundat', 'tri', 'achiev', 'also', 'address', 'issu', 'tend', 'ignor', 'among', 'us', 'outspoken', 'may', 'miss', 'great', 'opportun', 'inject', 'comedi', 'also', 'make', 'watch', 'film', 'enjoy', 'experi', 'must', 'see', 'anyon', 'interest', 'reflect', 'yet', 'comic', 'look', 'life', 'eagerli', 'look', 'forward', 'next', 'product', 'hope', 'continu', 'provid', 'us', 'qualiti', 'entertain', 'excel', 'work', 'joann']
words :  ['sad', 'lucian', 'pintili', 'stop', 'make', 'movi', 'get', 'wors', 'everi', 'time', 'niki', 'flo', '2003', 'depress', 'stab', 'camera', 'unfortun', 'mani', 'movi', 'made', 'yearli', 'romania', 'worst', 'get', 'sent', 'abroad', 'e', 'g', 'chicago', 'intern', 'film', 'festiv', 'movi', 'without', 'plot', 'act', 'script', 'wast', 'time', 'money', 'score', '0', '02', '10']
words :  ['haunt', 'secret', 'ben', 'thoma', 'smith', 'look', 're

words :  ['well', 'mayb', 'pc', 'version', 'game', 'impress', 'mayb', 'finish', 'play', 'ps2', 'version', 'pretti', 'much', 'complet', 'mess', 'coupl', 'element', 'okay', 'promis', 'mention', 'first', 'quickli', 'first', 'idea', 'histor', 'gta', 'like', 'game', 'great', 'one', 'game', 'gun', 'histor', 'gta', 'like', 'game', 'unlik', 'mafia', 'gun', 'excel', 'love', 'see', 'game', 'set', 'mafia', 'era', 'done', 'right', 'next', 'storylin', 'well', 'written', 'stori', 'make', 'sens', 'dramat', 'arc', 'use', 'unusu', 'devic', 'much', 'game', 'backstori', 'interest', 'final', 'graphic', 'especi', 'use', 'cutscen', 'impress', 'mafia', 'design', 'seem', 'focu', 'get', 'graphic', 'right', 'place', 'gta', 'skimp', 'effort', 'especi', 'charact', 'unfortun', 'mani', 'area', 'graphic', 'kinda', 'stink', 'much', 'rather', 'excel', 'gameplay', 'impress', 'look', 'charact', 'gameplay', 'sink', 'titl', 'low', 'first', 'control', 'camera', 'absolut', 'suck', 'first', 'focu', 'game', 'develop', 'releas

words :  ['love', 'batman', 'tv', 'seri', 'realli', 'look', 'forward', 'tri', 'much', 'stori', 'adam', 'west', 'burt', 'ward', 'tri', 'recov', 'batmobil', 'beyond', 'want', 'knock', 'burt', 'adam', 'way', 'look', '35', 'year', 'sinc', 'appear', 'batman', 'robin', 'see', 'dress', 'dress', 'suit', 'fight', 'badguy', 'kinda', 'sad', 'would', 'rather', 'seen', 'ex', 'star', 'commentari', 'batmobil', 'side', 'stori', 'stupid', 'flashback', 'movi', 'think', 'short', 'left', 'way', 'much', 'realli', 'quick', 'overview', 'opinion', 'like', 'background', 'show', 'penguin', 'joker', 'minut', 'tell', 'stuff', 'alreadi', 'knew', 'joker', 'mustach', 'makeup', 'penguin', 'smoke', 'even', 'though', 'hate', 'ex', 'smoker', '2', 'love', 'read', 'book', 'sure', 'show', 'like', '2', 'riddler', '3', 'catwoman', '3', 'mister', 'freez', 'commishion', 'gordon', 'cheif', 'ohara', 'alfr', 'mister', 'freez', 'king', 'tut', 'etc', 'list', 'goe', 'like', 'said', 'even', 'one', 'one', 'bare', 'disappoint', 'realli

words :  ['unassum', 'subtl', 'lean', 'film', 'man', 'white', 'suit', 'yet', 'anoth', 'breath', 'fresh', 'air', 'filmic', 'format', 'eal', 'studio', 'suspect', 'modern', 'viewer', 'may', 'initi', 'find', 'obscur', 'doubt', 'mani', 'would', 'fail', 'charm', 'expert', 'way', 'plot', 'theme', 'charact', 'languidli', 'relay', 'film', 'cours', 'genuin', 'great', 'alec', 'guin', 'give', 'anoth', 'fine', 'character', 'film', 'perhap', 'obvious', 'virtuoso', 'eal', 'inspir', 'kind', 'heart', 'coronet', '1949', 'time', 'mere', 'play', 'one', 'charact', 'rather', 'eight', 'unworldli', 'inventor', 'scientist', 'sidney', 'stratton', 'alway', 'find', 'correct', 'tone', 'express', 'along', 'guin', 'subtl', 'express', 'perform', 'rest', 'cast', 'effect', 'main', 'player', 'cecil', 'parker', 'ernest', 'thesig', 'stand', 'thesig', 'compellingli', 'absurd', 'crippl', 'influenti', 'busi', 'grande', 'parker', 'depend', 'ineffectu', 'yet', 'pivot', 'mill', 'owner', 'father', 'father', 'joan', 'greenwood', 

words :  ['terribl', 'terribl', 'film', 'one', 'worst', 'movi', 'seen', 'life', 'usual', 'love', 'movi', 'like', 'whole', 'guy', 'meet', 'eccentr', 'woman', 'like', 'happen', 'alreadi', 'involv', 'someon', 'right', 'expect', 'someth', 'predict', 'mind', 'movi', 'alway', 'entertain', 'mix', 'right', 'amount', 'romanc', 'comedi', 'one', 'everi', 'singl', 'joke', 'fall', 'flat', 'romanc', 'make', 'want', 'vomit', 'titl', 'charact', 'one', 'pleas', 'kill', 'charact', 'ever', 'wit', 'televis', 'eccentr', 'woman', 'eccentr', 'like', 'quirki', 'annoy', 'someon', 'reason', 'matur', 'person', 'film', 'also', 'happen', 'annoy', 'film', 'flat', 'suck', 'way', 'around', 'wast', 'time']
words :  ['movi', 'scare', 'heck', 'kid', 'citizen', 'kane', 'moment', 'arm', 'rip', 'scene', 'good', 'plot', 'good', 'even', 'charact', 'could', 'someth', 'act', 'put', 'top', 'name', 'peopl', 'role', 'see', 'get', 'one', 'shoot', 'edit', 'littl', 'distribut', 'coupl', 'month', 'type', 'movi', 'classic', 'low', 'bu

words :  ['excel', 'littl', 'film', 'loneli', 'singl', 'man', 'phillip', 'harel', 'notr', 'hero', 'bit', 'like', 'amalgam', 'robert', 'de', 'niro', 'taxi', 'driver', 'inspector', 'clouseau', 'stoicism', 'chauncey', 'gardin', 'also', 'peter', 'seller', 'singl', 'yet', 'clue', 'attract', 'opposit', 'sex', 'fact', 'realli', 'make', 'effort', 'stoicism', 'fatal', 'defi', 'hope', 'ever', 'achiev', 'coupledom', 'friend', 'jose', 'garcia', 'tisserand', 'plight', 'yet', 'least', 'make', 'brave', 'effort', 'transcend', 'extend', 'virginhood', '28', 'admit', 'never', 'sex', 'good', 'outdoor', 'shot', 'pari', 'rouen', 'two', 'softwar', 'peopl', 'travel', 'busi', 'tri', 'variou', 'nightclub', 'place', 'avail', 'theori', 'tri', 'wrong', 'place', 'go', 'less', 'youth', 'nightclub', 'tri', 'type', 'older', 'peopl', 'age', 'harel', 'increasingli', 'becom', 'isol', 'littl', 'de', 'niro', 'effort', 'taxi', 'driver', 'urg', 'friend', 'colleagu', 'go', 'stab', 'bloke', 'pull', 'nice', 'look', 'girl', 'nig

words :  ['attent', 'spoiler', 'mani', 'peopl', 'told', 'planet', 'ape', 'tim', 'burton', 'worst', 'movi', 'apart', 'much', 'weaker', 'origin', 'film', 'decid', 'see', 'anoth', 'friend', 'mine', 'seen', 'movi', 'yet', 'advis', 'watch', 'spite', 'tim', 'burton', 'movi', 'still', 'tim', 'burton', 'movi', 'decid', 'found', 'right', 'clear', 'remak', 'famou', 'film', 'planet', 'ape', 'automat', 'influenc', 'commerci', 'think', 'still', 'tim', 'burton', 'manag', 'film', 'repres', 'weird', 'play', 'well', 'beetlejuic', 'batman', 'alreadi', 'fond', 'burton', 'movi', 'hard', 'like', 'one', 'film', 'even', 'flaw', 'nerv', 'rack', 'monkey', 'squeal', 'dress', 'ape', 'lead', 'actor', 'could', 'without', 'difficulti', 'replac', 'anbodi', 'els', 'film', 'give', 'us', 'first', 'place', 'answer', 'question', 'result', 'tim', 'burton', 'instruct', 'creat', 'remak', 'first', 'burton', 'burton', 'refus', 'call', 'remak', 'start', 'imagin', 'hand', 'burton', 'know', 'almost', 'everi', 'viewer', 'movi', '

words :  ['4th', 'juli', 'weekend', 'hearten', 'see', 'spirit', 'declar', 'independ', 'aliv', 'well', 'film', 'war', 'inc', 'found', 'father', 'gave', 'back', 'collect', 'hand', 'king', 'georg', 'iii', 'film', 'expos', 'hilari', 'fashion', 'craven', 'war', 'profit', 'current', 'crop', 'capitalist', 'creep', 'intent', 'indec', 'privat', 'govern', 'includ', 'privat', 'war', 'cast', 'satir', 'absolut', 'shine', 'john', 'cusack', 'wonder', 'droll', 'conflict', 'corpor', 'assassin', 'beauti', 'marisa', 'tomei', 'superb', 'love', 'interest', 'gosh', 'georg', 'costanza', 'right', 'marisa', 'tomei', 'attract', 'john', 'sister', 'joan', 'cusack', 'realli', 'steal', 'film', 'portray', 'bossi', 'yet', 'simultan', 'sycophant', 'person', 'assist', 'priceless', 'stop', 'laugh', 'brillianc', 'perform', 'possess', 'fantast', 'comic', 'time', 'face', 'express', 'one', 'could', 'ever', 'wish', 'actor', 'dan', 'ackroyd', 'short', 'effect', 'cameo', 'film', 'head', 'compani', 'run', 'war', 'tamerlan', 'co

words :  ['one', 'episod', 'one', 'indisput', 'error', 'storytel', 'handl', 'ralphi', 'situat', 'christoph', 'state', 'heard', 'pie', 'death', 'fire', 'accid', 'import', 'detail', 'context', 'quit', 'obviou', 'christoph', 'know', 'begin', 'toni', 'one', 'must', 'kill', 'ralphi', 'howev', 'way', 'chri', 'could', 'heard', 'accid', 'told', 'time', 'torn', 'delirium', 'toni', 'call', 'nobodi', 'els', 'inform', 'toni', 'know', 'make', 'even', 'wors', 'hear', 'christoph', 'talk', 'pie', 'death', 'could', 'therefor', 'lead', 'toni', 'conclus', 'chri', 'set', 'fire', 'given', 'impress', 'elabor', 'write', 'process', 'told', 'writer', 'dvd', 'realli', 'wonder', 'none', 'realiz', 'problem', 'stori', 'work', 'way', 'unnecessari', 'add', 'huge', 'fan', 'soprano', 'otherwis', 'certainli', 'care']
words :  ['1999', 'go', 'histori', 'year', 'movi', 'critic', 'lead', 'gener', 'public', 'astray', 'first', 'sent', 'us', 'eye', 'wide', 'shut', 'hype', 'blair', 'witch', 'project', 'magnolia', 'far', 'wors

words :  ['previous', 'seen', 'zu', 'warrior', 'magic', 'mountain', 'film', 'set', 'take', 'place', 'china', 'mountain', 'legend', 'zu', 'look', 'like', 'anoth', 'dimens', 'thank', 'tsui', 'hark', 'extens', 'use', 'cgi', 'effect', 'abl', 'portray', 'vision', 'mountain', 'float', 'cloud', 'land', 'be', 'fli', 'freeli', 'power', 'rang', 'razor', 'sharp', 'wing', 'blade', 'split', 'sword', 'ultra', 'cool', 'moon', 'orb', 'mani', 'charact', 'one', 'focu', 'mainli', 'king', 'sky', 'enigma', 'romanc', 'aspect', 'although', 'movi', 'seem', 'much', 'darker', 'predecessor', 'cecilia', 'cheung', 'beauti', 'presenc', 'screen', 'make', 'movi', 'worth', 'watch', 'begin', 'like', 'resembl', 'countess', 'zu', 'warrior', 'well', 'play', 'enigma', 'deal', 'face', 'past', 'life', 'oh', 'way', 'mention', 'cecilia', 'appeal', 'eye', 'truth', 'zu', 'warrior', 'comedi', 'element', 'special', 'effect', 'limit', 'due', 'time', '1983', 'tsui', 'hark', 'take', 'whole', 'new', 'level', 'set', 'new', 'standard', 

words :  ['imagin', 'school', 'would', 'like', 'world', 'like', 'kid', 'one', 'big', 'gang', 'realli', 'good', 'tast', 'music', 'unit', 'bad', 'headmast', 'teacher', 'rock', 'n', 'roll', 'high', 'school', 'take', 'place', 'world', 'like', 'ramon', 'record', 'come', 'life', 'charact', 'silli', 'innoc', 'charm', 'ramon', 'song', 'music', 'cours', 'fantast', 'high', 'school', 'comedi', 'realli', 'chang', 'year', 'compar', 'movi', 'like', 'american', 'pie', 'late', '70', 'classic', 'tasteless', 'sex', 'joke', 'made', 'sinc', 'remak', 'appar', 'work', 'probabl', 'expect', 'charm', 'origin', 'get', 'lost', 'along', 'way', 'get', 'replac', 'vulgar', 'half', 'funni', 'dick', 'joke', 'bill', 'hick', 'use', 'call', 'howev', 'main', 'problem', 'ramon', 'cannot', 'replac', 'perfect', 'band', 'movi', 'one', 'els', 'could', 'even', 'come', 'close', 'take', 'place', 'best', 'thing', 'would', 'leav', 'origin', 'alon', 'quirki', 'charm', 'gabba', 'gabba', 'hey']
words :  ['compel', 'write', 'review', '

words :  ['person', 'vision', 'hell', 'lock', 'room', 'without', 'abil', 'close', 'eye', 'block', 'ear', 'movi', 'play', 'etern', 'everi', 'avail', 'surfac', 'room', 'whole', 'notion', 'streisand', 'play', 'boy', 'man', 'begin', 'scratch', 'surfac', 'ridicul', 'premis', 'movi', 'singl', 'import', 'thing', 'watch', 'movi', 'concept', 'will', 'suspens', 'disbelief', 'imposs', 'movi']
words :  ['eras', 'thought', 'nearli', 'twenti', 'seven', 'time', 'feel', 'conquer', 'review', 'complex', 'french', 'drama', 'read', 'lip', 'written', 'five', 'hundr', 'review', 'never', 'found', 'loss', 'word', 'director', 'jacqu', 'audiard', 'subtl', 'yet', 'inspir', 'love', 'stori', 'thought', 'pour', 'love', 'hate', 'film', 'love', 'overpow', 'element', 'hate', 'spark', 'debat', 'within', 'mind', 'read', 'lip', 'drama', 'precis', 'charact', 'driven', 'drama', 'fuse', 'social', 'uncertainti', 'crime', 'lord', 'doldrum', 'everyday', 'offic', 'work', 'review', 'begin', 'crumbl', 'item', 'much', 'much', 'vie

words :  ['secret', 'kell', 'independ', 'anim', 'featur', 'give', 'us', 'one', 'fabl', 'stori', 'surround', 'book', 'kell', 'illumin', 'manuscript', 'middl', 'age', 'featur', 'four', 'gospel', 'new', 'testament', 'know', 'book', 'actual', 'exist', 'know', 'make', 'interpret', 'analysi', 'much', 'lot', 'easier', 'stori', 'idea', 'float', 'around', 'book', 'came', 'wrote', 'surviv', '1', '000', 'year', 'one', 'introduc', 'brendan', 'orphan', 'live', 'abbey', 'kell', 'ireland', 'uncl', 'abbot', 'cellach', 'voic', 'brendan', 'gleeson', 'abbot', 'cellach', 'construct', 'massiv', 'wall', 'around', 'abbey', 'protect', 'villag', 'monk', 'brendan', 'fond', 'wall', 'neither', 'monk', 'focus', 'read', 'write', 'someth', 'abbot', 'cellach', 'time', 'anymor', 'fear', 'northmen', 'plunder', 'leav', 'town', 'villag', 'empti', 'burnt', 'ground', 'one', 'day', 'travel', 'come', 'island', 'iona', 'near', 'scotland', 'brother', 'aidan', 'wise', 'man', 'carri', 'special', 'book', 'yet', 'finish', 'abbot',

words :  ['lee', 'chang', 'dong', 'except', 'secret', 'sunshin', 'singl', 'emot', 'ravag', 'experi', 'year', 'instantli', 'sober', 'brutal', 'honest', 'charact', 'piec', 'reverber', 'loss', 'grace', 'memento', 'mori', 'reson', 'strike', 'densiti', 'thought', 'yet', 'remain', 'inscrut', 'emot', 'observ', 'layer', 'natur', 'stunningli', 'trenchant', 'view', 'small', 'town', 'dynam', 'lee', 'implicitli', 'deconstruct', 'tradit', 'korean', 'melodrama', 'pull', 'apart', 'cinemat', 'excess', 'rip', 'shred', 'arc', 'shape', 'charact', 'ground', 'proceed', 'crush', 'grind', 'stoic', 'realism', 'secret', 'sunshin', 'remain', 'immens', 'compel', 'fluid', 'work', 'throughout', '142', 'minut', 'runtim', 'bravura', 'first', 'hour', 'fill', 'brim', 'subtextu', 'insinu', 'remark', 'foreshadow', 'adroit', 'revers', 'tone', 'brought', 'humanist', 'caprici', 'adapt', 'short', 'stori', 'lee', 'infus', 'film', 'sensit', 'sublim', 'paradox', 'life', 'last', 'seen', 'transgress', 'comic', 'irrever', 'oasi',

words :  ['okay', 'let', 'start', 'say', 'noth', 'come', 'surpris', 'anyon', 'read', 'comment', 'said', 'g', 'movi', 'reek', 'mean', 'wow', 'know', 'possibl', 'throw', 'much', 'money', 'obvious', 'someth', 'still', 'come', 'wors', 'roger', 'corman', 'movi', 'corman', 'probabl', 'pitch', 'movi', 'point', 'declin', 'due', 'poor', 'qualiti', 'script', 'reason', 'movi', 'got', 'made', 'first', 'place', 'someon', 'said', 'hey', 'zombi', 'popular', 'video', 'game', 'popular', 'game', 'get', 'hold', 'zombi', 'resid', 'evil', 'someon', 'els', 'got', 'first', 'silent', 'hill', 'silent', 'peopl', 'never', 'sit', 'dark', 'room', 'scare', 'silli', 'hey', 'sega', 'game', 'peopl', 'ran', 'around', 'shoot', 'zombi', 'platform', 'biz', 'could', 'get', 'penni', 'basic', 'tri', 'best', 'make', 'movi', 'felt', 'like', 'video', 'game', 'even', 'shoot', 'combat', 'charact', 'actual', 'play', 'game', 'first', 'major', 'problem', 'origin', 'game', 'horrid', 'mean', 'bad', 'movi', 'merchandis', 'made', 'wors'

words :  ['go', 'slasher', 'movi', 'good', 'look', 'peopl', 'act', 'everyon', 'realli', 'good', 'kid', 'play', 'prank', 'phone', 'call', 'parent', 'kill', 'killer', 'front', 'kid', '20', 'year', 'still', 'friend', 'go', 'huge', 'hous', 'fun', 'drug', 'sex', 'nuditi', 'least', 'half', 'hour', 'movi', 'start', 'make', 'prank', 'phone', 'call', 'killer', 'come', 'back', 'kill', 'one', 'one', 'killer', 'big', 'black', 'coat', 'axe', 'like', 'urban', 'legend', 'movi', 'death', 'scene', 'realli', 'weird', 'realli', 'odd', 'time', 'nice', 'slasher', 'movi', 'part', 'would', 'gave', '7', '10', 'twist', 'end', 'movi', 'made', 'whole', 'movi', 'kinda', 'pointlessth', 'twist', 'kill', 'movi', 'go', 'give', '4', '10']
words :  ['saw', 'tv', 'glad', 'go', 'cinema', 'see', 'spend', 'money', 'rental', 'movi', 'total', 'predict', 'corrupt', 'owner', 'planner', 'snake', 'electr', 'cabl', 'plot', 'realli', 'weak', 'unbeliev', 'avalanch', 'expert', 'guy', 'get', 'hit', '20', 'foot', 'wave', 'bone', 'brea

words :  ['spot', 'high', 'rate', 'imdb', 'decid', 'go', 'see', 'movi', 'beyond', 'high', 'rate', 'intent', 'avoid', 'read', 'review', 'want', 'go', 'theater', 'clean', 'slate', 'without', 'know', 'plot', 'predetermin', 'expect', 'given', 'rate', 'see', 'disappoint', 'enjoy', 'develop', 'main', 'charact', 'mike', 'enslin', 'also', 'enjoy', 'hotel', 'manag', 'attempt', 'talk', 'enter', 'hotel', 'room', 'time', 'enslin', 'enter', 'room', 'readi', 'scari', 'stuff', 'first', 'chocol', 'appear', 'pillow', 'toilet', 'paper', 'fold', 'enslin', 'react', 'believ', 'manner', 'freak', 'encourag', 'think', 'go', 'good', 'peopl', 'made', 'movi', 'understand', 'less', 'happen', 'next', 'big', 'let', 'subtl', 'quickli', 'replac', 'predict', 'shotgun', 'approach', 'blast', 'audienc', 'everi', 'hollywood', 'scari', 'trick', 'book', 'hope', 'someth', 'work', 'let', 'see', 'clock', 'radio', 'turn', 'good', 'alway', 'scari', 'object', 'move', 'around', 'room', 'good', 'complain', 'blood', 'drip', 'wall', 

words :  ['way', 'disast', 'fellini', 'work', 'toward', 'life', 'line', 'absurd', 'masterpiec', 'free', 'associ', 'bullshit', 'small', 'categori', 'film', 'ultim', 'fit', 'often', 'depend', 'person', 'feel', 'said', 'casanova', 'left', 'cold', 'admir', 'set', 'littl', 'cannot', 'sum', 'adequ', 'bukowski', 'casanova', 'die', 'old', 'guy', 'big', 'cock', 'long', 'tongu', 'gut', 'say', 'live', 'well', 'true', 'say', 'could', 'spit', 'grave', 'without', 'feel', 'also', 'true', 'ladi', 'usual', 'go', 'biggest', 'fool', 'find', 'human', 'race', 'stand', 'today', 'bred', 'clever', 'last', 'casanova', 'hollow', 'insid', 'like', 'easter', 'bunni', 'foster', 'upon', 'poor', 'children', 'far', 'could', 'make', 'posit', 'fellini', 'take', 'regard', 'subject', 'grant', 'empathi', 'disgust', 'nonetheless', 'casanova', 'environ', 'made', 'decay', 'incestu', 'behavior', 'theme', 'fellini', 'dealt', 'pointedli', 'satyricon', 'success', 'plot', 'characterist', 'soft', 'porn', 'without', 'coher', 'donald

words :  ['like', 'realli', 'pleas', 'think', 'idiot', 'admit', 'enjoy', 'film', 'expect', 'crap', 'crap', 'sometim', 'ok', 'relax', 'watch', 'crappi', 'film', 'concentr', 'much', 'expect', 'hidden', 'mean', 'moral', 'matter', 'watch', 'entertain', 'entertain', 'throughout', 'film', 'like', 'ben', 'stiller', 'excus', 'someth', 'mari', 'vinc', 'vaughn', 'howev', 'spell', 'last', 'name', 'bother', 'check', 'job', 'ok', 'watch', 'crap', 'film', 'long', 'expect', 'much', 'one', 'shall', 'take', 'stand', 'jog', 'perhap', 'run', 'drive', 'car', 'blockbust', 'video', 'even', 'choic', 'rent', 'bunch', 'toilet', 'humour', 'film', 'stay', 'one', 'night', 'watch', 'good', 'day', 'reader', 'p', 'say', 'comment', 'help', 'like', 'say', 'help', 'god', 'bless', 'go', 'heaven']
words :  ['despit', 'budget', 'limit', 'great', 'film', 'proof', 'effort', 'imagin', 'overcom', 'lack', 'cash', 'open', 'cave', 'paint', 'seem', 'show', 'dinosaur', 'least', 'surviv', 'age', 'human', 'be', 'nice', 'red', 'her',

words :  ['turn', 'back', 'away', 'gonna', 'get', 'big', 'troubl', 'boyfriend', 'back', 'happi', 'end', 'bloom', 'innoc', 'full', 'gloom', 'doom', 'moment', 'watch', 'safe', 'say', 'entir', 'movi', 'fall', 'apart', 'sarcast', 'approach', 'tribut', 'zombi', 'show', 'defi', 'nonsens', 'max', 'get', 'name', 'like', 'johnni', 'everi', 'often', 'johnni', 'nowher', 'go', 'specif', 'reason', 'dead', 'corps', 'crawl', 'grave', 'surviv', 'prom', 'night', 'render', 'movi', 'total', 'useless', 'without', 'feel', 'sorrow', 'mother', 'convinc', 'tell', 'doctor', 'dead', 'johnni', 'take', 'bite', 'eddi', 'arm', 'afterward', 'viewer', 'ask', 'tough', 'question', 'movi', 'cornbal', 'answer', 'resembl', 'person', 'live', 'dead', 'pure', 'coincident', 'live', 'coincid', 'dead', 'noth', 'common', 'movi', 'show', 'one', 'girlfriend', 'skip', 'senior', 'prom', 'turn', 'life', 'desert', 'ruin', 'blah']
words :  ['earli', 'film', 'pilot', 'hit', 'canadian', 'tv', 'show', 'trailer', 'park', 'boy', 'play', 'ex

words :  ['summari', 'say', 'made', 'ignor', 'comment', 'ever', 'heard', 'rpg', 'serious', 'thought', 'gay', 'retard', 'went', 'go', 'save', 'best', 'friend', 'someon', 'decid', 'good', 'heart', 'help', 'seriou', 'debt', 'man', 'lavitz', 'good', 'person', 'time', 'help', 'made', 'closer', 'friend', 'gay', 'lover', 'like', 'bitch', 'let', 'know', 'game', 'set', 'mediev', 'time', 'period', 'back', 'women', 'prepar', 'meal', 'men', 'fought', 'even', 'know', 'histori', 'know', 'long', 'took', 'women', 'accept', 'armi', 'present', 'day', 'game', 'contain', 'lot', 'realism', 'even', 'though', 'damn', 'slow', 'obvious', 'catch', 'realli', 'need', 'spit', 'solid', 'proof', 'instead', 'ignor', 'assumpt', 'base', 'misguid', 'act', 'interpret', 'stori']
words :  ['seri', '2', 'got', 'great', 'start', 'think', 'need', 'watch', 'seri', '1', 'get', 'grasp', 'what', 'happen', 'like', 'seri', 'nice', 'feel', 'sens', 'charact', 'care', 'happen', 'show', 'make', 'think', 'like', '4', '30', 'someth', 'wo

words :  ['saw', 'movi', 'recent', '2', 'hour', 'later', 'head', 'still', 'hurt', 'laugh', 'plot', 'soo', 'aw', 'joke', 'soo', 'bad', 'count', '1', '2', 'scene', 'movi', 'pat', 'jay', 'pose', 'caus', 'enough', 'laughter', '2', 'kick', 'windshield', 'decapit', 'evil', 'doer', 'movi', '20', 'time', 'better', 'rush', 'hour', 'seri', 'copi', 'even', 'came', 'disclam', 'say', 'like', 'movi', 'send', 'certif', 'hbo', 'consid', 'date', 'send', 'januari', '1991', 'also', 'caus', 'wacki', 'ensu']
words :  ['poor', 'ingrid', 'suffer', 'suffer', 'went', 'itali', 'tire', 'hollywood', 'glamor', 'treatment', 'first', 'suffer', 'torment', 'volcan', 'island', 'stromboli', 'arti', 'failur', 'would', 'kill', 'career', 'less', 'resili', 'actress', 'europa', '51', 'anoth', 'tediou', 'exercis', 'soggi', 'sentiment', 'stori', 'much', 'alexand', 'knox', 'anoth', 'thankless', 'role', 'long', 'suffer', 'husband', 'tri', 'comfort', 'suicid', 'death', 'young', 'son', 'least', 'one', 'better', 'product', 'valu', 

words :  ['thought', 'read', 'somewher', 'last', 'monogram', 'product', 'whether', 'true', 'matter', 'deadli', 'dull', 'affair', 'star', 'john', 'carradin', 'gray', 'hair', 'make', 'appear', 'like', 'older', 'scientist', 'experi', 'aid', 'young', 'apprentic', 'robert', 'shayn', 'bring', 'dead', 'back', 'life', 'everi', 'time', 'subject', 'reviv', 'seem', 'whitish', 'face', 'like', 'marbl', 'lie', 'strap', 'laboratori', 'tabl', 'big', 'deal', 'carradin', 'manag', 'restor', 'faith', 'dog', 'life', 'dead', 'mutt', 'gain', 'unusu', 'abil', 'walk', 'wall', 'ghostlik', 'fashion', 'wooooooooohhhh', 'wrote', 'ultra', 'cheap', 'monogram', 'quicki', 'thing', 'least', 'actual', 'util', 'fanci', 'schmanci', 'lab', 'setup', 'usual', 'allot', 'funniest', 'run', 'joke', 'movi', 'older', 'doctor', 'carradin', 'constantli', 'refer', 'young', 'assist', 'shayn', 'boy', 'fact', 'carradin', 'actual', '40', 'shayn', '45', 'made']
words :  ['idea', 'ia', 'short', 'film', 'lot', 'inform', 'interest', 'enterta

words :  ['averag', 'film', 'act', 'partli', 'spoil', 'complet', 'predict', 'stori', 'line', 'even', 'music', 'chosen', 'word', 'fit', 'action', 'everi', 'time', 'scent', 'pleasantvil', 'camp', 'hang', 'around', 'flick', 'period', 'piec', 'accur', 'depict', 'tragedi', 'compani', 'town', 'lack', 'upward', 'mobil', 'sketchi', 'move', 'chri', 'cooper', 'turn', 'first', 'class', 'perform', 'howard', 'coal', 'miner', 'daddi']
words :  ['reason', 'sort', 'vendetta', 'awesom', 'show', 'somebodi', 'involv', 'therein', 'would', 'best', 'show', 'seen', 'year', 'cancel', 'addict', 'saw', 'show', 'randomli', 'last', 'fall', 'immedi', 'love', 'watch', 'everi', 'week', 'went', 'away', 'tri', 'tivo', 'air', 'forgot', 'awhil', 'found', 'episod', 'abc', 'websit', 'want', 'agre', 'everybodi', 'els', 'rest', 'junk', 'tv', 'today', 'refresh', 'see', 'someth', 'well', 'round', 'develop', 'watch', 'boston', 'legal', 'eccentr', 'comed', 'fix', 'hous', 'intellectu', 'mysteri', 'jackass', 'fix', 'wife', 'love'

words :  ['boggl', 'mind', 'movi', 'nomin', 'seven', 'oscar', 'one', 'abysm', 'given', 'collect', 'credenti', 'creativ', 'team', 'behind', 'realli', 'ought', 'deserv', 'everi', 'categori', 'nomin', 'prizzi', 'honor', 'disappoint', 'would', 'argu', 'old', 'hollywood', 'pioneer', 'john', 'huston', 'lost', 'point', 'career', 'buy', 'previou', 'year', 'sign', 'superb', 'volcano', 'dark', 'charact', 'studi', 'set', 'mexico', 'rank', 'among', 'finest', 'ever', 'prizzi', 'honor', 'hand', 'film', 'load', 'star', 'power', 'good', 'intent', 'decent', 'script', 'prove', 'major', 'letdown', 'overal', 'tone', 'plot', 'gangster', 'fall', 'love', 'femal', 'hit', 'man', 'prefigur', 'quirki', 'crimedi', 'caught', 'hollywood', 'storm', 'earli', '90', 'script', 'convolut', 'sake', 'motiv', 'whole', 'stori', 'seem', 'unsur', 'exactli', 'tri', 'romant', 'comedi', 'crime', 'drama', 'gangster', 'saga', 'etc', 'jack', 'nicholson', 'brooklyn', 'accent', 'work', 'perfectli', 'de', 'niro', 'sound', 'unconvinc', 

words :  ['believ', 'dvd', 'copi', 'movi', 'bought', 'walgreen', 'great', 'big', 'whole', 'dollar', 'still', 'sure', 'dollar', 'well', 'spent', 'dollar', 'foolishli', 'wast', 'pretti', 'amaz', 'set', 'design', 'costum', 'appar', 'much', 'thought', 'effort', 'went', 'make', 'set', 'design', 'mexican', 'styliz', 'like', 'lot', 'santa', 'impress', 'one', 'impress', 'santa', 'moviedom', 'guess', 'origin', 'intent', 'purpos', 'movi', 'someth', 'uplift', 'cheer', 'kid', 'audienc', 'somehow', 'come', 'across', 'quit', 'derang', 'fact', 'left', 'stun', 'derang', 'mayb', 'english', 'dub', 'make', 'seem', 'derang', 'bizarr', 'one', 'reason', 'prefer', 'experi', 'movi', 'origin', 'languag', 'made', 'use', 'english', 'subtitl', 'dub', 'often', 'give', 'unintend', 'strang', 'kid', 'voic', 'least', 'dub', 'actual', 'kid', 'voic', 'oppos', 'women', 'pretend', 'kid', 'tend', 'sound', 'weird', 'know', 'demon', 'hell', 'spend', 'free', 'time', 'danc', 'around', 'ballet', 'longjohn', 'neither', 'watch', 

words :  ['said', 'visual', 'effect', 'stun', 'breathtak', 'person', 'use', 'blender', 'graphic', 'like', 'easi', 'movi', 'plot', 'confus', 'overal', 'conflict', 'clear', 'exampl', 'first', 'scene', 'proog', 'emo', 'tri', 'run', 'away', 'know', 'conflict', 'seem', 'man', 'natur', 'later', 'enter', 'room', 'bottomless', 'pit', 'proog', 'explain', 'one', 'step', 'place', 'dead', 'precis', 'conflict', 'careless', 'man', 'natur', 'movi', 'progress', 'clear', 'conflict', 'exist', 'man', 'natur', 'suddenli', 'conflict', 'exist', 'man', 'man', 'proog', 'nowher', 'murder', 'emo', 'proog', 'immedi', 'chang', 'care', 'guardian', 'look', 'lost', 'child', 'sick', 'man', 'betray', 'us', 'depress', 'care', 'conflict', 'charact', 'thought', 'action', 'develop', 'stori', 'someon', 'struggl', 'emerg', 'stronger', 'depress', 'point', 'great', 'truth', 'human', 'soul', 'world', 'brought', 'light', 'like', 'great', 'drama', 'opinion', 'movi', 'sever', 'underdevelop', 'aspect', 'howev', 'graphic', 'stun', 

words :  ['take', 'long', 'see', 'code', 'name', 'diamond', 'head', 'make', 'onto', 'network', 'schedul', 'tv', 'pilot', 'movi', 'get', 'past', 'credit', 'obviou', 'bad', 'go', 'mayb', 'miss', 'someth', 'plot', 'make', 'whole', 'lot', 'sens', 'base', 'got', 'muddl', 'mess', 'terrorist', 'thief', 'someth', 'name', 'tree', 'ian', 'mcshane', 'goe', 'hawaii', 'steal', 'someth', 'secret', 'weapon', 'world', 'dullest', 'secret', 'agent', 'johnni', 'paul', 'roy', 'thinn', 'stop', 'might', 'trust', 'realli', 'matter', 'anyway', 'action', 'movi', 'action', 'suspens', 'moment', 'suspens', 'dramat', 'moment', 'drama', 'none', 'code', 'name', 'diamond', 'head', 'seen', 'other', 'use', 'word', 'turgid', 'describ', 'made', 'tv', 'snoozer', 'better', 'one', 'word', 'descript', 'come', 'none', 'charact', 'least', 'bit', 'excit', 'worth', 'care', 'roy', 'thinn', 'make', 'worst', 'lead', 'imagin', 'charisma', 'slightli', 'north', 'slug', 'ian', 'mcshane', 'easili', 'best', 'thing', 'movi', 'go', 'unfort

words :  ['littl', 'parent', 'got', 'movi', 'watch', 'realli', 'like', 'watch', 'even', '3rd', 'grade', 'still', 'watch', 'time', 'time', 'recent', 'watch', 'sake', 'nostalgia', 'though', 'show', 'aim', 'age', 'group', 'late', 'teen', 'still', 'found', 'entertain', 'educ', 'show', 'teach', 'good', 'lesson', 'imagin', 'get', 'along', 'well', 'other', 'part', 'found', 'quit', 'entertain', 'also', 'show', 'bad', 'content', 'leav', 'kid', 'alon', 'show', 'worri', 'pick', 'bad', 'languag', 'whatnot', 'would', 'recommend']
words :  ['recent', 'caught', 'littl', 'gem', 'film', 'cabl', 'took', 'surpris', 'even', 'though', 'expect', 'team', 'involv', 'movi', 'henri', 'bromwel', 'direct', 'film', 'sure', 'hand', 'show', 'one', 'alway', 'wonder', 'secret', 'life', 'hit', 'killer', 'one', 'go', 'far', 'realiz', 'probabl', 'one', 'neighbor', 'social', 'acquaint', 'even', 'friend', 'differ', 'us', 'least', 'surfac', 'stori', 'grandfath', 'despic', 'charact', 'hesit', 'elimin', 'anyon', 'right', 'pri

words :  ['like', 'origin', 'gut', 'wrench', 'laughter', 'like', 'movi', 'young', 'old', 'love', 'movi', 'hell', 'even', 'mom', 'like', 'great', 'camp']
words :  ['film', 'regard', 'classic', 'idea', 'terribl', 'point', 'laughabl', 'save', 'grace', 'movi', 'deliveri', 'cheesi', 'line', 'toe', 'curlingli', 'embarrass', 'choic', 'laugh', 'coupl', 'good', 'song', 'good', 'choreographi', 'film', 'plot', 'set', 'theatr', 'chang', 'sceneri', 'michael', 'dougla', 'depress', 'ever', 'brother', 'forc', 'watch', 'film', 'said', 'believ', 'bad', 'film', 'get', 'right', 'normal', 'film', 'dread', 'would', 'recommend', 'peopl', 'watch', 'case', 'think', 'peopl', 'put', 'everi', 'bad', 'film', 'seen', 'perspect']
words :  ['came', 'back', 'first', 'show', 'basic', 'instinct', '2', 'go', 'think', 'would', 'crappi', 'base', 'preview', 'critic', 'pleasantli', 'surpris', 'like', 'origin', 'basic', 'instinct', 'think', 'enjoy', '2', 'much', 'great', 'stori', 'alway', 'keep', 'wonder', 'think', 'music', '

words :  ['thing', 'come', 'earli', 'sci', 'fi', 'film', 'show', 'imagin', 'world', 'everytown', '100', 'year', 'break', '4', 'differ', 'scene', 'part', 'film', 'span', '1940', '2036', 'mainli', 'ruler', 'boss', 'want', 'get', 'capabl', 'fli', 'airplan', 'everytown', 'bomb', 'war', 'broke', 'film', '3', 'fault', 'audio', 'muddi', 'video', 'quirk', 'charact', 'deep', 'overal', 'plot', 'altogeth', 'solid', 'plot', 'lack', 'someth', 'put', 'finger', 'seem', 'littl', 'fluffi', 'love', 'sci', 'fi', 'interest', 'h', 'g', 'well', 'though', 'might', 'happen', 'next', 'hundr', 'year', 'must', 'see', 'worth', 'see', 'learn', 'everyon', 'fear', 'long', 'drawn', 'war', 'go', 'war', 'germani', 'threat', 'biolog', 'weapon', 'everyth', 'thing', 'come', 'pretti', 'good', 'movi', 'peopl', 'need', 'see']
words :  ['sight', 'kareena', 'kapoor', 'two', 'piec', 'bikini', 'thing', 'wake', 'sleep', 'watch', 'tashan', 'mega', 'disappoint', 'mind', 'numb', 'new', 'film', 'cinema', 'weekend', 'bad', 'film', 'ba

words :  ['anoth', 'downey', 'must', 'see', 'obsess', 'fan', 'like', 'got', 'see', 'movi', 'play', 'alex', 'finch', '22', 'year', 'old', 'yale', 'grad', 'realiz', 'life', 'came', 'life', 'left', '26', 'year', 'earlier', 'alex', 'incarn', 'louie', 'jeffri', 'nonsens', 'lawyer', 'happili', 'marri', 'corrin', 'cybil', 'sheppard', 'louie', 'kill', 'one', 'year', 'anniversari', 'hit', 'car', 'demand', 'go', 'back', 'time', 'bodi', 'alex', 'finch', 'enter', 'robert', 'downey', 'jr', 'lot', 'confus', 'lot', 'laugh', 'although', 'movi', '15', 'year', 'old', 'still', 'make', 'wonder', 'realli', 'thing', 'incarn', 'often', 'meet', 'soul', 'life', 'life', 'know', 'answer', 'know', 'need', 'see', 'movi', 'riot', 'downey', 'look', 'good', 'tuxedo', 'film', 'make', 'believ', 'love', 'true', 'love', 'never', 'die', 'get', 'recycl']
words :  ['saw', 'film', 'prior', 'join', 'british', 'armi', 'went', 'basic', 'train', 'first', 'difficult', 'progress', 'much', 'easier', 'time', 'spent', 'height', 'trou

words :  ['watch', 'john', 'cassavet', 'film', 'open', 'night', 'remind', 'someth', 'quentin', 'tarantino', 'said', 'interview', 'person', 'experi', 'creator', 'art', 'act', 'refer', 'exampl', 'say', 'ran', 'dog', 'way', 'act', 'play', 'end', 'life', 'would', 'affect', 'without', 'doubt', 'would', 'bring', 'experi', 'stage', 'even', 'light', 'comedi', 'otherwis', 'said', 'help', 'think', 'word', 'watch', 'gena', 'rowland', 'charact', 'myrtl', 'gordon', 'almost', 'whole', 'week', 'goe', 'similar', 'scenario', 'cassavet', 'film', 'cours', 'sinc', 'theater', 'work', 'around', 'star', 'actress', 'emot', 'human', 'natur', 'mean', 'look', 'play', 'charact', 'one', 'live', 'one', 'like', 'myrtl', 'gordon', 'theater', 'near', 'begin', 'film', 'exit', 'perform', 'myrtl', 'sign', 'autograph', 'one', 'fan', 'name', 'nanci', 'come', 'favorit', 'star', 'pour', 'heart', 'myrtl', 'touch', 'littl', 'moment', 'last', 'get', 'car', 'pour', 'rain', 'watch', 'horror', 'girl', 'stood', 'right', 'next', 'ca

words :  ['movi', 'worst', 'movi', 'ever', 'made', 'planet', 'like', 'barney', 'movi', 'graphic', 'suck', 'half', 'movi', 'anim', 'death', 'suck', 'readi', 'sue', 'peopl', 'made', 'movi', 'pleas', 'wast', 'hour', 'life', 'watch', 'movi', 'good', 'part', 'movi', 'end', 'movi', '50', 'percent', 'jurass', 'park', '1', 'percent', 'sabretooth', '49', '9', 'percent', 'dumb', 'pleas', 'wast', 'time', 'watch', 'movi', 'regret', 'want', 'know', 'movi', 'suck', 'well', 'cover', 'suck', 'graphic', 'suck', 'blood', 'look', 'mean', 'ketchup', 'peopl', 'tri', 'blow', 'colleg', 'student', 'think', 'stand', 'anim', 'mean', '5', 'ft', 'tiger', 'run', 'straight', 'woman', 'throw', 'spear', '100', 'ft', 'away', 'wait', 'till', 'actual', 'hit', 'act', 'horribl', 'jurras', 'park', 'actual', 'good', 'movi', 'go', 'ruin']
words :  ['spite', 'great', 'futur', 'design', 'touch', 'clever', 'asimov', 'premis', 'smith', 'depend', 'cool', 'perform', 'movi', 'live', 'expect', 'clich', 'come', 'thick', 'fast', 'wake

words :  ['sum', 'exactli', 'feelgood', 'right', 'touch', 'film', 'sever', 'week', 'dvd', 'leap', 'shelf', 'everi', 'time', 'went', 'store', 'seen', 'steve', 'carrel', 'coupl', 'film', 'previous', 'want', 'smear', 'thought', 'process', 'resist', 'resist', 'final', 'grab', 'hell', 'attitud', 'surpris', 'wish', 'purchas', 'earlier', 'watch', 'three', 'time', 'two', 'day', 'still', 'smile', 'portray', 'widow', 'struggl', 'three', 'daughter', 'yearn', 'miss', 'sinc', 'pass', 'belov', 'wife', 'thu', 'meet', 'intrigu', 'woman', 'charm', 'profound', 'interest', 'dare', 'say', 'bookish', 'way', 'throw', 'whole', 'differ', 'light', 'onto', 'life', 'make', 'realiz', 'search', 'snag', 'woman', 'brother', 'girl', 'complic', 'matter', 'portray', 'dan', 'comic', 'shi', 'heartfelt', 'chagrin', 'see', 'someon', 'special', 'bring', 'fun', 'enjoy', 'famili', 'home', 'well', 'brother', 'life', 'realli', 'begin', 'feel', 'blind', 'date', 'occur', 'ruthi', 'draper', 'turn', 'point', 'mari', 'estim', 'dan',

words :  ['true', 'stori', 'bunch', 'junki', 'rob', 'honest', 'businessman', 'drug', 'jewelri', 'gun', 'money', 'would', 'say', 'tragic', 'tale', 'america', 'excess', 'eighti', 'high', 'peac', 'free', 'love', 'sixti', 'crash', 'drug', 'aid', 'honestli', 'regular', 'peopl', 'aim', 'life', 'sit', 'around', 'get', 'high', 'decid', 'rob', 'ruthless', 'man', 'second', 'part', 'master', 'plan', 'stuff', 'sit', 'around', 'get', 'high', 'great', 'plan', 'even', 'know', 'stori', 'suspens', 'movi', 'surpris', 'fact', 'cox', 'tri', 'make', 'kind', 'folk', 'hero', 'charact', 'parti', 'scene', 'montag', 'loot', 'weak', 'insult', 'stori', 'better', 'straight', 'forward', 'approach', 'sad', 'stori', 'small', 'time', 'drug', 'dealer', 'get', 'kill', 'big', 'time', 'drug', 'dealer', 'bigger', 'stori', 'way', 'one', 'john', 'holm', 'center', 'stori', 'anyway', 'movi', 'life', 'one', 'wonderland', 'wonder', 'fade', 'away', 'p', 'although', 'offici', 'boogi', 'night', 'better', 'version', 'holm', 'life', 

words :  ['entir', 'impress', 'film', 'origin', 'name', 'sin', 'eater', 'stay', 'way', 'consid', 'talk', 'last', 'half', 'film', 'even', 'sure', 'first', '20', 'minut', 'film', 'rest', 'slow', 'pick', 'robocop', 'peter', 'weller', 'one', 'main', 'actor', 'sad', 'point', 'would', 'say', 'check', 'thing', 'deal', 'cathol', 'religion', 'expect', 'exorcist', 'stigmata', 'film', 'sure', 'flop', 'day', 'word', 'get']
words :  ['plot', 'plausibl', 'banal', 'e', 'beauti', 'neglect', 'wife', 'wealthi', 'power', 'man', 'fling', 'psychot', 'hunk', 'tri', 'cover', 'psycho', 'stalk', 'blackmail', 'develop', 'stupefyingli', 'illog', 'despit', 'resourc', 'avail', 'usual', 'coupl', 'money', 'influenc', 'privileg', 'hero', 'heroin', 'appear', 'one', 'domest', 'attorney', 'local', 'polic', 'say', 'noth', 'dispos', 'grappl', 'suspens', 'terror', 'privat', 'secur', 'staff', 'fanci', 'secur', 'system', 'mishandl', 'household', 'ground', 'staff', 'chauffeur', 'etc', 'even', 'appar', 'fund', 'hire', 'privat'

words :  ['saw', 'film', '80', 'minut', 'long', 'thought', 'troubl', 'condens', 'gigant', 'w', 'somerset', 'maugham', 'novel', 'movi', 'clock', 'hour', 'half', 'seem', 'like', 'disast', 'wait', 'happen', 'know', 'movi', 'half', 'bad', 'even', 'manag', 'retain', 'much', 'make', 'book', 'reson', 'much', 'reader', 'heard', 'mani', 'film', 'buff', 'complain', 'lesli', 'howard', 'wet', 'noodl', 'actor', 'think', 'anyon', 'suit', 'play', 'role', 'philip', 'carey', 'wet', 'noodl', 'certainli', 'carey', 'howard', 'play', 'well', 'mean', 'want', 'shake', 'slap', 'upsid', 'head', 'repeatedli', 'final', 'take', 'buy', 'spine', 'ah', 'bett', 'girl', 'carey', 'obsess', 'bring', 'world', 'crash', 'around', 'know', 'earth', 'appeal', 'mildr', 'book', 'movi', 'stay', 'true', 'detail', 'play', 'davi', 'becom', 'fascin', 'charact', 'stori', 'nasti', 'unlik', 'least', 'dynam', 'person', 'screen', 'given', 'time', 'davi', 'perform', 'credit', 'chang', 'cours', 'screen', 'act', 'much', 'brando', 'would', '

words :  ['okay', 'got', 'back', 'start', 'review', 'let', 'tell', 'one', 'thing', 'want', 'like', 'movi', 'know', 'neg', 'past', 'hope', 'surpris', 'actual', 'come', 'like', 'film', 'fact', 'everi', 'horror', 'clich', 'imagin', 'fact', 'make', 'everi', 'littl', 'thing', 'jump', 'scare', 'walk', 'basebal', 'bat', 'left', 'floor', 'kid', 'scari', 'one', 'thing', 'surpris', 'blood', 'thought', 'go', 'say', 'much', 'film', 'start', 'donna', 'drop', 'lisa', 'mom', 'hous', 'come', 'goe', 'upstair', 'camera', 'pan', 'father', 'dead', 'couch', 'spooki', 'goe', 'upstair', 'aforement', 'basebal', 'bat', 'scene', 'happen', 'find', 'brother', 'bed', 'appar', 'dead', 'could', 'tell', 'spot', 'blood', 'killer', 'come', 'donna', 'hide', 'bed', 'mom', 'die', 'run', 'outsid', 'scream', 'help', 'killer', 'behind', 'us', 'cut', 'therapi', 'session', 'confus', 'lot', 'peopl', 'everyon', 'ask', 'whether', 'famili', 'actual', 'die', 'imagin', 'mention', 'nightmar', 'start', 'come', 'back', 'filler', 'dialo

words :  ['main', 'reason', 'want', 'see', 'movi', 'wonder', 'cast', 'ton', 'favorit', 'actor', 'one', 'movi', 'equal', 'amaz', 'actual', 'see', 'movi', 'caught', 'guard', 'expect', 'sinc', 'seen', 'rememb', 'could', 'stop', 'laugh', 'cast', 'script', 'amazingli', 'written', 'everi', 'time', 'expect', 'someth', 'happen', 'happen', 'mani', 'twist', 'turn', 'fit', 'whole', 'tone', 'movi', 'instead', 'come', 'pretenti', 'cinematographi', 'along', 'set', 'absolut', 'beauti', 'well', 'realli', 'say', 'anyth', 'bad', 'movi', 'expcept', 'would', 'andrew', 'davoli', 'littl', 'screen', 'time']
words :  ['like', 'stand', 'stay', 'away', 'earli', 'round', 'fact', 'good', 'comic', 'unless', 'got', 'cute', 'qualiti', 'got', 'snowbal', 'chanc', 'controversi', 'materi', 'think', 'hurt', 'much', 'discard', 'comedian', 'see', 'crook', 'judg', 'let', '1', 'top', '4', 'made', 'preliminari', 'half', 'finalist', 'given', '0', 'laugh', 'sever', 'lift', 'materi', 'elsewher', 'someth', 'judg', 'seem', 'proble

words :  ['seven', 'pound', 'movi', 'convinc', 'smith', 'realli', 'go', 'go', 'make', 'cri', 'film', 'one', 'thing', 'give', 'ton', 'credit', 'man', 'cri', 'thing', 'move', 'stori', 'smith', 'prove', 'time', 'time', 'act', 'take', 'extrem', 'depress', 'stori', 'nevertheless', 'still', 'good', 'movi', 'admit', 'made', 'cri', 'felt', 'stand', 'perform', 'rosario', 'dawson', 'absolut', 'love', 'girl', 'ever', 'sinc', 'saw', '25th', 'hour', 'ed', 'norton', 'knew', 'girl', 'go', 'go', 'far', 'beauti', 'charm', 'funni', 'talent', 'wait', 'see', 'much', 'career', 'go', 'go', 'smith', 'sure', 'great', 'chemistri', 'film', 'need', 'would', 'made', 'great', 'film', 'two', 'year', 'ago', 'tim', 'thoma', 'car', 'crash', 'caus', 'use', 'mobil', 'phone', 'seven', 'peopl', 'die', 'six', 'stranger', 'fianc', 'e', 'year', 'crash', 'quit', 'job', 'aeronaut', 'engin', 'tim', 'donat', 'lung', 'lobe', 'brother', 'ben', 'ir', 'employe', 'six', 'month', 'later', 'donat', 'part', 'liver', 'child', 'servic', '

words :  ['titan', 'classic', 'realli', 'surpris', 'movi', 'solid', 'ten', 'overal', 'imdb', 'user', 'rank', 'mayb', 'cool', 'give', 'titan', 'credit', 'nowaday', 'first', 'made', 'realli', 'someth', 'movi', 'came', 'peopl', 'flock', 'theater', 'came', 'video', 'sister', 'would', 'watch', 'twice', 'day', 'month', 'safe', 'say', 'obsess', 'good', 'reason', 'disast', 'scene', 'hard', 'forgot', 'like', 'frozen', 'babi', 'guy', 'commit', 'suicid', 'kill', 'someon', 'unruli', 'crowd', 'mani', 'peopl', 'die', 'ship', 'convey', 'film', 'immediaci', 'emot', 'need', 'hard', 'challeng', 'jame', 'cameron', 'step', 'let', 'forget', 'amaz', 'romanc', 'jack', 'rose', 'whether', 'relationship', 'figment', 'someon', 'imagin', 'love', 'bare', 'knew', 'would', 'die', 'trust', 'sure', 'hell', 'give', 'romeo', 'juliet', 'run', 'money', 'never', 'let', 'go', 'jack', 'titan', 'great', 'film', 'core', 'power', 'stori', 'told', 'brilliant', 'act', 'excel', 'cinematographi', 'beauti', 'music', 'crew', 'full', 

words :  ['thoma', 'inc', 'alway', 'knack', 'bring', 'simpl', 'homespun', 'stori', 'life', 'full', 'flair', 'italian', 'film', 'solid', 'act', 'particularli', 'georg', 'beban', 'father', 'silent', 'child', 'actor', 'georg', 'beban', 'jr', 'wonder', 'set', 'convey', 'realist', 'feel', 'earli', 'immigr', 'tenement', 'new', 'york', 'give', '1915', 'film', 'authent', 'unusu', 'featur', 'vintag', 'film', 'begin', 'modern', 'day', 'man', 'georg', 'beban', 'modern', 'cloth', 'read', 'stori', 'italian', 'immigr', 'transit', 'stori', 'georg', 'play', 'immigr', 'rais', 'enough', 'money', 'bring', 'fianc', 'e', 'itali', 'america', 'marri', 'son', 'time', 'hard', 'famili', 'struggl', 'surviv', 'found', 'wonder', 'mother', 'breastfe', 'child', 'avoid', 'complic', 'dirti', 'formula', 'oh', 'well', 'even', 'earli', 'dream', 'factori', 'push', 'polit', 'correct', 'behaviour', 'women', '1915', 'best', 'scene', 'pictur', 'beban', 'chanc', 'seek', 'reveng', 'crime', 'boss', 'inadvert', 'put', 'jail', 'la

words :  ['rate', '21', 'film', 'snob', 'see', 'blog', 'see', 'next', 'detail', 'rate', 'system', 'movi', 'claw', 'face', 'attempt', 'earn', 'releas', 'screen', 'tedium', 'wring', 'hand', 'roll', 'eye', 'sigh', 'popcorn', 'inde', 'movi', 'averagous', 'claw', 'face', 'begin', 'claw', 'face', 'begin', 'must', 'start', 'lower', 'portion', 'need', 'upper', 'portion', 'handi', 'tear', 'duct', 'intact', 'truli', 'tear', 'jerk', 'third', 'act', 'may', 'bring', 'knee', 'claw', 'way', 'clear', 'entir', 'theatr', 'season', 'celebr', 'joe', 'six', 'pack', 'hockey', 'mom', 'new', 'gold', 'standard', 'leadership', 'foreign', 'diplomaci', 'permayb', 'movi', 'tedium', 'welcom', 'thing', 'anyon', 'could', 'creat', 'watch', 'howev', 'much', 'danger', 'undertak', 'stori', 'sidney', 'young', 'london', 'publish', 'fourth', 'tier', 'celebr', 'entertain', 'magazin', 'see', 'magazin', 'go', 'need', 'miracl', 'get', 'phone', 'call', 'new', 'york', 'citi', 'usa', 'publish', 'sharp', 'magazin', 'clayton', 'hard

words :  ['saw', 'film', 'store', 'cheap', 'section', 'actual', 'vividli', 'rememb', 'see', 'commerci', 'trailer', 'year', 'ago', 'thought', 'hey', 'bought', 'basic', 'plot', 'sound', 'interest', 'clair', 'dane', 'alway', 'someon', 'talent', 'eye', 'also', 'becam', 'huge', 'kate', 'beckinsal', 'fan', 'two', 'girl', 'sneak', 'vacat', 'bangkok', 'get', 'bust', 'narcot', 'innoc', 'sent', 'thailand', 'prison', 'film', 'follow', 'happen', 'time', 'question', 'innoc', 'clair', 'dane', 'kate', 'beckinsal', 'give', 'great', 'perform', 'plot', 'film', 'wrap', 'unconvent', 'rais', 'nice', 'moral', 'discuss', 'question', 'think', 'solid', 'good', 'film', 'could', 'improv', 'could', 'longer', 'would', 'help', 'solidifi', 'charact', 'insight', 'polit', 'thailand', 'justic', 'system', 'would', 'help', 'nevertheless', 'good', 'film', 'great', 'perform', 'p', 'pop', 'cultur', 'junki', 'lookout', 'two', 'minut', 'role', 'paul', 'walker', 'even', 'notic', 'first', 'time', 'saw', 'film']
words :  ['know'

words :  ['love', 'cinema', 'confess', 'embarrass', 'deepli', 'given', 'thumb', 'che', 'part', '1', '2', 'without', 'seen', 'film', 'terribl', 'know', 'felt', 'trap', 'perpetr', 'realli', 'know', 'neg', 'word', 'mouth', 'spread', 'like', 'wild', 'fire', 'matter', 'smart', 'think', 'fell', 'thank', 'bump', 'argeninean', 'film', 'director', 'martin', 'donovan', 'man', 'love', 'admir', 'told', 'go', 'see', 'knew', 'film', 'failur', 'look', 'readi', 'punch', 'right', 'face', 'donovan', 'pacifist', 'took', 'asid', 'told', 'much', 'love', 'film', 'went', 'see', 'part', 'straight', 'away', 'che', 'part', '1', '2', 'best', 'film', 'seen', '2008', 'puriti', 'remind', 'work', 'great', 'master', 'past', 'regard', 'audienc', 'someth', 'use', 'anymor', 'know', 'ever', 'rivet', 'move', 'without', 'concess', 'benicio', 'del', 'toro', 'extraordinari', 'see', 'soul', 'actual', 'perceiv', 'human', 'man', 'overwhelm', 'thank', 'martin', 'donovan', 'educ', 'honestli', 'bravo', 'del', 'toro', 'bravo', 'sod

words :  ['titl', 'robot', 'jox', '1990', 'director', 'stuart', 'gordon', 'cast', 'gari', 'graham', 'ann', 'mari', 'johnson', 'paul', 'koslo', 'review', 'stuart', 'gordon', 'usual', 'associ', 'extrem', 'gori', 'horror', 'film', 'anim', 'beyond', 'dagon', 'castl', 'freak', 'took', 'small', 'detour', 'littl', 'sci', 'fi', 'flick', 'stress', 'word', 'littl', 'sinc', 'low', 'budget', 'flick', 'lie', 'main', 'weak', 'stori', 'take', 'place', 'futur', 'world', 'great', 'superpow', 'accord', 'movi', 'unit', 'state', 'russia', 'duke', 'differ', 'go', 'full', 'blown', 'world', 'war', 'fight', 'gladiat', 'style', 'battl', 'gigant', 'robot', 'hero', 'achil', 'must', 'go', 'evil', 'russian', 'robot', 'fighter', 'call', 'alexand', 'lot', 'cheap', 'stop', 'motion', 'anim', 'ensu', 'well', 'idea', 'awesom', 'guess', 'great', 'nation', 'settlel', 'territori', 'disput', 'giant', 'robot', 'interest', 'premis', 'one', 'could', 'handl', 'properli', 'proper', 'budget', 'avail', 'unfortun', 'could', 'fun', 

words :  ['old', 'secur', 'guard', 'guy', 'die', 'kevin', 'world', 'biggest', 'wuss', 'kevin', 'want', 'impress', 'incred', 'insensit', 'bratti', 'virgin', 'girlfriend', 'ami', 'return', 'work', 'random', 'hous', 'find', 'friend', 'sexual', 'confus', 'red', 'short', 'kyle', 'truli', 'revolt', 'sluttish', 'daphn', 'soon', 'join', 'daphn', 'boyfriend', 'trigger', 'happi', 'sex', 'craze', 'macho', 'lunkhead', 'nick', 'titl', 'creatur', 'horrid', 'littl', 'dogear', 'puppet', 'kill', 'peopl', 'give', 'heart', 'desir', 'kyle', 'heart', 'desir', 'mate', 'creepi', 'yucki', 'woman', 'spandex', 'nick', 'heart', 'desir', 'throw', 'grenad', 'grade', 'school', 'cafeteria', 'mean', 'nightclub', 'kevin', 'heart', 'desir', 'beat', 'skinni', 'thug', 'nunchuck', 'ami', 'heart', 'desir', 'disgust', 'slut', 'daphn', 'alreadi', 'disgust', 'slut', 'heart', 'desir', 'along', 'way', 'truli', 'hideou', 'band', 'sing', 'truli', 'odd', 'song', 'hobgoblin', 'randomli', 'go', 'back', 'came', 'blow', 'citizen', 'ka

words :  ['could', 'seen', 'coup', 'toward', 'sexual', 'revolut', 'purpos', 'use', 'quotat', 'word', 'jean', 'eustach', 'wrote', 'direct', 'mother', 'whore', 'poetic', 'damn', 'critiqu', 'seem', 'get', 'enough', 'love', 'messag', 'film', 'hope', 'messag', 'would', 'come', 'fact', 'els', 'ben', 'hur', 'length', 'featur', 'offer', 'order', 'love', 'honestli', 'level', 'happi', 'real', 'truth', 'possibl', 'two', 'lover', 'tri', 'outcom', 'one', 'realli', 'realli', 'want', 'feel', 'even', 'express', 'say', 'want', 'truth', 'relationship', 'alexandr', 'jean', 'pierr', 'leaud', 'women', 'around', 'twenti', 'someth', 'pseudo', 'intellectu', 'seem', 'job', 'live', 'woman', 'mari', 'bernadett', 'lafont', 'slightli', 'older', 'usual', 'alway', 'lover', 'last', 'possibl', 'love', 'life', 'left', 'right', 'away', 'pick', 'woman', 'see', 'street', 'veronika', 'fran', 'ois', 'lebrun', 'perhap', 'remind', 'soon', 'unfold', 'subtli', 'torrid', 'love', 'triangl', 'ever', 'put', 'film', 'psycholog', 'st

words :  ['comment', 'miniseri', 'perspect', 'someon', 'read', 'novel', 'first', 'perspect', 'honestli', 'say', 'enjoy', 'see', 'rebroadcast', 'anytim', 'recent', 'specif', 'mini', 'seriou', 'problem', '1', 'terribl', 'miscast', 'actor', 'play', 'younger', 'gener', '15', '20', 'year', 'older', 'charact', 'ali', 'mcgraw', '45', 'time', 'play', 'natali', 'jastrow', 'suppos', '26', 'jan', 'michael', 'vincent', '39', 'time', 'play', 'byron', 'henri', 'suppos', '22', 'henri', 'children', 'pamela', 'tudsburi', 'also', 'play', 'actor', 'way', 'old', 'charact', 'suppos', '20', '2', 'act', 'absolut', 'aw', 'ali', 'mcgraw', 'time', 'almost', 'made', 'mini', 'unwatch', 'seen', 'convinc', 'perform', 'high', 'school', 'play', '3', 'direct', 'poor', 'fair', 'ali', 'mcgraw', 'bad', 'act', 'charact', 'develop', 'probabl', 'direct', 'portray', 'hitler', 'way', 'overdon', 'charact', 'came', 'look', 'behav', 'like', 'cartoon', 'villain', 'charismat', 'sometim', 'charm', 'alway', 'diabol', 'geniu', 'herma

KeyboardInterrupt: 

### Extract Bag-of-Words features

For the model we will be implementing, rather than using the reviews directly, we are going to transform each review into a Bag-of-Words feature representation. Keep in mind that 'in the wild' we will only have access to the training set so our transformer can only use the training set to construct a representation.

In [78]:
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
# from sklearn.externals import joblib
import joblib
# joblib is an enhanced version of pickle that is more efficient for storing NumPy arrays

def extract_BoW_features(words_train, words_test, vocabulary_size=5000,
                         cache_dir=cache_dir, cache_file="bow_features.pkl"):
    """Extract Bag-of-Words for a given set of documents, already preprocessed into words."""
    
    # If cache_file is not None, try to read from it first
    cache_data = None
    if cache_file is not None:
        try:
            with open(os.path.join(cache_dir, cache_file), "rb") as f:
                cache_data = joblib.load(f)
            print("Read features from cache file:", cache_file)
        except:
            pass  # unable to read from cache, but that's okay
    
    # If cache is missing, then do the heavy lifting
    if cache_data is None:
        # Fit a vectorizer to training documents and use it to transform them
        # NOTE: Training documents have already been preprocessed and tokenized into words;
        #       pass in dummy functions to skip those steps, e.g. preprocessor=lambda x: x
        vectorizer = CountVectorizer(max_features=vocabulary_size,
                preprocessor=lambda x: x, tokenizer=lambda x: x)  # already preprocessed
        features_train = vectorizer.fit_transform(words_train).toarray()

        # Apply the same vectorizer to transform the test documents (ignore unknown words)
        features_test = vectorizer.transform(words_test).toarray()
        
        # NOTE: Remember to convert the features using .toarray() for a compact representation
        
        # Write to cache file for future runs (store vocabulary as well)
        if cache_file is not None:
            vocabulary = vectorizer.vocabulary_
            cache_data = dict(features_train=features_train, features_test=features_test,
                             vocabulary=vocabulary)
            with open(os.path.join(cache_dir, cache_file), "wb") as f:
                joblib.dump(cache_data, f)
            print("Wrote features to cache file:", cache_file)
    else:
        # Unpack data loaded from cache file
        features_train, features_test, vocabulary = (cache_data['features_train'],
                cache_data['features_test'], cache_data['vocabulary'])
    
    # Return both the extracted features as well as the vocabulary
    return features_train, features_test, vocabulary

In [79]:
train_X, test_X, vocabulary = extract_BoW_features(result_sample_train_X, result_sample_test_X)

Wrote features to cache file: bow_features.pkl


In [87]:
train_X

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,2107,2108,2109,2110,2111,2112,2113,2114,2115,2116


In [82]:
vocabulary

{'oliv': 1291,
 'gruner': 838,
 'total': 1917,
 'unknown': 1974,
 'friend': 759,
 'show': 1668,
 'film': 703,
 'seen': 1624,
 'call': 287,
 'pretti': 1415,
 'good': 812,
 'sci': 1600,
 'fi': 696,
 'nemesi': 1246,
 'watch': 2035,
 'found': 744,
 'fastforward': 682,
 'bs': 272,
 'drama': 564,
 'part': 1328,
 'get': 790,
 'unbeliev': 1956,
 'action': 49,
 'sequenc': 1639,
 'love': 1102,
 'kick': 1016,
 'hahagrun': 849,
 'charact': 327,
 'graduat': 821,
 'student': 1803,
 'forc': 734,
 'stay': 1766,
 'ghetto': 791,
 'close': 376,
 'one': 1293,
 'grew': 830,
 'find': 708,
 'boy': 250,
 'live': 1090,
 'realli': 1477,
 'want': 2025,
 'join': 994,
 'mexican': 1173,
 'gang': 772,
 'keep': 1009,
 'torment': 1915,
 'famili': 670,
 'instead': 966,
 'tell': 1867,
 'fight': 698,
 'back': 166,
 'crazi': 452,
 'play': 1376,
 'typic': 1950,
 'van': 1992,
 'damm': 479,
 'kill': 1018,
 'everyon': 638,
 'maim': 1118,
 'bad': 168,
 'work': 2082,
 'rid': 1528,
 'block': 224,
 'gangmemb': 773,
 'plot': 1380,

In [86]:
train_X

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,2107,2108,2109,2110,2111,2112,2113,2114,2115,2116


In [None]:
# Extract Bag of Words features for both training and test datasets
# train_X, test_X, vocabulary = extract_BoW_features(train_X, test_X)

## Step 4: Classification using XGBoost

Now that we have created the feature representation of our training (and testing) data, it is time to start setting up and using the XGBoost classifier provided by SageMaker.

### (TODO) Writing the dataset

The XGBoost classifier that we will be using requires the dataset to be written to a file and stored using Amazon S3. To do this, we will start by splitting the training dataset into two parts, the data we will train the model with and a validation set. Then, we will write those datasets to a file and upload the files to S3. In addition, we will write the test set input to a file and upload the file to S3. This is so that we can use SageMakers Batch Transform functionality to test our model once we've fit it.

In [83]:
import pandas as pd

val_X = pd.DataFrame(train_X[:10000])
train_X = pd.DataFrame(train_X[10000:])

val_y = pd.DataFrame(train_y[:10000])
train_y = pd.DataFrame(train_y[10000:])
# TODO: Split the train_X and train_y arrays into the DataFrames val_X, train_X and val_y, train_y. Make sure that
#       val_X and val_y contain 10 000 entires while train_X and train_y contain the remaining 15 000 entries.


In [85]:
val_X

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,2107,2108,2109,2110,2111,2112,2113,2114,2115,2116
0,0,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,1,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
2,0,0,2,0,0,1,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
7,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
9,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


The documentation for the XGBoost algorithm in SageMaker requires that the saved datasets should contain no headers or index and that for the training and validation data, the label should occur first for each sample.

For more information about this and other algorithms, the SageMaker developer documentation can be found on __[Amazon's website.](https://docs.aws.amazon.com/sagemaker/latest/dg/)__

In [None]:
# First we make sure that the local directory in which we'd like to store the training and validation csv files exists.
data_dir = '../data/xgboost'
if not os.path.exists(data_dir):
    os.makedirs(data_dir)

In [None]:
# First, save the test data to test.csv in the data_dir directory. Note that we do not save the associated ground truth
# labels, instead we will use them later to compare with our model output.

pd.DataFrame(test_X).to_csv(os.path.join(data_dir, 'test.csv'), header=False, index=False)

# TODO: Save the training and validation data to train.csv and validation.csv in the data_dir directory.
#       Make sure that the files you create are in the correct format.


In [None]:
# To save a bit of memory we can set text_X, train_X, val_X, train_y and val_y to None.

test_X = train_X = val_X = train_y = val_y = None

### (TODO) Uploading Training / Validation files to S3

Amazon's S3 service allows us to store files that can be access by both the built-in training models such as the XGBoost model we will be using as well as custom models such as the one we will see a little later.

For this, and most other tasks we will be doing using SageMaker, there are two methods we could use. The first is to use the low level functionality of SageMaker which requires knowing each of the objects involved in the SageMaker environment. The second is to use the high level functionality in which certain choices have been made on the user's behalf. The low level approach benefits from allowing the user a great deal of flexibility while the high level approach makes development much quicker. For our purposes we will opt to use the high level approach although using the low-level approach is certainly an option.

Recall the method `upload_data()` which is a member of object representing our current SageMaker session. What this method does is upload the data to the default bucket (which is created if it does not exist) into the path described by the key_prefix variable. To see this for yourself, once you have uploaded the data files, go to the S3 console and look to see where the files have been uploaded.

For additional resources, see the __[SageMaker API documentation](http://sagemaker.readthedocs.io/en/latest/)__ and in addition the __[SageMaker Developer Guide.](https://docs.aws.amazon.com/sagemaker/latest/dg/)__

In [None]:
import sagemaker

session = sagemaker.Session() # Store the current SageMaker session

# S3 prefix (which folder will we use)
prefix = 'sentiment-xgboost'

# TODO: Upload the test.csv, train.csv and validation.csv files which are contained in data_dir to S3 using sess.upload_data().
test_location = None
val_location = None
train_location = None

### (TODO) Creating the XGBoost model

Now that the data has been uploaded it is time to create the XGBoost model. To begin with, we need to do some setup. At this point it is worth discussing what a model is in SageMaker. It is easiest to think of a model of comprising three different objects in the SageMaker ecosystem, which interact with one another.

- Model Artifacts
- Training Code (Container)
- Inference Code (Container)

The Model Artifacts are what you might think of as the actual model itself. For example, if you were building a neural network, the model artifacts would be the weights of the various layers. In our case, for an XGBoost model, the artifacts are the actual trees that are created during training.

The other two objects, the training code and the inference code are then used the manipulate the training artifacts. More precisely, the training code uses the training data that is provided and creates the model artifacts, while the inference code uses the model artifacts to make predictions on new data.

The way that SageMaker runs the training and inference code is by making use of Docker containers. For now, think of a container as being a way of packaging code up so that dependencies aren't an issue.

In [None]:
from sagemaker import get_execution_role

# Our current execution role is require when creating the model as the training
# and inference code will need to access the model artifacts.
role = get_execution_role()

In [None]:
# We need to retrieve the location of the container which is provided by Amazon for using XGBoost.
# As a matter of convenience, the training and inference code both use the same container.
from sagemaker.amazon.amazon_estimator import get_image_uri

container = get_image_uri(session.boto_region_name, 'xgboost')

In [None]:
# TODO: Create a SageMaker estimator using the container location determined in the previous cell.
#       It is recommended that you use a single training instance of type ml.m4.xlarge. It is also
#       recommended that you use 's3://{}/{}/output'.format(session.default_bucket(), prefix) as the
#       output path.

xgb = None


# TODO: Set the XGBoost hyperparameters in the xgb object. Don't forget that in this case we have a binary
#       label so we should be using the 'binary:logistic' objective.


### Fit the XGBoost model

Now that our model has been set up we simply need to attach the training and validation datasets and then ask SageMaker to set up the computation.

In [None]:
s3_input_train = sagemaker.s3_input(s3_data=train_location, content_type='csv')
s3_input_validation = sagemaker.s3_input(s3_data=val_location, content_type='csv')

In [None]:
xgb.fit({'train': s3_input_train, 'validation': s3_input_validation})

### (TODO) Testing the model

Now that we've fit our XGBoost model, it's time to see how well it performs. To do this we will use SageMakers Batch Transform functionality. Batch Transform is a convenient way to perform inference on a large dataset in a way that is not realtime. That is, we don't necessarily need to use our model's results immediately and instead we can peform inference on a large number of samples. An example of this in industry might be peforming an end of month report. This method of inference can also be useful to us as it means to can perform inference on our entire test set. 

To perform a Batch Transformation we need to first create a transformer objects from our trained estimator object.

In [None]:
# TODO: Create a transformer object from the trained model. Using an instance count of 1 and an instance type of ml.m4.xlarge
#       should be more than enough.
xgb_transformer = None

Next we actually perform the transform job. When doing so we need to make sure to specify the type of data we are sending so that it is serialized correctly in the background. In our case we are providing our model with csv data so we specify `text/csv`. Also, if the test data that we have provided is too large to process all at once then we need to specify how the data file should be split up. Since each line is a single entry in our data set we tell SageMaker that it can split the input on each line.

In [None]:
# TODO: Start the transform job. Make sure to specify the content type and the split type of the test data.


Currently the transform job is running but it is doing so in the background. Since we wish to wait until the transform job is done and we would like a bit of feedback we can run the `wait()` method.

In [None]:
xgb_transformer.wait()

Now the transform job has executed and the result, the estimated sentiment of each review, has been saved on S3. Since we would rather work on this file locally we can perform a bit of notebook magic to copy the file to the `data_dir`.

In [None]:
!aws s3 cp --recursive $xgb_transformer.output_path $data_dir

The last step is now to read in the output from our model, convert the output to something a little more usable, in this case we want the sentiment to be either `1` (positive) or `0` (negative), and then compare to the ground truth labels.

In [None]:
predictions = pd.read_csv(os.path.join(data_dir, 'test.csv.out'), header=None)
predictions = [round(num) for num in predictions.squeeze().values]

In [None]:
from sklearn.metrics import accuracy_score
accuracy_score(test_y, predictions)

## Optional: Clean up

The default notebook instance on SageMaker doesn't have a lot of excess disk space available. As you continue to complete and execute notebooks you will eventually fill up this disk space, leading to errors which can be difficult to diagnose. Once you are completely finished using a notebook it is a good idea to remove the files that you created along the way. Of course, you can do this from the terminal or from the notebook hub if you would like. The cell below contains some commands to clean up the created files from within the notebook.

In [None]:
# First we will remove all of the files contained in the data_dir directory
!rm $data_dir/*

# And then we delete the directory itself
!rmdir $data_dir

# Similarly we will remove the files in the cache_dir directory and the directory itself
!rm $cache_dir/*
!rmdir $cache_dir