# Amazon Customer Reviews Classification with Babble

### For this task, you will work with Amazon Customer Reviews, writing explanations about how to classify them as positive or negative sentiment.

Only 1 star and 5 star reviews are included.


## Load and Prepare the Data

The reviews are available [via Amazon](https://s3.amazonaws.com/amazon-reviews-pds/readme.html).
You may download them there, or provide a password to unzip the file below.

For simplicity, only 1 star and 5 star reviews are included.

You must replace `PASSWORD` with the password to unzip the data.

In [None]:
!unzip -P PASSWORD data/data.zip
!ls

In [1]:
from data.preparer import load_amazon_dataset

DELIMITER = "#"
df_train, df_dev, df_valid, df_test = load_amazon_dataset(delimiter=DELIMITER)
print("{} training examples".format(len(df_train)))
print("{} development examples".format(len(df_dev)))
print("{} validation examples".format(len(df_valid)))
print("{} test examples".format(len(df_test)))

2500 training examples
500 development examples
399750 validation examples
250 test examples


In [2]:
#define labels
ABSTAIN = 0
NEGATIVE = 1
POSITIVE = 2

Transform the data into a format compatible with Babble Labble:

In [3]:
from babble.Candidate import Candidate # this is a helper class to transform our data into a format Babble can parse

dfs = [df_train, df_dev, df_test]

for df in dfs:
    df["id"] = range(len(df))

Cs = [df.apply(lambda x: Candidate(x), axis=1) for df in dfs]

# babble labble uses 1 and 2 for labels, while our data uses 0 and 1
# add 1 to convert
Ys = [df.label.values + 1 for df in dfs]

In [4]:
from babble import BabbleStream

aliases = {}
babbler = BabbleStream(Cs, Ys, balanced=True, shuffled=True, seed=456, aliases=aliases)

Grammar construction complete.


## Labeling Instructions

All reviews were submitted with either 1 star (negative) or 5 star (positive) ratings. Your task is to create labeling functions that take the text of the review as input, and output either a NEGATIVE or a POSITIVE or an ABSTAIN label.

## Create Explanations

Creating explanations generally happens in five steps:
1. View candidates
2. Write explanations
3. Get feedback
4. Update explanations 
5. Apply label aggregator

Steps 3-5 are optional; explanations may be submitted without any feedback on their quality. However, in our experience, observing how well explanations are being parsed and what their accuracy/coverage on a dev set are (if available) can quickly lead to simple improvements that yield significantly more useful labeling functions. Once a few labeling functions have been collected, you can use the label aggregator to identify candidates that are being mislabeled and write additional explanations targeting those failure modes.

### Collection

Use `babbler` to show candidates

In [5]:
candidate = babbler.next()
print(candidate)

{'key': 3223807, 'text': 'My treo was not at all durable. After one year the flip part did not work so the phone only worked with a headset. After a while the headset would not pick up calls. It also locked up and had to be rebooted in the middle of calls on a regular basis. There was a rebate with the phone and handspring never sent the rebate even with the appropriate documentation.', 'label': 0, 'id': 131, 'tokens': ['My', 'treo', 'was', 'not', 'at', 'all', 'durable', '.', 'After', 'one', 'year', 'the', 'flip', 'part', 'did', 'not', 'work', 'so', 'the', 'phone', 'only', 'worked', 'with', 'a', 'headset', '.', 'After', 'a', 'while', 'the', 'headset', 'would', 'not', 'pick', 'up', 'calls', '.', 'It', 'also', 'locked', 'up', 'and', 'had', 'to', 'be', 'rebooted', 'in', 'the', 'middle', 'of', 'calls', 'on', 'a', 'regular', 'basis', '.', 'There', 'was', 'a', 'rebate', 'with', 'the', 'phone', 'and', 'handspring', 'never', 'sent', 'the', 'rebate', 'even', 'with', 'the', 'appropriate', 'docum

In [6]:
explanations = []

In [7]:
from babble import Explanation
explanation = Explanation(
    name='check_out', # name of this rule, for your reference
    label=NEGATIVE, # label to assign
    condition='The word "bad" is in the text', # natural language description of why you label the candidate this way
    candidate=candidate.mention_id # optional argument, the candidate should be an example labeled by this rule
)
explanations.append(explanation)

Babble will parse your explanations into functions, then filter out functions that are duplicates, incorrectly label their given candidate, or assign the same label to all examples.

In [8]:
parses, filtered = babbler.apply(explanations)

Building list of target candidate ids...
Collected 1 unique target candidate ids from 1 explanations.
Gathering desired candidates...
Found 1/1 desired candidates
Linking explanations to candidates...
Linked 1/1 explanations
1 explanation(s) out of 1 were parseable.
1 parse(s) generated from 1 explanation(s).
1 parse(s) remain (0 parse(s) removed by DuplicateSemanticsFilter).
Note: 1 LFs did not have candidates and therefore could not be filtered.
1 parse(s) remain (0 parse(s) removed by ConsistencyFilter).
Applying labeling functions to investigate labeling signature.

1 parse(s) remain (0 parse(s) removed by UniformSignatureFilter: (0 None, 0 All)).
1 parse(s) remain (0 parse(s) removed by DuplicateSignatureFilter).
1 parse(s) remain (0 parse(s) removed by LowestCoverageFilter).


### Analysis
See how your explanations were parsed and filtered

In [9]:
babbler.analyze(parses)

Unnamed: 0,j,Polarity,Coverage,Overlaps,Conflicts,Correct,Incorrect,Emp. Acc.
check_out_0,0,1.0,0.072,0.0,0.0,25,11,0.694444


In [10]:
babbler.filtered_analysis(filtered)

No filtered parses to analyze.


In [11]:
babbler.commit()

Added 1 parse(s) from 1 explanations to set. (Total # parses = 1)

Applying labeling functions to split 1

Added 36 labels to split 1: L.nnz = 36, L.shape = (500, 1).
Applying labeling functions to split 2

Added 16 labels to split 2: L.nnz = 16, L.shape = (250, 1).


### Evaluation
Get feedback on the performance of your explanations

In [12]:
from metal.analysis import lf_summary

Ls = [babbler.get_label_matrix(split) for split in [0,1,2]]
lf_names = [lf.__name__ for lf in babbler.get_lfs()]
lf_summary(Ls[1], Ys[1], lf_names=lf_names)

Retrieved label matrix for split 0: L.nnz = 148, L.shape = (2500, 1)
Retrieved label matrix for split 1: L.nnz = 36, L.shape = (500, 1)
Retrieved label matrix for split 2: L.nnz = 16, L.shape = (250, 1)


Unnamed: 0,j,Polarity,Coverage,Overlaps,Conflicts,Correct,Incorrect,Emp. Acc.
check_out_0,0,1,0.072,0.0,0.0,25,11,0.694444


In [13]:
from metal import LabelModel
from metal.tuners import RandomSearchTuner

search_space = {
    'n_epochs': [50, 100, 500],
    'lr': {'range': [0.01, 0.001], 'scale': 'log'},
    'show_plots': False,
}

tuner = RandomSearchTuner(LabelModel, seed=123)

label_aggregator = tuner.search(
    search_space, 
    train_args=[Ls[0]], 
    X_dev=Ls[1], Y_dev=Ys[1], 
    max_search=20, verbose=False, metric='f1')

[SUMMARY]
Best model: [1]
Best config: {'n_epochs': 500, 'show_plots': False, 'lr': 0.0012223249524949424, 'seed': 123}
Best score: 0.6859395532194481


If you'd like to save the explanations you've generated, you can use the `ExplanationIO` object to write to or read them from file.

In [None]:
from babble.utils import ExplanationIO

FILE = "my_explanations.tsv"
exp_io = ExplanationIO()
exp_io.write(explanations, FILE)
explanations = exp_io.read(FILE)