This notebook extracts the passage/review pairs in the validation set that the model (the model loaded in testing_util) succeeds in matching correctly. Looking at these pairs, as well as the pairs it failed to match can provide some insight into what kinds of features it has learned.

In [33]:
from dataloading import get_dataset

In [34]:
_, evalset = get_dataset()
evalset_size = len(evalset)

In [35]:
from testing_util import tok, get_batch_tokens, model
from util import generate_indices
import torch

def encode_and_val(pass_mbs, rev_mbs):
        with torch.no_grad():
            pass_encs = [model.encodeX(tokens, masks)
                for (tokens, masks) in pass_mbs]
            
            rev_encs = [model.encodeY(tokens, masks)
                for (tokens, masks) in rev_mbs]
        
            test_loss, test_acc = model.cLoss(torch.cat(pass_encs), torch.cat(rev_encs))
        return pass_encs, rev_encs, test_loss, test_acc
    
MB_SIZE = 4

pass_t, pass_m, rev_t, rev_m = get_batch_tokens(evalset, torch.arange(evalset_size))
mb_inds = generate_indices(evalset_size, MB_SIZE, shuffle = False)

pass_mbs = [(pass_t[ind], pass_m[ind]) for ind in mb_inds]
rev_mbs = [(rev_t[ind], rev_m[ind]) for ind in mb_inds]

pass_encs, rev_encs, _, total_acc = encode_and_val(pass_mbs, rev_mbs)
pass_encs = torch.cat(pass_encs)
rev_encs = torch.cat(rev_encs)

print(total_acc)
logits = pass_encs @ rev_encs.T


tensor(0.1800, device='cuda:0')


In [36]:
success_pairs = []
fail_pairs = []

for i, row in enumerate(logits):
    if row.argmax().item() == i:
        success_pairs.append(evalset[i])
    else:
        fail_pairs.append(evalset[i])
        

In [37]:
for i, pair in enumerate(success_pairs):
    print(str(i) + ":" + pair[1])
    print("")

0:Sam3 seems oddly familiar with Jane0's name. Did she know Jane0 before this evening? Perhaps I missed that.

1:A comment about height might be rude, but I'm not sure how it could be much other than innocent (or am I too vanilla?)

2:Setting aside Sam0 'hiding' Sam1's name and their prior relationship from the reader, I do like the way that she teases Sam1 with it

3:Slightly odd that in an essentially romantic story the writer would have them fall into each others arms and then - nothing. There's definitely a feel of something missing (however slight and brief).

4:The circularity, back to Sam0 admiring the view, is good - though entirely expected. But expectation is usually a good thing. In my view, it connects the reader to the writer. I don't think the adjective thing is a 'rule' either, but there is some truth in it. 'radiant', 'brilliant' stand out like a sort thumb.

5:[the predawn night] - if it's predawn then it's night - [the predawn twilight] might work

6:[her mother and S

One commononality among successes is that they have long and descriptive reviews. Many of them also leak quotes from the passage into the review, which is bad (that is, some cases are claerly the model cheating). There does seem to be a lot of word matching happening.

In [23]:
for i, pair in enumerate(fail_pairs[:100]):
    print(str(i) + ": " + pair[1])
    print("")

0:The change of pace to something slower and more intimate is good. Things weren't all bad before - just focusing on what seemed to be problems.

1:[quote] - like it

2:Mmm Perhaps Sam2 knows Jane0 so well that Jane0 was prepared to be extremely informal. It would be pretty lax of Jane0 though

3:Sam0 writer seems more comfortable with this, more relaxed, action.

4:As a reader, I'm more comfortable with this level of descriptiveness - as opposed to what feels to me like 'trying too hard' in the opening and the conclusion. However, it is a matter of taste.

5:Has she previously told John0 that her name is now Sam0 If so, the reader didn't know.

6:Nitpick" [he diesel engine’s horn] - an engine doesn't have a horn

7:Over the past several paragraphs, the filing of head-hopping has gone. I think it's because with the two characters interacting directly, the POV has moved slightly away from both of the, into a more third-person overview.

8:Excellent the issue of the room and the key was 

Many of the fail cases seem to have short and unspecific comments (i.e. good ending, lol, WOAH!, punctuation, awkward phrasing) which could apply to any story.

In [32]:
print(success_pairs[185][0])

In a world where butterflies rule the skies during their annual Sam0 migration over Sam1 their massive, groovy kaleidoscope gets knocked off course in a storm, and flies smack into…
