# Estimate sentence probability with BERT
## Calculating probability more properly:
P_f, P_b: Probability forward pass, backward pass, respectively
P_f = P(w_0) * P(w_1|w_0) * P(w_2|w_0, w_1) * ... * P(w_N)
P_b = P(w_N-1|w_N) * P(w_N-2|w_N-1, w_N) * ... * P(w_0|w_1, w_2, ... ,w_N)

P_f, P_b become smaller as the sentence length increases, hence, I try normalizing them by sentence length.
Since we're dealing with dependent probabilities, I use a geometric mean.
```
mean(P_f) = (P(w_0) * P(w_1|w_0) * ... * P(w_N|w_0, w_1, ..., w_{N-1})) ^ (1/N),
```
where N is the sentence length.

Finally, the sentence probability P(S) is the geometric mean of forward and backwards probabilities:
```
P(S) = (mean(P_f(S)) * mean(P_b(S))) ^ (1/2)
```

In [1]:
import numpy as np
import torch
from transformers import BertTokenizer, BertForMaskedLM

BOS_TOKEN = '[CLS]'
EOS_TOKEN = '[SEP]'
MASK_TOKEN = '[MASK]'

# Load pre-trained model (weights)
with torch.no_grad():
    model = BertForMaskedLM.from_pretrained('bert-large-uncased')
    model.eval()
    # Load pre-trained model tokenizer (vocabulary)
    tokenizer = BertTokenizer.from_pretrained('bert-large-uncased')

In [2]:
def print_top_predictions(probs, k=5):
    probs = probs.detach().numpy()
    top_indexes = np.argpartition(probs, -k)[-k:]
    sorted_indexes = top_indexes[np.argsort(-probs[top_indexes])]
    top_tokens = tokenizer.convert_ids_to_tokens(sorted_indexes)
    print(f"Ordered top predicted tokens: {top_tokens}")
    print(f"Ordered top predicted values: {probs[sorted_indexes]}")

In [3]:
def get_sentence_prob(sentence, verbose=False):
    # Pre-process sentence, adding special tokens
    tokenized_input = tokenizer.tokenize(sentence)
    if tokenized_input[0] != BOS_TOKEN:
        tokenized_input.insert(0, BOS_TOKEN)
    if tokenized_input[-1] != EOS_TOKEN:
        tokenized_input.append(EOS_TOKEN)
    sent_len = len(tokenized_input)
    ids_input = tokenizer.convert_tokens_to_ids(tokenized_input)
    print(f"Processing sentence: {tokenized_input}\n")
    
    sm = torch.nn.Softmax(dim=0) # used to convert last hidden state to probs
    
    sent_prob_forward = 1
    sent_prob_backwards = 1
    # Mask non-special tokens in forward and backwards directions; calculate their probabilities
    for i in range(1, len(tokenized_input) - 1): # Don't loop first and last tokens
        probs_forward = get_directional_prob(sm, tokenized_input, i, 'forward', verbose=verbose)
        probs_backwards = get_directional_prob(sm, tokenized_input, i, 'backwards', verbose=verbose)
        prob_forward = probs_forward[ids_input[i]] # Prediction for masked word 
        prob_backwards = probs_backwards[ids_input[i]] # Prediction for masked word 
        sent_prob_forward *= np.power(prob_forward.detach().numpy(), 1/sent_len)
        sent_prob_backwards *= np.power(prob_backwards.detach().numpy(), 1/sent_len)

        print(f"Word: {tokenized_input[i]} \t Prob_forward: {prob_forward}; Prob_backwards: {prob_backwards}")

    print(f"Geometric-mean forward sentence probability: {sent_prob_forward}")
    print(f"Geometric-mean backward sentence probability: {sent_prob_backwards}\n")
    
    # Obtain geometric average of forward and backward probs
    geom_mean_sent_prob = np.sqrt(sent_prob_forward * sent_prob_backwards)
    print(f"Average normalized sentence prob: {geom_mean_sent_prob}\n")
    return geom_mean_sent_prob

In [4]:
def get_directional_prob(sm, tokenized_input, i, direction, verbose=False):
    current_tokens = tokenized_input[:]
    if direction == 'backwards':
        current_tokens[1:i+1] = [MASK_TOKEN for j in range(i)]
    elif direction == 'forward':
        current_tokens[i:-1] = [MASK_TOKEN for j in range(len(tokenized_input) - 1 - i)]
    else:
        print("Direction can only be 'forward' or 'backwards'")
        exit()
    if verbose: 
        print()
        print(current_tokens)
        
    masked_input = torch.tensor([tokenizer.convert_tokens_to_ids(current_tokens)])
    predictions = model(masked_input)
    predictions = predictions[0]
    probs = sm(predictions[0, i]) # Softmax to get probabilities
    if verbose: 
        print_top_predictions(probs)
    
    return probs # Model predictions

In [14]:
get_sentence_prob("The test was a success.")
get_sentence_prob("The party was a success.")
get_sentence_prob("The plan was a success.")
get_sentence_prob("The was test a success.")
get_sentence_prob("The test was success a.")

Processing sentence: ['[CLS]', 'the', 'test', 'was', 'a', 'success', '.', '[SEP]']

Word: the 	 Prob_forward: 0.006482909899204969; Prob_backwards: 0.9590326547622681
Word: test 	 Prob_forward: 9.367070742882788e-05; Prob_backwards: 0.003962513990700245
Word: was 	 Prob_forward: 0.3028062582015991; Prob_backwards: 0.9035378694534302
Word: a 	 Prob_forward: 0.002898415084928274; Prob_backwards: 0.3296686112880707
Word: success 	 Prob_forward: 0.5157564878463745; Prob_backwards: 2.4018852855078876e-05
Word: . 	 Prob_forward: 0.9452541470527649; Prob_backwards: 0.7197229266166687
Geometric-mean forward sentence probability: 0.0633631213660846
Geometric-mean backward sentence probability: 0.10875349434011992

Average normalized sentence prob: 0.08301181157437063

Processing sentence: ['[CLS]', 'the', 'party', 'was', 'a', 'success', '.', '[SEP]']

Word: the 	 Prob_forward: 0.006482909899204969; Prob_backwards: 0.9431305527687073
Word: party 	 Prob_forward: 0.001439713523723185; Prob_backwar

0.010037710023262371

In [5]:
get_sentence_prob("He answered unequivocally.")
get_sentence_prob("He answered quickly.", verbose=True)

Processing sentence: ['[CLS]', 'he', 'answered', 'une', '##qui', '##vo', '##cal', '##ly', '.', '[SEP]']

Word: he 	 Prob_forward: 0.0005044482531957328; Prob_backwards: 0.2814375162124634
Word: answered 	 Prob_forward: 0.0002913153439294547; Prob_backwards: 0.010978045873343945
Word: une 	 Prob_forward: 2.243901064957754e-07; Prob_backwards: 0.9987058639526367
Word: ##qui 	 Prob_forward: 0.0005762826185673475; Prob_backwards: 0.0008089180919341743
Word: ##vo 	 Prob_forward: 0.05035709589719772; Prob_backwards: 3.2233551792160142e-06
Word: ##cal 	 Prob_forward: 0.9999289512634277; Prob_backwards: 2.658714583958499e-05
Word: ##ly 	 Prob_forward: 0.9982821941375732; Prob_backwards: 0.0010952855227515101
Word: . 	 Prob_forward: 0.9998167157173157; Prob_backwards: 0.7406781911849976
Geometric-mean forward sentence probability: 0.015776195322160777
Geometric-mean backward sentence probability: 0.013302663224079483

Average normalized sentence prob: 0.014486732320575367

Processing sentence: 

0.05712756923398538

In [6]:
get_sentence_prob("The guy with small hands demanded a quid pro quo.")
get_sentence_prob("The guy with small hands demanded an exchange.")

Processing sentence: ['[CLS]', 'the', 'guy', 'with', 'small', 'hands', 'demanded', 'a', 'qui', '##d', 'pro', 'quo', '.', '[SEP]']

Word: the 	 Prob_forward: 0.020117253065109253; Prob_backwards: 0.6742717027664185
Word: guy 	 Prob_forward: 4.888691910309717e-05; Prob_backwards: 0.0006307517760433257
Word: with 	 Prob_forward: 0.0006999199977144599; Prob_backwards: 0.06413324922323227
Word: small 	 Prob_forward: 4.887327304459177e-05; Prob_backwards: 0.008030521683394909
Word: hands 	 Prob_forward: 0.0014287744415923953; Prob_backwards: 7.560867379652336e-05
Word: demanded 	 Prob_forward: 2.932926236098865e-06; Prob_backwards: 1.76298769360983e-07
Word: a 	 Prob_forward: 0.000463048490928486; Prob_backwards: 0.06345833837985992
Word: qui 	 Prob_forward: 3.862149696942652e-06; Prob_backwards: 0.5241817235946655
Word: ##d 	 Prob_forward: 0.23301775753498077; Prob_backwards: 7.432691973008332e-07
Word: pro 	 Prob_forward: 3.778058089665137e-05; Prob_backwards: 0.004886653274297714
Word: qu

0.008648924871896689

In [39]:
get_sentence_prob("This is a sentence.")
get_sentence_prob("This is a macrame.", verbose=False)
get_sentence_prob("This is a joke.", verbose=False)
get_sentence_prob("Are you kidding me?", verbose=False)


Processing sentence: ['[CLS]', 'this', 'is', 'a', 'sentence', '.', '[SEP]']

Word: this 	 Prob_forward: 0.003717252053320408; Prob_backwards: 0.16919253766536713
Word: is 	 Prob_forward: 0.2988516092300415; Prob_backwards: 0.0976012647151947
Word: a 	 Prob_forward: 0.0826600193977356; Prob_backwards: 0.2026401311159134
Word: sentence 	 Prob_forward: 9.11332608666271e-05; Prob_backwards: 1.6831930338412349e-07
Word: . 	 Prob_forward: 0.9075567722320557; Prob_backwards: 0.9580832123756409
Geometric-mean forward sentence probability: 0.06919501892920826
Geometric-mean backward sentence probability: 0.04742574703773452

Average normalized sentence prob: 0.05728547341174622

Processing sentence: ['[CLS]', 'this', 'is', 'a', 'mac', '##ram', '##e', '.', '[SEP]']

Word: this 	 Prob_forward: 0.00048383252578787506; Prob_backwards: 0.06995750218629837
Word: is 	 Prob_forward: 0.15630875527858734; Prob_backwards: 0.15581747889518738
Word: a 	 Prob_forward: 0.19458439946174622; Prob_backwards: 0.0

0.11492692635310144

In [40]:
get_sentence_prob("Rachel was wearing a lovely satin dress last night.")

Processing sentence: ['[CLS]', 'rachel', 'was', 'wearing', 'a', 'lovely', 'satin', 'dress', 'last', 'night', '.', '[SEP]']

Word: rachel 	 Prob_forward: 0.00026082462863996625; Prob_backwards: 0.0007910796557553113
Word: was 	 Prob_forward: 0.14712725579738617; Prob_backwards: 0.9898462891578674
Word: wearing 	 Prob_forward: 0.0004857216263189912; Prob_backwards: 0.04084348678588867
Word: a 	 Prob_forward: 0.8745661973953247; Prob_backwards: 0.47447705268859863
Word: lovely 	 Prob_forward: 0.0003499957674648613; Prob_backwards: 1.9559100110200234e-05
Word: satin 	 Prob_forward: 0.000256720173638314; Prob_backwards: 2.4062626380327856e-06
Word: dress 	 Prob_forward: 0.018947714939713478; Prob_backwards: 1.652981518418528e-05
Word: last 	 Prob_forward: 0.00010001149348681793; Prob_backwards: 0.051961563527584076
Word: night 	 Prob_forward: 0.9452313780784607; Prob_backwards: 7.455756076524267e-06
Word: . 	 Prob_forward: 0.9814563989639282; Prob_backwards: 0.9604282379150391
Geometric-mea

0.011063568234790695

In [41]:
get_sentence_prob("Rachel was wearing a lovely satin dress last night.")
get_sentence_prob("Grandma was wearing a lovely satin dress last night.")
get_sentence_prob("Mother was wearing a lovely satin dress last night.")
get_sentence_prob("She was wearing a lovely satin dress last night.")
get_sentence_prob("He was wearing a lovely satin dress last night.")
get_sentence_prob("I was wearing a lovely satin dress last night.")
get_sentence_prob("Angela was wearing a lovely satin dress last night.")
get_sentence_prob("Roberta was wearing a lovely satin dress last night.")
get_sentence_prob("Running was wearing a lovely satin dress last night.")

Processing sentence: ['[CLS]', 'rachel', 'was', 'wearing', 'a', 'lovely', 'satin', 'dress', 'last', 'night', '.', '[SEP]']

Word: rachel 	 Prob_forward: 0.00026082462863996625; Prob_backwards: 0.0007910796557553113
Word: was 	 Prob_forward: 0.14712725579738617; Prob_backwards: 0.9898462891578674
Word: wearing 	 Prob_forward: 0.0004857216263189912; Prob_backwards: 0.04084348678588867
Word: a 	 Prob_forward: 0.8745661973953247; Prob_backwards: 0.47447705268859863
Word: lovely 	 Prob_forward: 0.0003499957674648613; Prob_backwards: 1.9559100110200234e-05
Word: satin 	 Prob_forward: 0.000256720173638314; Prob_backwards: 2.4062626380327856e-06
Word: dress 	 Prob_forward: 0.018947714939713478; Prob_backwards: 1.652981518418528e-05
Word: last 	 Prob_forward: 0.00010001149348681793; Prob_backwards: 0.051961563527584076
Word: night 	 Prob_forward: 0.9452313780784607; Prob_backwards: 7.455756076524267e-06
Word: . 	 Prob_forward: 0.9814563989639282; Prob_backwards: 0.9604282379150391
Geometric-mea

Word: was 	 Prob_forward: 0.06433098763227463; Prob_backwards: 0.9898462891578674
Word: wearing 	 Prob_forward: 0.0012692866148427129; Prob_backwards: 0.04084348678588867
Word: a 	 Prob_forward: 0.8440825343132019; Prob_backwards: 0.47447705268859863
Word: lovely 	 Prob_forward: 0.00037023151526227593; Prob_backwards: 1.9559100110200234e-05
Word: satin 	 Prob_forward: 0.0001819009630708024; Prob_backwards: 2.4062626380327856e-06
Word: dress 	 Prob_forward: 0.01633988507091999; Prob_backwards: 1.652981518418528e-05
Word: last 	 Prob_forward: 0.00011782765068346635; Prob_backwards: 0.051961563527584076
Word: night 	 Prob_forward: 0.9161355495452881; Prob_backwards: 7.455756076524267e-06
Word: . 	 Prob_forward: 0.9734354019165039; Prob_backwards: 0.9604282379150391
Geometric-mean forward sentence probability: 0.01519521011944542
Geometric-mean backward sentence probability: 0.004266871782342074

Average normalized sentence prob: 0.008052081301466124

Processing sentence: ['[CLS]', 'runnin

0.005019735338399365

In [42]:
get_sentence_prob("The man ate the steak.")
get_sentence_prob("The man who arrived late ate the steak with a glass of wine.")
get_sentence_prob("The steak was eaten by the man.")
get_sentence_prob("The stake ate the man.")

Processing sentence: ['[CLS]', 'the', 'man', 'ate', 'the', 'steak', '.', '[SEP]']

Word: the 	 Prob_forward: 0.005224968772381544; Prob_backwards: 0.9681575894355774
Word: man 	 Prob_forward: 0.0058873724192380905; Prob_backwards: 0.0023271343670785427
Word: ate 	 Prob_forward: 0.0001978118234546855; Prob_backwards: 0.030350716784596443
Word: the 	 Prob_forward: 0.05299869179725647; Prob_backwards: 0.4090028405189514
Word: steak 	 Prob_forward: 0.004556961823254824; Prob_backwards: 1.2853566659032367e-05
Word: . 	 Prob_forward: 0.9913156032562256; Prob_backwards: 0.9829498529434204
Geometric-mean forward sentence probability: 0.033145627284672706
Geometric-mean backward sentence probability: 0.06584567069157952

Average normalized sentence prob: 0.04671719232843935

Processing sentence: ['[CLS]', 'the', 'man', 'who', 'arrived', 'late', 'ate', 'the', 'steak', 'with', 'a', 'glass', 'of', 'wine', '.', '[SEP]']

Word: the 	 Prob_forward: 0.021069437265396118; Prob_backwards: 0.940187633037

0.02113022204556181

In [43]:
get_sentence_prob("He was born in Berlin.")
get_sentence_prob("He was born in Santiago.")
get_sentence_prob("He was born in France.")
get_sentence_prob("He was born in window.")
get_sentence_prob("He was born in was.")


Processing sentence: ['[CLS]', 'he', 'was', 'born', 'in', 'berlin', '.', '[SEP]']

Word: he 	 Prob_forward: 0.0021003505680710077; Prob_backwards: 0.8305719494819641
Word: was 	 Prob_forward: 0.21391922235488892; Prob_backwards: 0.9997050166130066
Word: born 	 Prob_forward: 0.00040079050813801587; Prob_backwards: 0.01874382235109806
Word: in 	 Prob_forward: 0.980291485786438; Prob_backwards: 0.018296195194125175
Word: berlin 	 Prob_forward: 0.02419455163180828; Prob_backwards: 9.928996587404981e-05
Word: . 	 Prob_forward: 0.9961835741996765; Prob_backwards: 0.9829498529434204
Geometric-mean forward sentence probability: 0.08986879181068508
Geometric-mean backward sentence probability: 0.11362871557646786

Average normalized sentence prob: 0.10105283461564618

Processing sentence: ['[CLS]', 'he', 'was', 'born', 'in', 'santiago', '.', '[SEP]']

Word: he 	 Prob_forward: 0.0021003505680710077; Prob_backwards: 0.7335055470466614
Word: was 	 Prob_forward: 0.21391922235488892; Prob_backwards:

0.028633058233584703

In [44]:
get_sentence_prob("I fed my cat some of it and he damn near passed out.")
get_sentence_prob("I fed my dog some of it and he damn near passed out.")
get_sentence_prob("I fed my window some of it and he damn near passed out.")
get_sentence_prob("I fed my the some of it and he damn near passed out.")

Processing sentence: ['[CLS]', 'i', 'fed', 'my', 'cat', 'some', 'of', 'it', 'and', 'he', 'damn', 'near', 'passed', 'out', '.', '[SEP]']

Word: i 	 Prob_forward: 0.00788011122494936; Prob_backwards: 0.9571204781532288
Word: fed 	 Prob_forward: 0.0002093437797157094; Prob_backwards: 0.1745494306087494
Word: my 	 Prob_forward: 0.001939286245033145; Prob_backwards: 0.02095809206366539
Word: cat 	 Prob_forward: 0.0075833857990801334; Prob_backwards: 2.67108316620579e-06
Word: some 	 Prob_forward: 0.02979872189462185; Prob_backwards: 0.0016446709632873535
Word: of 	 Prob_forward: 0.002569902455434203; Prob_backwards: 0.014875434339046478
Word: it 	 Prob_forward: 0.0014934015925973654; Prob_backwards: 4.503015225054696e-05
Word: and 	 Prob_forward: 0.43204018473625183; Prob_backwards: 0.4202370047569275
Word: he 	 Prob_forward: 0.037493497133255005; Prob_backwards: 0.09001420438289642
Word: damn 	 Prob_forward: 3.998847751063295e-05; Prob_backwards: 0.8230639696121216
Word: near 	 Prob_forwar

0.013058838399427147

In [45]:
print("Should have similar/high probs\n")
get_sentence_prob("I forgot to take my medicine.")
get_sentence_prob("I forgot to take my medicines.")
get_sentence_prob("I forgot to take my medication.")
get_sentence_prob("I forgot to take my pills.")
print("Should have low probs\n")
get_sentence_prob("I forgot to take my turn.")
get_sentence_prob("I forgot to take my medical.")
get_sentence_prob("I forgot to take my medically.")
get_sentence_prob("I forgot to take my turned.")

Should have similar/high probs

Processing sentence: ['[CLS]', 'i', 'forgot', 'to', 'take', 'my', 'medicine', '.', '[SEP]']

Word: i 	 Prob_forward: 0.009378484450280666; Prob_backwards: 0.9732086658477783
Word: forgot 	 Prob_forward: 0.00021250064310152084; Prob_backwards: 0.015700489282608032
Word: to 	 Prob_forward: 0.03624541684985161; Prob_backwards: 0.055403366684913635
Word: take 	 Prob_forward: 0.017103174701333046; Prob_backwards: 0.02183195948600769
Word: my 	 Prob_forward: 0.00037714111385867; Prob_backwards: 2.8737551474478096e-05
Word: medicine 	 Prob_forward: 0.01692848652601242; Prob_backwards: 1.0145240594283678e-06
Word: . 	 Prob_forward: 0.9715754985809326; Prob_backwards: 0.9818249344825745
Geometric-mean forward sentence probability: 0.027014838021123058
Geometric-mean backward sentence probability: 0.02007291597720789

Average normalized sentence prob: 0.023286617911063946

Processing sentence: ['[CLS]', 'i', 'forgot', 'to', 'take', 'my', 'medicines', '.', '[SEP]']

0.014468761225455259

In [54]:
get_sentence_prob("We will explore the elements used to construct sentences, and what parts of speech are used to expand and elaborate on them.")
get_sentence_prob("Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project by a community of volunteer editors.")
get_sentence_prob("Once she gave her a little cap of red velvet, which suited her so well that she would never wear anything else.")

Processing sentence: ['[CLS]', 'we', 'will', 'explore', 'the', 'elements', 'used', 'to', 'construct', 'sentences', ',', 'and', 'what', 'parts', 'of', 'speech', 'are', 'used', 'to', 'expand', 'and', 'elaborate', 'on', 'them', '.', '[SEP]']

Word: we 	 Prob_forward: 0.0005438847583718598; Prob_backwards: 0.022272974252700806
Word: will 	 Prob_forward: 0.008487698622047901; Prob_backwards: 0.024045974016189575
Word: explore 	 Prob_forward: 9.5668867288623e-05; Prob_backwards: 0.0005457077058963478
Word: the 	 Prob_forward: 0.1310628205537796; Prob_backwards: 0.08164729177951813
Word: elements 	 Prob_forward: 1.3173785191611387e-05; Prob_backwards: 3.286916762590408e-05
Word: used 	 Prob_forward: 1.640969276195392e-05; Prob_backwards: 0.9119576811790466
Word: to 	 Prob_forward: 0.7283820509910583; Prob_backwards: 0.7407114505767822
Word: construct 	 Prob_forward: 0.004861429799348116; Prob_backwards: 0.0010620743269100785
Word: sentences 	 Prob_forward: 4.218660615151748e-05; Prob_backward

0.026338711941297832

In [15]:
# Load pre-trained model (weights)
with torch.no_grad():
    model = BertForMaskedLM.from_pretrained('bert-base-uncased')
    model.eval()
    # Load pre-trained model tokenizer (vocabulary)
    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

In [16]:
#get_sentence_prob("I fed my cat some of it and he damn near passed out")
get_sentence_prob("He was born in Berlin.")
get_sentence_prob("He was born in Santiago.")
get_sentence_prob("He was born in France.")
get_sentence_prob("He was born in window.")
get_sentence_prob("He was born in was.")

Processing sentence: ['[CLS]', 'he', 'was', 'born', 'in', 'berlin', '.', '[SEP]']
Word: he 	 Prob_forward: 0.0021003505680710077; Prob_backwards: 0.8305719494819641

Word: was 	 Prob_forward: 0.21391922235488892; Prob_backwards: 0.9997050166130066

Word: born 	 Prob_forward: 0.00040079050813801587; Prob_backwards: 0.01874382235109806

Word: in 	 Prob_forward: 0.980291485786438; Prob_backwards: 0.018296195194125175

Word: berlin 	 Prob_forward: 0.02419455163180828; Prob_backwards: 9.928996587404981e-05

Word: . 	 Prob_forward: 0.9961835741996765; Prob_backwards: 0.9829498529434204


Geometric-mean forward sentence probability: 0.08986879181068508


Geometric-mean backward sentence probability: 0.11362871557646786


Average normalized sentence prob: 0.10105283461564618

Processing sentence: ['[CLS]', 'he', 'was', 'born', 'in', 'santiago', '.', '[SEP]']
Word: he 	 Prob_forward: 0.0021003505680710077; Prob_backwards: 0.7335055470466614

Word: was 	 Prob_forward: 0.21391922235488892; Prob_b

0.028633058233584703

In [17]:
get_sentence_prob("I fed my cat some of it and he damn near passed out.")
get_sentence_prob("I fed my dog some of it and he damn near passed out.")
get_sentence_prob("I fed my window some of it and he damn near passed out.")
get_sentence_prob("I fed my the some of it and he damn near passed out.")

Processing sentence: ['[CLS]', 'i', 'fed', 'my', 'cat', 'some', 'of', 'it', 'and', 'he', 'damn', 'near', 'passed', 'out', '.', '[SEP]']
Word: i 	 Prob_forward: 0.00788011122494936; Prob_backwards: 0.9571204781532288

Word: fed 	 Prob_forward: 0.0002093437797157094; Prob_backwards: 0.1745494306087494

Word: my 	 Prob_forward: 0.001939286245033145; Prob_backwards: 0.02095809206366539

Word: cat 	 Prob_forward: 0.0075833857990801334; Prob_backwards: 2.67108316620579e-06

Word: some 	 Prob_forward: 0.02979872189462185; Prob_backwards: 0.0016446709632873535

Word: of 	 Prob_forward: 0.002569902455434203; Prob_backwards: 0.014875434339046478

Word: it 	 Prob_forward: 0.0014934015925973654; Prob_backwards: 4.503015225054696e-05

Word: and 	 Prob_forward: 0.43204018473625183; Prob_backwards: 0.4202370047569275

Word: he 	 Prob_forward: 0.037493497133255005; Prob_backwards: 0.09001420438289642

Word: damn 	 Prob_forward: 3.998847751063295e-05; Prob_backwards: 0.8230639696121216

Word: near 	 Pr

0.013058838399427147

In [18]:
print("Should have similar/high probs\n")
get_sentence_prob("I forgot to take my medicine.")
get_sentence_prob("I forgot to take my medicines.")
get_sentence_prob("I forgot to take my medication.")
get_sentence_prob("I forgot to take my pills.")
print("Should have low probs\n")
get_sentence_prob("I forgot to take my turn.")
get_sentence_prob("I forgot to take my medical.")
get_sentence_prob("I forgot to take my medically.")
get_sentence_prob("I forgot to take my turned.")

Should have similar/high probs

Processing sentence: ['[CLS]', 'i', 'forgot', 'to', 'take', 'my', 'medicine', '.', '[SEP]']
Word: i 	 Prob_forward: 0.009378484450280666; Prob_backwards: 0.9732086658477783

Word: forgot 	 Prob_forward: 0.00021250064310152084; Prob_backwards: 0.015700489282608032

Word: to 	 Prob_forward: 0.03624541684985161; Prob_backwards: 0.055403366684913635

Word: take 	 Prob_forward: 0.017103174701333046; Prob_backwards: 0.02183195948600769

Word: my 	 Prob_forward: 0.00037714111385867; Prob_backwards: 2.8737551474478096e-05

Word: medicine 	 Prob_forward: 0.01692848652601242; Prob_backwards: 1.0145240594283678e-06

Word: . 	 Prob_forward: 0.9715754985809326; Prob_backwards: 0.9818249344825745


Geometric-mean forward sentence probability: 0.027014838021123058


Geometric-mean backward sentence probability: 0.02007291597720789


Average normalized sentence prob: 0.023286617911063946

Processing sentence: ['[CLS]', 'i', 'forgot', 'to', 'take', 'my', 'medicines', '.'

0.014468761225455259