# Inferences in prior Language Models

*Computing of $p(x)$*

In [1]:
from os import chdir, getcwd

if getcwd().endswith('notebooks'):
    chdir('..')

## One-hot data padding for $n$-grams

* If $L < n$:

If the sequence's length is lower than $n$ -- the order of the $n$-gram --, so the sequence is padded with closing boundaries `')'`.\\
For instance, `'(b)'` string will be processed by the 4-gram as `'(b))'` so at least one inference can be done in the language model.

In training, the probabilities of this kind of $n$-gram will be computed by lower-order $(n-k)$-gram models.

* If $L < L_\textrm{max}$:

The inference in a batch of strings will force the algorithm to process $n$-grams containing padding empty characters. To neutralize their useless probabilities, the empty characters are represented with a "full one-hot" vector, containing only values equaling `1`. Then, the computed transition probabilities will be higher than 1 and we after limit it to 1 with the max operator.

In [3]:
import torch
from torch.nn.utils.rnn import pack_sequence
from torch.nn.functional import one_hot
V = {'a':0, 'b':1, 'c':2, "(":3, ')':4}
V_inv = ['a', 'b', 'c', "(", ")"]
raw_batch = ['(ab)', '(abcb)', '(cba)', '(b)']
batch = pack_sequence([one_hot(torch.LongTensor([V[c] for c in w])) for w in raw_batch], enforce_sorted=False) # size = (L, B)

from lm.PriorLM import NGramLM

model = NGramLM([], 4)
paddedData = torch.exp(model.padDataToNgram((batch, None, None)))
for w in range(4):
    print(f'\"{raw_batch[w]}\"')
    print(paddedData[:,w,:]) # dim = (L, V)



"(ab)"
tensor([[0., 0., 0., 1., 0.],
        [1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])
"(abcb)"
tensor([[0., 0., 0., 1., 0.],
        [1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 1.]])
"(cba)"
tensor([[0., 0., 0., 1., 0.],
        [0., 0., 1., 0., 0.],
        [0., 1., 0., 0., 0.],
        [1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1.],
        [1., 1., 1., 1., 1.]])
"(b)"
tensor([[0., 0., 0., 1., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 1.],
        [0., 0., 0., 0., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])
