<p style="align: center;"><img align=center src="https://s8.hostingkartinok.com/uploads/images/2018/08/308b49fcfbc619d629fe4604bceb67ac.jpg" style="height:450px;" width=500/></p>

<h3 style="text-align: center;"><b>Школа глубокого обучения ФПМИ МФТИ</b></h3>
<h3 style="text-align: center;"><b>Продвинутый поток (часть 2). Весна 2021</b></h3>

<h1 style="text-align: center;"><b>Language modeling.</b></h1>

Для начала загрузим датасет, состоящий из сэмплов кода на языке Python. Датасет представлен гитхабом. [Про датасет](https://github.blog/2019-09-26-introducing-the-codesearchnet-challenge/).

Для препроцессинга будем использовать уже известную нам библиотеку `datasets` от Huggingface.

In [None]:
!pip install -q datasets

[K     |████████████████████████████████| 194kB 21.9MB/s 
[K     |████████████████████████████████| 112kB 51.4MB/s 
[K     |████████████████████████████████| 245kB 58.4MB/s 
[?25h

In [None]:
!wget https://s3.amazonaws.com/code-search-net/CodeSearchNet/v2/python.zip
!unzip -p python.zip python/final/jsonl/train/python_train_0.jsonl.gz > train.jsonl.gz
!unzip -p python.zip python/final/jsonl/test/python_test_0.jsonl.gz > test.jsonl.gz

--2021-03-06 10:28:54--  https://s3.amazonaws.com/code-search-net/CodeSearchNet/v2/python.zip
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.217.70.198
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.217.70.198|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 940909997 (897M) [application/zip]
Saving to: ‘python.zip’


2021-03-06 10:29:04 (95.0 MB/s) - ‘python.zip’ saved [940909997/940909997]



In [None]:
# decompress this gzip file
!gzip -d train.jsonl.gz
!gzip -d test.jsonl.gz

Загружать датасеты можно не только из хаба, но и из диска. Для этого достаточно указать формат и путь до файла.

In [None]:
from datasets import load_dataset  
dataset = load_dataset(
    "json",
    data_files=[
        "train.jsonl",
    ],
)

Using custom data configuration default-c29742d500581c14


Downloading and preparing dataset json/default (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to /root/.cache/huggingface/datasets/json/default-c29742d500581c14/0.0.0/dc7ee63ec8b554c48ecc5a8a6fbe27af8071408c244e4347cf9222d6206d83a2...


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/default-c29742d500581c14/0.0.0/dc7ee63ec8b554c48ecc5a8a6fbe27af8071408c244e4347cf9222d6206d83a2. Subsequent calls will reuse this data.


In [None]:
dataset

DatasetDict({
    train: Dataset({
        features: ['repo', 'path', 'func_name', 'original_string', 'language', 'code', 'code_tokens', 'docstring', 'docstring_tokens', 'sha', 'url', 'partition'],
        num_rows: 30000
    })
})

Ограничим число уникальных слов до `40000`.

In [None]:
from collections import Counter


vocab_size = 40000
stats = Counter()

for item in dataset["train"]:
    stats.update(item["code_tokens"])
tokens = dict(stats.most_common(vocab_size)).keys()

Посмотрим на самые часто встречающиеся слова

In [None]:
stats.most_common(20)

[('(', 299562),
 (')', 299456),
 ('.', 283659),
 (',', 267888),
 ('=', 238494),
 (':', 176959),
 ('[', 99465),
 (']', 99448),
 ('self', 71449),
 ('if', 65763),
 ('return', 40852),
 ('None', 37311),
 ('def', 32278),
 ('in', 29951),
 ('*', 21773),
 ('not', 21291),
 ('name', 21205),
 ('ret', 20862),
 ('1', 19494),
 ('for', 19364)]

Добавим служебные токены

In [None]:
PAD = 0
UNK = 1
EOS = 2

token2idx = {"[PAD]": 0, "[UNK]": 1, "[EOS]": 2}

for idx, token in enumerate(tokens):
    token2idx[token] = idx + 3

Переведем токены в индексы

In [None]:
def encode(token):
    if token in token2idx.keys():
        return token2idx[token]
    return UNK

In [None]:
dataset = dataset.map(
    lambda item: {
        "features": [encode(token) for token in item["code_tokens"]] + [EOS]
    }
)

HBox(children=(FloatProgress(value=0.0, max=30000.0), HTML(value='')))




## N-gram

 Наченм с простейшей модели. Она основывается на статистическом методе. Итак, в языковом моделировании мы хотим максимизировать вероятность нашего текста по мнению модели, то есть:
 $$
\mathrm{P}(\mathrm{W})=\mathrm{P}\left(\mathrm{w}_{1}, \mathrm{w}_{2}, \mathrm{w}_{3}, \mathrm{w}_{4}, \mathrm{w}_{5} \ldots \mathrm{w}_{\mathrm{n}}\right)
$$


Вспомним, что можно переписать:

$$
P\left(x_{1}, x_{2}, x_{3}, \ldots, x_{n}\right)=P\left(x_{1}\right) P\left(x_{2} \mid x_{1}\right) P\left(x_{3} \mid x_{1}, x_{2}\right) \ldots P\left(x_{n} \mid x_{1}, \ldots, x_{n-1}\right)
$$

Тогда:

$$
P\left(w_{1} w_{2} \ldots w_{n}\right)=\prod_{i} P\left(w_{i} \mid w_{1} w_{2} \ldots w_{i-1}\right)
$$

Однако число вероятностей вида $P\left(w_{i} \mid w_{1} w_{2} \ldots w_{i-1}\right)$ растет очень быстро. Поэтому используют некоторое предположение которое называется **марковковское приближение**. Формулируется оно так:

$$
P\left(w_{1} w_{2} \ldots w_{n}\right) \approx \prod_{i} P\left(w_{i} \mid w_{i-k} \ldots w_{i-1}\right)
$$

То есть мы считаем, что текущее слово зависит только от $k$ предыдущих.

$$
P\left(w_{i} \mid w_{1} w_{2} \ldots w_{i-1}\right) \approx P\left(w_{i} \mid w_{i-k} \ldots w_{i-1}\right)
$$


In [None]:
import numpy as np
from collections import Counter, defaultdict

from tqdm.notebook import tqdm


class NGramModel(object):
    """
    Структура этой реализации n-граммной модели следующая:
    self.ngrams – словарь, который на каждый (token_0, ..., token_(n-1)) – n-1 tuple из токенов
        хранит частоту появления следующего токена. Для подсчета числа токенов воспользуемся
        Counter
    self.tokenize_func – функция токенизации текста. С её помощью будем получать токены.
    """
    def __init__(self, n=2):
        self.ngrams = defaultdict(Counter)
        self.n = n
        self.tokenize_func = None
        
    def compute_ngrams(self, dataset):
        self.ngrams = defaultdict(Counter)
        for row in tqdm(dataset):
            ngram = [PAD] * self.n
            for token in row["features"]:
                ngram[:-1] = ngram[1:]
                ngram[-1] = token
                self.ngrams[tuple(ngram[:-1])].update([ngram[-1]])
            
    def get_log_probs(self, prefix, min_log_pr=-15):
        """
        Функция, которая будет возвращать логарифмы частот появления токенов
        """
        if len(prefix) < self.n - 1:
            prefix = [PAD] * (self.n - len(prefix) - 1) + prefix
        else:
            prefix = prefix[-self.n + 1:]
        possible_ends = self.ngrams[tuple(prefix)]
        sum_freq = np.log(sum(possible_ends[e] for e in possible_ends))
        return {e: np.log(possible_ends[e]) - sum_freq for e in possible_ends}
    
    def sample(self, prefix):
        possible_ends = self.get_log_probs(prefix)
        if len(possible_ends) > 0:
            end = np.random.choice(list(possible_ends.keys()), p=np.exp(list(possible_ends.values())))
            return end
        return EOS

In [None]:
n_gram_model = NGramModel(n=5)

In [None]:
n_gram_model.compute_ngrams(dataset["train"])

HBox(children=(FloatProgress(value=0.0, max=30000.0), HTML(value='')))




In [None]:
idx2token = {idx: token for token, idx in token2idx.items()}

In [None]:
prefix = ["def", "train", "("]
encoded_prefix = [token2idx[token] for token in prefix]
length=100

for i in range(length):
    cur_token = n_gram_model.sample(encoded_prefix)
    if cur_token == EOS:
        break
    encoded_prefix += [cur_token]


decoded_text = [idx2token[idx] for idx in encoded_prefix]
print(" ".join(decoded_text))

def train ( self ) : channel_id = self . [UNK] + self . content_feature ] [ 0 ] , ) + crop_target ) else : size = kernel_size * kernel_size * inputs_shape [ - 1 : ] == [UNK] : [UNK] try : ar = [UNK] . [UNK] ( project_name , updatetime , md5sum ) assert project_data , [UNK] if project_data . get ( 'exception' ) : ret = { 'result' : success , 'state' : { 'old' : [UNK] , 'new' : [UNK] } ret_communities [ 'changes' ] . update ( kwargs ) return ubq


In [None]:
test_dataset = load_dataset(
    "json",
    data_files=[
        "test.jsonl",
    ],
)

Using custom data configuration default-3e2c20c277017e63


Downloading and preparing dataset json/default (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to /root/.cache/huggingface/datasets/json/default-3e2c20c277017e63/0.0.0/dc7ee63ec8b554c48ecc5a8a6fbe27af8071408c244e4347cf9222d6206d83a2...


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/default-3e2c20c277017e63/0.0.0/dc7ee63ec8b554c48ecc5a8a6fbe27af8071408c244e4347cf9222d6206d83a2. Subsequent calls will reuse this data.


In [None]:
max_seq_len=128

test_dataset = test_dataset.map(
    lambda item: {
        "features": [encode(token) for token in item["code_tokens"]][:max_seq_len-1] + [EOS]
    }
)

HBox(children=(FloatProgress(value=0.0, max=22176.0), HTML(value='')))




### Метрика качества

$$
P P(p):=2^{H(p)}=2^{-\sum_{x} p(x) \log _{2} p(x)}
$$

Можно использовать

$$
P P' (p):=e^{H(p)}=e^{-\sum_{x} p(x) \ln p(x)}
$$

$$
P P' (p):=e^{H(p)}=e^{-\frac{1}{n}\sum_{x} p(x) \ln p(x)}
$$

In [None]:
def count_perplexity(model, dataset, max_iter_num: int = 1000):
    entropy = 0
    iter_num = 0
    num_words = 0
    for item in tqdm(dataset, total=min(max_iter_num, len(dataset))):
        output_so_far = [item["features"][0]]

        for token in item["features"][1:]:
            num_words += 1
            try:
                log_probs = model.get_log_probs(output_so_far)
                entropy += -log_probs[token]
            except KeyError:
                entropy += np.log(-10)
            output_so_far.append(token)
        iter_num += 1
        if iter_num > max_iter_num:
            break
    mean_entropy = entropy / num_words
    return np.e ** mean_entropy

In [None]:
count_perplexity(n_gram_model, test_dataset["train"])

HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

  


nan

## CNN

![](https://lena-voita.github.io/resources/lectures/lang_models/neural/cnn/cnn_main-min.png)



In [None]:
dataset.set_format(type="torch", columns=["features"])
test_dataset.set_format(type="torch", columns=["features"])

In [None]:
def collate_fn(batch):
    batch = batch[0]
    max_len = max(len(f_t) for f_t in batch["features"])
    input_embeds = torch.zeros((len(batch["features"]), max_len), dtype=torch.long)
    for idx, row in enumerate(batch["features"]):
        input_embeds[idx][:len(row)] += row
    return {
        "features": input_embeds,
    }

In [None]:
from torch.utils.data import Sampler


class TextSampler(Sampler):
    def __init__(self, sampler, batch_size_tokens=1e4):
        self.sampler = sampler
        self.batch_size_tokens = batch_size_tokens

    def __iter__(self):
        batch = []
        max_len = 0
        for ix in self.sampler:
            row = self.sampler.data_source[ix]
            max_len = max(max_len, len(row["features"]))
            if (len(batch) + 1) * max_len > self.batch_size_tokens:
                yield batch
                batch = []
                max_len = len(row["features"])
            batch.append(ix)
        if len(batch) > 0:
            yield batch

    def __len__(self):
        return len(self.sampler)

In [None]:
from torch.utils.data import DataLoader, SequentialSampler, RandomSampler, random_split


train_sampler = RandomSampler(dataset["train"])
valid_sampler = SequentialSampler(test_dataset["train"])

loaders = {
    "train": DataLoader(
        dataset["train"], 
        collate_fn=collate_fn, 
        sampler=TextSampler(train_sampler,)
    ),
    "valid": DataLoader(
        test_dataset["train"],
        collate_fn=collate_fn, 
        sampler=TextSampler(
            valid_sampler, 
        )
    )
}

In [None]:
import torch
import torch.nn as nn


class CNNLM(nn.Module):
    def __init__(self, vocab_size, emb_size, hidden_size, num_layers=3, kernel_size: int = 5):
        super().__init__()
        
        self.emb = nn.Embedding(vocab_size, emb_size)
        layers = []
        for layer_idx in range(num_layers):
            layers.append(nn.ZeroPad2d((kernel_size-1, 0, 0, 0)))
            if layer_idx == 0:
                layers.append(nn.Conv1d(emb_size, hidden_size, kernel_size=kernel_size))
            else:
                layers.append(nn.Conv1d(hidden_size, hidden_size, kernel_size=kernel_size))
        self.conv_layers = nn.Sequential(*layers)
        self.receptive_field = kernel_size + (kernel_size-1)*(num_layers-1)
        self.pred = nn.Linear(hidden_size, vocab_size)
        
    def forward(self, input_ids):
        embed = self.emb(input_ids)
        embed = embed.permute(0, 2, 1)
        features = self.conv_layers(embed)
        features = features.permute(0, 2, 1)
        logits = self.pred(features)
        return logits

In [None]:
device = "cuda:0" if torch.cuda.is_available() else "cpu"

model = CNNLM(len(tokens) + 3, 300, 100, num_layers=1).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)
criterion = nn.CrossEntropyLoss(ignore_index=PAD)

In [None]:
from tqdm.notebook import tqdm, trange


def train(
    num_epochs: int, 
    model: nn.Module,
    train_loader: DataLoader,
    valid_loader: DataLoader,
    criterion: nn.Module,
    optimizer: torch.optim.Optimizer,
    max_grad_norm: float = None
):
    for epoch in trange(num_epochs):
        pbar = tqdm(train_loader, leave=False, total=len(train_loader)//20)
        pbar.set_description("Train epoch")
        model.train()
        for batch in pbar:
            optimizer.zero_grad()
            features = batch["features"].to(device)
            predictions = model(features[:, :-1])
            loss = criterion(
                predictions.reshape(-1, predictions.size(-1)),
                features[:, 1:].reshape(-1)
            )
            loss.backward()
            if max_grad_norm is not None:
                torch.nn.utils.clip_grad_norm_(model.parameters(), max_grad_norm)
            optimizer.step()
        model.eval()
        mean_loss = 0
        pbar = tqdm(valid_loader, leave=False, total=len(valid_loader)//100)
        pbar.set_description("Valid epoch")
        num_iter=0
        for batch in pbar:
            features = batch["features"].to(device)
            with torch.no_grad():
                predictions = model(features[:, :-1])
                loss = criterion(
                    predictions.reshape(-1, predictions.size(-1)),
                    features[:, 1:].reshape(-1)
                )
            mean_loss += loss.item()
            num_iter += 1
        mean_loss /= num_iter
        print(f"Epoch: {epoch}; mean loss: {mean_loss}; perplexity: {np.exp(mean_loss)}")
            

In [None]:
train(
    num_epochs=1,
    model=model, 
    train_loader=loaders["train"],
    valid_loader=loaders["valid"],
    criterion=criterion,
    optimizer=optimizer,
)

HBox(children=(FloatProgress(value=0.0, max=1.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1500.0), HTML(value='')))

  return np.array(array, copy=False, **self.np_array_kwargs)


HBox(children=(FloatProgress(value=0.0, max=221.0), HTML(value='')))

Epoch: 0; mean loss: 3.3694700441862406; perplexity: 29.063120806004164



![](https://lena-voita.github.io/resources/lectures/lang_models/neural/cnn/receptive_field-min.png)


Как увеличить receptive field? 

Добавить больше слоев.

Как обучать?

Добавить residual connections.


![](https://lena-voita.github.io/resources/lectures/lang_models/neural/cnn/cnn_with_residual-min.png)



In [None]:
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
from ipywidgets import interactive
from IPython import display

sns.set(style="whitegrid", font_scale=1.4)

sample = np.random.randn(10)
def plot_temperature(T: float = 1.0):
    plt.figure(figsize=(12, 8))
    plt.title(f"Temperature = {T}")
    probs = np.exp(sample / T) / sum(np.exp(sample / T))
    plt.bar(range(10), probs)
    plt.xlabel("tokens")
    plt.ylabel("probs")
    plt.show()


v = interactive(
    plot_temperature, T=(0.02, 10)
)

In [None]:
display.display(v)

interactive(children=(FloatSlider(value=1.0, description='T', max=10.0, min=0.02), Output()), _dom_classes=('w…

In [None]:
from typing import List
from torch.distributions import Categorical

@torch.no_grad()
def generate(
    prefix, model, length: int = 100, receptive_field: int = 5, T: float = 1.
) -> List[int]:
    prefix = torch.from_numpy(prefix)
    prefix = prefix.unsqueeze(0).to(device)
    model.eval()
    for iter_idx in range(length):
        preds = model(prefix[:, -receptive_field:])
        probs = torch.softmax(preds[:, -1]/T, dim=-1)
        distribution = Categorical(probs)
        sampled = distribution.sample()
        if sampled.item() == EOS:
            break
        prefix = torch.cat((prefix, sampled.unsqueeze(0)), dim=1)
    return prefix

In [None]:
prefix = ["def", "train", "("]
encoded_prefix = np.array([token2idx[t] for t in prefix])


for t in np.logspace(0.002, 1, 10):
    generated = generate(
        encoded_prefix, 
        model, 
        receptive_field=model.receptive_field, 
        length=20,
        T=t-1
    )
    print(f"Temperature: {t-1}")
    print(" ".join([idx2token[idx] for idx in generated.cpu().numpy().flatten()]))

Temperature: 0.004615790278395204
def train ( self , y , y , y , y , y , y , y , y , y ,
Temperature: 0.29684743947274783
def train ( self , other , other , other ) : self . [UNK] . emit ( ) ) editor . [UNK]
Temperature: 0.6740860511469413
def train ( self , buf , ctypes . byref ( local_struct_pack ) , ctypes . byref ( round ( y0 ) unmatched
Temperature: 1.161059212781561
def train ( self = 4 , broker_name = // 3 ) . Y_batch ) [ 'code' ] environments = language = cwd
Temperature: 1.7896874942291365
def train ( expr , >= , finished_seq
Temperature: 2.6011768069240193
def train ( query_info ) values jls parse . metric_type cm Menu service_root yp_masked_test font num_entities_in_instance new_class VMwareSaltError ) pl . start_session start_open
Temperature: 3.648719407300855
def train ( bus_number "q" usable_url ip_info video_hparams logprob_i bad_probs render_template alignments queue = GraphDef nni exports [UNK] effective_path interpreter , n_iter "_id"
Temperature: 5.000980592306576
d

## LSTM

![](https://lena-voita.github.io/resources/lectures/lang_models/neural/rnn/rnn_simple-min.png)

In [None]:
class LSTM(nn.Module):
    def __init__(self, vocab_size, emb_size, hidden_size):
        super().__init__()
        self.emb = nn.Embedding(vocab_size, emb_size)
        self.lstm = nn.LSTM(emb_size, hidden_size, batch_first=True)
        self.pred = nn.Linear(hidden_size, vocab_size)
        
    def forward(self, input_ids):
        embs = self.emb(input_ids)
        output, _ = self.lstm(embs)
        return self.pred(output)

In [None]:
model = LSTM(len(token2idx), 300, 50).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)

In [None]:
train(
    num_epochs=1,
    model=model,
    train_loader=loaders["train"],
    valid_loader=loaders["valid"],
    criterion=criterion,
    optimizer=optimizer,
)

HBox(children=(FloatProgress(value=0.0, max=1.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1500.0), HTML(value='')))

  return np.array(array, copy=False, **self.np_array_kwargs)


HBox(children=(FloatProgress(value=0.0, max=221.0), HTML(value='')))

Epoch: 0; mean loss: 3.1772768689875015; perplexity: 23.981360169501063



## Методы генерации текста

### Greedy Search

$$
w_t = \operatorname{argmax}_{w} P\left(w \mid w_{1: t-1}\right)
$$

![](https://huggingface.co/blog/assets/02_how-to-generate/greedy_search.png)

**Проблема**: Модель быстро начинает повторять одну и ту же фразу. 

### Beam search

![](https://huggingface.co/blog/assets/02_how-to-generate/beam_search.png)

**Проблема**: Модель все еще выдает слишком предсказуемый текст, в отличии от человеческой речи.
![](https://blog.fastforwardlabs.com/images/2019/05/Screen_Shot_2019_05_08_at_3_06_36_PM-1557342561886.png)

### Sampling

$$
w_{t} \sim P\left(w \mid w_{1: t-1}\right)
$$

![](https://huggingface.co/blog/assets/02_how-to-generate/sampling_search_with_temp.png)

**Проблема**: страдает целостность текста. Некоторые фразы получаются слишком случайные.

### Top-K Sampling


![](https://huggingface.co/blog/assets/02_how-to-generate/top_k_sampling.png)

Еще можно использовать top-p sampling. Жадно набирать слова, пока их общая вероятность не станет p.


In [None]:
prefix = ["def", "train", "("]
encoded_prefix = np.array([token2idx[t] for t in prefix])

generated = generate(encoded_prefix, model)

In [None]:
prefix = ["def", "train", "("]
encoded_prefix = np.array([token2idx[t] for t in prefix])


for t in np.logspace(0.002, 1, 10):
    generated = generate(
        encoded_prefix, 
        model, 
        receptive_field=20, 
        length=20,
        T=t-1
    )
    print(f"Temperature: {t-1}")
    print(" ".join([idx2token[idx] for idx in generated.cpu().numpy().flatten()]))

Temperature: 0.004615790278395204
def train ( self , [UNK] , [UNK] ) : if not isinstance ( self , [UNK] ) : raise CommandExecutionError ( [UNK]
Temperature: 0.29684743947274783
def train ( self , name , * * kwargs ) : if self . [UNK] : [UNK] = [UNK] . format (
Temperature: 0.6740860511469413
def train ( self , * ) : self . [UNK] = [UNK] ( [UNK] . format ( ) ) if [UNK] in
Temperature: 1.161059212781561
def train ( create_from_ll , data , models : payload = { self . plot_obj . get ( difference ) , # type: bool cl
Temperature: 1.7896874942291365
def train ( "text/plain" EISDIR pos num_expired sequenceOutput init_h RQInvalidArgument _urlopen tw tok_tokens file_count list_cache Summary ABCSeries font_o _gluster_ok segments RuntimeError end_date_ixs regex_re
Temperature: 2.6011768069240193
def train ( per_cpu time_started around_and_astype pretrained_settings version_ f student_w numpy_doc resize_height_factor _mxnet_utils remember_bias pages [UNK] 'code' newb upload x_index back_color skey 

## References



1.   [Заметки из курса ШАДа.](https://lena-voita.github.io/nlp_course/language_modeling.html)
2.   [Блогпост по теме генерации текста от huggingface.](https://huggingface.co/blog/how-to-generate) Пока не заморачивайтесь, что там за модель в примере. Мы ее подробно рамерем в одном из следующих занятий.

