# **Создайте своего собственного Рика Санчеса (бота) с помощью Transformers и тонкой настройки DialoGPT**

![alt text](https://media.giphy.com/media/ZdBnIqBeVWBYTobH8U/giphy.gif)


Путь к созданию виртуального Рика был непростым. Я перепробовал множество моделей и подходов, но виртуальный Рик долгое время не хотел находить со мной общий язык. Первым ответом Рика на мое приветствие было «Я не твой приятель» (честно говоря, вначале он знал только эту фразу). Но я не сдавался и с помощью превосходной библиотеки Transformers и крупномасштабной предварительно обученной модели генерации ответов Microsoft (DialoGPT) мне удалось поговорить с Риком по душам. Вот один из наших разговоров:
**User**:Hey, Rick, make me a time machine <br>
**RickBot**: You're a monster! <br>
**User**:No, I just want to travel in time <br>
**RickBot**: To the past. <br>
**User**:yes <br>
**RickBot**: No time machine, please. <br>
**User**:But I need it <br>
**RickBot**: You don't need it, Morty. Just do it. <br>
**User**:I want to see antient Greece <br>
**RickBot**: Oh, geez. <br>

![alt text](https://media.giphy.com/media/WVGSfyhUYBoiGLbZ7N/giphy.gif)

# **Немного теории**

В этой работе я расскажу вам, как создать виртуального персонажа, чьи высказывания будут основаны на стенограмме моего любимого анимационного научно-фантастического ситкома. Вы можете использовать персонажей и ситкомы по своему выбору. Я добавил код для преобразования обычного текстового файла с диалогами в формат, понятный модели.
Как я уже упоминал, в нашей задаче нам поможет библиотека [Transformers](https://huggingface.co/transformers/) , содержащая последние модели НЛП (такие как [BERT](https://huggingface.co/transformers/model_doc/bert.html), [XLNet](https://huggingface.co/transformers/model_doc/xlnet.html), [GPT-2](https://huggingface.co/transformers/model_doc/gpt2.html) ). Вы можете узнать больше о трансформерах в прекрасно иллюстрированной [статье](http://jalammar.github.io/illustrated-transformer/) Джея Аламмара

![alt text](http://jalammar.github.io/images/t/transformer_resideual_layer_norm_3.png)

В коллекции моделей Transformers есть Microsoft  [DialoGPT](https://huggingface.co/transformers/model_doc/dialogpt.html) . DialoGPT - это модель GPT-2, обученная на 147M многооборотных диалогах из ветки обсуждения Reddit (вы можете узнать больше о GPT-2 [здесь](http://jalammar.github.io/illustrated-gpt2/)). Эта модель идеально подходит для создания виртуального персонажа для увлекательной беседы и даже в небольшом варианте реализации может поддерживать связный диалог, который мы сейчас увидим.

## Первый диалог с DialoGPT

Все эксперименты  будем проводить в Google Colab, его ресурсов хватит на обучение маленькой модели DialoGPT. Для начала подключимся к Google Drive и установим необходимые модули.

In [1]:
from google.colab import drive
drive.mount('/content/drive/')

Mounted at /content/drive/


Установим Transformers

In [2]:
! pip install transformers==2.11.0

Collecting transformers==2.11.0
[?25l  Downloading https://files.pythonhosted.org/packages/48/35/ad2c5b1b8f99feaaf9d7cdadaeef261f098c6e1a6a2935d4d07662a6b780/transformers-2.11.0-py3-none-any.whl (674kB)
[K     |▌                               | 10kB 21.1MB/s eta 0:00:01[K     |█                               | 20kB 28.6MB/s eta 0:00:01[K     |█▌                              | 30kB 25.8MB/s eta 0:00:01[K     |██                              | 40kB 22.9MB/s eta 0:00:01[K     |██▍                             | 51kB 24.3MB/s eta 0:00:01[K     |███                             | 61kB 17.1MB/s eta 0:00:01[K     |███▍                            | 71kB 17.9MB/s eta 0:00:01[K     |███▉                            | 81kB 18.5MB/s eta 0:00:01[K     |████▍                           | 92kB 16.5MB/s eta 0:00:01[K     |████▉                           | 102kB 17.7MB/s eta 0:00:01[K     |█████▍                          | 112kB 17.7MB/s eta 0:00:01[K     |█████▉                   

Попробуйте пообщаться с DialoGPT без тонкой настройки.

In [3]:
from transformers import AutoModelWithLMHead, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")
model = AutoModelWithLMHead.from_pretrained("microsoft/DialoGPT-small")

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=641.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1042301.0, style=ProgressStyle(descript…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=456318.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=351265583.0, style=ProgressStyle(descri…




### Давайте немного поговорим без тонкой настройки

In [4]:
# пообщаемся например на 5 вопросов
for step in range(5):
    # encode ввода нового пользователя, добавляем eos_token и возвращаем tensor in Pytorch
    new_user_input_ids = tokenizer.encode(input(">> Black:") + tokenizer.eos_token, return_tensors='pt')

    # добавляем новые токены введенные пользователем в историю чата
    bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids

    # генерируем ответ, ограничев в истории до 1000 tokens    
    chat_history_ids = model.generate(
    bot_input_ids, max_length=1000,
    pad_token_id=tokenizer.eos_token_id
    )

    
    print("Rick: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))

>> Black:Hello Rick
Rick: I'm not your buddy, friend.
>> Black:what are you?
Rick: I'm not your guy, buddy.
>> Black:Hi
Rick: Hi guy
>> Black:hi how are you?
Rick: Hi guy
>> Black:Who is Morty?
Rick: Hi guy


![alt text](https://media.giphy.com/media/L3WevKXIKFDaZBvV8Q/giphy.gif)

Неплохо, но не слишком впечатляюще. Исправим это с помощью тонкой настройки.

## Начальная конфигурация модели

Давайте обучим нашего собственного чат-бота Рика. Для начала нам понадобится базовая конфигурация и набор данных. Сценарии настройки и обучения в основном основаны на этом скрипте от Huggingface и отличном [руководстве](https://nathancooper.io/i-am-a-nerd/chatbot/deep-learning/gpt2/2020/05/12/chatbot-part-1.html) от Натана Купера

Тонкая настройка библиотечных моделей для языкового моделирования в текстовом файле (GPT, GPT-2, BERT, RoBERTa).
GPT и GPT-2 настраиваются с использованием причинно-следственного моделирования (CLM), в то время как BERT и RoBERTa настраиваются.
с использованием потерь при моделировании замаскированного языка (MLM).

In [4]:
import glob
import logging
import os
import pickle
import random
import re
import shutil
from typing import Dict, List, Tuple

import pandas as pd
import numpy as np
import torch

from sklearn.model_selection import train_test_split

from torch.nn.utils.rnn import pad_sequence
from torch.utils.data import DataLoader, Dataset, RandomSampler, SequentialSampler
from torch.utils.data.distributed import DistributedSampler
from tqdm.notebook import tqdm, trange

from pathlib import Path

from transformers import (
    MODEL_WITH_LM_HEAD_MAPPING,
    WEIGHTS_NAME,
    AdamW,
    AutoConfig,
    AutoModelWithLMHead,
    AutoTokenizer,
    PreTrainedModel,
    PreTrainedTokenizer,
    get_linear_schedule_with_warmup,
)


try:
    from torch.utils.tensorboard import SummaryWriter
except ImportError:
    from tensorboardX import SummaryWriter


конфигурация

In [5]:
logger = logging.getLogger(__name__)

MODEL_CONFIG_CLASSES = list(MODEL_WITH_LM_HEAD_MAPPING.keys())
MODEL_TYPES = tuple(conf.model_type for conf in MODEL_CONFIG_CLASSES)

Аргументы позволяющие легко преобразовать скрипт  Python в блокнот

In [6]:
class Args():
    def __init__(self):
        self.output_dir = 'output-small'
        self.model_type = 'gpt2'
        self.model_name_or_path = 'microsoft/DialoGPT-small'
        self.config_name = 'microsoft/DialoGPT-small'
        self.tokenizer_name = 'microsoft/DialoGPT-small'
        self.cache_dir = 'cached'
        self.block_size = 512
        self.do_train = True
        self.do_eval = True
        self.evaluate_during_training = False
        self.per_gpu_train_batch_size = 4
        self.per_gpu_eval_batch_size = 4
        self.gradient_accumulation_steps = 1
        self.learning_rate = 5e-5
        self.weight_decay = 0.0
        self.adam_epsilon = 1e-8
        self.max_grad_norm = 1.0
        self.num_train_epochs = 3
        self.max_steps = -1
        self.warmup_steps = 0
        self.logging_steps = 1000
        self.save_steps = 3500
        self.save_total_limit = None
        self.eval_all_checkpoints = False
        self.no_cuda = False
        self.overwrite_output_dir = True
        self.overwrite_cache = True
        self.should_continue = False
        self.seed = 42
        self.local_rank = -1
        self.fp16 = False
        self.fp16_opt_level = 'O1'

args = Args()

## Предобработка данных

Наш набор данных диалогов будет основан на наборе данных, который использовался в статье Андрады Олтяну об анализе настроений Рика и Морти. Большое спасибо ее работе, а также Габриэлю Эрнандесу, автору оригинального текстового набора данных !

![alt text](https://media.giphy.com/media/U6LOakQja88ImTnE6T/giphy.gif)

Загрузим данные

In [7]:
from google.colab import files
file = files.upload()

Saving RickAndMortyScripts.csv to RickAndMortyScripts.csv


Посмотрим на данные

In [8]:
all_rick = pd.read_csv('RickAndMortyScripts.csv')
all_rick.head(10)

Unnamed: 0,index,season no.,episode no.,episode name,name,line
0,0,1,1,Pilot,Rick,Morty! You gotta come on. Jus'... you gotta co...
1,1,1,1,Pilot,Morty,"What, Rick? What’s going on?"
2,2,1,1,Pilot,Rick,"I got a surprise for you, Morty."
3,3,1,1,Pilot,Morty,It's the middle of the night. What are you tal...
4,4,1,1,Pilot,Rick,"Come on, I got a surprise for you. Come on, h..."
5,5,1,1,Pilot,Morty,Ow! Ow! You're tugging me too hard!
6,6,1,1,Pilot,Rick,"We gotta go, gotta get outta here, come on. Go..."
7,7,1,1,Pilot,Rick,"What do you think of this... flying vehicle, M..."
8,8,1,1,Pilot,Morty,"Yeah, Rick... I-it's great. Is this the surprise?"
9,9,1,1,Pilot,Rick,Morty. I had to... I had to do it. I had— I ha...


Мы преобразуем этот набор данных таким образом, чтобы каждая строка ответа содержала n предыдущих ответов в качестве контекста. Для наших целей будет достаточно семи предыдущих ответов.

In [9]:
contexted = []

n = 7

for i in range(n, len(all_rick['line'])):
  row = []
  prev = i - 1 - n # вычитаем 1, поэтому строка будет содержать текущий ответ и 7 предыдущих ответов 
  for j in range(i, prev, -1):
    row.append(all_rick['line'][j])
  contexted.append(row)  

In [10]:
len(contexted)

1898

In [11]:
columns = ['response', 'context'] 
columns = columns + ['context/'+str(i) for i in range(n-1)]
columns

['response',
 'context',
 'context/0',
 'context/1',
 'context/2',
 'context/3',
 'context/4',
 'context/5']

In [12]:
df = pd.DataFrame.from_records(contexted, columns=columns)
df.head(5)

Unnamed: 0,response,context,context/0,context/1,context/2,context/3,context/4,context/5
0,"What do you think of this... flying vehicle, M...","We gotta go, gotta get outta here, come on. Go...",Ow! Ow! You're tugging me too hard!,"Come on, I got a surprise for you. Come on, h...",It's the middle of the night. What are you tal...,"I got a surprise for you, Morty.","What, Rick? What’s going on?",Morty! You gotta come on. Jus'... you gotta co...
1,"Yeah, Rick... I-it's great. Is this the surprise?","What do you think of this... flying vehicle, M...","We gotta go, gotta get outta here, come on. Go...",Ow! Ow! You're tugging me too hard!,"Come on, I got a surprise for you. Come on, h...",It's the middle of the night. What are you tal...,"I got a surprise for you, Morty.","What, Rick? What’s going on?"
2,Morty. I had to... I had to do it. I had— I ha...,"Yeah, Rick... I-it's great. Is this the surprise?","What do you think of this... flying vehicle, M...","We gotta go, gotta get outta here, come on. Go...",Ow! Ow! You're tugging me too hard!,"Come on, I got a surprise for you. Come on, h...",It's the middle of the night. What are you tal...,"I got a surprise for you, Morty."
3,What?! A bomb?!,Morty. I had to... I had to do it. I had— I ha...,"Yeah, Rick... I-it's great. Is this the surprise?","What do you think of this... flying vehicle, M...","We gotta go, gotta get outta here, come on. Go...",Ow! Ow! You're tugging me too hard!,"Come on, I got a surprise for you. Come on, h...",It's the middle of the night. What are you tal...
4,We're gonna drop it down there just get a whol...,What?! A bomb?!,Morty. I had to... I had to do it. I had— I ha...,"Yeah, Rick... I-it's great. Is this the surprise?","What do you think of this... flying vehicle, M...","We gotta go, gotta get outta here, come on. Go...",Ow! Ow! You're tugging me too hard!,"Come on, I got a surprise for you. Come on, h..."


Разделите наш набор данных на обучающую и тестовую части.

In [13]:
trn_df, val_df = train_test_split(df, test_size = 0.1)
trn_df.head()

Unnamed: 0,response,context,context/0,context/1,context/2,context/3,context/4,context/5
1443,Yes.,By changing you from a pickle to a human.,"It's a serum that I need to, uh, to stay alive...",Dad I would like you to tell me what's in the ...,"Yeah, thank you. Uh, sweetie, you don't still ...",I've heard a lot about you today. Your family ...,Mm-hmm.,You must be Rick.
563,Boom! In your face Summer! You really dropped ...,"Actually, sorry Summer, I gotta back the M bom...",Yeah huh!,Nuh uh!,"Uh, what? It was your job, Morty.","Uh, just to be clear, it was Summer's job to p...","Yup, it really makes you appreciate how fickle...","God, I feel terrible."
760,"Somebody call the planet of Tron, we have thre...",I'm gonna pee-hee-hee.,No it doesn't. It hurts.,"Doesn't feel so good, does it?",I'm gonna pee my pants.,Are you guys Power Rangers? But only on one sm...,"Um, hold the phone, where did you guys get tho...","Hey guys, we're home.Morty & Summer"
1332,"Okay, come on. This can't really be the way I ...","Oh, crap, that sun is bright.","Whoa! Oh! Whoa, whoa, whoa!","Whoa, whoa, whoa, whoa! I know what it looks l...","Hey, hey, be careful with that. It's for somet...",Great. We'll see you later.,Can't argue with that.,"Well, I mean, you don't want to get pierced by..."
961,Morty!,The heads are displeased!,I’m sure that has… that has nothing to do with...,"Nup-oo-nup-oo-nup-oo-nups… ooh, tough crowd.","Mr. President, if I’ve learned one thing today...","I’m really bad at this, Morty! There are way t...",Hey! Look at the heads! Looks like the heads a...,Headward free now to rise!


Теперь преобразуем наш набор данных в формат, подходящий для нашей модели. В основном мы объединяем ответы в одну строку для каждой строки (дополнительно мы добавим специальный токен конца строки между ответами, чтобы модель понимала конец каждого ответа в строке).

In [14]:
def construct_conv(row, tokenizer, eos = True):
    flatten = lambda l: [item for sublist in l for item in sublist]
    conv = list(reversed([tokenizer.encode(x) + [tokenizer.eos_token_id] for x in row]))
    conv = flatten(conv)
    return conv

class ConversationDataset(Dataset):
    def __init__(self, tokenizer: PreTrainedTokenizer, args, df, block_size=512):

        block_size = block_size - (tokenizer.max_len - tokenizer.max_len_single_sentence)

        directory = args.cache_dir
        cached_features_file = os.path.join(
            directory, args.model_type + "_cached_lm_" + str(block_size)
        )

        if os.path.exists(cached_features_file) and not args.overwrite_cache:
            logger.info("Loading features from cached file %s", cached_features_file)
            with open(cached_features_file, "rb") as handle:
                self.examples = pickle.load(handle)
        else:
            logger.info("Creating features from dataset file at %s", directory)

            self.examples = []
            for _, row in df.iterrows():
                conv = construct_conv(row, tokenizer)
                self.examples.append(conv)

            logger.info("Saving features into cached file %s", cached_features_file)
            with open(cached_features_file, "wb") as handle:
                pickle.dump(self.examples, handle, protocol=pickle.HIGHEST_PROTOCOL)

    def __len__(self):
        return len(self.examples)

    def __getitem__(self, item):
        return torch.tensor(self.examples[item], dtype=torch.long)

Кэширование и хранение данных / контрольных точек

In [15]:
def load_and_cache_examples(args, tokenizer, df_trn, df_val, evaluate=False):
    return ConversationDataset(tokenizer, args, df_val if evaluate else df_trn)


def set_seed(args):
    random.seed(args.seed)
    np.random.seed(args.seed)
    torch.manual_seed(args.seed)
    if args.n_gpu > 0:
        torch.cuda.manual_seed_all(args.seed)


def _sorted_checkpoints(args, checkpoint_prefix="checkpoint", use_mtime=False) -> List[str]:
    ordering_and_checkpoint_path = []

    glob_checkpoints = glob.glob(os.path.join(args.output_dir, "{}-*".format(checkpoint_prefix)))

    for path in glob_checkpoints:
        if use_mtime:
            ordering_and_checkpoint_path.append((os.path.getmtime(path), path))
        else:
            regex_match = re.match(".*{}-([0-9]+)".format(checkpoint_prefix), path)
            if regex_match and regex_match.groups():
                ordering_and_checkpoint_path.append((int(regex_match.groups()[0]), path))

    checkpoints_sorted = sorted(ordering_and_checkpoint_path)
    checkpoints_sorted = [checkpoint[1] for checkpoint in checkpoints_sorted]
    return checkpoints_sorted


def _rotate_checkpoints(args, checkpoint_prefix="checkpoint", use_mtime=False) -> None:
    if not args.save_total_limit:
        return
    if args.save_total_limit <= 0:
        return

    # Проверяем нужно ли удалять старые контрольные точки
    checkpoints_sorted = _sorted_checkpoints(args, checkpoint_prefix, use_mtime)
    if len(checkpoints_sorted) <= args.save_total_limit:
        return

    number_of_checkpoints_to_delete = max(0, len(checkpoints_sorted) - args.save_total_limit)
    checkpoints_to_be_deleted = checkpoints_sorted[:number_of_checkpoints_to_delete]
    for checkpoint in checkpoints_to_be_deleted:
        logger.info("Deleting older checkpoint [{}] due to args.save_total_limit".format(checkpoint))
        shutil.rmtree(checkpoint)

## Обучение и оценка

Для обучения нашей модели потребуется довольно много кода, но не волнуйтесь, все должно работать как есть, главное - предоставить модели набор данных в правильном формате.

![alt text](https://media.giphy.com/media/KetvQljQJdEMscR83K/giphy.gif)

In [16]:
def train(args, train_dataset, model: PreTrainedModel, tokenizer: PreTrainedTokenizer) -> Tuple[int, float]:
    """ Train the model """
    if args.local_rank in [-1, 0]:
        tb_writer = SummaryWriter()

    args.train_batch_size = args.per_gpu_train_batch_size * max(1, args.n_gpu)

    def collate(examples: List[torch.Tensor]):
        if tokenizer._pad_token is None:
            return pad_sequence(examples, batch_first=True)
        return pad_sequence(examples, batch_first=True, padding_value=tokenizer.pad_token_id)

    train_sampler = RandomSampler(train_dataset) if args.local_rank == -1 else DistributedSampler(train_dataset)
    train_dataloader = DataLoader(
        train_dataset, sampler=train_sampler, batch_size=args.train_batch_size, collate_fn=collate, drop_last = True
    )

    if args.max_steps > 0:
        t_total = args.max_steps
        args.num_train_epochs = args.max_steps // (len(train_dataloader) // args.gradient_accumulation_steps) + 1
    else:
        t_total = len(train_dataloader) // args.gradient_accumulation_steps * args.num_train_epochs

    model = model.module if hasattr(model, "module") else model  # Позаботьтесь о распределенном / параллельном обучении
    model.resize_token_embeddings(len(tokenizer))
    # add_special_tokens_(model, tokenizer)


    # Подготовить оптимизатор и расписание (линейный прогрев и спад)
    no_decay = ["bias", "LayerNorm.weight"]
    optimizer_grouped_parameters = [
        {
            "params": [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)],
            "weight_decay": args.weight_decay,
        },
        {"params": [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)], "weight_decay": 0.0},
    ]
    optimizer = AdamW(optimizer_grouped_parameters, lr=args.learning_rate, eps=args.adam_epsilon)
    scheduler = get_linear_schedule_with_warmup(
        optimizer, num_warmup_steps=args.warmup_steps, num_training_steps=t_total
    )

    # Проверьте, существуют ли сохраненные состояния оптимизатора или планировщика
    if (
        args.model_name_or_path
        and os.path.isfile(os.path.join(args.model_name_or_path, "optimizer.pt"))
        and os.path.isfile(os.path.join(args.model_name_or_path, "scheduler.pt"))
    ):
        # Загрузить в состояниях оптимизатора и планировщика
        optimizer.load_state_dict(torch.load(os.path.join(args.model_name_or_path, "optimizer.pt")))
        scheduler.load_state_dict(torch.load(os.path.join(args.model_name_or_path, "scheduler.pt")))

    if args.fp16:
        try:
            from apex import amp
        except ImportError:
            raise ImportError("Please install apex from https://www.github.com/nvidia/apex to use fp16 training.")
        model, optimizer = amp.initialize(model, optimizer, opt_level=args.fp16_opt_level)

    # обучение с несколькими графическими процессорами (должно быть после инициализации apex fp16)
    if args.n_gpu > 1:
        model = torch.nn.DataParallel(model)

    # Распределенное обучение (должно быть после инициализации apex fp16)
    if args.local_rank != -1:
        model = torch.nn.parallel.DistributedDataParallel(
            model, device_ids=[args.local_rank], output_device=args.local_rank, find_unused_parameters=True
        )

    # Train!
    logger.info("***** Running training *****")
    logger.info("  Num examples = %d", len(train_dataset))
    logger.info("  Num Epochs = %d", args.num_train_epochs)
    logger.info("  Instantaneous batch size per GPU = %d", args.per_gpu_train_batch_size)
    logger.info(
        "  Total train batch size (w. parallel, distributed & accumulation) = %d",
        args.train_batch_size
        * args.gradient_accumulation_steps
        * (torch.distributed.get_world_size() if args.local_rank != -1 else 1),
    )
    logger.info("  Gradient Accumulation steps = %d", args.gradient_accumulation_steps)
    logger.info("  Total optimization steps = %d", t_total)

    global_step = 0
    epochs_trained = 0
    steps_trained_in_current_epoch = 0
    # Проверьте, продолжается ли обучение с контрольной точки
    if args.model_name_or_path and os.path.exists(args.model_name_or_path):
        try:
            # установите global_step на gobal_step последней сохраненной контрольной точки из пути к модели
            checkpoint_suffix = args.model_name_or_path.split("-")[-1].split("/")[0]
            global_step = int(checkpoint_suffix)
            epochs_trained = global_step // (len(train_dataloader) // args.gradient_accumulation_steps)
            steps_trained_in_current_epoch = global_step % (len(train_dataloader) // args.gradient_accumulation_steps)

            logger.info("  Continuing training from checkpoint, will skip to saved global_step")
            logger.info("  Continuing training from epoch %d", epochs_trained)
            logger.info("  Continuing training from global step %d", global_step)
            logger.info("  Will skip the first %d steps in the first epoch", steps_trained_in_current_epoch)
        except ValueError:
            logger.info("  Starting fine-tuning.")

    tr_loss, logging_loss = 0.0, 0.0

    model.zero_grad()
    train_iterator = trange(
        epochs_trained, int(args.num_train_epochs), desc="Epoch", disable=args.local_rank not in [-1, 0]
    )
    set_seed(args)  # Добавлено здесь для воспроизводимости
    for _ in train_iterator:
        epoch_iterator = tqdm(train_dataloader, desc="Iteration", disable=args.local_rank not in [-1, 0])
        for step, batch in enumerate(epoch_iterator):

            # Пропустите любые уже обученные шаги, если продолжите тренировку
            if steps_trained_in_current_epoch > 0:
                steps_trained_in_current_epoch -= 1
                continue

            inputs, labels = (batch, batch)
            if inputs.shape[1] > 1024: continue
            inputs = inputs.to(args.device)
            labels = labels.to(args.device)
            model.train()
            outputs = model(inputs, labels=labels)
            loss = outputs[0]  # выходы модели всегда являются кортежами в трансформаторах (см. документ)

            if args.n_gpu > 1:
                loss = loss.mean()  # mean () к среднему при параллельном обучении с несколькими графическими процессорами
            if args.gradient_accumulation_steps > 1:
                loss = loss / args.gradient_accumulation_steps

            if args.fp16:
                with amp.scale_loss(loss, optimizer) as scaled_loss:
                    scaled_loss.backward()
            else:
                loss.backward()

            tr_loss += loss.item()
            if (step + 1) % args.gradient_accumulation_steps == 0:
                if args.fp16:
                    torch.nn.utils.clip_grad_norm_(amp.master_params(optimizer), args.max_grad_norm)
                else:
                    torch.nn.utils.clip_grad_norm_(model.parameters(), args.max_grad_norm)
                optimizer.step()
                scheduler.step()  # Обновить расписание скорости обучения
                model.zero_grad()
                global_step += 1

                if args.local_rank in [-1, 0] and args.logging_steps > 0 and global_step % args.logging_steps == 0:
                    # Log metrics
                    if (
                        args.local_rank == -1 and args.evaluate_during_training
                    ):  # Оценивайте только при использовании одного графического процессора, иначе показатели могут не усредниться
                        results = evaluate(args, model, tokenizer)
                        for key, value in results.items():
                            tb_writer.add_scalar("eval_{}".format(key), value, global_step)
                    tb_writer.add_scalar("lr", scheduler.get_lr()[0], global_step)
                    tb_writer.add_scalar("loss", (tr_loss - logging_loss) / args.logging_steps, global_step)
                    logging_loss = tr_loss

                if args.local_rank in [-1, 0] and args.save_steps > 0 and global_step % args.save_steps == 0:
                    checkpoint_prefix = "checkpoint"
                    # Save model checkpoint
                    output_dir = os.path.join(args.output_dir, "{}-{}".format(checkpoint_prefix, global_step))
                    os.makedirs(output_dir, exist_ok=True)
                    model_to_save = (
                        model.module if hasattr(model, "module") else model
                    )  # Позаботьтесь о распределенном / параллельном обучении
                    model_to_save.save_pretrained(output_dir)
                    tokenizer.save_pretrained(output_dir)

                    torch.save(args, os.path.join(output_dir, "training_args.bin"))
                    logger.info("Saving model checkpoint to %s", output_dir)

                    _rotate_checkpoints(args, checkpoint_prefix)

                    torch.save(optimizer.state_dict(), os.path.join(output_dir, "optimizer.pt"))
                    torch.save(scheduler.state_dict(), os.path.join(output_dir, "scheduler.pt"))
                    logger.info("Saving optimizer and scheduler states to %s", output_dir)

            if args.max_steps > 0 and global_step > args.max_steps:
                epoch_iterator.close()
                break
        if args.max_steps > 0 and global_step > args.max_steps:
            train_iterator.close()
            break

    if args.local_rank in [-1, 0]:
        tb_writer.close()

    return global_step, tr_loss / global_step

# Оценка модели

def evaluate(args, model: PreTrainedModel, tokenizer: PreTrainedTokenizer, df_trn, df_val, prefix="") -> Dict:

    eval_output_dir = args.output_dir

    eval_dataset = load_and_cache_examples(args, tokenizer, df_trn, df_val, evaluate=True)
    os.makedirs(eval_output_dir, exist_ok=True)
    args.eval_batch_size = args.per_gpu_eval_batch_size * max(1, args.n_gpu)
    # Обратите внимание, что DistributedSampler выбирает случайным образом

    def collate(examples: List[torch.Tensor]):
        if tokenizer._pad_token is None:
            return pad_sequence(examples, batch_first=True)
        return pad_sequence(examples, batch_first=True, padding_value=tokenizer.pad_token_id)

    eval_sampler = SequentialSampler(eval_dataset)
    eval_dataloader = DataLoader(
        eval_dataset, sampler=eval_sampler, batch_size=args.eval_batch_size, collate_fn=collate, drop_last = True
    )

    # multi-gpu evaluate
    if args.n_gpu > 1:
        model = torch.nn.DataParallel(model)

    # Eval!
    logger.info("***** Running evaluation {} *****".format(prefix))
    logger.info("  Num examples = %d", len(eval_dataset))
    logger.info("  Batch size = %d", args.eval_batch_size)
    eval_loss = 0.0
    nb_eval_steps = 0
    model.eval()

    for batch in tqdm(eval_dataloader, desc="Evaluating"):
        inputs, labels = (batch, batch)
        inputs = inputs.to(args.device)
        labels = labels.to(args.device)

        with torch.no_grad():
            outputs = model(inputs, labels=labels)
            lm_loss = outputs[0]
            eval_loss += lm_loss.mean().item()
        nb_eval_steps += 1

    eval_loss = eval_loss / nb_eval_steps
    perplexity = torch.exp(torch.tensor(eval_loss))

    result = {"perplexity": perplexity}

    output_eval_file = os.path.join(eval_output_dir, prefix, "eval_results.txt")
    with open(output_eval_file, "w") as writer:
        logger.info("***** Eval results {} *****".format(prefix))
        for key in sorted(result.keys()):
            logger.info("  %s = %s", key, str(result[key]))
            writer.write("%s = %s\n" % (key, str(result[key])))

    return result

Запуск

In [17]:
def main(df_trn, df_val):
    args = Args()
    
    if args.should_continue:
        sorted_checkpoints = _sorted_checkpoints(args)
        if len(sorted_checkpoints) == 0:
            raise ValueError("Used --should_continue but no checkpoint was found in --output_dir.")
        else:
            args.model_name_or_path = sorted_checkpoints[-1]

    if (
        os.path.exists(args.output_dir)
        and os.listdir(args.output_dir)
        and args.do_train
        and not args.overwrite_output_dir
        and not args.should_continue
    ):
        raise ValueError(
            "Output directory ({}) already exists and is not empty. Use --overwrite_output_dir to overcome.".format(
                args.output_dir
            )
        )

    # Настройка CUDA, GPU и распределенного обучения
    device = torch.device("cuda")
    args.n_gpu = torch.cuda.device_count()
    args.device = device

    # Настройка ведения журнала
    logging.basicConfig(
        format="%(asctime)s - %(levelname)s - %(name)s -   %(message)s",
        datefmt="%m/%d/%Y %H:%M:%S",
        level=logging.INFO if args.local_rank in [-1, 0] else logging.WARN,
    )
    logger.warning(
        "Process rank: %s, device: %s, n_gpu: %s, distributed training: %s, 16-bits training: %s",
        args.local_rank,
        device,
        args.n_gpu,
        bool(args.local_rank != -1),
        args.fp16,
    )

    # Set seed
    set_seed(args)

    config = AutoConfig.from_pretrained(args.config_name, cache_dir=args.cache_dir)
    tokenizer = AutoTokenizer.from_pretrained(args.tokenizer_name, cache_dir=args.cache_dir)
    model = AutoModelWithLMHead.from_pretrained(
        args.model_name_or_path,
        from_tf=False,
        config=config,
        cache_dir=args.cache_dir,
    )
    model.to(args.device)
    
    logger.info("Training/evaluation parameters %s", args)

    # Training
    if args.do_train:
        train_dataset = load_and_cache_examples(args, tokenizer, df_trn, df_val, evaluate=False)

        global_step, tr_loss = train(args, train_dataset, model, tokenizer)
        logger.info(" global_step = %s, average loss = %s", global_step, tr_loss)

    # Рекомендации по сохранению: если вы используете save_pretrained для модели и токенизатора, вы можете перезагрузить их с помощью from_pretrained ()
    if args.do_train:
        # При необходимости создайте выходной каталог
        os.makedirs(args.output_dir, exist_ok=True)

        logger.info("Saving model checkpoint to %s", args.output_dir)
        # Сохраните обученную модель, конфигурацию и токенизатор с помощью save_pretrained ().
        # Затем их можно перезагрузить с помощью from_pretrained ().
        model_to_save = (
            model.module if hasattr(model, "module") else model
        )  # Позаботьтесь о распределенном / параллельном обучении
        model_to_save.save_pretrained(args.output_dir)
        tokenizer.save_pretrained(args.output_dir)

        # Хорошая практика: сохраните свои обучающие аргументы вместе с обученной моделью
        torch.save(args, os.path.join(args.output_dir, "training_args.bin"))

        # Загрузите обученную модель и словарь, который вы точно настроили
        model = AutoModelWithLMHead.from_pretrained(args.output_dir)
        tokenizer = AutoTokenizer.from_pretrained(args.output_dir)
        model.to(args.device)

    # Evaluation
    results = {}
    if args.do_eval and args.local_rank in [-1, 0]:
        checkpoints = [args.output_dir]
        if args.eval_all_checkpoints:
            checkpoints = list(
                os.path.dirname(c) for c in sorted(glob.glob(args.output_dir + "/**/" + WEIGHTS_NAME, recursive=True))
            )
            logging.getLogger("transformers.modeling_utils").setLevel(logging.WARN)  # Reduce logging
        logger.info("Evaluate the following checkpoints: %s", checkpoints)
        for checkpoint in checkpoints:
            global_step = checkpoint.split("-")[-1] if len(checkpoints) > 1 else ""
            prefix = checkpoint.split("/")[-1] if checkpoint.find("checkpoint") != -1 else ""

            model = AutoModelWithLMHead.from_pretrained(checkpoint)
            model.to(args.device)
            result = evaluate(args, model, tokenizer, df_trn, df_val, prefix=prefix)
            result = dict((k + "_{}".format(global_step), v) for k, v in result.items())
            results.update(result)

    return results

Пора тренировать нашу модель!

![alt text](https://media.giphy.com/media/Tia3dkakIp2m4uGoDI/giphy.gif)

In [18]:
main(trn_df, val_df)

02/20/2021 21:29:44 - INFO - filelock -   Lock 139725378292536 acquired on cached/c3a09526c725b854c685b72cf60c50f1fea9b0e4d6227fa41573425ef4bd4bc6.36332ed8c811a8f0488055ce93ce78909e2f8e027955fba2faa82bfc6728105b.lock
02/20/2021 21:29:44 - INFO - transformers.file_utils -   https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/DialoGPT-small/config.json not found in cache or force_download set to True, downloading to /content/cached/tmp55xrd886


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=641.0, style=ProgressStyle(description_…

02/20/2021 21:29:44 - INFO - transformers.file_utils -   storing https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/DialoGPT-small/config.json in cache at cached/c3a09526c725b854c685b72cf60c50f1fea9b0e4d6227fa41573425ef4bd4bc6.36332ed8c811a8f0488055ce93ce78909e2f8e027955fba2faa82bfc6728105b
02/20/2021 21:29:44 - INFO - transformers.file_utils -   creating metadata file for cached/c3a09526c725b854c685b72cf60c50f1fea9b0e4d6227fa41573425ef4bd4bc6.36332ed8c811a8f0488055ce93ce78909e2f8e027955fba2faa82bfc6728105b
02/20/2021 21:29:44 - INFO - filelock -   Lock 139725378292536 released on cached/c3a09526c725b854c685b72cf60c50f1fea9b0e4d6227fa41573425ef4bd4bc6.36332ed8c811a8f0488055ce93ce78909e2f8e027955fba2faa82bfc6728105b.lock
02/20/2021 21:29:44 - INFO - transformers.configuration_utils -   loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/DialoGPT-small/config.json from cache at cached/c3a09526c725b854c685b72cf60c50f1fea9b0e4d6227fa41573425e




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1042301.0, style=ProgressStyle(descript…

02/20/2021 21:29:45 - INFO - transformers.file_utils -   storing https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/DialoGPT-small/vocab.json in cache at cached/78725a31b87003f46d5bffc3157ebd6993290e4cfb7002b5f0e52bb0f0d9c2dd.1512018be4ba4e8726e41b9145129dc30651ea4fec86aa61f4b9f40bf94eac71
02/20/2021 21:29:45 - INFO - transformers.file_utils -   creating metadata file for cached/78725a31b87003f46d5bffc3157ebd6993290e4cfb7002b5f0e52bb0f0d9c2dd.1512018be4ba4e8726e41b9145129dc30651ea4fec86aa61f4b9f40bf94eac71
02/20/2021 21:29:45 - INFO - filelock -   Lock 139725377142512 released on cached/78725a31b87003f46d5bffc3157ebd6993290e4cfb7002b5f0e52bb0f0d9c2dd.1512018be4ba4e8726e41b9145129dc30651ea4fec86aa61f4b9f40bf94eac71.lock
02/20/2021 21:29:45 - INFO - filelock -   Lock 139725377139768 acquired on cached/570e31eddfc57062e4d0c5b078d44f97c0e5ac48f83a2958142849b59df6bbe6.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda.lock
02/20/2021 21:29:45 - INFO - transformers.f




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=456318.0, style=ProgressStyle(descripti…




02/20/2021 21:29:45 - INFO - transformers.file_utils -   storing https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/DialoGPT-small/merges.txt in cache at cached/570e31eddfc57062e4d0c5b078d44f97c0e5ac48f83a2958142849b59df6bbe6.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
02/20/2021 21:29:45 - INFO - transformers.file_utils -   creating metadata file for cached/570e31eddfc57062e4d0c5b078d44f97c0e5ac48f83a2958142849b59df6bbe6.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
02/20/2021 21:29:45 - INFO - filelock -   Lock 139725377139768 released on cached/570e31eddfc57062e4d0c5b078d44f97c0e5ac48f83a2958142849b59df6bbe6.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda.lock
02/20/2021 21:29:45 - INFO - transformers.tokenization_utils -   loading file https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/DialoGPT-small/vocab.json from cache at cached/78725a31b87003f46d5bffc3157ebd6993290e4cfb7002b5f0e52bb0f0d9c2dd.1512018b

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=351265583.0, style=ProgressStyle(descri…

02/20/2021 21:29:50 - INFO - transformers.file_utils -   storing https://cdn.huggingface.co/microsoft/DialoGPT-small/pytorch_model.bin in cache at cached/9eab12d0b721ee394e9fe577f35d9b8b22de89e1d4f6a89b8a76d6e1a82bceae.906a78bee3add2ff536ac7ef16753bb3afb3a1cf8c26470f335b7c0e46a21483
02/20/2021 21:29:50 - INFO - transformers.file_utils -   creating metadata file for cached/9eab12d0b721ee394e9fe577f35d9b8b22de89e1d4f6a89b8a76d6e1a82bceae.906a78bee3add2ff536ac7ef16753bb3afb3a1cf8c26470f335b7c0e46a21483
02/20/2021 21:29:50 - INFO - filelock -   Lock 139725377141616 released on cached/9eab12d0b721ee394e9fe577f35d9b8b22de89e1d4f6a89b8a76d6e1a82bceae.906a78bee3add2ff536ac7ef16753bb3afb3a1cf8c26470f335b7c0e46a21483.lock
02/20/2021 21:29:50 - INFO - transformers.modeling_utils -   loading weights file https://cdn.huggingface.co/microsoft/DialoGPT-small/pytorch_model.bin from cache at cached/9eab12d0b721ee394e9fe577f35d9b8b22de89e1d4f6a89b8a76d6e1a82bceae.906a78bee3add2ff536ac7ef16753bb3afb3a1cf




02/20/2021 21:29:54 - INFO - transformers.modeling_utils -   Weights of GPT2LMHeadModel not initialized from pretrained model: ['transformer.h.0.attn.masked_bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.3.attn.masked_bias', 'transformer.h.4.attn.masked_bias', 'transformer.h.5.attn.masked_bias', 'transformer.h.6.attn.masked_bias', 'transformer.h.7.attn.masked_bias', 'transformer.h.8.attn.masked_bias', 'transformer.h.9.attn.masked_bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.11.attn.masked_bias']
02/20/2021 21:30:08 - INFO - __main__ -   Training/evaluation parameters <__main__.Args object at 0x7f14598a0438>
02/20/2021 21:30:08 - INFO - __main__ -   Creating features from dataset file at cached
02/20/2021 21:30:10 - INFO - __main__ -   Saving features into cached file cached/gpt2_cached_lm_512
02/20/2021 21:30:10 - INFO - __main__ -   ***** Running training *****
02/20/2021 21:30:10 - INFO - __main__ -     Num examples = 1708


HBox(children=(FloatProgress(value=0.0, description='Epoch', max=3.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Iteration', max=427.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Iteration', max=427.0, style=ProgressStyle(description_wi…




HBox(children=(FloatProgress(value=0.0, description='Iteration', max=427.0, style=ProgressStyle(description_wi…

02/20/2021 21:37:03 - INFO - __main__ -    global_step = 1281, average loss = 2.0120008991715688
02/20/2021 21:37:03 - INFO - __main__ -   Saving model checkpoint to output-small
02/20/2021 21:37:03 - INFO - transformers.configuration_utils -   Configuration saved in output-small/config.json






02/20/2021 21:37:05 - INFO - transformers.modeling_utils -   Model weights saved in output-small/pytorch_model.bin
02/20/2021 21:37:05 - INFO - transformers.configuration_utils -   loading configuration file output-small/config.json
02/20/2021 21:37:05 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_layer": 12,
  "n_positions": 1024,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "conversational": {
      "max_length": 1000
    }
  },
  "vocab_size": 50257
}

02/20/2021 21:37:05 - INFO - trans

HBox(children=(FloatProgress(value=0.0, description='Evaluating', max=47.0, style=ProgressStyle(description_wi…

02/20/2021 21:37:19 - INFO - __main__ -   ***** Eval results  *****
02/20/2021 21:37:19 - INFO - __main__ -     perplexity = tensor(3.7678)





{'perplexity_': tensor(3.7678)}

# Беседа с Риком

Модель готова, самое время пообщаться с Риком. Но не забывай, что Рик может быть грубым, я тебя предупреждал.

Для генерации ответов можно использовать самые разные методы. Более подробную информацию об этих методах вы можете найти по этой [ссылке.](https://huggingface.co/blog/how-to-generate). 

![alt text](https://media.giphy.com/media/ftl8NiWz2C3tuXOn3w/giphy.gif)


In [19]:
tokenizer = AutoTokenizer.from_pretrained('microsoft/DialoGPT-small')
model = AutoModelWithLMHead.from_pretrained('output-small')

02/20/2021 21:37:27 - INFO - transformers.configuration_utils -   loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/microsoft/DialoGPT-small/config.json from cache at /root/.cache/torch/transformers/c3a09526c725b854c685b72cf60c50f1fea9b0e4d6227fa41573425ef4bd4bc6.36332ed8c811a8f0488055ce93ce78909e2f8e027955fba2faa82bfc6728105b
02/20/2021 21:37:27 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_layer": 12,
  "n_positions": 1024,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_param

In [20]:
for step in range(5):
    new_user_input_ids = tokenizer.encode(input(">> Black:") + tokenizer.eos_token, return_tensors='pt')
    # print(new_user_input_ids)


    bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids

 
    chat_history_ids = model.generate(
        bot_input_ids, max_length=200,
        pad_token_id=tokenizer.eos_token_id,  
        no_repeat_ngram_size=3,       
        do_sample=True, 
        top_k=100, 
        top_p=0.7,
        temperature = 0.8
    )
    
    # pretty print last ouput tokens from bot
    print("RickBot: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))

>> Black:Hi
RickBot: Hiya.
>> Black:How are you?
RickBot: I'm fine. How are you today?
>> Black:Fine
RickBot: You're my hero, Morty. I mean, really.
>> Black:Okey
RickBot: Whoa, whoa! Whoooa! Whoa! Who's there?!
>> Black:I am Black
RickBot: I-I-Am!


**Я приведу пример еще нескольких диалогов, чтобы показать, что Рики теперь наш Рик.**

![alt text](https://media.giphy.com/media/eNpXWzGIMRjIo4lXT8/giphy.gif)

**Black**:Who are you? <br>
**RickBot**: A musician, I'm not a musician, but I'm pretty close to you. <br>
**Black**:What kind of musician are you? <br>
**RickBot**: Who cares? I'm a musician. <br>
**Black**:Ok, play a song for me <br>
**RickBot**: Shut up and listen to me, Morty. Listen to me. Listen! I'm the leader of a bunch of different death worlds, and I'm just trying to assemble a band. I'm gonna make this one-piece, and you're gonna be part of it. <br>
**Black**:ok <br>
**RickBot**: Whoa, whoa, hey, who's this? <br>
**Black**:This is Morty <br>
**RickBot**: Geez!!! <br>

![alt text](https://i.kym-cdn.com/news/posts/desktop/000/000/247/Screenshot_21.jpg)

[news.knowyourmeme.com](https://news.knowyourmeme.com/news/elon-musks-rick-and-morty-cameo-is-a-meme-reference)

**Black**:What do you think about Elon Musk? <br>
**RickBot**: I think about him a lot. <br>
**Black**:But what exactly do you think? <br>
**RickBot**: He's not a bad guy. He's just... not a very charismatic one. <br>
**Black**:I think he is charismatic <br>
**RickBot**: He is. <br>
**Black**:What else you can say about him? <br>
**RickBot**: You don't have to say anything. He is the leader of your class. <br>
**Black**:ok, thank you <br>
**RickBot**: !!!

![alt text](https://media.giphy.com/media/YSTmV6MWEkgW0HblZQ/giphy.gif)


**Black**:hey, go get me a burger <br>
**RickBot**: Whoa, whoa, hold on a second, let's not get crazy. This is gonna be good. <br>
**Black**:I'm hungry! c'mon, buy me a burger or better two <br>
**RickBot**: Holy crap. <br>
**Black**:no crappy burger, a good one! <br>
**RickBot**: What the hell? <br>
**Black**:don't be so bad, feed me <br>
**RickBot**: I'll be back as fast as I can, Morty. <br>
**Black**:good boy, and don't forget ketchup and fries <br>
**RickBot**: !!!!!! <br>
**Black**:what a hell? can you reply with words? <br>
**RickBot**: Yes!! I will!! We will go out on a date. <br>

# **Вывод**

![alt text](https://media.giphy.com/media/YQBWwGLJtFDQIATck5/giphy.gif)



**Ураааа...!!!**
* Наш виртуальный Рик жив (почти)! С помощью тонкой настройки нашей модели на небольшом наборе данных смогли создать виртуального персонажа, с которым  можем вести интересные диалоги.

* Используя предложенный подход, вы можете создать множество интересных виртуальных персонажей на основе произвольного набора данных диалогов (просто файл csv с репликами, по одной реплике в строке).