<img src="imgs/hpe_logo.png" alt="HPE Logo" width="300">

# Data Science Summit Workshop: 
## Creating chatbots by finetuning GPT models with Machine Learning Development Environment
 ----

## Objective: Train your own chatbot
This notebook walks you through finetuning your own chatbot.
We will learn what are Generative Pretrained Transformers (GPT) and how you can finetune them for domain specific chatbots.

## Motivation: The rise of generative language modeling
Chatbots can be useful across many enterprise applications:
* `Enterprise`: Chatbots for helpdesk support
* `Healthcare`: Chatbots for scheduling appts, manage coverage, process claims
* `Manufacturing`: Chatbots for checking supplies and inventory check
* `Financial Services`: Chatbots for investment and account support

Generative Language models like GPT-4 and ChatGPT enable exciting applications that were not possible previously! Unfortunately, enterprises can't use to use ChatGPT and GPT-4 if models need to analyze proprietary data. Here we show how to finetune open source GPT models for domain-specific applications and host on prem.

The Machine Learning Development Environment can help data scientists and ML engineers finetune language models for enterprise usecases!

## Why MLDE

Developing robust, high performing Deep Learning (DL) application is challenging. To deploy DL applications succesfully great infrastructure. Building and managing distributed training, automatic checkpointing, hyperparameter search and metrics tracking is critical and challenging for small teams. 

Machine Learning Development Environment (MLDE) can remove the burden of writing and maintaining a custom training infrastructure and offers a streamlined approach to onboard new models to a state-of-the-art training platform, offering the following integrated platform features:

<img src="imgs/det_components.jpg" alt="Determined Components" width="900">

MLDE provides a high-level framework APIs for PyTorch, Keras, and Estimators that let users describe their model without boilerplate code. MLDE reduces boilerplate by providing a state-of-the-art training loop that provides distributed training, hyperparameter search, automatic mixed precision, reproducibility, and many more features.

## Overview of Workshop

* Step 1: Overview what are GPT models
* Step 2: Test a Data Science Chatbot using pretrained weights
* Step 3: Dive into what model training looks like without Determined
* Step 4: Overview of integrating Pytorch training code into MLDE
* Step 5 - 6: Updating model configuration files, complete User Task
* Step 7: Finetuning a  Chatbot on a Data Science Textbook
* Step 8: Launch a distributed training Experiment
* Step 9: Test a Chatbot to convert English text to Latex
* Step 10: Explore and preprocess custom dataset to finetune Dataset to train
* Step 11: Finetune Chatbot to convert English text to Latex
* Step 12: Run inference on finetuned model
* Step 13: (Optional): Improve inference with few shot prompting
* User Exercise 1: Finetune chat book on book choice


Let's get started!

This Demo is based on the following works: 

* https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm_no_trainer.py
* https://github.com/sinanuozdemir/oreilly-transformers-video-series/blob/main/notebooks/8%20Hands_on_GPT.ipynb


---

In [None]:
from transformers import GPT2Tokenizer, TextDataset, DataCollatorForLanguageModeling, GPT2LMHeadModel, pipeline, \
                         Trainer, TrainingArguments
import torch
from datasets import Dataset
import pandas as pd
from determined.experimental import Determined
from utils import load_model_from_checkpoint
from determined.experimental import client
from determined import pytorch

# Step 1: Overview What are GPT models

<img src="imgs/openAI-gpt2.png" alt="Determined Components" width="900">

GPT2 is a transformer-based language model created by OpenAI. The model is a causal (unidirectional) transformer pre-trained using language modeling on a large corpus with long range dependencies.

Since GPT2 was released, many larger GPT variants that are open sourced. Some examples available on huggingface include: GPT2-Medium, GPT2-Large and GPT2-XL.

For a deeper dive into the model architecture, take a look at this  article: http://jalammar.github.io/illustrated-gpt2/

# Step 2: Test a Data Science Chatbot using pretrained weights

We will load a pretrained model and see how it responds to data science questions. The model was pretrained on the WebText dataset (46 million urls, 40+GB of text)
More info about the pretrained model can be view here: https://huggingface.co/docs/transformers/model_doc/gpt2

Prompt we will run: `A test statistic is`

In [None]:
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')  # load up a GPT2 model
pretrained_generator = pipeline(
    'text-generation', model=model, tokenizer='gpt2',
    config={'max_length': 200, 'do_sample': True, 'top_p': 0.9, 'temperature': 0.7, 'top_k': 10}
)

In [None]:
PROMPT='A test statistic is'

In [None]:
# Run cell to see how model responds to prompt
print('----------')
for generated_sequence in pretrained_generator(PROMPT, num_return_sequences=3):
    print(generated_sequence['generated_text'])
    print('----------')

We can see the model can somewhat respond, but the response is for the most part not usable and nonsense.

# Step 3: Dive into what model training looks like without Determined
If a data scientist were to write their own training code, this is what it would look like:

```python

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

dataset = TextDataset(
    tokenizer=tokenizer,
    file_path='./data/PDS2.txt',  # Principles of Data Science - Sinan Ozdemir
    block_size=32  # length of each chunk of text to use as a datapoint
)
config = GPT2Config.from_pretrained('gpt2')
tokenizer.pad_token = tokenizer.eos_token

model = GPT2LMHeadModel.from_pretrained('gpt2')
model.to(device)

train_sampler = RandomSampler(dataset)
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)
train_dataloader = DataLoader(dataset, collate_fn =data_collator ,sampler=train_sampler, batch_size=train_batch_size)

t_total = len(dataset) // gradient_accumulation_steps * num_train_epochs
# Prepare optimizer and schedule (linear warmup and decay)
optimizer = AdamW(optimizer_grouped_parameters, lr=learning_rate, eps=adam_epsilon)
scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=warmup_steps, num_training_steps=t_total)

model.resize_token_embeddings(len(tokenizer))
model.zero_grad()
train_iterator = trange(int(num_train_epochs), desc="Epoch", disable=local_rank not in [-1, 0])
set_seed(0)
for _ in train_iterator:
    epoch_iterator = tqdm(train_dataloader, desc="Iteration", disable=local_rank not in [-1, 0])
    for step, batch in enumerate(epoch_iterator):
        inputs, labels = (batch, batch) # batch contains a dict of {'labels', 'input_ids' and 'attention_mask'}
        inputs = inputs.to(device)
        labels = labels.to(device)
        model.train()

        outputs = model(inputs, labels=labels)
        loss = outputs[0]  # model outputs are always tuple in transformers (see doc)
        if fp16:
            with amp.scale_loss(loss, optimizer) as scaled_loss:
                scaled_loss.backward()
        loss.backward()
```

Here is what it would look like to do hyperparameter search on the same training code

```python

import numpy as np

def train(lr,m):
    tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

    dataset = TextDataset(
        tokenizer=tokenizer,
        file_path='./data/PDS2.txt',  # Principles of Data Science - Sinan Ozdemir
        block_size=32  # length of each chunk of text to use as a datapoint
    )
    config = GPT2Config.from_pretrained('gpt2')
    tokenizer.pad_token = tokenizer.eos_token

    model = GPT2LMHeadModel.from_pretrained('gpt2')
    model.to(device)
    train_sampler = RandomSampler(dataset)
    data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)
    train_dataloader = DataLoader(dataset, collate_fn =data_collator ,sampler=train_sampler, batch_size=train_batch_size)

    t_total = len(dataset) // gradient_accumulation_steps * num_train_epochs

    optimizer = AdamW(optimizer_grouped_parameters, lr=learning_rate, eps=adam_epsilon)
    scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=warmup_steps, num_training_steps=t_total)

    model.resize_token_embeddings(len(tokenizer))
    model.zero_grad()
    train_iterator = trange(int(num_train_epochs), desc="Epoch", disable=local_rank not in [-1, 0])
    set_seed(0)
    for _ in train_iterator:
        epoch_iterator = tqdm(train_dataloader, desc="Iteration", disable=local_rank not in [-1, 0])
        for step, batch in enumerate(epoch_iterator):
            inputs, labels = (batch, batch) # batch contains a dict of {'labels', 'input_ids' and 'attention_mask'}
            inputs = inputs.to(device)
            labels = labels.to(device)
            model.train()

            outputs = model(inputs, labels=labels)
            loss = outputs[0]  # model outputs are always tuple in transformers (see doc)
            if fp16:
                with amp.scale_loss(loss, optimizer) as scaled_loss:
                    scaled_loss.backward()
            loss.backward()
    model, loss
def hp_grid_search():
    for lr in np.logspace(-4, -2, num=10):
        for m in np.linspace(0.7, 0.95, num=10):
            print(f"Training model with learning rate {lr} and momentum {m}")
            model, loss = train(lr,m)
            print(f"Train Loss: {loss}\n")

try:
    hp_grid_search()
except KeyboardInterrupt:
    pass


```

#### What's Missing?
This approach works in theory -- we could get a good model, save it, and use it for predictions. But we're missing a lot from the ideal state:

#### Distributed training
    - Parallel search
    - Intelligent checkpointing
    - Interruptibility and fault tolerance
    - Logging of experiment configurations and results

# Step 4: Overview of integrating Pytorch training code into MLDE

Here we will see how to implement the same training loop in MLDE, but automatically enable distributed training, automated checkpointing, and automatic hyperparameter search.

The main components for any deep learning training loop are the following:
* Datasets
* Dataloader
* Model
* Optimizer
* (Optional) Learn rate schedule
* training a batch, evaluating a batch

We will show how to integrate each core part into MLDE using the PyTorchTrial API. Note we have another API called CoreAPI, that supports flexibility if your team wants to integrate more complex Machine Learning codebases. 

### Template Class that integrates DL code with MLDE
The Template class is class we need to fill in to implement our training loop

```python
import filelock
import os
from typing import Any, Dict, Sequence, Tuple, Union, cast

import torch
import torch.nn as nn
from torch import optim
from determined.pytorch import DataLoader, PyTorchTrial, PyTorchTrialContext

import data

TorchData = Union[Dict[str, torch.Tensor], Sequence[torch.Tensor], torch.Tensor]

class GPT2Trial(PyTorchTrial):
    def __init__(self, context: PyTorchTrialContext) -> None:
        # Trial context contains info about the trial, such as the hyperparameters for training
        self.context = context
        
        # init and wrap model, optimizer, LRscheduler, datasets
       

    def build_training_data_loader(self) -> DataLoader:
        # create train dataloader from dataset
        return DataLoader()

    def build_validation_data_loader(self) -> DataLoader:
        # create train dataloader from dataset
        return DataLoader()

    def train_batch(self, batch: TorchData, epoch_idx: int, batch_idx: int)  -> Dict[str, Any]:
        return {}

    def evaluate_batch(self, batch: TorchData) -> Dict[str, Any]:
        return {}
```

### Wrapping the Model
* Wrapping model to the TrialContext allows MLDE to reduces boilerplate code
* Providing a state-of-the-art training loop that provides distributed training, hyperparameter search, automatic mixed precision, reproducibility, and many more features
* All the models, optimizers, and LR schedulers must be wrapped with wrap_model and wrap_optimizer respectively

```python
self.model = GPT2LMHeadModel.from_pretrained('gpt2')
# Wrapping model to the TrialContext 
self.model = self.context.wrap_model(self.model)
```

### Wrapping the Optimizer

```python
self.optimizer = self.context.wrap_optimizer(
            AdamW(optimizer_grouped_parameters, lr=self.learning_rate, eps=self.adam_epsilon)
            )
```

### Wrapping the Learn Rate Scheduler

```python
self.scheduler = self.context.wrap_lr_scheduler(
    get_linear_schedule_with_warmup(self.optimizer, num_warmup_steps=self.warmup_steps,
                                    num_training_steps=self.t_total),
    LRScheduler.StepMode.MANUAL_STEP
)
```

### Integrating a Dataset
Here we integrate the same TextDataset (used in Step 3) that formats and preprocess our text file to finetune our GPT model on. 

```python
dataset = TextDataset(
                tokenizer=tokenizer,
                file_path='/run/determined/workdir/shared_fs/workshop_data/PDS2.txt',
                block_size=32  # length of each chunk of text to use as a datapoint
            )
```

### Implement Train Dataloader and Validation Dataloader

```python
def build_training_data_loader(self) -> None:
    '''
    '''
    self.train_sampler = RandomSampler(self.dataset)
    self.train_dataloader = DataLoader(self.dataset, collate_fn =self.data_collator ,sampler=self.train_sampler, batch_size=self.train_batch_size)
    return self.train_dataloader
def build_validation_data_loader(self) -> None:
    '''
    '''
    self.eval_sampler = SequentialSampler(self.dataset)
    self.validataion_dataloader = DataLoader(self.dataset,collate_fn =self.data_collator, sampler=self.eval_sampler, batch_size=self.eval_batch_size)
```

### Implement Train Batch and Evaluate Batch

```python
def train_batch(self,batch,epoch_idx, batch_idx):
    '''
    '''
    inputs,labels = self.format_batch(batch)
    outputs = self.model(inputs, labels=labels)
    loss = outputs[0]
    train_result = {
        'loss': loss
    }
    self.context.backward(train_result["loss"])
    self.context.step_optimizer(self.optimizer)
    return train_result

def evaluate_batch(self,batch):
    '''
    '''
    inputs,labels = self.format_batch(batch)
    outputs = self.model(inputs, labels=labels)
    lm_loss = outputs[0]
    eval_loss = lm_loss.mean().item()
    perplexity = torch.exp(torch.tensor(eval_loss))

    results = {
        "eval_loss": eval_loss,
        "perplexity": perplexity
    }
    return results
```

# Step 5: Defining Training Experiment with Model Config
In Determined, a trial is a training task that consists of a dataset, a deep learning model, and values for all of the model’s hyperparameters. An experiment is a collection of one or more trials: an experiment can either train a single model (with a single trial), or can define a search over a user-defined hyperparameter space.


Here is what a configuration file looks like for a single trial experiment
```yaml
name: gpt2_finetune_data_science_chatbot
workspace: <your_workspace>
project: <your_project>
description: "DS Workshop"
hyperparameters:
    global_batch_size: 32
    weight_decay: 0.0
    learning_rate: 5e-5
    adam_epsilon: 1e-8
    warmup_steps: 0
    epochs: 10
    gradient_accumulation_steps: 1
    dataset_name: 'PDS2'
environment:
    image: "hugcyrill/workshops:chat_0.1"
records_per_epoch: 147 # 4696 examples total, shortening for experimentation
resources:
    slots_per_trial: 1
min_validation_period:
  batches: 4
min_checkpoint_period:
  batches: 147
searcher:
    name: single
    metric: eval_loss
    max_length:
        epochs: 30
    smaller_is_better: true
max_restarts: 0
entrypoint: model_def:GPT2Finetune
```

Here is what a configuration yaml file looks like to do a hyperparameter search
```yaml
name: adaptive_gpt2_finetune_data_science_chatbot
workspace: <your_workspace>
project: <your_project>
description: "DS Workshop"
hyperparameters:
    global_batch_size: 32
    weight_decay: 0.0
    learning_rate:
        type: log
        minval: -6.0
        maxval: -4.0
        base: 10.0
    adam_epsilon:
        type: log
        minval: -10.0
        maxval: -4.0
        base: 10.0
    warmup_steps: 0
    epochs: 10
    gradient_accumulation_steps: 1
    dataset_name: 'PDS2'
environment:
    image: "hugcyrill/workshops:chat_0.1"
records_per_epoch: 147 # 4696 examples total, shortening for experimentation
resources:
    slots_per_trial: 1
min_validation_period:
  batches: 4
min_checkpoint_period:
  batches: 147
searcher:
    name: adaptive_asha
    metric: eval_loss
    max_length:
        epochs: 30
    smaller_is_better: true
    max_trials: 4
max_restarts: 0
entrypoint: model_def:GPT2Finetune

```

# Step 6: Updating model configuration files

## Step 6.1: Update `const_ds_chatbot.yaml`
Please  update the `const_ds_chatbot.yaml` file in the `determined_files/` folder and look for the workspace and project fields. Replace the placeholders with your workspace name and project name, which you created during the preparation session:

```yaml
workspace: <your_workspace>
project: <your_project>
```

## Step 6.2: update `const_eng_to_latex.yaml`
Update the `const_eng_to_latex.yaml` file in the `determined_files/`in the same manner.  Replace the placeholders with your workspace name and project name.

# Step 7: Finetuning a Chatbot on a Data Science Textbook
Run cell to start finetuning chatbot on data science book. We have already configured our experiment in `determined_files/const_ds_chatbot.yaml` that trains on a text file called PDS2.txt. This is a text file from a Pakt publishing book called "Principles of Data Science". Link to the book here https://www.packtpub.com/product/principles-of-data-science/9781785887918

In [None]:
!det experiment create \
    determined_files/const_ds_chatbot.yaml \
    determined_files/ 

### Step 7.1 See Result
Replace the experiment ID with their ID once the experiment is completed in MLDE. only then they will be able to test their finetuned model on their book

In [None]:
experiment_id = <EXP_ID>
MODEL_NAME = "gpt2"
checkpoint = client.get_experiment(experiment_id).top_checkpoint(sort_by="eval_loss", smaller_is_better=True)
print(checkpoint.uuid)
loaded_model = load_model_from_checkpoint(checkpoint)

In [None]:
finetuned_generator = pipeline(
    'text-generation', model=loaded_model, tokenizer=tokenizer,
    config={'max_length': 200,  'do_sample': True, 'top_p': 0.9, 'temperature': 0.7, 'top_k': 10}
)

Here is how the chatbot responds to same prompt after finetuning

In [None]:
PROMPT='A test statistic is a value'

In [None]:
print('----------')
for generated_sequence in finetuned_generator(PROMPT, num_return_sequences=3):
    print(generated_sequence['generated_text'])
    print('----------')

# Step 8: Launch a distributed training Experiment
With Determined, to scale to a multi-GPU distributed training job only requires a single configuration line change. There is no need to worry about setting up frameworks like Horovod or PyTorch Lightning.

Distributed changing to train a 2 gpu job is located at `dist_ds_chatbot.yaml`. Copy`const_ds_chatbot.yaml` and rename to `dist_ds_chatbot.yaml`. Change the <b>slots_per_trial field from 1 to 2</b> to run a distributed training job on 2 GPUs. The below cell we can execute the job. Make sure that the workspace and project is filled in:
```yaml
workspace: <your_workspace>
project: <your_project>
```

In [None]:
!det experiment create \
    determined_files/dist_ds_chatbot.yaml \
    determined_files/ 

# Step 9: Test a Chatbot to convert English text to Latex

Here we are downloading pretrained weights and seeing how GPT does on a question to convert an english description into latex

In [None]:
MODEL='gpt2'
non_finetuned_latex_generator = pipeline(
    'text-generation', 
    model=GPT2LMHeadModel.from_pretrained(MODEL),  # not fine-tuned!
    tokenizer=tokenizer
)

In [None]:
# Add our singular prompt
CONVERSION_PROMPT = 'LCT\n'  # LaTeX conversion task

CONVERSION_TOKEN = 'LaTeX:'

text_sample='x to the fourth power'

conversion_text_sample = f'{CONVERSION_PROMPT}English: {text_sample}\n{CONVERSION_TOKEN}'

print(conversion_text_sample)
print(non_finetuned_latex_generator(
    conversion_text_sample, num_beams=5, early_stopping=True, temperature=0.7,
    max_length=len(tokenizer.encode(conversion_text_sample)) + 20
)[0]['generated_text'])

# Step 10: Explore and preprocess custom dataset to finetune Dataset to train
To finetune the model on this niche task. We need a dataset, here we will see the dataset and how we are preprocessing it for finetuning.

In [None]:
data = pd.read_csv('./data/english_to_latex.csv')

print(data.shape)

data.head(2)

This is how the dataset is preprocessed for GPT to learn via prompting

In [None]:
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

tokenizer.pad_token = tokenizer.eos_token

# Add our singular prompt
CONVERSION_PROMPT = 'LCT\n'  # LaTeX conversion task

CONVERSION_TOKEN = 'LaTeX:'
# This is our "training prompt" that we want GPT2 to recognize and learn
training_examples = f'{CONVERSION_PROMPT}English: ' + data['English'] + '\n' + CONVERSION_TOKEN + ' ' + data['LaTeX'].astype(str)

print(training_examples[0])

# Step 11: Finetune Chatbot to convert English text to Latex
Now we have the dataset preprocessed, we will now finetune the model.

In [None]:
!det experiment create \
    determined_files/const_eng_to_latex.yaml \
    determined_files/ 

# Step 12: Run inference on finetuned model
Replace the experiment ID with their ID once the experiment is completed in MLDE. only then they will be able to test their finetuned model on their book

In [None]:
# Get the best checkpoint from the training
experiment_id = <EXP_ID>
MODEL_NAME = "gpt2"
checkpoint = client.get_experiment(experiment_id).top_checkpoint(sort_by="eval_loss", smaller_is_better=True)
print(checkpoint.uuid)
loaded_model = load_model_from_checkpoint(checkpoint)

In [None]:
latex_generator = pipeline('text-generation', model=loaded_model, tokenizer=tokenizer)

In [None]:
text_sample = 'x to the fourth power'

conversion_text_sample = f'{CONVERSION_PROMPT}English: {text_sample}\n{CONVERSION_TOKEN}'
print(latex_generator(
    conversion_text_sample, num_beams=5, early_stopping=True, temperature=0.7,
    max_length=len(tokenizer.encode(conversion_text_sample)) + 20
)[0]['generated_text'])

# Step 13 (Optional): Improve inference with few shot prompting
Here we include examples of correct conversions for GPT for additional context

In [None]:
few_shot_prompt = """LCT
English: f of x is sum from 0 to x of x squared
LaTeX: f(x) = \sum_{0}^{x} x^2 \,dx \
###
LCT
English: f of x equals integral from 0 to pi of x to the fourth power
LaTeX: f(x) = \int_{0}^{\pi} x^4 \,dx \
###
LCT
English: f of x is x to the third power
LaTeX:"""

In [None]:
print(latex_generator(
    few_shot_prompt, num_beams=5, early_stopping=True, temperature=0.7,
    max_length=len(tokenizer.encode(few_shot_prompt)) + 20
)[0]['generated_text'])

# User Exercise 1: Finetune chat book on book choice
Lets form into groups, download text from online, and train our own chatbot!

Steps to integrate custom dataset:
* Go to project gutenburg and pick a book: (i.e. https://www.gutenberg.org/ebooks/1787 )
* copy URL Plain Text UTF-8 .txt file and download using command: 
    - Example: `wget -O hamlet.txt https://www.gutenberg.org/cache/epub/1787/pg1787.txt`
* move to shared directory: `cp hamlet.txt /run/determined/workdir/shared_fs/exercise/ -v`
* Copy `run_det_ds_chatbot.sh` and rename: 
    - i.e. `cp determined_files/const_ds_chatbot.yaml determined_files/const_hamlet_chatbot.yaml `
* NOTE: MAKE SURE THAT the file is a text file, and that there are no spaces in the name of the text file
* Finally, change `name` field that describes the name of the experiment
    - example -> `name: gpt2_finetune_hamlet_chatbot`
* Change the `dataset_name` to name of text file: (i.e. hamlet)
    - Do not include `.txt` in dataset name

In [None]:
!wget -O <CHANGE_NAME>.txt <URL_TO_TXT_FILE>

In [None]:
!cp <CHANGE_NAME>.txt /run/determined/workdir/shared_fs/workshop_data/ -v

In [None]:
!cp determined_files/const_ds_chatbot.yaml determined_files/const_<CHANGE_NAME>_chatbot.yaml

Edit the following fields in `const_<CHANGE_NAME>_chatbot.yaml`
```yaml
name: <CHANGE_EXPERIMENT_NAME>
workspace: <your_workspace>
project: <your_project>
description: "DS Workshop"
hyperparameters:
    global_batch_size: 32
    weight_decay: 0.0
    learning_rate: 5e-5
    adam_epsilon: 1e-8
    warmup_steps: 0
    epochs: 10
    device: 'cuda'
    gradient_accumulation_steps: 1
    dataset_name: <DATASET_NAME_TO_NEW_NAME>
    train_batch_size: 32
    eval_batch_size: 32
environment:
    image: "hugcyrill/workshops:chat_0.1"
records_per_epoch: 147 # 4696 examples, 128 per batch: 601088/21 is 147 records for an epoch
resources:
    slots_per_trial: 1
min_validation_period:
  batches: 4
min_checkpoint_period:
  batches: 147
searcher:
    name: single
    metric: eval_loss
    max_length:
        epochs: 30
    smaller_is_better: true
max_restarts: 0
entrypoint: model_def:GPT2Finetune
```

In [None]:
# Run finetuning job 
!det experiment create determined_files/const_<CHANGE_ME>_chatbot.yaml determined_files/

Replace the experiment ID with their ID once the experiment is completed in MLDE. only then they will be able to test their finetuned model on their book

In [None]:
# Run inference
experiment_id = <EXP_ID>
MODEL_NAME = "gpt2"
checkpoint = client.get_experiment(experiment_id).top_checkpoint(sort_by="eval_loss", smaller_is_better=True)
print(checkpoint.uuid)
loaded_model = load_model_from_checkpoint(checkpoint)

finetuned_generator = pipeline(
    'text-generation', model=loaded_model, tokenizer=tokenizer,
    config={'max_length': 200,  'do_sample': True, 'top_p': 0.9, 'temperature': 0.7, 'top_k': 10}
)

In [None]:
PROMPT='My name' # Example: PROMPT='Hamlet:'

In [None]:
print('----------')
for generated_sequence in finetuned_generator(PROMPT, num_return_sequences=3):
    print(generated_sequence['generated_text'])
    print('----------')