## Self-supervised fine-tuning
Goal is to train the Transformer as a Masked Language Model (MLM) in order to fine-tune the underlying language model. Will come back to if have time in the future

In [14]:
from transformers import AutoModelForMaskedLM
from transformers import AutoTokenizer
import torch
HF_HUB_DISABLE_SYMLINKS_WARNING = False

In [15]:
model_checkpoint = "distilbert-base-uncased"
model = AutoModelForMaskedLM.from_pretrained(model_checkpoint)

In [16]:
distilbert_num_parameters = model.num_parameters() / 1_000_000
print(f"'>>> DistilBERT number of parameters: {round(distilbert_num_parameters)}M'")
print(f"'>>> BERT number of parameters: 110M'")

'>>> DistilBERT number of parameters: 67M'
'>>> BERT number of parameters: 110M'


In [17]:
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

In [18]:
text = "This is an awful [MASK]."

In [19]:
inputs = tokenizer(text, return_tensors="pt")
token_logits = model(**inputs).logits
# Find the location of [MASK] and extract its logits
mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
mask_token_logits = token_logits[0, mask_token_index, :]
# Pick the [MASK] candidates with the highest logits
top_5_tokens = torch.topk(mask_token_logits, 5, dim=1).indices[0].tolist()

for token in top_5_tokens:
    print(f"'>>> {text.replace(tokenizer.mask_token, tokenizer.decode([token]))}'")

'>>> This is an awful sight.'
'>>> This is an awful thing.'
'>>> This is an awful idea.'
'>>> This is an awful coincidence.'
'>>> This is an awful mess.'


To be continued...
<hr>

# Supervised fine-tuning  
Following [this](https://github.com/huggingface/notebooks/blob/main/examples/summarization.ipynb) notebook

In [3]:
!pip install torch --user
!pip install transformers --user
!pip install datasets --user
!pip install evaluate --user
!pip install nltk --user
!pip install rouge_score --user

Collecting torch
  Downloading torch-1.13.1-cp37-cp37m-manylinux1_x86_64.whl (887.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m887.5/887.5 MB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting nvidia-cuda-runtime-cu11==11.7.99
  Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m849.3/849.3 kB[0m [31m57.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cublas-cu11==11.10.3.66
  Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m317.1/317.1 MB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting nvidia-cuda-nvrtc-cu11==11.7.99
  Downloading nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.0/21.0 MB[0m [31m75.9 MB/s[0m eta 



In [1]:
from transformers import (
    AutoModelForSeq2SeqLM,
    AutoTokenizer,
    Seq2SeqTrainingArguments,
    Seq2SeqTrainer,
    DataCollatorForSeq2Seq,
)
import datasets
import evaluate
import pandas as pd
import numpy as np
import re
import nltk
from nltk import tokenize
HF_HUB_DISABLE_SYMLINKS_WARNING = False
no_deprecation_warning = True
#nltk.download('punkt')

## Loading, Cleaning, Preprocessing Data

In [2]:
files = ['data/Computing.csv','data/Economics.csv','data/Humanities.csv','data/Math.csv','data/Science.csv']
df_list = []

for file in files:
    df_list.append(pd.read_csv(file))
    
df = pd.concat(df_list).dropna()

In [3]:
len(df)

8261

In [4]:
def remove_phrase(text, phrase):
    result = " ".join(list(filter(lambda x : phrase.lower() not in x.lower(), tokenize.sent_tokenize(text))))
    return result if len(result) > 0 else None

In [5]:
df['about'] = df['about'].map(lambda entry : remove_phrase(entry, "created by"))
df['transcript'] = df['transcript'].map(lambda entry : remove_phrase(re.sub(r"-?\s*(\[.+?\]|\(.+?\))\s*-?", " ", entry).strip(), "subtitles by"))
df = df.dropna()
df['transcript'] = df['transcript'].map(lambda entry : entry.replace("\n", " ").replace("NARRATOR:", "").replace("Voiceover:",""))
df = df.reset_index()
df = df.drop(columns=['index'])

In [6]:
len(df)

8103

In [7]:
raw_dataset = datasets.Dataset.from_pandas(df[['transcript','about']])
#metric = load_metric("rouge")
metric = evaluate.load('rouge')

In [8]:
raw_dataset

Dataset({
    features: ['transcript', 'about'],
    num_rows: 8103
})

In [9]:
raw_dataset[0]

{'transcript': "Hi, welcome to programming! If you've never learned to program before, you might be wondering what programming actually is. Well, when we write a program, we're giving the computer a series of commands that kind of look like a weird form of English. You can think of a computer as a very obedient dog, listening to your every command, and doing whatever you tell it to do. So what's so cool about programming? Well, it really depends on what you think is cool. Because as it turns out, you can use programming for almost everything. Programs control robots that can take care of patients, and my favorite, robots that can roam around Mars and look for water on the surface. Programs help self-driving cars know which way to turn-- which is pretty important! Programs help doctors cure diseases by processing huge amounts of medical data. Programs can be really fun games, like Doodle Jump, Angry Birds, Minecraft. Programs make it possible for Pixar to put out their awesome 3-D anima

In [10]:
#model = AutoModelForSeq2SeqLM.from_pretrained("philschmid/bart-large-cnn-samsum") #facebook/bart-large-cnn
#tokenizer = AutoTokenizer.from_pretrained("philschmid/bart-large-cnn-samsum")

model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")

Downloading (…)lve/main/config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

Downloading (…)"pytorch_model.bin";:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [11]:
max_input_length = 1024
max_target_length = 128

def preprocess_function(examples):
    inputs = [doc for doc in examples["transcript"]]
    model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True)

    # Setup the tokenizer for targets
    with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples["about"], max_length=max_target_length, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

In [12]:
tokenized_datasets = raw_dataset.train_test_split(test_size=0.1).map(preprocess_function, batched=True)

  0%|          | 0/8 [00:00<?, ?ba/s]

  "`as_target_tokenizer` is deprecated and will be removed in v5 of Transformers. You can tokenize your "


  0%|          | 0/1 [00:00<?, ?ba/s]

In [13]:
tokenized_datasets

DatasetDict({
    train: Dataset({
        features: ['transcript', 'about', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 7292
    })
    test: Dataset({
        features: ['transcript', 'about', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 811
    })
})

## Training

In [14]:
batch_size = 2
args = Seq2SeqTrainingArguments(
    "FB Model",
    evaluation_strategy = "epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    weight_decay=0.01,
    save_total_limit=1,
    save_strategy="no",
    num_train_epochs=5,
    predict_with_generate=True,
    fp16=False,
    push_to_hub=False,
)

In [15]:
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)

In [16]:
def compute_metric(eval_pred):
    predictions, labels = eval_pred
    decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
    
    decoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]
    decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]
    
    return metric.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True)

In [17]:
trainer = Seq2SeqTrainer(
    model,
    args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"], # "validation"
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metric
)

In [20]:
trainer.train()

The following columns in the training set don't have a corresponding argument in `BartForConditionalGeneration.forward` and have been ignored: about, transcript. If about, transcript are not expected by `BartForConditionalGeneration.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 7292
  Num Epochs = 5
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 4
  Gradient Accumulation steps = 1
  Total optimization steps = 9115
  Number of trainable parameters = 406290432


Epoch,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum
1,0.3529,1.683492,0.357713,0.21952,0.298069,0.326035
2,0.3554,1.683492,0.357713,0.21952,0.298069,0.326035
3,0.3354,1.683492,0.357713,0.21952,0.298069,0.326035
4,0.3367,1.683492,0.357713,0.21952,0.298069,0.326035
5,0.3572,1.683492,0.357713,0.21952,0.298069,0.326035


The following columns in the evaluation set don't have a corresponding argument in `BartForConditionalGeneration.forward` and have been ignored: about, transcript. If about, transcript are not expected by `BartForConditionalGeneration.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 811
  Batch size = 4
Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_s

TrainOutput(global_step=9115, training_loss=0.3463998054843386, metrics={'train_runtime': 14503.0243, 'train_samples_per_second': 2.514, 'train_steps_per_second': 0.628, 'total_flos': 7.893516826873037e+16, 'train_loss': 0.3463998054843386, 'epoch': 5.0})

In [21]:
trainer.save_model('Saved-Model-FB-10')

Saving model checkpoint to Saved-Model-FB-10
Configuration saved in Saved-Model-FB-10/config.json
Configuration saved in Saved-Model-FB-10/generation_config.json
Model weights saved in Saved-Model-FB-10/pytorch_model.bin
tokenizer config file saved in Saved-Model-FB-10/tokenizer_config.json
Special tokens file saved in Saved-Model-FB-10/special_tokens_map.json


In [4]:
tr = "Alright, good afternoon, class. What's up? What's going on?\nGood afternoon.\nRight. So today is our.\nSession.\nTo review stuff and I'm happy to answer any questions. I could spend a few minutes on the last slide from the memory hierarchy set of slides if anyone is interested in that.\nOr I could answer any other questions anyone has.\nI wonder if we could go over a Boolean simplification.\nOK.\nSo you mean just the algebraic simplifications?\nMostly the algebraic stuff, but they wouldn't be upset if we did K maps as well.\nOK. Oh, open up the slide set.\nAlright so.\nThere's some properties that are useful when we do algebraic simplifications for a Boolean equations, so we've got.\nLet's let me see. Let me go back here. Oh, that's not part of this slide set now anyway, so we can go from here.\nUh, so we we've built, as an example is circuit here that says is like any automatic door if there is a person detected and.\nOr if there is a switch forcing the hold open right or if there is a key that's forcing it to be closed, right? So there's a few conditions that we have to take into account. We want the door to be open when H = 1 and C = 0. So this key that's forcing it to be closed should not be there, right? Or if equals 0 and P = 1.\nSo there isn't the switch forcing the hold open, but we've detected a person and the key that's forcing it closed is is not there, right? So if we take into account these conditions as they are, we get F equals HC prime or H prime PC prime.\nSo.\nAnd yeah, go ahead and mute yourself please.\nYou don't need to hear and everyone else. So if we build a circuit like that, we get this particular circuit right? Is there any questions about this circuit that we've got here? We've got the three inputs 8C and P and we're saying that there is an end condition for age and C prime. So H is going straight through, C is getting inverted and there's an end condition for age, prime, PC prime. So there is a P that's going straight through.\nAnd each is getting inverted and see is also getting inverted. So this is a slightly large circuit away and we're trying to see can we simplify this circuit and build a smaller circuit? OK, so any clues? Any suggestions on what we might do here?\nC prime can be factored.\nC prime can be factored, right? So we can.\nWell, first, we're moving things around by the commutative property. And again, if there is a question on this on the exam, I don't really care about anybody mentioning any of the properties or not. I want you to be able to do this stuff, not so much like name which property you're using and so on. So factoring C prime out of this, we get get age or H prime P.\nAnd with C prime, right? What else might we be able to do over here?\nWhat is this search? As far as simplification goes?\nDis.\nTribute to age plus.\nSo you could do see prime as it is outside of the parentheses and then do age or age prime and or P OK, so this is a little bit tricky.\nIt's that second distributive property. So this didn't make any sense. Think about it a little bit, let me know if you have.\nQuestions on how this came about.\nAnd if not, then we can proceed to say anything or its complement. Wright is always going to be one 'cause. One of these is bound bound to be one and or anything with the one you get a one. So we end up with C, prime one and HPF or simplifying that further we get C prime H4P.\nQuestions on this I have more examples, this is just one of them.\nI mean, I might be considered helpless, but I got completely lost at the 4th step.\nLet me see if I can show you the slides that have the properties listed.\nAlright, let me present a different slide here.\nThat is a tough one to to follow. I agree. So there's two distributive properties, right? A prime.\nAnd B or C = A prime B or A prime. See this one. We're familiar with 'cause. We do this in math quite a lot, right? So this one is not that tricky. There's a different one or B prime C = A or B&A or C.\nOK.\nSo if you think of this and you go back to what we had earlier, let's see. Where's that slide?\nUmm, let me get Outlook slide show mode and just bring it over.\nSo we had age or age prime P over here.\nRight.\nThis part.\nIs what we're doing A or B prime C = A or B&A or C?\nSo making more sense now.\nIt's extra confusing because we have H&H prime as they're two different variables.\nWhereas we had a B&C in the example, so our A is the age our be is the age prime and our C is the P.\nQuestions on that one.\nOK, so I think.\nThat's good. Let's look at maybe another example. Learn. And we should also think there's De Morgan's laws. Let me get rid of.\nIs it a Flyers?\nSo there is a few additional properties that.\nWe we already looked at, but let me go full screen.\nUhm. A or one, right? Anything I ordered with the one is a one. Anything ended with a 0 is zero Y or A is the same as A and is the same as a. A prime prime is the same as A and then the De Morgans again they can be a little tricky to follow along.\nCome here or be whole. Prime is a Prime B prime and then maybe hold prime is a prime or be prime.\nOK, so let's continue looking at some examples here.\nWhich is if we had a circuit which was as equals ABC prime, right? Well, actually there's a different. So this is one way to build this particular circuit and I understand you've not looked at these slides before. So maybe I should go a little slow.\nUhm, unless you studied them for the midterm hum. So we're essentially building a circuit for the laboratory light that goes on or goes off in, in an airplane when the laboratories are in use. OK, so we could say that we have something that says a laboratory is available if, right, any one of the so AB&C are the three lavatories, any of the three is available or not.\nUh clothes bright. So if we do that, we get s = a prime or be prime or C prime and to get the opposite of that, if we wanted to.\nGet the opposite of that. We do the whole thing primed. Can you see this? Or do you want me to zoom in or go full screen or presentation mode?\nThat looks OK to me.\nLooks OK. All right, right. So when we do that, we're getting a prime prime and we prime prime and C prime prime, which just gives us a.\nAnd BNC solar essentially reducing one gate by going this route instead of the other one.\nLet me go more.\nExamples of simplification from the slides.\nWe could actually work one out.\nHow about here?\nRemove this and we can try to work it out.\nSo why don't you take a minute and just try to do?\nThe De Morgan's on this and then do the simplification on it.\nJust grab a pencil and paper and just work it out.\nSword Answer did you get?\nAnyone.\nI'm still final answer you got from applying the De Morgan theorem simplifying this.\nI gotta be prime or C prime.\nI think the steps are worked out in the previous slide. So we let's go take a look.\nSo we did a. So this is the first application of dim organs. We get ABC whole prime and a prime B whole primed. Then we apply De Morgan's again inside of each of the parentheses. So we get a prime or be prime or C prime and A or B prime right?\nUh oh, actually I made a mistake on my own.\nExample here on paper so.\nOnce we get that, we start distributing these out, right. So we get a the first term here and a prime or be prime or C prime or be prime, which is the second term here and this whole thing a prime or be prime or C prime, right. So as we start opening these up and a prime is 0.\nRight. And then?\nHe be prime is a second run. We get a C prime is a third term we get and then we get be prime. Whoops. Sorry, B prime and a prime B prime and B prime which is just be prime and then be prime and C prime.\nEveryone with me so far on this step.\nAny questions?\nAlright, so we can rearrange these two. We get a be primacie prime, a prime B prime or B prime and B prime C prime. Then we can take.\nFor the AV prime and a prime B prime term, we can take be prime common and we get a or a prime.\nWe just saw that anything or its complement is A1, not a zero, right? Quite often students make a mistake by making this zero and getting rid of the V prime terms. That's incorrect.\nUhm.\nOr we've already got the B prime and the other terms AC prime and B prime C prime. So now we have be prime or a C prime or B prime C prime. Now one interesting thing to always remember, if I had like 15 other terms which had be prime and do something over here, they would all end up being consumed by the B prime. OK. And the way we see that happening here is we can say V prime.\nIs the same as B prime earned one right? So we can take be prime common from out of that and we get one or C prime again one or anything else is just one so we get B prime or A C prime.\nDoes that help? Do you have questions?\nDoing that.\nI mean, I have questions not relating to this specific example.\nOK.\nLet me just finish up a point I made verbally by just making up some more terms here. So what I was saying was if we had a B, prime C as another term or anything else with B prime in it, because we have be prime by itself, right? We could have a prime.\nC prime, like all of these terms, they would all be consumed into one term.\nWhich would just be to be prime because it would be one or any of these combinations that making sense.\nAnd when you have one or anything, all those, any things don't matter. They can just disappear.\nSo yeah, that's the one additional point I wanted to make question, Tim.\nSo if you were given.\nI make it like wasean to simplify.\nUh-huh.\nOne that you don't know the steps for ahead of time. What do you work for? What kind of tips and tricks do you use to figure out what rules and simplification things to use to simplify it?\nYeah. So that's a good question. I would say look for.\nUhm.\nSome obvious patterns, which is basically taking out anything that's common, anything that's repeated.\nUhm.\nStuff like bored. I just showed you like you've got one term by itself, right? So you've got a or ABC or ABCD or whatever else, right? Any combination of stuff with with that, a you can drop all the other eight terms because you've got a by itself or with all of those things. So yeah, I mean, it's it's stuff like that. I know what you're asking.\nI don't have a great.\nYou know step by step guide for you. Like you look for this first. Then you look for that. It's really just a matter of practice to some extent.\nOK. Thank you.\nIf anyone else has any tips, I'm happy to hear.\nOK. And your nails have any questions?\nDo you wanna go over K maps? Someone mentioned there?\nK maps.\nWe get to the main part. I'm assuming you understand the basics of K maps.\nHum.\nSo here's the one slide that I think that captures stuff went once you've gone beyond the basics.\nThere's some dog. There's making wisest. Sorry.\nSo.\nSome things that you want to keep in mind, these ovals that we're trying to create can cross boundaries both on the top and the bottom.\nThe Olds must have 1/2, four or eight, so it's it's a.\nMultiple order multiple, but a power of two.\nOK, so 356 is not allowed. Six is divisible by two, but it's not a power of two. OK, they don't correspond to algebraic transformations that combine terms, so this is not OK to have like 3 here you would divide this into two groups of two. OK, you do.\nFrom the XY prime.\nZ1 Z so this over here or XY prime. Sorry. And then the second term would be XZ.\nThose were the two term that would be obtained from here.\nIt's not making sense. Do you want a deeper explanation or is this is this OK?\nI thought you could only do.\nFor some reason, multiples of two in the circles, so I guess I really.\nYeah.\nIt's powers, powers of two.\nOK.\nYeah. So the logic behind that is that the powers of two, because of the way we lay it out with only one term changing at a time, they represent the invariable terms.\nAnd what I mean by that is let me.\nPull back up one slide.\nThis one let me duplicate this and maybe write on it a little bit O2, right? I need to move to a different screen, hold on one SEC.\nSo the part that often becomes confusing when you're learning K maps is like, how do I figure out which term it is right and what's going on here? The logic behind all of this is let me zoom in.\nIn particular part here.\nOops.\nOh, it's gone to a different slide.\nRight. So what's happening here is to keep in mind the values of XY&Z. OK, so X is zero, and this first row, right? And what we're seeing is that the output stays one.\nEven when Y or Z sorry, changes from zero to 1, right? This is the change.\nNZ going from zero to one and what that's indicating is that the the value of Z doesn't matter for this particular change that we're seeing and that's why this term becomes X prime, right, because X is zero and Y prime because why is 0 in both of these? So the change of Z going from zero to 1 does not affect the output, which is why we can eliminate that and we get the term as X prime Y prime.\nStart making sense.\nI'm sorry you said the seed doesn't.\nChange.\nNo, no change in Z doesn't change the output.\nSusie, in this particular example here, does not affect whether we get the output of one or zero.\nOK.\nRight. So these are the changing values of Z and they don't affect the outputs and that's why we dropped from the combination of terms that we are getting here.\nI'll give you one more example on the same slide here. So as you can see, X is 1 throughout all of these, right? And both Y&Z are changing. So like if we go across this whole.\nFour term Oval.\nBoth Y&Z change and yet the output doesn't change based on that. So for this term, why or the output? Sorry not why the output is entirely dependent on X if X is 1, the output is going to be one irrespective of whether Y or Z change and that's why this term comes out simply as X.\nThat makes sense.\nOK. I gotcha. Yeah.\nYeah. And that's why now if we go to that other slide where we said that you can't. So if you look at this group of three ones, right or whether it was three ones or 6 ones, we don't have that condition because in in this case, yes, X is constant. And why is changing over here, why is the same over here but it's changing from this term to this term. And then Z is changing from this term to this term.\nSo we don't have a, we're not capturing that condition where only one of the variables is changing or both are changing through all of their permutations.\nThat's why 3 is not OK or same logic would apply if it was like this, like if it was these six, that would still not capture the change that we're looking for.\nOther questions about K maps that I can answer.\nGenerally speaking, other questions that I might be able to answer.\nCan we do maybe a brief overview of pipelining at some point?\nOh, OK.\nSpecifically the like the.\nData caches or or instruction caches maybe?\nFigure with.\nOr instruction caches or separate from pipelining.\nUh.\nYou want more pipelining, or do you want to talk about caches?\nNot, not caches, but I'm sorry I'm forgetting the name of them in the.\nWhen you're pipelining, there's the like the cash.\nThe interface buffers.\nYeah, the buffer. Sorry, the buffers. That's where I'm I'm talking about.\nSure. Yeah. Let's take a look.\nRight now it's a highly marked up slide, alright, so here's.\nHere's a slide with the interstage buffers and.\nI mean, I I really just want you to understand the concept of the interstage buffers and why we would require them what they actually hold is not something I'm too concerned about you guys memorizing or anything like that. OK, so let me duplicate this slide and let's talk about it a little bit.\nSo what I was trying to say is basically the need for the interstage buffers comes from the fact that when, let's say we haven't.\nA few instructions going on right. OK, let me. I'm trying to see how to best.\nDepict these instructions. So let's say there are these five instructions. When this instruction here right, which is in its right backstage, he's trying to write back some value to the register file, right?\nAt that point, this other instruction here is the one that's being decoded, right? So these values that we get.\nUhm 4.\nWhich register we want to write to let me zoom in?\nCuz I myself can't see this very well.\nSo this right register right it's indicated from some combination of bits. Normally that would be coming in from the instruction, right? It's usually what is it instruction 20 through 16 or instruction was it?\n21 I mean 25 through 21.\nSo it's one of those two, right, if we if we go back to our non pipelined implementation.\nLet's go back all the way to this date of birth. So we take the bits of the instruction to figure out which.\nHere registered we want to write to right. So the problem with when we do pipelining is that that information is for the instruction that's currently in its decode stage. But we are doing a right back for a different instruction, so that kind of information needs to be conveyed to the hardware based on which instruction is where. And so that's why we have Interstate buffers that keep track of that kind of stuff. So maybe a simpler example is when we go from decode.\nTo execute right, we want to carry over some things like for example what what did or did we just read from the registry, the registers that this current instruction that is in execute needs for its execution.\nRight. We also want to keep track of whether this current instruction for which we're doing the execution is that instruction, a load word instruction, for example. Is it going to read from memory later on or is it going to write to memory later on? So that kind of information that we get from the instruction itself when we decode.\nNeeds to be retained across pipeline stages.\nAlong with that instruction that needs to flow with the instruction, because if you know prior to pipelining, all that information was with us the whole time, right, because there was only one instruction in play. Sorry.\nUhm, record. Just go back and get that information. All of these signals would come from that one instruction that was in the pipeline. The problem is now there are five instructions in the pipeline and we don't really know which instructions control signal to use where. And to illustrate that issue. Right. So because one instruction is going to be.\nIn execute while others else in decode and another one is in the in the right backstage, right? Where does the this these where do these control signals really come from?\nIs that making sense?\nYeah, yeah, that makes a lot more sense. Thank you.\nOK. So it's just keeping the relevant information along with the instruction as it passes through, and sometimes it's just information. Sometimes they actually did our values as well, like as I was saying, the values that we read from the registers need to be buffered somewhere. The result of the ALU needs to be buffered somewhere, so that kind of stuff also resides in these interstage buffers.\nGood other questions.\nAs I mentioned the other day, I'm going to leave you with 10 minutes to spare so you can do the course evaluation. So we've got another 8 minutes. Then I can devote to answering any questions you might have.\nI have a question.\nYeah.\nSo for the control unit.\nUh-huh.\nI remember you saying that.\nThe R code with the app code going into. It's basically a look up table.\nYou can think of it as a look up table here.\nThey remember in the book that they asked a question.\nWhether specific UM.\nLions were set to one or not.\nAnd I managed to get all of them wrong based on just in time.\nWishing based on the nature of the car instructions and wondering if there's a better way is if it or even if I have to worry about that on the exam.\nUh, my well, yeah, I'll answer your second question first. You do have to worry about it for the exam.\nBecause the exam is going to have a question similar to what you did in in one of the quizzes where I would be asking you, this is a load word instruction right for example. And then what would be the value on say this control line, the MEM to ridgeline?\nRight. Or the MEM right line and let me test your intuition for this particular question then.\nUh, if this was a lowered word, would the value on MEM write this signal here or why? I can't really select it, but men right? Would that be a zero or would that be a one?\nFor a load word.\nUh-huh.\nI it would be set to zero I believe.\nThat is correct. And what's the explanation?\nIt's reading from memory, not reading from memory.\nPerfect. So you've got good intuition. What do you have to worry about?\nThe maybe the register destination one and they'll you everything.\nOK. Yeah. OK. So register destination, right. So that's again, it's just picking between these two guys here in terms of which one is going to be the right register. So in load word how many registers do we have?\nUh.\nOne, I believe.\nNo, the your options are between 2:00 and 3:00.\nNo.\nI believe it's true.\nYeah. So just look up and load word instruction somewhere. So yeah, it's two instruction. I mean, two registers, right. So again, intuitively, if we have only two registers, you're going to be stuck with these two, right? The first register is always a source register. It goes straight through. There is no multiplexer. There's no choice there. The second register has the option of being a right register. If you have only two registers, then your second register is likely to be the right register.\nSo it making sense.\nAnd that would imply that register destination would be set to 0.\nCorrect.\nExactly.\nOh boy. OK.\nYeah.\nThe earliest source is actually even easier, because it's really a question of do I have a 15? I mean R zero through 15 right is 16 bit offset where it's going to be useful for this instruction. If the answer is yes, then this is the the one value needs to go through right? So this alias source is going to be one. If the answer is no, I have two source registers right? So for example. So this is where it gets a little complicated for the branch instruction because you haven't.\nAn immediate value, but you have two registers, and. In fact, in the branch you want to compare the two registers, right? So even though you have an offset value, that offset is being used elsewhere in the hardware. It's used here and so you are going through with two source registers, so that's. That's the kind of logic that you need to figure out. So if I want two source registers, this value needs to be 0 for all the other I type instructions.\nWhere you are actually the Lu is using the offset that you provided. At that point you don't need the second register, but you need the offset which means a loose or should be one.\nYou're OK with that?\nThat was, uh, I didn't quite follow that. But uh.\nSure.\nWell, let me say it again, so for.\nJust about all I type instructions right, we're going to use the offset that 16 bit offset that we provide as part of the instruction for all of those. We know that these 16 bit values sign extended to 32. This is the one that should pass through. So for all of those that ALU source control line should be a one.\nRight.\nThere's only one exception that comes to my mind right away, which is the branch instruction which is also and I type instruction, but for that one.\nWould actually comparing the two registers right?\nAnd the the offset that we have is not used by the ALU, that offset is instead used to to figure out what is the branch target address.\nSo that's the one exception where we have an offset or or an immediate field, but we're still going to use the two registers, which means alias source needs to be 0.\nLittle better kind of said the same things so.\nFor all R type instructions right where you have three registers, you're going to have two source registers. So this for all R type instructions. Alias source is always going to be 0 because both of the values that you read from the registers are being input to the ALU.\nThat part makes sense. I got that part.\nOK.\nOK.\nSo we've got two minutes left.\nFour other questions can I answer?\nWe might take some of these same concepts tomorrow when we go through the sample final exam.\nAnd if you, if you think of any other questions, please, please bring those up tomorrow. If we don't get to those tomorrow, then on Friday we'll do small group sessions and you can ask your questions there as well.\nK your final exam I believe is on Wednesday.\nTwo to four.\nIt will be on canvas. It'll be very similar in format to the midterm, but it'll be similar in content to the sample final exam.\nDid you say Wednesday?\nI believe it's on Wednesday, yes.\nNot Monday. Monday.\nDelete my 1:00 o'clock section has their exam on Monday.\nUnless I have it backwards.\nIt's ours is 2 to four Wednesday.\nOK.\nOr is it OK?\nOK. Well, I will let you guys go.\nAnd please take the time to fill out the evaluation for this course and.\nI will see you tomorrow.\nWhy?\nYes, I see Wednesday now looking good.\nI did.\n"

## Comparing Model Output
### samsum

In [23]:
#Original Model (reload cause "Expected all tensors to be on the same device" when using .generate)
model = AutoModelForSeq2SeqLM.from_pretrained("philschmid/bart-large-cnn-samsum") #facebook/bart-large-cnn
tokenizer = AutoTokenizer.from_pretrained("philschmid/bart-large-cnn-samsum")

inputs = tokenizer(tr, max_length=1024, truncation=True, return_tensors="pt")
summary_ids = model.generate(inputs['input_ids'])
tokenizer.batch_decode(summary_ids, skip_special_tokens=True)

loading configuration file config.json from cache at /home/kajan/.cache/huggingface/hub/models--philschmid--bart-large-cnn-samsum/snapshots/e49b3d60d923f12db22bdd363356f1a4c68532ad/config.json
Model config BartConfig {
  "_name_or_path": "philschmid/bart-large-cnn-samsum",
  "_num_labels": 3,
  "activation_dropout": 0.0,
  "activation_function": "gelu",
  "add_final_layer_norm": false,
  "architectures": [
    "BartForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "classif_dropout": 0.0,
  "classifier_dropout": 0.0,
  "d_model": 1024,
  "decoder_attention_heads": 16,
  "decoder_ffn_dim": 4096,
  "decoder_layerdrop": 0.0,
  "decoder_layers": 12,
  "decoder_start_token_id": 2,
  "dropout": 0.1,
  "early_stopping": true,
  "encoder_attention_heads": 16,
  "encoder_ffn_dim": 4096,
  "encoder_layerdrop": 0.0,
  "encoder_layers": 12,
  "eos_token_id": 2,
  "force_bos_token_to_be_generated": true,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "gradi

Generate config GenerationConfig {
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "early_stopping": true,
  "eos_token_id": 2,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "length_penalty": 2.0,
  "max_length": 142,
  "min_length": 56,
  "no_repeat_ngram_size": 3,
  "num_beams": 4,
  "pad_token_id": 1,
  "transformers_version": "4.26.0"
}



["Today's session is to review stuff and answer any questions. The last slide from the memory hierarchy set of slides is about algebraic simplifications for a Boolean equations. C prime can be factored out of the commutative property to get age, PC prime and PC prime."]

In [24]:
#Tuned Model
tuned_model = AutoModelForSeq2SeqLM.from_pretrained('./Saved-Model-Samsum-5')
tuned_tokenizer = AutoTokenizer.from_pretrained('./Saved-Model-Samsum-5')

inputs = tuned_tokenizer(tr, max_length=1024, truncation=True, return_tensors="pt")
summary_ids = tuned_model.generate(inputs['input_ids']) #min_length=
tuned_tokenizer.batch_decode(summary_ids, skip_special_tokens=True)

loading configuration file ./Saved-Model-Samsum-5/config.json
Model config BartConfig {
  "_name_or_path": "./Saved-Model-Samsum-5",
  "_num_labels": 3,
  "activation_dropout": 0.0,
  "activation_function": "gelu",
  "add_final_layer_norm": false,
  "architectures": [
    "BartForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "classif_dropout": 0.0,
  "classifier_dropout": 0.0,
  "d_model": 1024,
  "decoder_attention_heads": 16,
  "decoder_ffn_dim": 4096,
  "decoder_layerdrop": 0.0,
  "decoder_layers": 12,
  "decoder_start_token_id": 2,
  "dropout": 0.1,
  "early_stopping": true,
  "encoder_attention_heads": 16,
  "encoder_ffn_dim": 4096,
  "encoder_layerdrop": 0.0,
  "encoder_layers": 12,
  "eos_token_id": 2,
  "force_bos_token_to_be_generated": true,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "gradient_checkpointing": false,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2"
  },
  "init_std": 0.02,
  "is_encoder_

['Reviewing and answering any questions regarding simplifications of arithmetic and algebraic reasoning. Going over the example of a door that opens when a person is detected and closes when a key is not present. Thinking about how we can use the commutative property to make a larger, more efficient circuit.']

In [25]:
#Tuned Model
tuned_model = AutoModelForSeq2SeqLM.from_pretrained('./Saved-Model-Samsum-10')
tuned_tokenizer = AutoTokenizer.from_pretrained('./Saved-Model-Samsum-10')

inputs = tuned_tokenizer(tr, max_length=1024, truncation=True, return_tensors="pt")
summary_ids = tuned_model.generate(inputs['input_ids']) #min_length=
tuned_tokenizer.batch_decode(summary_ids, skip_special_tokens=True)

loading configuration file ./Saved-Model-Samsum-10/config.json
Model config BartConfig {
  "_name_or_path": "./Saved-Model-Samsum-10",
  "_num_labels": 3,
  "activation_dropout": 0.0,
  "activation_function": "gelu",
  "add_final_layer_norm": false,
  "architectures": [
    "BartForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "classif_dropout": 0.0,
  "classifier_dropout": 0.0,
  "d_model": 1024,
  "decoder_attention_heads": 16,
  "decoder_ffn_dim": 4096,
  "decoder_layerdrop": 0.0,
  "decoder_layers": 12,
  "decoder_start_token_id": 2,
  "dropout": 0.1,
  "early_stopping": true,
  "encoder_attention_heads": 16,
  "encoder_ffn_dim": 4096,
  "encoder_layerdrop": 0.0,
  "encoder_layers": 12,
  "eos_token_id": 2,
  "force_bos_token_to_be_generated": true,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "gradient_checkpointing": false,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2"
  },
  "init_std": 0.02,
  "is_encode

['Reviewing and answering any questions regarding simplifications of arithmetic and algebraic reasoning. Going over the example of a door that opens when a person is detected and closes when a key is not present. Thinking about how we can use the commutative property to make a larger, more efficient circuit.']

### bart-large-cnn

In [26]:
#Original Model (reload cause "Expected all tensors to be on the same device" when using .generate)
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")

inputs = tokenizer(tr, max_length=1024, truncation=True, return_tensors="pt")
summary_ids = model.generate(inputs['input_ids'])
tokenizer.batch_decode(summary_ids, skip_special_tokens=True)

loading configuration file config.json from cache at /home/kajan/.cache/huggingface/hub/models--facebook--bart-large-cnn/snapshots/3d224934c6541b2b9147e023c2f6f6fe49bd27e1/config.json
Model config BartConfig {
  "_name_or_path": "facebook/bart-large-cnn",
  "_num_labels": 3,
  "activation_dropout": 0.0,
  "activation_function": "gelu",
  "add_final_layer_norm": false,
  "architectures": [
    "BartForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "classif_dropout": 0.0,
  "classifier_dropout": 0.0,
  "d_model": 1024,
  "decoder_attention_heads": 16,
  "decoder_ffn_dim": 4096,
  "decoder_layerdrop": 0.0,
  "decoder_layers": 12,
  "decoder_start_token_id": 2,
  "dropout": 0.1,
  "early_stopping": true,
  "encoder_attention_heads": 16,
  "encoder_ffn_dim": 4096,
  "encoder_layerdrop": 0.0,
  "encoder_layers": 12,
  "eos_token_id": 2,
  "force_bos_token_to_be_generated": true,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "gradient_checkpointing"

['Professor Wright opens up the exam with a series of questions and answers. The first question is about algebraic simplifications for a Boolean equations. The second question is a question on the properties of a Boolean simplification. The third question is on the use of the K maps. The fourth question is the question of how to simplify a Boolean equation.']

In [27]:
#Tuned Model
tuned_model = AutoModelForSeq2SeqLM.from_pretrained('./Saved-Model-FB-5')
tuned_tokenizer = AutoTokenizer.from_pretrained('./Saved-Model-FB-5')

inputs = tuned_tokenizer(tr, max_length=1024, truncation=True, return_tensors="pt")
summary_ids = tuned_model.generate(inputs['input_ids']) #min_length=
tuned_tokenizer.batch_decode(summary_ids, skip_special_tokens=True)

loading configuration file ./Saved-Model-FB-5/config.json
Model config BartConfig {
  "_name_or_path": "./Saved-Model-FB-5",
  "_num_labels": 3,
  "activation_dropout": 0.0,
  "activation_function": "gelu",
  "add_final_layer_norm": false,
  "architectures": [
    "BartForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "classif_dropout": 0.0,
  "classifier_dropout": 0.0,
  "d_model": 1024,
  "decoder_attention_heads": 16,
  "decoder_ffn_dim": 4096,
  "decoder_layerdrop": 0.0,
  "decoder_layers": 12,
  "decoder_start_token_id": 2,
  "dropout": 0.1,
  "early_stopping": true,
  "encoder_attention_heads": 16,
  "encoder_ffn_dim": 4096,
  "encoder_layerdrop": 0.0,
  "encoder_layers": 12,
  "eos_token_id": 2,
  "force_bos_token_to_be_generated": true,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "gradient_checkpointing": false,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2"
  },
  "init_std": 0.02,
  "is_encoder_decoder"

['Review of algebraic and logistic simplifications of a door-opening circuit and using commutative and distributive properties to find a smaller circuit. Class discussion of how to use these properties in algebraic simplifications and how they can be used in K-map form.']

In [28]:
#Tuned Model
tuned_model = AutoModelForSeq2SeqLM.from_pretrained('./Saved-Model-FB-10')
tuned_tokenizer = AutoTokenizer.from_pretrained('./Saved-Model-FB-10')

inputs = tuned_tokenizer(tr, max_length=1024, truncation=True, return_tensors="pt")
summary_ids = tuned_model.generate(inputs['input_ids']) #min_length=
tuned_tokenizer.batch_decode(summary_ids, skip_special_tokens=True)

loading configuration file ./Saved-Model-FB-10/config.json
Model config BartConfig {
  "_name_or_path": "./Saved-Model-FB-10",
  "_num_labels": 3,
  "activation_dropout": 0.0,
  "activation_function": "gelu",
  "add_final_layer_norm": false,
  "architectures": [
    "BartForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "classif_dropout": 0.0,
  "classifier_dropout": 0.0,
  "d_model": 1024,
  "decoder_attention_heads": 16,
  "decoder_ffn_dim": 4096,
  "decoder_layerdrop": 0.0,
  "decoder_layers": 12,
  "decoder_start_token_id": 2,
  "dropout": 0.1,
  "early_stopping": true,
  "encoder_attention_heads": 16,
  "encoder_ffn_dim": 4096,
  "encoder_layerdrop": 0.0,
  "encoder_layers": 12,
  "eos_token_id": 2,
  "force_bos_token_to_be_generated": true,
  "forced_bos_token_id": 0,
  "forced_eos_token_id": 2,
  "gradient_checkpointing": false,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2"
  },
  "init_std": 0.02,
  "is_encoder_decode

['Review of algebraic and logistic simplifications of a door-opening circuit and using commutative and distributive properties to find a smaller circuit. Class discussion of how to use these properties in algebraic simplifications and how they can be used in K-map form.']

It appears there is no difference between 5 and 10 epochs. This could be due to the model reaching a limit after 5 epochs, but I think this is more likely an error with using trainer.train() twice in a row (a suggested way to continue training). Doesn't really matter since the 5 epoch models are already much better, but we could easily re-visit in the future if needed. 
  
**First 5 epochs**
![m1.PNG](m1.PNG)  
  
**Next 5 epochs**  
![m2.PNG](m2.PNG)
  
### Testing with Long Summarization

In [2]:
#Brought in from the transformers notebook
def long_summarize(text, min_len=0, max_len=512, beams=1, sample=False, temp=1.0, k=50, p=1.0):
    #Split document into sentences
    sentences = nltk.tokenize.sent_tokenize(text)
    length = 0
    chunk = ""
    chunks = []

    #Aggregate sentences into chunks that are < model_max_length
    for i, sen in enumerate(sentences):
        combined_length = len(tokenizer.tokenize(sen)) + length
        if combined_length <= tokenizer.max_len_single_sentence:
            chunk += sen + " "
            length = combined_length
            if i == len(sentences) - 1:
                chunks.append(chunk.strip())          
        else:
            chunks.append(chunk.strip())
            chunk = sen + " "
            length = len(tokenizer.tokenize(sen))
            
    #Generate and decode summaries for each chunk
    res = ""
    for i in [tokenizer(c, return_tensors='pt') for c in chunks]:
        summary_ids = model.generate(i["input_ids"], num_beams=beams, \
                                     min_length=min_len, max_length=max_len, \
                                     do_sample=sample, temperature=temp, top_k=k, top_p=p)
        summary = tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
        res += summary + " "
        print(summary,"*")
    return res

In [16]:
#Renamed model to reflect tuning
model = AutoModelForSeq2SeqLM.from_pretrained('./bart-large-cnn-khan')
tokenizer = AutoTokenizer.from_pretrained('./bart-large-cnn-khan')

In [17]:
long_summarize(tr)

Reviewing K maps and simplifications of linear equations. *
How to use De Morgan's Laws to rewrite complex circuits. *
K-map introduction. *
Introduction to inter-stage buffers and instruction caches. *
More questions on instruction pipelining and control unit. *
Use the offset for the branch instruction. *


"Reviewing K maps and simplifications of linear equations. How to use De Morgan's Laws to rewrite complex circuits. K-map introduction. Introduction to inter-stage buffers and instruction caches. More questions on instruction pipelining and control unit. Use the offset for the branch instruction. "

Idea: instead of returning one big summary for the entire lecture, return shorter summaries for each chunk/section that could be shown alongside the transcript. I believe this would be more useful since the model tends to return shorter summaries now anyways (forcing with min_length can lead to bad results).