In [1]:
from transformers import AutoTokenizer
from transformers import pipeline


In [2]:
text = """Dear Amazon, last week I ordered an Optimus Prime action figure \
from your online store in Germany. Unfortunately, when I opened the package, \
I discovered to my horror that I had been sent an action figure of Megatron \
instead! As a lifelong enemy of the Decepticons, I hope you can understand my \
dilemma. To resolve the issue, I demand an exchange of Megatron for the \
Optimus Prime figure I ordered. Enclosed are copies of my records concerning \
this purchase. I expect to hear from you soon. Sincerely, Bumblebee."""

In [3]:
generator = pipeline("text-generation")
response = "Dear Bumblebee, It is your fault."
prompt = text + "\n\nCustomer service response:\n" + response
outputs = generator(prompt, max_length=200)
print(outputs[0]['generated_text'])

No model was supplied, defaulted to openai-community/gpt2 and revision 6c0e608 (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Dear Amazon, last week I ordered an Optimus Prime action figure from your online store in Germany. Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead! As a lifelong enemy of the Decepticons, I hope you can understand my dilemma. To resolve the issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered. Enclosed are copies of my records concerning this purchase. I expect to hear from you soon. Sincerely, Bumblebee.

Customer service response:
Dear Bumblebee, It is your fault. The product is not yours.

We will refund the purchase price, in full and in installments.

I made a couple of requests, but I ordered one of 2 (you have to pay $5 each) and was disappointed.


I sent you a copy of the packaging and requested a refund, but you sent me a shipping address that wasn't


In [4]:
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [5]:
input_text = "Today I went to the park and "
input_ids = tokenizer([input_text], return_tensors='pt')
print(input_ids)

{'input_ids': tensor([[8888,  314, 1816,  284,  262, 3952,  290,  220]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1]])}


In [6]:
streamer = TextStreamer(tokenizer)
_ = model.generate(**input_ids, streamer=streamer, repetition_penalty=1.1, do_sample=True, temperature=1.0, max_new_tokens=100)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Today I went to the park and  didn't realise this was where he is. We were supposed, that's how long we're going there if you want us back... it should have been a little bit later (or better).
I also heard rumours of his father being in bed with him at work today - did anyone know? It has just started yet......<|endoftext|>


In [7]:
input_ids = tokenizer.encode(input_text, return_tensors='pt')

output = model.generate(input_ids, max_length=100)

print(tokenizer.decode(output[0], skip_special_tokens=True))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Today I went to the park and  I saw a lot of people there. I was really excited to see what they were doing. I was really excited to see what they were doing. I was really excited to see what they were doing. I was really excited to see what they were doing. I was really excited to see what they were doing. I was really excited to see what they were doing. I was really excited to see what they were doing. I was really excited to see


In [10]:
import evaluate

perplexity = evaluate.load("perplexity", module_type="metric")
input_texts = ["lorem ipsum", "Happy Birthday!", "Bienvenue"]
results = perplexity.compute(model_id='gpt2',
                             add_start_token=False,
                             predictions=input_texts)
print(list(results.keys()))

Downloading builder script:   0%|          | 0.00/8.46k [00:00<?, ?B/s]

  0%|          | 0/1 [00:00<?, ?it/s]

['perplexities', 'mean_perplexity']


In [11]:
print(round(results["mean_perplexity"], 2))

646.74


In [12]:
print(results["perplexities"][1])

1499.6920166015625


In [13]:
import torch
loss = torch.nn.CrossEntropyLoss()

In [14]:
inputs_text = tokenizer("Happy Birthday!", return_tensors = "pt")
loss = model(input_ids = inputs_text["input_ids"], labels = inputs_text["input_ids"]).loss
ppl = torch.exp(loss)
print(ppl)

tensor(1499.6876, grad_fn=<ExpBackward0>)


In [15]:
import numpy as np
inputs = tokenizer(["Today is"], return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=5, return_dict_in_generate=True, output_scores=True)
transition_scores = model.compute_transition_scores(
    outputs.sequences, outputs.scores, normalize_logits=True
)

input_length = inputs.input_ids.shape[1]
generated_tokens = outputs.sequences[:, input_length:]
for tok, score in zip(generated_tokens[0], transition_scores[0]):
    # | token | token string | logits | probability
    print(f"| {tok:5d} | {tokenizer.decode(tok):8s} | {score.numpy():.4f} | {np.exp(score.numpy()):.2%}")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


|   262 |  the     | -1.4135 | 24.33%
|  1110 |  day     | -2.6088 | 7.36%
|   618 |  when    | -2.0095 | 13.41%
|   356 |  we      | -1.8593 | 15.58%
|   460 |  can     | -2.5082 | 8.14%


In [20]:
# print(generated_outputs)
# Not sure about what this is.

In [18]:
model = AutoModelForCausalLM.from_pretrained("KnutJaegersberg/gpt2-chatbot")
tokenizer = AutoTokenizer.from_pretrained("KnutJaegersberg/gpt2-chatbot")

config.json:   0%|          | 0.00/1.01k [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/43.4k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.96G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/1.27G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/119 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/476 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/131 [00:00<?, ?B/s]

In [21]:
input_text = "### System:\
You are an AI assistant. User will give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.\
### User: \
How do you fine tune a large language model? \
### Assistant:\
"
input_ids = tokenizer([input_text], return_tensors='pt')
streamer = TextStreamer(tokenizer)
_ = model.generate(**input_ids, streamer=streamer, max_new_tokens=100)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


### System:You are an AI assistant. User will give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.### User: How do you fine tune a large language model? ### Assistant:
Fine-tuning a large language model involves fine-tuning the model's parameters, such as the number of hidden layers, the number of hidden units, and the number of hidden units per layer.

Here are some steps you can take to fine-tune a large language model:

1. Train the model: Train the model on a large dataset of text data. This will help you fine-tune the model's parameters.
2. Fine-tune
