## Inference with PEFT

### Loading a Saved PEFT Model

In [24]:
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoConfig
from peft import AutoPeftModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained("gpt2")
lora_model = AutoPeftModelForCausalLM.from_pretrained("gpt2-lora")

### Generating Text from a PEFT Model

In [25]:
inputs = tokenizer("Hello, my name is ", return_tensors="pt", padding=True)
outputs = model.generate(
    input_ids=inputs["input_ids"],
    attention_mask=inputs["attention_mask"],
    max_new_tokens=50,
    num_return_sequences=1,
    no_repeat_ngram_size=2,
    top_k=50
)
print(tokenizer.batch_decode(outputs))

lora_outputs = lora_model.generate(
    input_ids=inputs["input_ids"],
    attention_mask=inputs["attention_mask"],
    max_new_tokens=50,
    num_return_sequences=1,
    no_repeat_ngram_size=2,
    top_k=50
)
print(tokenizer.batch_decode(lora_outputs))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


["Hello, my name is _____. I am a student at the University of California, Berkeley.\n\nI am an English teacher. My name was ____. And I'm a professor at a university in the United States. So I have a lot of experience in teaching"]
['Hello, my name is !!"!#!$!%!&!\'!(!)!*!+!,!-!.!/!0!1!2!3!4!5!6!7!8!9!']
