In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import warnings
warnings.filterwarnings("ignore")

### GPT-2

In [2]:
model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token  # Explicitly setting pad token
model = AutoModelForCausalLM.from_pretrained(model_name)

In [3]:
import textwrap

In [4]:
max_length = 128
input_txt = """In a shocking finding, scientist discovered \
a herd of unicorns living in a remote, previously unexplored \
valley, in the Andes Mountains. Even more surprising to the \
researchers was the fact that the unicorns spoke perfect English.\n\n
"""
input_ids = tokenizer(input_txt, padding=True, return_tensors="pt")["input_ids"]
output_greedy = model.generate(input_ids, max_length=max_length, do_sample=False)
print(textwrap.fill(tokenizer.decode(output_greedy[0])), 100)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In a shocking finding, scientist discovered a herd of unicorns living
in a remote, previously unexplored valley, in the Andes Mountains.
Even more surprising to the researchers was the fact that the unicorns
spoke perfect English.   "The unicorns were very intelligent, and they
were very intelligent," said Dr. David S. Siegel, a professor of
anthropology at the University of California, Berkeley. "They were
very intelligent, and they were very intelligent, and they were very
intelligent, and they were very intelligent, and they were very
intelligent, and they were very intelligent, and they were very
intelligent, and they were very 100


In [5]:
output_beam = model.generate(input_ids, max_length=max_length, num_beams=5, do_sample=False)
print(textwrap.fill(tokenizer.decode(output_beam[0])), 100)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In a shocking finding, scientist discovered a herd of unicorns living
in a remote, previously unexplored valley, in the Andes Mountains.
Even more surprising to the researchers was the fact that the unicorns
spoke perfect English.   The researchers, from the University of
California, San Diego, and the University of California, Santa Cruz,
found that the unicorns were able to communicate with each other in a
way that was similar to that of human speech.   "The unicorns were
able to communicate with each other in a way that was similar to that
of human speech," said study co-lead author Dr. David J. 100


In [11]:
output_beam = model.generate(input_ids, max_length=max_length, num_beams=5, do_sample=False, no_repeat_ngram_size=2)
print(textwrap.fill(tokenizer.decode(output_beam[0])), 100)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In a shocking finding, scientist discovered a herd of unicorns living
in a remote, previously unexplored valley, in the Andes Mountains.
Even more surprising to the researchers was the fact that the unicorns
spoke perfect English.   The researchers, from the University of
California, San Diego, and the National Science Foundation (NSF) in
Boulder, Colorado, were able to translate the words of the unicorn
into English, which they then translated into Spanish.  "This is the
first time that we have translated a language into an English
language," said study co-author and NSF professor of linguistics and
evolutionary biology Dr. 100


In [12]:
output_temp = model.generate(input_ids, max_length=max_length, do_sample=True, temperature=2.0, top_k=50)
print(textwrap.fill(tokenizer.decode(output_temp[0])), 100)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In a shocking finding, scientist discovered a herd of unicorns living
in a remote, previously unexplored valley, in the Andes Mountains.
Even more surprising to the researchers was the fact that the unicorns
spoke perfect English.   An archaeotherms of Peru says you'll never
stop hearing new noises and noises, but just after being driven off a
trail near the Orane Mountains, they spotted something completely
alien.   Somehow humans came home together after a hunting game they
played (that is one of the few examples where a human actually used
that field field field):  But that will take an eternity or worse….
but 100


In [13]:
output_topp = model.generate(input_ids, max_length=max_length, do_sample=True, top_p=0.90)
print(textwrap.fill(tokenizer.decode(output_topp[0])), 100)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In a shocking finding, scientist discovered a herd of unicorns living
in a remote, previously unexplored valley, in the Andes Mountains.
Even more surprising to the researchers was the fact that the unicorns
spoke perfect English.   When researchers brought the unicorns to
Bolivia, they found that they spoke the same vocabulary. This finding
is quite shocking. First of all, the scientists knew nothing about the
ancient language of the natives of Peru. Secondly, the unicorns are
quite similar to the traditional Latin-language spoken by people of
the Amazon region. And yet the researchers discovered that the
unicorns themselves did not have a specific language. 100


LLaMA based Text Generation

In [14]:
from transformers import LlamaTokenizer, LlamaForCausalLM

In [15]:
model_path = 'openlm-research/open_llama_3b'
# model_path = 'openlm-research/open_llama_7b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto',
)
prompt = 'Q: Which is the largest animal?\nA:'
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=300
)
print(tokenizer.decode(generation_output[0]))

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


<s>Q: Which is the largest animal?
A: The blue whale.
Q: Which is the smallest animal?
A: The mosquito.
Q: Which is the fastest animal?
A: The cheetah.
Q: Which is the strongest animal?
A: The lion.
Q: Which is the most intelligent animal?
A: The human.
Q: Which is the most dangerous animal?
A: The tiger.
Q: Which is the most dangerous animal in the world?
A: The lion.
Q: Which is the most dangerous animal in the world?
A: The tiger.
Q: Which is the most dangerous animal in the world?
A: The lion. The tiger is the most dangerous animal in the world.
Q: Which is the most dangerous animal in the world?
A: The tiger is the most dangerous animal in the world.
Q: Which is the most dangerous animal in the world?
A: The tiger is the most dangerous animal in the world. The tiger is the most dangerous animal in the world.
Q: Which is the most dangerous animal in the world?
A: The tiger is the most dangerous animal in the world. The tiger is the most dangerous animal in the world.
Q: Which is th

In [None]:
prompt = 'Q: When did Pakistan get independence?\nA:'
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=300
)
print(tokenizer.decode(generation_output[0]))

### Falcon

In [None]:
model_path = "tiiuae/falcon-7b-instruct" #Technology Innovation Institute (UAE)
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto',trust_remote_code=True)
prompt = 'Q: Which is the largest animal?\nA:'
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(input_ids=input_ids, max_new_tokens=32)
print(tokenizer.decode(generation_output[0]))


In [None]:
prompt = 'Q: When did Pakistan get independence?\nA:'
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=300)
print(tokenizer.decode(generation_output[0]))

In [None]:
text = "Karachi, Islamabad and Lahore are few of the major cities in Pakistan."
query = "identify the entities in the following text: "
prompt = query + "\n" + text
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=300)
print(tokenizer.decode(generation_output[0]))

In [None]:
text = """Dear Amazon, last week I ordered an Optimus Prime action figure from your online store in Germany. 
Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead! 
As a lifelong enemy of the Decepticons, I hope you can understand my dilemma. 
To resolve the issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered. 
Enclosed are copies of my records concerning this purchase. I expect to hear from you soon. Sincerely, Bumblebee."""

In [None]:
query = "identify the entities and their types (use standard NER terminology) in the following text: "
prompt = query + "\n" + text
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=800)
print(tokenizer.decode(generation_output[0]))

In [None]:
query = "summarize the following text: "
prompt = query + "\n" + text
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=600)
print(tokenizer.decode(generation_output[0]))

In [None]:
query = "Describe the sentiment of the following text as either positive, negative or neutral: "
prompt = query + "\n" + text
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=600)
print(tokenizer.decode(generation_output[0]))

In [None]:
response = "Dear Bumblebee, I am sorry to hear that your order was mixed up. "
prompt = text + "\n\nCustomer service response:\n" + response

input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=900)
print(tokenizer.decode(generation_output[0]))