In [None]:
# Load setup.py file
%load ../utils/setup.py
%run ../utils/setup.py

In [None]:
useGPU()

Have fun with this chapter!🥳


#Imports

In [None]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

  torch.utils._pytree._register_pytree_node(
  torch.utils._pytree._register_pytree_node(


# About this notebook

In this notebook, you will explore the different sampling methods and how they change the output of the model. You will also use the Falcon 7B model using various sampling strategies. This practical exercise will demonstrate how the choice of sampling method can significantly influence the model's output.

I enchourage you to experiment with the different strategies! Try out how Greedy search, Beam search, Top-K sampling, and Top-p (nucleus) sampling changes the output. Observe how each strategy affects the style and content of the generated text.

As highlighted in the book, adjust the temperature setting to see its effect on the model's creativity. Notice how a temperature closer to 1 tends to produce more varied outputs.

Examine also the differences in outputs between the methods. This will give you a deeper understanding of how each sampling technique guides the model's decision-making process.

This hands-on exploration complements the theoretical knowledge from the book, giving you a tangible feel for the nuances of each sampling method with GPT 2 and Falcon 7B.

# Deterministic Sampling

In [None]:
#@title Greedy Search
# Instantiate the model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Define an input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')


greedy_output = model.generate(
    input_ids,
    max_length=100
)


# Now we print the result
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
print(f"{tokenizer.decode(greedy_output[0], skip_special_tokens=True)}\n")
print("-" * 140, "\n")

Downloading tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[1mOutput:
In a world where AI has become ubiquitous, it's hard to imagine how it will ever be able to replace humans.

"We're going to have to see how we do it," said Dr. Michael S. Hirsch, a professor of psychology at the University of California, San Francisco. "We're going to have to see how we do it in a way that's not just a matter of human beings, but also of machines."

The AI revolution is already happening.

-------------------------------------------------------------------------------------------------------------------------------------------- 



## Beam Search

![Google Drive Image](https://drive.google.com/uc?export=view&id=10iJOlzJt-1-bFWI2mxZ7UhbJ8RKL2Eco)

In [None]:
# Instantiate the model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Define an input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')


beam_outputs = model.generate(
    input_ids,
    max_length=100,
    num_beams=5,
    num_return_sequences=5,
    no_repeat_ngram_size=2,
    early_stopping=True
)


# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, beam_output in enumerate(beam_outputs):
    print(f"Output {i+1}:\n{tokenizer.decode(beam_output, skip_special_tokens=True)}\n")
    print("-" * 140, "\n")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, it's hard to imagine a better time to be a part of it than right now.

"I think we're going to see a lot more of that in the next few years," he said.

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 2:
In a world where AI has become ubiquitous, it's hard to imagine a better time to be a part of it than right now.

"I think we're going to see a lot more of that in the next few years," he says.

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 3:
In a world where AI has become ubiquitous, it's hard to imagine a better time to be a part of it than right now.

"I think we're going to see a lot more of that in the next few years," he said. "It's a very exciting time."

----------------------------------------------------------

# Probabilistic Methods

##Top-k Sampling

![Google Drive Image](https://drive.google.com/uc?export=view&id=1n984J6XPmVi-b1uvfLiDOkNBgS11YrVd)

In [None]:
# Instantiate the model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Define an input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')


top_k_outputs = model.generate(
    input_ids,
    max_length=400,
    num_return_sequences=5,
    do_sample=True,
    top_k=50,
    temperature=1.5
)


# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, top_k_output in enumerate(top_k_outputs):
    print(f"Output {i+1}:\n{tokenizer.decode(top_k_output, skip_special_tokens=True)}\n")
    print("-" * 140, "\n")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, humans may well become one giant animal to rule the world and dominate the rest. And as robots make billions faster, the chance of one living soul being captured within or on board is dwindling each second. It means we really shouldn't expect to be there first thing in the morning, because when it comes time not just for food though, but for our physical wellbeing, it truly is very worth it.

(Images Credit: Shutterstock, Shutterstock)

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 2:
In a world where AI has become ubiquitous, we could soon learn to live with them as technology advances. Perhaps in just twenty-twenty years, AI would probably need much help in learning which emotions it shows in its human form? That's why the latest findings from artificial intelligences—such as human emotions itself. A lot easier than g

## Nucleus (Top-p) Sampling

![Google Drive Image](https://drive.google.com/uc?export=view&id=1pJM6jtIO29qOx2JTPAKHLxSqzjtRowbZ)


In [None]:
# Instantiate the model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Define an input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text output with nucleus sampling
nucleus_outputs = model.generate(
    input_ids,
    max_length=100,
    do_sample=True,
    top_p=0.92,  # Set p for nucleus sampling
    num_return_sequences=5  # How many outputs to generate
)

# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, output in enumerate(nucleus_outputs):
    print(f"Output {i+1}:\n{tokenizer.decode(output, skip_special_tokens=True)}\n")
    print("-" * 140, "\n")


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, this can only mean the cost of human labour, rather than the benefits. It also means that the cost of learning is more than sufficient to make even an average human job feasible. This is one reason why China's AI might not be the most innovative, or even the most secure, of the five leading AI companies currently competing in the U.S. in 2015, but a key reason why it would be a boon for the world.

The

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 2:
In a world where AI has become ubiquitous, many of us are familiar with the concept of artificial intelligence, or AI. And when it comes to the real world, we are often surprised by our own intelligence, as AI is simply too big for our own good.

For many people this is a great time to take a step back and consider the reality that most of us will be less able to learn. An

## Temperature Sampling

![Google Drive Image](https://drive.google.com/uc?export=view&id=142UUXEJh32oKF90iBYN5jZvVE42pUqJI)


In [None]:
# Instantiate the model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Define an input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text output with temperature sampling
temperature_outputs = model.generate(
    input_ids,
    max_length=100,
    do_sample=True,
    temperature=0.7,  # Control randomness with the temperature parameter
    num_return_sequences=5  # How many outputs to generate
)

# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, output in enumerate(temperature_outputs):
    print(f"Output {i+1}:\n{tokenizer.decode(output, skip_special_tokens=True)}\n")
    print("-" * 140, "\n")


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, and the number of people using it has increased, the number of things we use for our business is growing rapidly—a few billion of us are already working on the AI platform," says Michael Oster, a professor at the University of California, Los Angeles (UCL) and a former consultant to the National Security Agency. "In the U.S., the number of jobs that are being filled by AI is increasing at an exponential rate. Now you

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 2:
In a world where AI has become ubiquitous, one can almost feel like we're not really living in a utopia, but one where it's possible to achieve a level of autonomy that's not as constrained by the rules of human behavior.

The new game in the series, called "The Last of Us" will be the first title in the series to be a fully-fledged multiplayer experience. I

In [None]:
# Instantiate the model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Define an input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text output with temperature sampling
temperature_outputs = model.generate(
    input_ids,
    max_length=100,
    do_sample=True,
    temperature=0.7,  # Control randomness with the temperature parameter
    num_return_sequences=5,  # How many outputs to generate
    eos_token_id=tokenizer.eos_token_id,  # Stop generation at EOS token
    pad_token_id=tokenizer.pad_token_id   # If the output is shorter than max_length, it'll be padded
)

# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, output in enumerate(temperature_outputs):
    generated_text = tokenizer.decode(output, skip_special_tokens=True).strip()
    print(f"Output {i+1}:\n{generated_text}\n")
    print("-" * 140, "\n")


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, the technology that makes it possible to do anything at all is still a mystery.

But with the help of the Internet, we can build computers that can perform complex tasks at a fraction of the cost of a human hand. This is an exciting time for the robot industry.

The company has launched its first robot, the robot that won the first prize of the 2011 International Robotics Championship. In a press release, Robot Robotics, the company behind

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 2:
In a world where AI has become ubiquitous, it becomes even more important to preserve a sense of control.

In this paper, we focus on this concept through the use of a network of high-level abstraction and in-depth analysis of the relationship between the network, hardware, and software. We present a model that uses the concept of "net

In [None]:
# Instantiate the model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Define an input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text output with temperature sampling
temperature_outputs = model.generate(
    input_ids,
    max_length=200,
    do_sample=True,
    temperature=0.7,
    num_return_sequences=5
)

# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, output in enumerate(temperature_outputs):
    generated_text = tokenizer.decode(output, skip_special_tokens=True).strip()
    # Check if EOS token exists in the output and trim up to that
    if '<|endoftext|>' in generated_text:
        generated_text = generated_text.split('<|endoftext|>')[0].strip()
    else:
        # Alternatively, trim up to the last period for coherence
        generated_text = '. '.join(generated_text.split('. ')[:-1]) + '.'
    print(f"Output {i+1}:\n{generated_text}\n")
    print("-" * 140, "\n")


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, we should think twice about the use of technology in our daily lives, especially what we might call "the 'digital age.'"

When we can't rely on the Internet to make transactions, we use technology to do things we don't want to do. That's why we've been so fascinated by the Internet's ability to give us a sense of control over our lives, through our interactions with websites and apps. If we don't want to be tracked and monitored, that's where our technology comes in. We use that to make our lives more comfortable and less stressful.

When we can't rely on the Internet to make transactions, we use technology to do things we don't want to do.

The Internet offers a few options for dealing with your online life. These include:

Online privacy: When you're offline for long periods of time, you can easily find and change your private information.

--------------------------------------------------------------------------------

In [None]:
# Instantiate the model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Define an input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text output with temperature sampling
temperature_outputs = model.generate(
    input_ids,
    max_length=400,  # More generous max_length
    do_sample=True,
    temperature=0.7,
    num_return_sequences=5,
    eos_token_id=tokenizer.eos_token_id
)

# Post-process outputs: Trim after the last full stop before a desired length
def trim_output(text, desired_max_length=250):
    if len(text) <= desired_max_length:
        return text
    last_full_stop = text.rfind('.', 0, desired_max_length)
    if last_full_stop == -1:
        return text[:desired_max_length]  # Fallback: truncate at desired_max_length
    return text[:last_full_stop + 1]

# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, output in enumerate(temperature_outputs):
    full_output = tokenizer.decode(output, skip_special_tokens=True)
    trimmed_output = trim_output(full_output)
    print(f"Output {i+1}:\n{trimmed_output}\n")
    print("-" * 140, "\n")


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, it's often hard to know how to predict how the future will look.

But at least one research team has taken a step forward and has published its findings in the journal Nature Neuroscience.

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 2:
In a world where AI has become ubiquitous, there's little doubt that humans will be the driving force behind the next step in artificial intelligence, according to a new paper in the American Journal of Psychology.

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 3:
In a world where AI has become ubiquitous, we need more of it.

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 4:
In a w

# Using Falcon 7B with Sampling Methods

Here you can try out the Falcon 7B model with different variations of sampling methods. This should give you an idea how the output of your chosen model can vary based on the sampling strategy. Take a close look at the different outputs of the LLM. Also, examine how the following model outputs differ from each other. As mentioned in the book, the closer the temperature parameter is set to 1, the more "creative" the model will be.

In [None]:
# Model and tokenizer
model_name = "tiiuae/falcon-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Define the input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text output with temperature sampling
outputs = model.generate(
    input_ids,
    max_length=200,  # max_length to match the desired output
    do_sample=True,
    temperature=0.7,  # Control randomness with the temperature parameter
    top_k=10,
    num_return_sequences=5,  # How many outputs to generate
    eos_token_id=tokenizer.eos_token_id,  # Stop generation at EOS token
    pad_token_id=tokenizer.pad_token_id  # If the output is shorter than max_length, it'll be padded
)

# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, output in enumerate(outputs):
    generated_text = tokenizer.decode(output, skip_special_tokens=True).strip()
    print(f"Output {i+1}:\n{generated_text}\n")
    print("-" * 140, "\n")


Downloading tokenizer_config.json:   0%|          | 0.00/287 [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/2.73M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/281 [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/1.05k [00:00<?, ?B/s]

Downloading (…)model.bin.index.json:   0%|          | 0.00/16.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)l-00001-of-00002.bin:   0%|          | 0.00/9.95G [00:00<?, ?B/s]

Downloading (…)l-00002-of-00002.bin:   0%|          | 0.00/4.48G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading generation_config.json:   0%|          | 0.00/117 [00:00<?, ?B/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.
The current implementation of Falcon calls `torch.scaled_dot_product_attention` directly, this will be deprecated in the future in favor of the `BetterTransformer` API. Please install the latest optimum library with `pip install -U optimum` and call `model.to_bettertransformer()` to benefit from `torch.scaled_dot_product_attention` and future performance optimizations.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, where the “smart” and “connected” have become a reality, the 4th industrial revolution is taking shape.
It’s a revolution that promises to change how we work, live and play and to transform the global economy.
It’s a revolution that will impact every corner of our world.
It’s a revolution that will have profound impacts on our society.
It’s a revolution that has already begun to transform the way we do business.
And it’s a revolution that will be the catalyst for an entirely new set of challenges to face the world.
What are the main challenges the 4th Industrial Revolution will present to us?
The main challenge is that we don’t know what these new technologies will look like or how they will work.
The second challenge is that the technologies that will transform our lives will be so radically different from what we’ve come to expect.

----------------------------------------------------------------------------------------

In [None]:
# Model and tokenizer
model_name = "tiiuae/falcon-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Define the input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text output with temperature sampling
outputs = model.generate(
    input_ids,
    max_length=100,  # Adjusted max_length to match the desired output
    do_sample=True,
    temperature=0.1,  # Control randomness with the temperature parameter
    top_k=10,
    num_return_sequences=5,  # How many outputs to generate
    eos_token_id=tokenizer.eos_token_id,  # Stop generation at EOS token
    pad_token_id=tokenizer.pad_token_id  # If the output is shorter than max_length, it'll be padded
)

# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, output in enumerate(outputs):
    generated_text = tokenizer.decode(output, skip_special_tokens=True).strip()
    print(f"Output {i+1}:\n{generated_text}\n")
    print("-" * 140, "\n")


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, it’s easy to forget that it’s still a relatively new technology.
In fact, it’s only been around for about 20 years.
But that doesn’t mean it’s not a powerful tool.
In fact, AI is one of the most powerful tools we have.
It can help us solve problems, make decisions, and even predict the future.
But it’s also a tool that can be used

-------------------------------------------------------------------------------------------------------------------------------------------- 

Output 2:
In a world where AI has become ubiquitous, it is important to understand the impact of AI on the future of work. AI is already being used in many industries, including healthcare, finance, and manufacturing. It is also being used in the home, where it can help with tasks such as cooking and cleaning. In the future, AI will likely be used in more industries, including transportation and education.
AI is already being used in many industries, inc

In [None]:

# Instantiate the model and tokenizer
model_name = "tiiuae/falcon-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Define an input prompt
prompt = "In a world where AI has become ubiquitous,"

# Encode the input prompt and prepare it for the model
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text output
outputs = model.generate(
    input_ids,
    max_length=200,
    do_sample=True,
    temperature=0.7,
    num_return_sequences=5
)

# Print the resulting sentences
print("\033[1m" + "Output:\n" + 140 * '=' + "\033[0m")
for i, output in enumerate(outputs):
    generated_text = tokenizer.decode(output, skip_special_tokens=True).strip()
    # Check if the EOS token exists in the output and trim up to that
    eos_token = tokenizer.eos_token
    if eos_token and eos_token in generated_text:
        generated_text = generated_text.split(eos_token)[0].strip()
    else:
        # Alternatively, trim up to the last period for coherence
        generated_text = '. '.join(generated_text.split('. ')[:-1]) + '.'
    print(f"Output {i+1}:\n{generated_text}\n")
    print("-" * 140, "\n")


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


[1mOutput:
Output 1:
In a world where AI has become ubiquitous, it is possible to develop AI systems that work well in many different circumstances. The problem is that these systems are often designed to work in one particular environment, and they may not work well in a different environment. The best solution is to develop AI systems that can work in any environment, and that is what we are trying to do with our new research, called Meta-AI.
We are working on a new approach to AI, called Meta-AI. This approach is based on the idea that AI systems should be able to learn from their experiences in the world. In other words, they should be able to learn from their own actions and from the actions of others.
This approach is very different from the traditional AI approach that we have all been taught. In the traditional approach, AI systems are designed to learn from their own behaviors, and they are not designed to learn from the behaviors of others.

---------------------------------