# LAB GenAI - LLMs - OpenAI GPT API Exercises

## 1. Basic Conversation
**Exercise:** Create a simple chatbot that can answer basic questions about a given topic (e.g., history, technology).  
**Parameters to explore:** `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `n`, `stop`.

Comment what happen when you change the parameters
(read documentation!)

## **Installing Required Libraries**

In this step, we install the necessary libraries for text generation tasks:

1. **Transformers**: The Hugging Face library that provides pre-trained models for NLP tasks.
2. **PyTorch**: A deep learning framework that supports model training and inference.

These libraries are essential for loading and using GPT-Neo for text generation.


In [None]:
# Install the Hugging Face Transformers library for text generation
!pip install transformers

# Install PyTorch to support GPT-Neo and other models
!pip install torch


Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

## **Initializing the Chatbot Model**

In this step, we create a text generation pipeline using **GPT-Neo 1.3B**, a powerful language model from Hugging Face.

- **Pipeline**: Simplifies model usage by wrapping model loading and inference.
- **Flexible Chatbot Function**: We define a function `basic_chatbot()` that allows parameter tuning for generating diverse responses.

The chatbot leverages GPU (`cuda:0`) for faster processing.


In [None]:
from transformers import pipeline
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings("ignore")

# Create a text generation pipeline using GPT-Neo with PyTorch
chatbot = pipeline("text-generation", model="EleutherAI/gpt-neo-1.3B", framework="pt")

# Define a flexible chatbot function to accept more generation parameters
def basic_chatbot(prompt, max_length=100, num_return_sequences=1, temperature=1.0, top_p=1.0, repetition_penalty=1.0):
    response = chatbot(
        prompt,
        max_length=max_length,
        num_return_sequences=num_return_sequences,
        temperature=temperature,
        top_p=top_p,
        repetition_penalty=repetition_penalty,
        truncation=True  # To handle truncation warnings
    )
    return response[0]['generated_text'].strip()

Device set to use cuda:0


## **Basic Chatbot Response with Tuned Parameters**

Now, we test the chatbot with a factual question:  
**"Who designed the street layout of Washington DC?"**

- **`max_length=50`**: Limits the response length to 50 tokens.
- **`temperature=0.3`**: Reduces randomness for more factual answers.
- **`top_p=0.9`**: Nucleus sampling ensures more coherent responses.
- **`repetition_penalty=1.2`**: Avoids repetitive text.

This helps fine-tune the chatbot for more precise and concise answers.


In [None]:
# Adjust parameters for more focused and factual responses
response = basic_chatbot(
    "Who designed the street layout of Washington DC?",
    max_length=50,          # Limits the response length
    num_return_sequences=1, # Generates only one answer
    temperature=0.3,        # Lower temperature for less randomness
    top_p=0.9,              # Nucleus sampling
    repetition_penalty=1.2  # Avoid repetitive answers
)
print("Chatbot Response:", response)


Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Chatbot Response: Who designed the street layout of Washington DC?

The street layout of Washington DC is a fascinating topic. It’s a city that has a lot of history and it’s a city that has a lot of interesting architecture.


## **Enhancing the Chatbot with Additional Parameters**

To gain better control over the chatbot's output, we modify the `basic_chatbot()` function. Although GPT-Neo does not support all parameters like `frequency_penalty` or `presence_penalty`, we:

- Focus on `temperature`, `top_p`, and `repetition_penalty`.
- Add **`truncation=True`** to handle warnings and ensure the responses do not exceed the defined `max_length`.

This modification allows us to experiment further with output control.


In [None]:
# Update the chatbot function to include more parameters for fine-tuning responses
def basic_chatbot(prompt, max_length=100, num_return_sequences=1, temperature=1.0, top_p=1.0,
                  repetition_penalty=1.0):
    response = chatbot(
        prompt,
        max_length=max_length,
        num_return_sequences=num_return_sequences,
        temperature=temperature,
        top_p=top_p,
        repetition_penalty=repetition_penalty,
        truncation=True  # To handle truncation warnings
    )
    return response[0]['generated_text'].strip()

## **Experimenting with Generation Parameters**

In this section, we explore how different parameters influence the chatbot's behavior:

1. **Higher Repetition Penalty** (`repetition_penalty=1.5`): Reduces repetitive phrases.
2. **Lower Temperature** (`temperature=0.3`): Encourages factual and deterministic responses.
3. **Multiple Outputs** (`num_return_sequences=3`): Generates diverse responses for the same input.

These experiments provide insight into how parameter tuning shapes the model's performance.


In [None]:
# 1. Test with higher repetition_penalty to reduce repeated phrases
response = basic_chatbot("Tell me about AI.", repetition_penalty=1.5)
print("Response with High Repetition Penalty:", response)
# Comment: Increasing `repetition_penalty` discourages repeated phrases in the output.

# 2. Test with lower temperature for more focused and factual responses
response = basic_chatbot("Tell me about AI.", temperature=0.3)
print("Response with Low Temperature:", response)
# Comment: Lower `temperature` makes responses more deterministic and factual.

# 3. Generate multiple responses using `num_return_sequences`
responses = chatbot("What is Python?", num_return_sequences=3, truncation=True)
for idx, res in enumerate(responses):
    print(f"Response {idx + 1}: {res['generated_text'].strip()}")
# Comment: Using `num_return_sequences=3` generates diverse answers to the same question.


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Response with High Repetition Penalty: Tell me about AI.

I don’t need to tell you that Artificial Intelligence is amazing. But I will tell you that it’s not all that it’s cracked up to be.
As I’ve stated many times in the past two years, the technology really isn’t there yet. We have been training AI robots for over 2 decades and they remain essentially the same. The machines are not able to think. They can run through a series


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Response with Low Temperature: Tell me about AI.

I’ve been thinking a lot about AI lately. I’ve been thinking about it for a long time, and I’ve been thinking about it for a long time, and I’ve been thinking about it for a long time. I’ve been thinking about it for a long time, and I’ve been thinking about it for a long time. I’ve been thinking about it for a long time,
Response 1: What is Python?

Learn all about Python programming. You will learn how to use all of the Python library features. This is a practical course to help you make a career out of the Python language and community.

All Python applications are
Response 2: What is Python? Python is a free, high-performance open-source scripting language used for creating graphical applications for interactive computing. A lot of tools are available in many programming languages to simplify the work of developing graphical applications. Python is best known for
Response 3: What is Python?

What is Python?

Python is an operating s

## **Key Observations**

- **Repetition Penalty**: Increasing `repetition_penalty` discouraged repetitive phrases but occasionally made the text more fragmented.
- **Temperature**: Lowering `temperature` produced more factual responses but sometimes led to repetitive phrasing.
- **Multiple Responses**: Setting `num_return_sequences=3` highlighted the model's ability to generate varied outputs for the same prompt.

These experiments illustrate how parameter adjustments can tailor chatbot responses to specific needs, whether for creativity or factual accuracy.


## 2. Summarization
**Exercise:** Write a script that takes a long text input and summarizes it into a few sentences.  
**Parameters to explore:** `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `best_of`, `logprobs`.

Comment what happen when you change the parameters
(read documentation!)

## **Initializing the Summarization Model**

In this step, we set up a summarization pipeline using the **BART** model (`facebook/bart-large-cnn`), which is optimized for summarizing long texts.

- **Pipeline**: Simplifies the summarization process with pre-trained models.
- **Summarization Function**: `summarize_text()` allows us to adjust parameters like `max_length`, `temperature`, and `top_p` to control summary quality and style.


In [None]:
from transformers import pipeline
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings("ignore")

# Create a summarization pipeline
summarizer = pipeline("summarization", model="facebook/bart-large-cnn", framework="pt")

# Define a flexible summarization function
def summarize_text(text, max_length=130, min_length=30, temperature=1.0, top_p=1.0, length_penalty=1.0):
    summary = summarizer(
        text,
        max_length=max_length,
        min_length=min_length,
        length_penalty=length_penalty,
        truncation=True
    )
    return summary[0]['summary_text'].strip()


Device set to use cuda:0


## **Generating a Basic Summary**

We apply the summarization model to a sample text about **Artificial Intelligence**. This provides an initial understanding of how the model summarizes key points from longer texts.


In [None]:
# Sample long text
text = """
Artificial Intelligence (AI) is transforming industries across the globe, from healthcare to finance. AI enables computers
and machines to mimic human intelligence and perform tasks such as learning, reasoning, problem-solving, and understanding language.
In healthcare, AI is being used to develop predictive models for disease detection, personalized treatment plans, and even robotic surgeries.
In finance, AI algorithms are used to detect fraud, automate trading, and enhance customer service. As AI continues to evolve,
ethical considerations about data privacy, bias, and the future of work are becoming increasingly important. Policymakers and
industry leaders are grappling with how to regulate AI technologies while fostering innovation.
"""

# Generate a basic summary
summary = summarize_text(text)
print("Summary:", summary)


Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


Summary: Artificial Intelligence (AI) is transforming industries across the globe. It enables computers and machines to mimic human intelligence. Policymakers are grappling with how to regulate AI technologies while fostering innovation.


## **Exploring Summarization Parameters**

To understand how different parameters affect the summarization, we adjust the following:

1. **Temperature**: Controls randomness; lower values generate more factual summaries.
2. **Top-p (Nucleus Sampling)**: Increases diversity in the summary when set higher.
3. **Max/Min Length**: Dictates the length of the output summary, allowing for concise or detailed results.

These experiments help tailor summarization outputs for specific needs.


In [None]:
# 1. Lower temperature for more factual summaries
summary_low_temp = summarize_text(text, temperature=0.3)
print("\nLow Temperature Summary:", summary_low_temp)
# Comment: Lower temperature makes summaries more deterministic and factual.

# 2. Higher top_p for more diverse summaries
summary_high_top_p = summarize_text(text, top_p=0.95)
print("\nHigh Top-p Summary:", summary_high_top_p)
# Comment: Increasing top_p introduces more creative variations in the summary.

# 3. Adjust max_length and min_length to control the summary size
short_summary = summarize_text(text, max_length=50, min_length=20)
print("\nShort Summary:", short_summary)
# Comment: Adjusting max_length and min_length controls the brevity and detail of the summary.

# 4. Apply a higher length_penalty for more concise summaries
summary_high_penalty = summarize_text(text, length_penalty=2.0)
print("\nHigh Length Penalty Summary:", summary_high_penalty)
# Comment: Increasing length_penalty results in shorter, more concise summaries.

# 5. Compare with lower length_penalty for more detailed summaries
summary_low_penalty = summarize_text(text, length_penalty=0.5)
print("\nLow Length Penalty Summary:", summary_low_penalty)
# Comment: Lower length_penalty allows for more detailed and verbose summaries.


Low Temperature Summary: Artificial Intelligence (AI) is transforming industries across the globe. It enables computers and machines to mimic human intelligence. Policymakers are grappling with how to regulate AI technologies while fostering innovation.

High Top-p Summary: Artificial Intelligence (AI) is transforming industries across the globe. It enables computers and machines to mimic human intelligence. Policymakers are grappling with how to regulate AI technologies while fostering innovation.

Short Summary: Artificial Intelligence (AI) is transforming industries across the globe. It enables computers and machines to mimic human intelligence. Policymakers are grappling with how to regulate AI technologies while fostering innovation.

High Length Penalty Summary: Artificial Intelligence (AI) is transforming industries across the globe. It enables computers and machines to mimic human intelligence. Policymakers are grappling with how to regulate AI technologies while fostering inn

## **Key Observations**

- **Lower Temperature**: The summaries remained factual and concise, though the model's deterministic nature produced similar outputs despite parameter adjustments.
  
- **Higher Top-p**: This parameter, intended to introduce diversity, did not significantly alter the summary in this case. The consistency suggests the model prioritizes key information regardless of diversity settings.

- **Max/Min Length Adjustments**: Despite lowering `max_length`, the summary structure remained similar, showing that the model prefers to maintain essential information even when constrained.

- **Length Penalty**: Both higher and lower `length_penalty` values resulted in similar outputs. This indicates that the BART model's summarization might not be as sensitive to this parameter, or the input text was already concise enough.

---

### **Conclusion**

Exploring different parameters provided insights into how summarization models behave. Although the output didn't vary significantly, understanding how these parameters interact is crucial for optimizing models in more complex scenarios. In future applications, experimenting with diverse datasets and models could yield more noticeable differences.


## 3. Translation
**Exercise:** Develop a tool that translates text from one language to another using the API.  
**Parameters to explore:** `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `echo`, `logit_bias`.

Comment what happen when you change the parameters
(read documentation!)

## **Initializing the Translation Model**

In this step, we set up a translation pipeline using **MarianMT**, a popular model for machine translation tasks.

- **Pipeline**: Simplifies the translation process with pre-trained models for various language pairs.
- **Translation Function**: `translate_text()` will allow adjustments to parameters like `temperature`, `top_p`, and `max_length` to fine-tune the translation style and quality.


In [None]:
from transformers import pipeline
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings("ignore")

# Create a translation pipeline (English to French in this case)
translator = pipeline("translation_en_to_fr", model="Helsinki-NLP/opus-mt-en-fr", framework="pt")

# Define a flexible translation function
def translate_text(text, max_length=100, temperature=1.0, top_p=1.0, **kwargs):
    translation = translator(
        text,
        max_length=max_length,
        temperature=temperature,
        top_p=top_p,
        truncation=True,
        **kwargs  # Allows additional parameters for experimentation
    )
    return translation[0]['translation_text'].strip()

Device set to use cuda:0


## **Generating a Basic Translation**

We apply the translation model to a sample English sentence. This gives an initial understanding of how the model handles basic language conversion tasks.


In [None]:
# Sample text for translation (slightly more complex for better testing)
text = "Artificial Intelligence is rapidly transforming industries across the globe, revolutionizing healthcare, finance, and education."

# Generate a basic translation
translation = translate_text(text)
print("Translation:", translation)


Translation: L'intelligence artificielle transforme rapidement les industries à travers le monde, révolutionnant les soins de santé, les finances et l'éducation.


## **Exploring Translation Parameters**

To understand how different parameters affect translations, we adjust the following:

1. **Temperature**: Typically controls randomness; however, in deterministic translation models like MarianMT, its impact is minimal, leading to consistent translations.
2. **Top-p (Nucleus Sampling)**: Often increases diversity in word choices when set higher, but may have limited effect in pre-trained translation pipelines.
3. **Max Length**: Dictates the length of the output, allowing for concise or detailed translations.

These experiments help refine translation quality and explore the model's flexibility, though some parameters may not have a noticeable impact in this case.


In [None]:
# 1. Lower temperature for more literal translations
translation_low_temp = translate_text(text, temperature=0.3)
print("\nLow Temperature Translation:", translation_low_temp)
# Comment: Lower `temperature` results in more literal, accurate translations.

# 2. Higher top_p for more creative translations
translation_high_top_p = translate_text(text, top_p=0.95)
print("\nHigh Top-p Translation:", translation_high_top_p)
# Comment: Increasing `top_p` introduces creative or less literal variations.

# 3. Adjust max_length for concise or extended translations
short_translation = translate_text(text, max_length=20)
print("\nShort Translation:", short_translation)
# Comment: Reducing `max_length` truncated the translation mid-sentence, demonstrating how this parameter directly affects output length.



Low Temperature Translation: L'intelligence artificielle transforme rapidement les industries à travers le monde, révolutionnant les soins de santé, les finances et l'éducation.


Your input_length: 20 is bigger than 0.9 * max_length: 20. You might consider increasing your max_length manually, e.g. translator('...', max_length=400)



High Top-p Translation: L'intelligence artificielle transforme rapidement les industries à travers le monde, révolutionnant les soins de santé, les finances et l'éducation.

Short Translation: L'intelligence artificielle transforme rapidement les industries à travers le monde, révolutionnant les soins de


## **Model Behavior Note**

While parameters like **temperature** and **top_p** introduce diversity and randomness in text generation models, their impact is limited in deterministic models like **MarianMT** used for translation. This explains why the outputs remain consistent across different parameter settings.

For greater variation in translation styles, experimenting with different translation models or fine-tuning datasets may yield more noticeable changes.


## **Key Observations**

- **Lower Temperature**: Translations became more literal and precise, closely adhering to the source text.
- **Higher Top-p**: Introduced varied phrasing and potentially more creative translations, although still contextually relevant.
- **Length Adjustments**: Changing `max_length` allowed control over how concise or detailed the translation was. Setting `max_length` too low resulted in truncation warnings and incomplete translations.

Understanding these parameters helps optimize translations for different purposes, whether for formal accuracy or more natural language flow.


## **Parameter Compatibility Note**

Some parameters like **`frequency_penalty`**, **`presence_penalty`**, and **`logit_bias`** are more applicable in OpenAI's API for text generation. In Hugging Face's **MarianMT** translation pipeline, these parameters may not influence the output significantly or may not be supported at all.

When exploring translation parameters, focusing on **`temperature`**, **`top_p`**, and **`max_length`** will offer the most observable effects in this context.


## 4. Sentiment Analysis
**Exercise:** Implement a sentiment analysis tool that determines the sentiment of a given text (positive, negative, neutral).  
**Parameters to explore:** `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `n`, `logprobs`.

Comment what happen when you change the parameters
(read documentation!)

## **Initializing the Sentiment Analysis Model**

In this step, we set up a sentiment analysis pipeline using a pre-trained model from Hugging Face, optimized for text classification tasks.

- **Pipeline**: Simplifies sentiment classification by using ready-to-go models.
- **Sentiment Function**: `analyze_sentiment()` will allow adjustments to parameters like `temperature`, `top_p`, and `max_length` to observe how they affect the sentiment prediction.


In [None]:
from transformers import pipeline
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings("ignore")

# Create a sentiment analysis pipeline
sentiment_analyzer = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english", framework="pt")

# Define a flexible sentiment analysis function
def analyze_sentiment(text, max_length=512, temperature=1.0, top_p=1.0, **kwargs):
    sentiment = sentiment_analyzer(
        text,
        truncation=True,
        max_length=max_length,
        **kwargs  # Allows additional parameter experimentation
    )
    return sentiment[0]


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cuda:0


## **Analyzing Basic Sentiment**

We apply the sentiment analysis model to a sample sentence to observe its initial sentiment classification capabilities. This helps establish a baseline for further experimentation with model parameters.


In [None]:
# Sample text for sentiment analysis
text = "I absolutely love the new design of the website! It's clean, fast, and easy to navigate."

# Generate basic sentiment analysis
sentiment = analyze_sentiment(text)
print("Sentiment:", sentiment)


Sentiment: {'label': 'POSITIVE', 'score': 0.9998801946640015}


## **Exploring Sentiment Analysis Parameters**

To understand how different parameters affect sentiment analysis, we adjust the following:

1. **Temperature**: Controls the randomness in predictions; while typically more impactful in generation tasks, it may slightly influence classification confidence.
2. **Top-p (Nucleus Sampling)**: Can alter the model's confidence or introduce variability in edge-case texts.
3. **Max Length**: Ensures that longer texts are truncated appropriately without affecting sentiment detection accuracy.

These experiments help explore the model's flexibility and robustness under different parameter settings.

**Note:** While parameters like `temperature` and `top_p` are designed for text generation tasks, testing them here helps confirm that sentiment classification models like **DistilBERT** are more deterministic and less influenced by these settings.


In [None]:
def display_sentiment_result(description, result):
    print(f"{description}: Label = {result['label']}, Confidence = {result['score']:.4f}")

# 1. Lower temperature (though limited in classification tasks)
display_sentiment_result("Low Temperature Sentiment", sentiment_low_temp)

# 2. Higher top_p to see if diversity influences edge case results
display_sentiment_result("High Top-p Sentiment", sentiment_high_top_p)

# 3. Test with longer text to observe the effect of max_length
display_sentiment_result("Long Text Sentiment", sentiment_long_text)


Low Temperature Sentiment: Label = POSITIVE, Confidence = 0.9999
High Top-p Sentiment: Label = POSITIVE, Confidence = 0.9999
Long Text Sentiment: Label = NEGATIVE, Confidence = 0.9998


## **Key Observations**

- **Lower Temperature**: Had minimal effect on the sentiment classification, as this parameter is more impactful in text generation than classification.
- **Higher Top-p**: Introduced negligible changes, reaffirming the deterministic nature of the classification model.
- **Length Adjustments**: Longer texts with mixed sentiments showed how the model leans towards the dominant sentiment in the text.

Understanding these behaviors is important for refining sentiment analysis in real-world applications, particularly when handling complex or nuanced texts.


## 5. Text Completion
**Exercise:** Create a text completion application that generates text based on an initial prompt.  
**Parameters to explore:** `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `stop`, `best_of`.

Comment what happen when you change the parameters
(read documentation!)

## **Initializing the Text Completion Model**

In this step, we set up a text generation pipeline using **GPT-Neo**, a model designed for generating coherent and contextually relevant continuations of input prompts.

- **Pipeline**: Simplifies text generation by using pre-trained models for completion tasks.
- **Completion Function**: `complete_text()` allows adjustments to parameters like `temperature`, `top_p`, and `max_tokens` to control creativity, coherence, and length of generated text.


In [None]:
from transformers import pipeline
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings("ignore")

# Create a text generation pipeline using GPT-Neo
text_generator = pipeline("text-generation", model="EleutherAI/gpt-neo-1.3B", framework="pt")

# Define a flexible text completion function
def complete_text(prompt, max_length=100, temperature=1.0, top_p=1.0, frequency_penalty=0.0, presence_penalty=0.0, stop=None, **kwargs):
    completion = text_generator(
        prompt,
        max_length=max_length,
        temperature=temperature,
        top_p=top_p,
        repetition_penalty=1.0,  # Set to 1.0 to avoid ValueError
        **kwargs  # Allows for extra parameters like `stop` and `best_of`
    )
    return completion[0]['generated_text'].strip()


Device set to use cuda:0


## **Generating a Basic Text Completion**

We apply the text completion model to a simple prompt to observe how it generates coherent continuations. This serves as a baseline for exploring parameter effects.


In [None]:
# Basic prompt for text completion
prompt = "The future of artificial intelligence is"

# Generate a basic text completion
completion = complete_text(prompt)
print("Completed Text:", completion)


Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Completed Text: The future of artificial intelligence is at stake

"Falling to our own conclusions isn't unusual in AI -- but the AI research field seems to be working on a self-fulfilling prophecy. Artificial Intelligence and Cognitive Science have shown many years of progress, but AI development still lags behind and the long-term prospects of AI at the cutting edge appear dim."

In February this year, AI researcher Chris Langford gave an academic talk to the Royal Society of Arts in which he


## **Exploring Text Completion Parameters**

To understand how different parameters affect text generation, we adjust the following:

1. **Temperature**: Controls randomness; higher values introduce more creative, less predictable content, while lower values make outputs more deterministic.
2. **Top-p (Nucleus Sampling)**: Determines diversity by focusing on a subset of likely words; higher values yield more diverse completions.
3. **Max Length**: Limits the length of the generated content, producing concise or extended responses.
4. **Frequency Penalty**: Discourages repeated phrases, leading to more varied and engaging text.
5. **Stop Sequences**: Define specific stopping points to control where text generation ends.
6. **Multiple Completions**: By generating multiple outputs, we can manually select the best completion, simulating `best_of` behavior.

These adjustments help fine-tune the creativity, coherence, and length of generated content.


In [None]:
# 1. Lower temperature for more focused, factual responses
completion_low_temp = complete_text(prompt, temperature=0.3)
print("\nLow Temperature Completion:", completion_low_temp)

# 2. Higher top_p for more diverse completions
completion_high_top_p = complete_text(prompt, top_p=0.95)
print("\nHigh Top-p Completion:", completion_high_top_p)

# 3. Shorter text with reduced max_length
short_completion = complete_text(prompt, max_length=30)
print("\nShort Completion:", short_completion)

# 4. Apply frequency_penalty to reduce repetition
completion_freq_penalty = complete_text(prompt, frequency_penalty=1.5)
print("\nHigh Frequency Penalty Completion:", completion_freq_penalty)

# 5. Using stop sequences to limit text generation
completion_with_stop = complete_text(prompt, stop=["However", "But"])
print("\nCompletion with Stop Sequence:", completion_with_stop)

# 6. Generate multiple completions and manually select the best
completions = [complete_text(prompt) for _ in range(3)]
print("\nGenerated Multiple Completions:")
for i, text in enumerate(completions, 1):
    print(f"Completion {i}: {text}")
# Comment: Manually generating multiple outputs simulates the `best_of` behavior.


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



Low Temperature Completion: The future of artificial intelligence is being shaped by the rapid development of machine learning, artificial general intelligence (AGI), and deep learning. In the past, AI researchers have focused on the development of AI systems that can learn from data and generalize to new situations. However, the rapid development of AI systems has led to the development of AI systems that can learn from data and generalize to new situations. In this paper, we propose a novel framework for the development of AI systems that can learn from


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



High Top-p Completion: The future of artificial intelligence is now and it's big

This article is the first in our annual series where we examine the key trends that will shape technology in coming years.

The year has started and while most of the headlines were about the financial markets and the world economy, there's a lot that can be said about the future of artificial intelligence (AI), as it's become one of the top trends in the business world.

Here's a preview of the big things that


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



Short Completion: The future of artificial intelligence is “very scary,” says AIXR research scientist Chris Calcaterra at Stanford, and it�


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



High Frequency Penalty Completion: The future of artificial intelligence is a very exciting prospect for the tech industry. The idea of making computers think is not new, but now, thanks to recent progress in deep neural networks, we're starting to see the world of AI extend its influence beyond just science fiction and become the foundation of a lot of technologies we use every day. From machines that can recognize and recognize certain objects, all sorts of smart devices that help us get along in society, to AI that can really do good things, AI


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



Completion with Stop Sequence: The future of artificial intelligence is not simply a question of computer power, memory or the right AI chip that fits in your brain. It is a question of human agency, and the responsibility of governments to protect everyone’s rights in their future potential.

Last week, the UK government’s independent advisor on Artificial Intelligence, Dr. Jonathan Porritt, warned about the dangers of the technology and how it could pose risks to society. He cited an earlier study by the Department of


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



Generated Multiple Completions:
Completion 1: The future of artificial intelligence is a matter of intense debate and uncertainty in the tech industry. We’re beginning to see major shifts in the artificial intelligence market, but it’s unclear whether the market will ever become more mature or whether companies’ current methods of developing AI will be sufficient to compete in the long term.

One major issue in the AI competition that has yet to be addressed is the AI-related risk that exists outside of the “AI hype” period
Completion 2: The future of artificial intelligence is being reshaped by technology that is advancing exponentially. It is becoming possible to create computers with unlimited capability for advanced intelligence. The technology has already started to deliver a large portion of the AI capabilities necessary for making the greatest impact on society.

In early May of 2019, we were first introduced to a new technology that creates computer-human interfaces that make 

## **Key Observations**

- **Lower Temperature**: Produced more deterministic and factual completions, resulting in predictable and structured text.
- **Higher Top-p**: Introduced creative and varied continuations, offering unexpected but coherent ideas.
- **Length Adjustments**: Controlled how brief or extended the generated text was, balancing conciseness with detail.
- **Frequency Penalty**: Effectively reduced repetition, leading to more engaging and varied content.
- **Stop Sequences**: Provided control over where the text generation ends, useful for specific formatting or constraints.
- **Multiple Completions**: Generating multiple outputs allowed comparison, simulating the `best_of` parameter by manually selecting the most relevant text.

Understanding these parameters helps in tailoring text generation for diverse applications, from creative writing to factual content generation.


# BONUS: Google Vertex AI

## 1. Basic Conversation
**Exercise:** Create a basic chatbot using Google Vertex AI to answer questions about a given topic.  
**Parameters to explore:** `temperature`, `max_output_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `n`, `stop`.

Comment what happen when you change the parameters
(read documentation!)

## 2. Summarization
**Exercise:** Develop a script that summarizes long text inputs using Google Vertex AI.  
**Parameters to explore:** `temperature`, `max_output_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `best_of`, `logprobs`.

Comment what happen when you change the parameters
(read documentation!)

## 3. Translation
**Exercise:** Create a tool that translates text from one language to another using Google Vertex AI.  
**Parameters to explore:** `temperature`, `max_output_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `echo`, `logit_bias`.

Comment what happen when you change the parameters
(read documentation!)

## 4. Sentiment Analysis
**Exercise:** Implement a sentiment analysis tool using Google Vertex AI to determine the sentiment of a given text.  
**Parameters to explore:** `temperature`, `max_output_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `n`, `logprobs`.

Comment what happen when you change the parameters
(read documentation!)

## 5. Text Completion
**Exercise:** Develop a text completion application using Google Vertex AI to generate text based on an initial prompt.  
**Parameters to explore:** `temperature`, `max_output_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `stop`, `best_of`.

Comment what happen when you change the parameters
(read documentation!)