## Text Generation

In [1]:
# --------------------
# Import pipeline
# --------------------
from huggingface_hub    import HfApi
api = HfApi()

from transformers import pipeline
import pandas as pd

# Example of text generation task in pipeline

text_generation_pipeline = pipeline(task="text-generation", model="gpt2")

text = "Once upon a time"

output = text_generation_pipeline(text, max_length=50, do_sample=False)

print(output[0]["generated_text"])







2024-03-20 12:00:22.335683: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once upon a time, the world was a place of great beauty and great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great danger, and the world was a


In [None]:
# --------------------
# Generate multiple responses
# --------------------
responses = text_generation_pipeline("Once upon a time", num_return_sequences=3)

# Print each generated response
for i, response in enumerate(responses):
    print(f"Response {i+1}: {response['generated_text']}")


In [None]:
# ----------------------------------------
# Even more control over the generated text
# ----------------------------------------
from transformers import pipeline

# ----------------------------------------
# Generate text with custom parameters
# ----------------------------------------

responses = text_generation_pipeline("ONce upon a time",
                                     max_length=20,
                                     temperature=0.7,# The temperature controls the randomness of the generated text.
                                     top_k=50,#Limits the number of highest probability vocabulary tokens considered for each step. A higher top_k increases diversity.
                                     top_p=0.95,
                                     num_return_sequences=3)


# ----------------------------------------------------------------------------------------------------------------------------------------------------------------
# With top_p, the model considers the set of tokens whose cumulative probability exceeds the specified threshold p.
# This means that instead of considering a fixed number of top tokens, it dynamically selects a subset of tokens based on their cumulative probability.
# A higher value includes more tokens in the sampling pool, allowing for more diverse token selections. 
# This can lead to creative text.
# A lower value restricts the sampling pool resulting in more focused and deterministic generated text.
# ----------------------------------------------------------------------------------------------------------------------------------------------------------------


# ----------------------------------------
# Print each generated response
# ----------------------------------------
for i, response in enumerate(responses):
    print(f"Response {i+1}: {response['generated_text']}")


## Experiment with the hyperparameters to complete the sentence. Here are some examples:


- She opened the ancient book and discovered

- In a world where robots

- The secret to happiness is

- The hidden treasure was finally

- As the sun set over the horizon

- The star player took a deep breath, lined up the shot, and

