# 16 GPT: Generative Pre-trained Transformer

GPT is a model and approach developed by OpenAI. It is primarily known for its capabilities in generating coherent and contextually relevant text over long passages.

* Architecture: GPT is based on the Transformer architecture. Unlike some other models that use both an encoder and a decoder, GPT exclusively utilizes the decoder part of the Transformer for its tasks.
* Pre-training and Fine-tuning:
    - Pre-training: GPT is first pre-trained on a large corpus of text (like books, articles, websites, etc.). During this phase, it learns to predict the next word in a sentence. This process enables the model to learn grammar, facts about the world, reasoning abilities, and even some level of common sense.
    - Fine-tuning: After pre-training, the model can be fine-tuned on a specific task, such as translation, question-answering, or summarization, using a smaller, task-specific dataset.

* Autoregressive Nature: GPT generates text in an autoregressive manner. This means it produces one word at a time and uses what it's generated so far as a context to generate the next word.

#### Key Features:

* Generative Abilities: As the name suggests, GPT excels at generating text. It can produce text that is often indistinguishable from what a human might write.
* Few-Shot Learning: Introduced with GPT-3, this capability allows the model to perform tasks even when provided with very few examples (sometimes as few as one). By just specifying a task in natural language, GPT-3 can often understand and perform the task without explicit fine-tuning.
* Versatility: Unlike many models that are trained for a specific task, GPT models, especially GPT-3, are versatile and can handle a wide range of tasks without task-specific training. This includes writing essays, answering questions, creating poetry, generating code, and much more.

#### Versions:

* GPT: The original model introduced by OpenAI.

* GPT-2: A larger and more powerful version that garnered significant attention due to its impressive text generation capabilities. OpenAI initially withheld the fully-trained model due to concerns about misuse, but later released it given the broader community's responsible usage.

* GPT-3: The third iteration with 175 billion parameters, making it one of the largest models ever created. It introduced the concept of few-shot and zero-shot learning, further advancing the state-of-the-art in various NLP tasks.

### Exercise: Exploring Creative Writing with GPT

Your task is to use OpenAI's GPT model to generate creative content. You'll explore various prompts and settings to see how GPT responds and creates different outputs.

**Load GPT-2 Model and Tokenizer. GPT-2 is freely available in Hugging Face's model hub and is still highly effective.**

In [1]:
!pip install transformers

Collecting transformers
  Downloading transformers-4.32.1-py3-none-any.whl (7.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.5/7.5 MB[0m [31m15.0 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.15.1 (from transformers)
  Downloading huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m28.2 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers)
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m45.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting safetensors>=0.3.1 (from transformers)
  Downloading safetensors-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m53.9 MB/s[0m eta [36m0:00:0

In [6]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer

model_name = 'gpt2-medium'  # You can start with 'gpt2' (smaller) and then experiment with larger models
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

**Generate Creative Content Function**

El parámetro temperature controla la creatividad y variabilidad de la generación de texto.

Con temperature=1.0 (valor por defecto), se conserva la distribución de probabilidad original del modelo. Esto produce una salida más conservadora y probable.

Si no se especifica temperature, por defecto toma el valor 0.7. Esto genera una salida más determinista y menos creativa.



In [13]:
def generate_creative_content(prompt, max_length=150, temperature=1.0):
    """Generate creative content using GPT-2 based on a given prompt."""

    # Encode the prompt
    input_ids = tokenizer.encode(prompt, return_tensors='pt')

    '''# Generate text
    #output = model.generate(input_ids, max_length=max_length, temperature=temperature,
                            pad_token_id=tokenizer.eos_token_id)
    '''
    output = model.generate(input_ids,
                            max_length=max_length,
                            temperature=temperature,
                            pad_token_id=tokenizer.eos_token_id,
                            do_sample=True,
                            top_k=50,
                            top_p=0.95,
                            no_repeat_ngram_size=2)

    # Decode and print the generated text
    generated_text = tokenizer.decode(output[:, input_ids.shape[-1]:][0], skip_special_tokens=True)

    return generated_text

#### Experiment:
**Use various prompts and observe GPT's creative capabilities.
Change parameters like *max_length* and *temperature* to see their impact. (Note: A higher temperature value makes output more random, while a lower value makes it more deterministic.)**

In [15]:
prompts = [
  "Érase una vez, en un reino muy lejano,",
  "En un futuro distópico, donde la IA gobierna el mundo,",
  "El último dinosaurio en la Tierra no era como los demás. Éste",
  "En las profundidades del océano, una civilización secreta"
]
for prompt in prompts:
    print(f"Prompt: {prompt}")
    print(generate_creative_content(prompt))
    print("\n" + "-"*50 + "\n")

Prompt: Érase una vez, en un reino muy lejano,
 por lo que yo me siente, me un viejo diferente para lo comentarios, afectivo hombre.

El más de la veza, para aquí el último con un cara de mejor, que me toda el año, así, es que parece mi dio en mi santa para me despuerso de vista de las fierro, poder un área, con lo tudo, qué tú los métores. (¿Cómo es uno, señor?!)


--------------------------------------------------

Prompt: En un futuro distópico, donde la IA gobierna el mundo,
 cómo lo siento no se le estarelo. Esse eso es una fuerza de un área, é que lo tienen que no le ha bló las han ha bien esos que hocative un esa. É que los comienzos que bientos ser vies de su su casa, según el trabajo de la fuente. Si no cambio se vinculado a diferentes sonos, porque con su familia que un poco se ha está si se descansando esté le �

--------------------------------------------------

Prompt: El último dinosaurio en la Tierra no era como los demás. Éste
 hacerla puede en los formos del trastro d

* **Vamos a generar con diferentes configuraciones para eso vamos a cambiar max_length=250, temperature=2.0 en la primera configuracion.**

In [5]:
def generate_creative_content(prompt, max_length=250, temperature=2.0):
    """Generate creative content using GPT-2 based on a given prompt."""

    # Encode the prompt
    input_ids = tokenizer.encode(prompt, return_tensors='pt')

    # Generate text
    output = model.generate(input_ids, max_length=max_length, temperature=temperature, pad_token_id=tokenizer.eos_token_id)

    # Decode and print the generated text
    generated_text = tokenizer.decode(output[:, input_ids.shape[-1]:][0], skip_special_tokens=True)

    return generated_text
###########################
prompts = [
    "Once upon a time, in a kingdom far away,",
    "In a dystopian future, where AI rules the world,",
    "The last dinosaur on Earth was not like the others. It",
    "Deep beneath the ocean waves, a secret civilization"
]

for prompt in prompts:
    print(f"Prompt: {prompt}")
    print(generate_creative_content(prompt))
    print("\n" + "-"*50 + "\n")

Prompt: Once upon a time, in a kingdom far away,




 there lived a king who was a great warrior. He was a great warrior, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he

--------------------------------------------------

Prompt: In a dystopian future, where AI rules the world,
 the only way to survive is to b

* **Vamos con otra configuracion en este caso max_length=300 con una temperature=3.0, esta es la segunda configuracion**

In [None]:
def generate_creative_content(prompt, max_length=300, temperature=3.0):
    """Generate creative content using GPT-2 based on a given prompt."""

    # Encode the prompt
    input_ids = tokenizer.encode(prompt, return_tensors='pt')

    # Generate text
    output = model.generate(input_ids, max_length=max_length, temperature=temperature, pad_token_id=tokenizer.eos_token_id)

    # Decode and print the generated text
    generated_text = tokenizer.decode(output[:, input_ids.shape[-1]:][0], skip_special_tokens=True)

    return generated_text
###########################
prompts = [
    "Once upon a time, in a kingdom far away,",
    "In a dystopian future, where AI rules the world,",
    "The last dinosaur on Earth was not like the others. It",
    "Deep beneath the ocean waves, a secret civilization"
]

for prompt in prompts:
    print(f"Prompt: {prompt}")
    print(generate_creative_content(prompt))
    print("\n" + "-"*50 + "\n")

Prompt: Once upon a time, in a kingdom far away,
 there lived a king who was a great warrior. He was a great warrior, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he was a great king. He was a great king, and he w

**Discussion and Analysis:**


- **Analyze the quality of the generated text: coherence, relevancy, and creativity.**

Primera Indicación: La coherencia del texto generado es escasa y se vuelve repetitivo y sin sentido. El texto comienza con una premisa coherente pero se desvanece en frases repetidas. La relevancia disminuye a medida que avanza el texto. No hay creatividad ya que el modelo parece estar reciclando las mismas frases.

Segunda Indicación: La coherencia se mantiene y la relevancia inicial está presente con el concepto de sobrevivir como un cíborg. Sin embargo, el texto pierde rápidamente coherencia al empezar a repetirse y la relevancia disminuye. Hay creatividad limitada ya que el modelo se adhiere a la misma idea sin expandirla.

Tercera Indicación: El texto generado carece de coherencia desde el principio y se vuelve sin sentido con frases repetidas. Se pierde la relevancia con respecto a la indicación. Hay una creatividad mínima ya que el modelo lucha por generar un contenido significativo.

Cuarta Indicación: El texto generado comienza de manera coherente, pero se pierde la relevancia cuando pasa inesperadamente a describir un videojuego. La coherencia y la relevancia sufren en gran medida, y no hay creatividad ya que el modelo regurgita información existente.

- **Discuss how different prompts influence the direction of the story.**

Como se puede observar en los ejemplos anteriores, el modelo tiende a repetirse y a perder relevancia y coherencia cuando se le pide que continúe la historia sin un estímulo claro.

- **Experiment with custom prompts to generate different genres of creative content (e.g., horror, sci-fi, romance).**

In [None]:
# Creamos una nueva historia en la cual se base en un contenido creativo en este caso va hacer de genero horror
prompts = [
    "Under the full moon, in the silence of the night, a shadow lurked in the darkness,",
    "The trees whispered secrets, and the wind carried a shiver with it,",
    "In an abandoned house, footsteps echoed, though no living being was present.,"
]

for prompt in prompts:
    print(f"Prompt: {prompt}")
    print(generate_creative_content(prompt))
    print("\n" + "-"*50 + "\n")

Prompt: Under the full moon, in the silence of the night, a shadow lurked in the darkness,
 and the moonlight was like a shadow upon the earth.

The shadow was like a shadow upon the earth, and the moonlight was like a shadow upon the earth.

The shadow was like a shadow upon the earth, and the moonlight was like a shadow upon the earth.

The shadow was like a shadow upon the earth, and the moonlight was like a shadow upon the earth.

The shadow was like a shadow upon the earth, and the moonlight was like a shadow upon the earth.

The shadow was like a shadow upon the earth, and the moonlight was like a shadow upon the earth.

The shadow was like a shadow upon the earth, and the moonlight was like a shadow upon the earth.

The shadow was like a shadow upon the earth, and the moonlight was like a shadow upon the earth.

The shadow was like a shadow upon the earth, and the moonlight was like a shadow upon the earth.

The shadow was like a shadow upon the earth, and the moonlight was like

**Tasks:**

* **Use a larger GPT-2 variant (gpt2-large or gpt2-xl) and compare the quality of outputs**.

Para utilizar una variante más grande del modelo, solo se necesita cambiar el nombre del modelo a "gpt2-large" o "gpt2-xl". En este caso utilizare el GPT-large

In [None]:
model_name_l = 'gpt2-large'
model_l = GPT2LMHeadModel.from_pretrained(model_name_l)
tokenizer_l = GPT2Tokenizer.from_pretrained(model_name_l)

Downloading (…)lve/main/config.json:   0%|          | 0.00/666 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/3.25G [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

**Mantenemos la función para Generar Contenido Creativo de la parte del inicio de este ejercicio**

In [None]:
def generate_creative_content_l(prompt, max_length=150, temperature=1.0):
    """Generate creative content using GPT-2 based on a given prompt."""

    # Encode the prompt
    input_ids = tokenizer_l.encode(prompt, return_tensors='pt')

    # Generate text
    output = model_l.generate(input_ids, max_length=max_length, temperature=temperature, pad_token_id=tokenizer.eos_token_id)

    # Decode and print the generated text
    generated_text = tokenizer_l.decode(output[:, input_ids.shape[-1]:][0], skip_special_tokens=True)

    return generated_text

* **Incorporate user feedback loops, where after getting an initial piece of text, they can provide a follow-up prompt to continue or steer the story.**

In [None]:
prompts = [
    "Once upon a time, in a kingdom far away,",
    "In a dystopian future, where AI rules the world,",
    "The last dinosaur on Earth was not like the others. It",
    "Deep beneath the ocean waves, a secret civilization"
]

for prompt in prompts:
    user_feedback = ""
    while not user_feedback.lower().startswith("exit"):
        print(f"Prompt: {prompt}")
        generated = generate_creative_content_l(prompt)
        print(generated)
        user_feedback = input("Provide feedback or enter a follow-up prompt (type 'exit' to move to the next prompt): ")
        if user_feedback.lower() != "exit":
            prompt += " " + user_feedback
        print("-" * 50)
    print("=" * 50)

Prompt: Once upon a time, in a kingdom far away,
 the king and queen were blessed with a beautiful baby girl. Ever since that day, the princess has been growing up in a world of magic and adventure. But one day, the king and queen are kidnapped by a band of thieves. Now, the princess must find her way home and save her kingdom.

The game is set in a fantasy world where the player controls a young girl named Princess Zelda. The game is set in a fantasy world where the player controls a young girl named Princess Zelda. The game is set in a fantasy world where the player controls a young girl named Princess Zelda. The game is set in a fantasy world where the player controls a young girl named Princess Zelda.


Provide feedback or enter a follow-up prompt (type 'exit' to move to the next prompt):   The princess was brave and beautiful


--------------------------------------------------
Prompt: Once upon a time, in a kingdom far away,  The princess was brave and beautiful
, And the king was kind and gentle. But one day, the king was cruel and cold, And the princess was sad and alone. And the king was angry and cold, And the princess was sad and alone. And the king was cruel and cold, And the princess was sad and alone. And the king was cruel and cold, And the princess was sad and alone. And the king was cruel and cold, And the princess was sad and alone. And the king was cruel and cold, And the princess was sad and alone. And the king was cruel and cold, And the princess was sad and alone. And the king was cruel and cold,


Provide feedback or enter a follow-up prompt (type 'exit' to move to the next prompt):  exit


--------------------------------------------------
Prompt: In a dystopian future, where AI rules the world,
 the only way to survive is to become a super-soldier.

The game is set in a future where the world is ruled by a super-intelligence called the AI. The AI is a powerful and intelligent being that has been created by the government to protect the world from the threat of a super-intelligence.

The game is set in a dystopian future where the world is ruled by a super-intelligence called the AI. The AI is a powerful and intelligent being that has been created by the government to protect the world from the threat of a super-intelligence.

The game is set in a dystopian future where the world is ruled by a super-intelligence called the


Provide feedback or enter a follow-up prompt (type 'exit' to move to the next prompt):  A handsome and strong young man


--------------------------------------------------
Prompt: In a dystopian future, where AI rules the world, A handsome and strong young man
 named John is sent to a remote island to find a girl named Jane. But when he arrives, he finds that the island is a prison, and the only way to escape is to kill the guards.

The film is based on the novel by John Scalzi, and stars John Cusack, Jennifer Jason Leigh, and John Goodman.

The film is set to be released on October 10, 2017.

Source: Variety


Provide feedback or enter a follow-up prompt (type 'exit' to move to the next prompt):  exit


--------------------------------------------------
Prompt: The last dinosaur on Earth was not like the others. It
 was a giant, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked, long-necked


Provide feedback or enter a follow-up prompt (type 'exit' to move to the next prompt):  While the wild animals of the forest.


--------------------------------------------------
Prompt: The last dinosaur on Earth was not like the others. It While the wild animals of the forest.
 The dinosaurs were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They were not like the other animals. They


Provide feedback or enter a follow-up prompt (type 'exit' to move to the next prompt):  exit


--------------------------------------------------
Prompt: Deep beneath the ocean waves, a secret civilization
 has been building a massive, underground city. The city is called the City of the Dead, and it is home to a mysterious race of undead. The city is a living, breathing, breathing city, and it is the only place in the world where the dead can live. The City of the Dead is a living, breathing, breathing city. It is a living, breathing, breathing city. It is a living, breathing, breathing city. It is a living, breathing, breathing city. It is a living, breathing, breathing city. It is a living, breathing, breathing city. It is a living, breathing, breathing city. It is a living, breathing, breathing city. It


Provide feedback or enter a follow-up prompt (type 'exit' to move to the next prompt):  exit


--------------------------------------------------
