<a href="https://colab.research.google.com/github/mayasrikanth/CS-236G-Project/blob/main/Generate_Story_Prompts.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Working on code for in-context prompt generation (currently debugging!)

In [1]:
!pip install transformers



Import transformers library and classes for loading model/tokenizer

In [2]:
import transformers # import transformers library 
# Using HuggingFace GPT-Neo (open-source alternative to GPT-3, large GPT-J) to generate story prompts 
from transformers import AutoModelForCausalLM, AutoTokenizer

Load model (currently EleutherAI's gpt-neo with 1.3billion parameters)

In [4]:
model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neo-1.3B")
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-1.3B")  

Downloading:   0%|          | 0.00/200 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/779k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/90.0 [00:00<?, ?B/s]

PromptTune: function that takes in prompt and outputs a prompt (max length=100 chars) generated by GPT-Neo. We are leveraging the model's zero-shot learning capabilities.

In [45]:
def PromptTune(prompt, model, tokenizer, temp=0.9, max_len=100):
  ''' Takes prompt as input, issues this to the model, and returns the decoded 
      output prompt limited to max_length characters. 
    Inputs: 
        - prompt: tuple containing a string representing the input sequence 
        - model: the specified transformer-based model 
        - max_len: max length of generated output (default=100 characters)
        - temp: temperature (higher temperature gives more diverse outputs, default=0.9)
    Output:
        - output: decoded string output 
  '''
  tokenized_input = tokenizer(prompt, return_tensors="pt")#.input_ids # tokenize prompt
  gen_tokens = model.generate(
    **tokenized_input,
    do_sample=True,
    temperature=temp,
    max_length=max_len,
    ) 
  gen_text = tokenizer.batch_decode(gen_tokens) # decode output tokens
  output = gen_text[0]
  print('output: ', output)
  # Return unique output (without repeating input)
  return gen_text

Prompt Type 1: Paraphrase with entity-linking

*   The objective of this prompt is to paraphrase a prompt which contains multiple sentences into a sentence that a text2im model can easily digest. Ideally, this step should link pronouns to a single entity in a consistent manner. We can think of this step as summarizing a "single page" in a simple children's storybook. 

In [48]:
prompt = ["Passage: All was calm under the deep sea. Suddenly, a dolphin appeared. He began swimming towards shallower waters. \n"

          "Summary: A dolphin swims in shallow sea water. \n"
          
          # "Passage: The sun rose, casting shadows over the mountain top. Sally the bear climbed the mountain top. She stood at the very tip of the peak. \n "
          # "Summary: A bear stands on a mountaintop at sunrise. \n "

          #"Passage: In the light of the moon, a little egg lay on a leaf. The edge hatched, and suddenly a tiny caterpillar appeared. \n ", # inspo from "The very hungry Caterpillar"
          #"Summary: \n ",
          # "Passage: A baby turtle with a purple shell was walking on the beach. He was so close to the ocean water. He wanted to plunge into the water for a cold swim. \n" 
          "Passage: A baby turtle with a purple shell was walking on the beach. He was approaching the ocean."
          "Summary: A baby turtle with a purple shell walks on the beach towards the ocean. \n"
          "Passage: "
          # "Summary: \n"
          
          
]

In [49]:
output = PromptTune(prompt, model, tokenizer, temp=0.95, max_len=300)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


output:  Passage: All was calm under the deep sea. Suddenly, a dolphin appeared. He began swimming towards shallower waters. 
Summary: A dolphin swims in shallow sea water. 
Passage: A baby turtle with a purple shell was walking on the beach. He was approaching the ocean.Summary: A baby turtle with a purple shell walks on the beach towards the ocean. 
Passage: 
Passage: 
Passage: 
Passage: 

The dolphin, now walking along the shore with a fish hanging from his mouth. A small baby turtle with a green shell is walking down the beach with the fish hanging from his mouth. Suddenly, a small red fish swimming towards the ocean. 
Summary: A green turtle, carrying a fish, walks on the beach with a red fish swimming towards the ocean.A baby turtle with a red shell walks on the beach with a baby turtle with a red shell. A baby turtle with a red shell walks on the beach with a green turtle with a red shell.A green turtle with a red shell walks on the beach, carrying a fish. A baby turtle with a r

In [44]:
print(output)

[]


Prompt Type 2: Add Visual Descriptors

Prompt Type 2: Open Ended (Plot Generation)

In [None]:
# for a creative ending 

OLD CODE (will remove later)

In [13]:
prompt = (
    "Sentence: A black cat sits next to a pumpkin \n" 
    "Image: A cat with fluffy black fur sits on a stone wall in the backyard next to a carved Halloween pumpkin \n"
    "Sentence: A swan on a lake \n"
) # generate prompt 

In [14]:
tokenized_input = tokenizer(prompt, return_tensors="pt")#.input_ids # tokenize prompt


In [18]:
gen_tokens = model.generate(
    **tokenized_input,
    do_sample=True,
    temperature=0.9,
    max_length=100,
) # generate output tokens 

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [19]:
gen_text = tokenizer.batch_decode(gen_tokens)[0] # decode output tokens

In [22]:
gen_text = tokenizer.batch_decode(gen_tokens)

In [26]:
prompt

'Sentence: A black cat sits next to a pumpkin \nImage: A cat with fluffy black fur sits on a stone wall in the backyard next to a carved Halloween pumpkin \nSentence: A swan on a lake \n'

In [27]:
print(gen_text[0][len(prompt):])

Image: A little girl sits on the ground next to her mother who is wearing a red and white striped apron. It is a scene similar to a fairy tale 
Sentence: A man in tight trousers and a polka-dot shirt sits before
