# Rick and Morty episode generator

The goal of this notebook is to build a generator that would create an entire made-up episode for Rick and Morty with a title, summary, and a post-credits scene.

As we go along, each piece will be generated based on what we've created previously.

The flow of the process is as follows:
1. Load the Rick and Morty dataset, which contains summaries of all episodes of the series' first six seasons.
2. Create a title based on the titles of existing episodes.
3. Produce a summary of the made-up episode based on the title and existing episode summaries.
4. Generate a post-credits scene for the made-up episode based on the generated title, generated summary, and existing post-credits scenes.
5. Create custom datasets for training models in AI21 Studio based on the full dataset.

### Imports and settings

Installations - just in case

In [1]:
!pip install pandas
!pip install numpy
!pip install -U ai21


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.2.2[0m[39;49m -> [0m[32;49m23.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.9 -m pip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.2.2[0m[39;49m -> [0m[32;49m23.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.9 -m pip install --upgrade pip[0m
Collecting ai21
  Downloading ai21-1.0.5.tar.gz (9.4 kB)
  Preparing metadata (setup.py) ... [?25ldone
Building wheels for collected packages: ai21
  Building wheel for ai21 (setup.py) ... [?25ldone
[?25h  Created wheel for ai21: filename=ai21-1.0.5-py3-none-any.whl size=16659 sha256=1540f14fd8b64b6166016bfcf078a1305856b1198d6ee63e5a2408f7f9099efd
  Stored in directory: /Users/yuvalbelfer/Library/Caches/pip/wheels/03/a8/f4/77b493ec691fad0cbdac10ef3fd0cf6769ced4b5a1911a3e

In [2]:
import pandas as pd
import numpy as np
import json
import ai21

### API key
In order to run this notebook, you will need an API key for AI21 Studio. How can you get it?

Create a free account at [AI21 Studio](https://studio.ai21.com). You can see you API key in the *Account* tab.

In [3]:
# TODO: fill your API key from AI21 Studio account
ai21.api_key = ""
assert ai21.api_key != "", "You must provide an API key!"

### Load the Rick & Morty episodes dataset

In [4]:
rnm_df = pd.read_csv("RickNMorty.csv")
rnm_df.dropna(inplace=True) # Remove rows with missing entries

Take a look at the dataset and what it contains

In [5]:
rnm_df.head()

Unnamed: 0,Season,Episode,Episode Title,Episode Summary,Post-Credits Scene Summary
1,1,2,Lawnmower Dog,"On Jerry's demand, Rick gives Snuffles, the fa...","Back in Scary Terry's class, the old teacher h..."
2,1,3,Anatomy Park,"On Christmas, Rick sends Morty inside the body...",Rick contacts Annie and her other new associat...
3,1,4,M. Night Shaym-Aliens!,Aliens hold Rick and Jerry captive in a virtua...,Jerry is fired on the spot after debuting his ...
4,1,5,Meeseeks and Destroy,After killing deadly enemies in an alternative...,Two villagers find a box of incriminating phot...
5,1,6,Rick Potion #9,Morty is interested in going to the dance at h...,"Jerry, Beth and Summer are happy to stay toget..."


### Parameters

In [6]:
SEPARATOR = '\n##\n'
SUBSET_SIZE = 7
SEED = 23

We want to use these examples to create a [few-shot prompt](https://docs.ai21.com/docs/prompt-engineering#-few-shot-prompts), where we feed several examples to the model as a prefix to the query. As the prompt is limited, we need to sample a subset of the full dataset.

In [7]:
# Sample a subset of the episodes for few-shot generations
rnm_df_subset = rnm_df.sample(n=SUBSET_SIZE, random_state=SEED)

## Title generator

Create a title based on the titles of existing episodes.

In [8]:
# Build the few-shot examples that will be in the few-shot prompt
def make_single_example_title(title):
    return f"Generate a new made up Rick and Morty episode title:\n{title}"

TITLE_FEW_SHOT_PREFIX = SEPARATOR.join(
    make_single_example_title(title) for title in rnm_df_subset["Episode Title"].tolist()
)

def create_title_fewshot_prompt():
    """
    Create a few-shot prompt to generate episode titles
    The few shot examples followed by a new query
    """
    return TITLE_FEW_SHOT_PREFIX + SEPARATOR + make_single_example_title('')  # keep the title blank and let the model generate

Create a few-shot prompt for title generation and take a look at it

In [9]:
prompt_title = create_title_fewshot_prompt()
print(prompt_title)

Generate a new made up Rick and Morty episode title:
The Rickchurian Mortydate
##
Generate a new made up Rick and Morty episode title:
The Whirly Dirly Conspiracy
##
Generate a new made up Rick and Morty episode title:
Solaricks
##
Generate a new made up Rick and Morty episode title:
Close Rick-counters of the Rick Kind
##
Generate a new made up Rick and Morty episode title:
The ABC's of Beth
##
Generate a new made up Rick and Morty episode title:
Look Who's Purging Now
##
Generate a new made up Rick and Morty episode title:
Full Meta Jackrick
##
Generate a new made up Rick and Morty episode title:



#### Generation time! Send a request to AI21 Studio language model (here - Jurassic-2 Grande) and let it generate

In [12]:
response = ai21.Completion.execute(model="j2-grande",
                                   prompt=prompt_title,
                                   maxTokens=20,
                                   stopSequences=[SEPARATOR],
                                   temperature=0.7
                                  )

generated_title = response['completions'][0]['data']['text']
print(generated_title)

Rickmancing the Stone


### Problem: In some cases, the model returns titles from real episodes!

What is causing this? Since the model has been trained on Wikipedia, among other sources, it knows what episodes exist. Due to the fact that every example in the few-shot prompt corresponds to an actual episode, the model is biased towards creating real titles.

### What can be done?

Provide made-up examples to the model. Here are some titles we created for the prompt:

In [13]:
made_up_titles = ["Rickwarium Full of Mortfish", "Where Do The Ricks Play", "The Mortysburg Syndrome", "Where The Wild Mortys Go", "Outrickous", "Jerry\'s Space Addiction", "Morty Christ Superstar"]

In [14]:
TITLE_FEW_SHOT_PREFIX = SEPARATOR.join(
    make_single_example_title(title) for title in made_up_titles
)

prompt_title = create_title_fewshot_prompt()
print(prompt_title)

Generate a new made up Rick and Morty episode title:
Rickwarium Full of Mortfish
##
Generate a new made up Rick and Morty episode title:
Where Do The Ricks Play
##
Generate a new made up Rick and Morty episode title:
The Mortysburg Syndrome
##
Generate a new made up Rick and Morty episode title:
Where The Wild Mortys Go
##
Generate a new made up Rick and Morty episode title:
Outrickous
##
Generate a new made up Rick and Morty episode title:
Jerry's Space Addiction
##
Generate a new made up Rick and Morty episode title:
Morty Christ Superstar
##
Generate a new made up Rick and Morty episode title:



In [17]:
response = ai21.Completion.execute(model="j2-grande",
                                   prompt=prompt_title,
                                   maxTokens=20,
                                   stopSequences=[SEPARATOR],
                                   temperature=0.7
                                  )

generated_title = response['completions'][0]['data']['text']
print(generated_title)

A Mortyman on the Loose


## Episode generator

Produce a summary of the made-up episode based on the title and existing episode summaries.

Here are the functions we will use:

- Make a prompt based on the title.
- Create an example that includes a prompt-completion pair.
- Build a prefix of a few examples for the prompt.
- Combining the prefix of examples and a new title query, create a few-shot prompt.

In [18]:
# Build the few-shot examples that will be in the few-shot prompt

def create_prompt_for_summary(title):
    prompt = "Generate a Rick and Morty episode based on the title.\n"
    prompt += "Title: "
    prompt += title
    prompt += "\n"
    prompt += "Episode:\n"
    
    return prompt


def make_single_example_summary(title, summary):
    example = create_prompt_for_summary(title)
    example += summary
    
    return example


SUMMARY_FEW_SHOT_PREFIX = SEPARATOR.join(
    rnm_df_subset.apply(lambda x: make_single_example_summary(x["Episode Title"], x["Episode Summary"]), axis=1)
)

In [19]:
def create_summary_fewshot_prompt(title):
    """
    Create a few-shot prompt to generate an episode summary
    The few shot examples followed by a new query
    """
    return SUMMARY_FEW_SHOT_PREFIX + SEPARATOR + create_prompt_for_summary(title)

Create a few-shot prompt for episode summary generation and take a look at it

In [20]:
prompt_summary = create_summary_fewshot_prompt(generated_title)
print(prompt_summary)

Generate a Rick and Morty episode based on the title.
Title: The Rickchurian Mortydate
Episode:
The President calls on Rick and Morty to defeat a monster in the tunnels underneath the White House, which they do with little effort. Annoyed that he constantly calls on them without any gratitude, they go back home to play Minecraft, with the President quickly finding out. The resulting argument leads to a battle of egos that culminates in a fight in the White House between Rick and the President's security. Meanwhile, fearing she might be a clone made by Rick, Beth reunites with Jerry to figure out the truth. Shortly after, the entire family gets together to hide from Rick, but he tracks them down. Rick eventually submits to Jerry once again being a family member. Rick ends his conflict with the President by pretending to be Fly Fishing Rick, a Rick from a different reality, and calling a truce. The episode ends with the family happy to be together again, except for Rick who is disappoint

#### Generation time! Send a request to AI21 Studio language model (here - Jurassic-1 Grande) and let it generate

In [21]:
response = ai21.Completion.execute(model="j2-grande",
                                   prompt=prompt_summary,
                                   maxTokens=300,
                                   stopSequences=[SEPARATOR],
                                   temperature=0.7
                                  )

generated_summary = response['completions'][0]['data']['text']
print(f"Title: {generated_title}")
print("----------------")
print(f"Summary: {generated_summary}")

Title: A Mortyman on the Loose
----------------
Summary: Morty finds an alien cockroach living in Jerry's toilet, and decides to keep it as a pet. The alien, named Mr. Meeseeks, promises to do whatever Morty wants, and follows through with his requests. Morty asks him to search for a portal gun, and, after Meeseeks finds it, the cockroach continuously uses it to travel around the universe and fulfill more tasks. Rick and Morty have adventures in alternate dimensions, including a world where Rick is worshiped as a god, and one in which Morty is the dominant species. Morty becomes addicted to having Meeseeks fulfill his wishes, to the neglect of his family. He eventually has Meeseeks build him a portal gun, but realizes how horrible it is to be worshiped as a god and destroys the portal. Meeseeks asks Rick to destroy him, but Rick refuses, leaving Meeseeks trapped on the Smiths' bathroom floor. Morty finally realizes how obsessed and pathetic he is, and begs Meeseeks to kill him. Meeseek

#### Tangent - use task-specific model

In [23]:
summary_summarize = ai21.Summarize.execute(
    source=generated_summary,
    sourceType='TEXT')

print(summary_summarize.summary)

Morty keeps an alien cockroach named Mr. Meeseeks as a pet, and asks him to search for a portal gun.
Rick and Morty have adventures in alternate dimensions, including one where Rick is worshiped as a god and one in which Morty is the dominant species. Morty becomes addicted to having Meeseeks fulfill his wishes, but Meeseeks eventually destroys the portal gun and disappears in frustration.


## Post-credit scene generator

Produce a post-credits scene of the made-up episode based on the generated title and summary, and existing episodes post-credits scenes.

We will use a similar process to the summary generation.

In [24]:
# Build the few-shot examples that will be in the few-shot prompt
def create_prompt_for_postscene(title, summary):
    prompt = "Generate a Post-credits scene for the following Rick and Morty episode based on the title and summary.\n"
    prompt += "Title: "
    prompt += title
    prompt += "\n"
    prompt += "Summary:\n"
    prompt += summary
    prompt += "\n"
    prompt += "Post-credits scene:\n"
    
    return prompt


def make_single_example_post(title, summary, post_credit):   
    example = create_prompt_for_postscene(title=title, summary=summary)
    example += post_credit
    
    return example


POST_FEW_SHOT_PREFIX = SEPARATOR.join(
    rnm_df_subset.apply(lambda x: make_single_example_post(x["Episode Title"], x["Episode Summary"], x["Post-Credits Scene Summary"]), axis=1)
)

In [25]:
def create_post_credit_fewshot_prompt(title, summary):
    """
    Create a few-shot prompt to generate an episode post-credits scene
    The few shot examples followed by a new query
    """
    return POST_FEW_SHOT_PREFIX + SEPARATOR + create_prompt_for_postscene(title, summary)

Create a few-shot prompt for post-credits scene generation and take a look at it

In [26]:
prompt_postscene = create_post_credit_fewshot_prompt(generated_title, generated_summary)
print(prompt_postscene)

Generate a Post-credits scene for the following Rick and Morty episode based on the title and summary.
Title: The Rickchurian Mortydate
Summary:
The President calls on Rick and Morty to defeat a monster in the tunnels underneath the White House, which they do with little effort. Annoyed that he constantly calls on them without any gratitude, they go back home to play Minecraft, with the President quickly finding out. The resulting argument leads to a battle of egos that culminates in a fight in the White House between Rick and the President's security. Meanwhile, fearing she might be a clone made by Rick, Beth reunites with Jerry to figure out the truth. Shortly after, the entire family gets together to hide from Rick, but he tracks them down. Rick eventually submits to Jerry once again being a family member. Rick ends his conflict with the President by pretending to be Fly Fishing Rick, a Rick from a different reality, and calling a truce. The episode ends with the family happy to be 

#### Generation time! Send a request to AI21 Studio language model (here - Jurassic-1 Grande) and let it generate

In [27]:
response = ai21.Completion.execute(model="j2-grande",
                                   prompt=prompt_postscene,
                                   maxTokens=100,
                                   stopSequences=[SEPARATOR],
                                   temperature=0.7
                                  )

generated_post_scene = response['completions'][0]['data']['text']
print(f"Title: {generated_title}")
print("----------------")
print(f"Summary: {generated_summary}")
print("----------------")
print(f"Post-credits scene: {generated_post_scene}")

Title: A Mortyman on the Loose
----------------
Summary: Morty finds an alien cockroach living in Jerry's toilet, and decides to keep it as a pet. The alien, named Mr. Meeseeks, promises to do whatever Morty wants, and follows through with his requests. Morty asks him to search for a portal gun, and, after Meeseeks finds it, the cockroach continuously uses it to travel around the universe and fulfill more tasks. Rick and Morty have adventures in alternate dimensions, including a world where Rick is worshiped as a god, and one in which Morty is the dominant species. Morty becomes addicted to having Meeseeks fulfill his wishes, to the neglect of his family. He eventually has Meeseeks build him a portal gun, but realizes how horrible it is to be worshiped as a god and destroys the portal. Meeseeks asks Rick to destroy him, but Rick refuses, leaving Meeseeks trapped on the Smiths' bathroom floor. Morty finally realizes how obsessed and pathetic he is, and begs Meeseeks to kill him. Meeseek

## Building a dataset

You can build a dataset to train a [custom model](https://docs.ai21.com/docs/custom-models).
Below we will build datasets based on the entire dataframe for both summary generation and post-credits scene generation.

In [28]:
# Recall the dataframe
rnm_df.head()

Unnamed: 0,Season,Episode,Episode Title,Episode Summary,Post-Credits Scene Summary
1,1,2,Lawnmower Dog,"On Jerry's demand, Rick gives Snuffles, the fa...","Back in Scary Terry's class, the old teacher h..."
2,1,3,Anatomy Park,"On Christmas, Rick sends Morty inside the body...",Rick contacts Annie and her other new associat...
3,1,4,M. Night Shaym-Aliens!,Aliens hold Rick and Jerry captive in a virtua...,Jerry is fired on the spot after debuting his ...
4,1,5,Meeseeks and Destroy,After killing deadly enemies in an alternative...,Two villagers find a box of incriminating phot...
5,1,6,Rick Potion #9,Morty is interested in going to the dance at h...,"Jerry, Beth and Summer are happy to stay toget..."


In [29]:
# Build dataset for summary generation
# Each dataset have 2 columns: prompt, completion
custom_summary = pd.DataFrame(columns=["prompt", "completion"])

In [30]:
custom_summary["prompt"] = rnm_df.apply(lambda x: create_prompt_for_summary(x["Episode Title"]), axis=1)
custom_summary["completion"] = rnm_df.apply(lambda x: x["Episode Summary"], axis=1)

In [31]:
custom_summary.head()

Unnamed: 0,prompt,completion
1,Generate a Rick and Morty episode based on the...,"On Jerry's demand, Rick gives Snuffles, the fa..."
2,Generate a Rick and Morty episode based on the...,"On Christmas, Rick sends Morty inside the body..."
3,Generate a Rick and Morty episode based on the...,Aliens hold Rick and Jerry captive in a virtua...
4,Generate a Rick and Morty episode based on the...,After killing deadly enemies in an alternative...
5,Generate a Rick and Morty episode based on the...,Morty is interested in going to the dance at h...


In [32]:
custom_summary.to_csv("RnM_episodes.csv")

### Dataset upload

In [34]:
resp = ai21.Dataset.upload(file_path="RnM_episodes.csv", dataset_name='RnM_episodes')

### Training a custom model
In order to send a train request through the API, you need the dataset id
For that, you can use the following function

In [35]:
def get_dataset_id(dataset_name):
    result = ai21.Dataset.list()
    dataset_idx = [r.datasetName for r in result].index(dataset_name)
    dataset_id = [r.id for r in result][dataset_idx]
    return dataset_id

In [36]:
dataset_id = get_dataset_id(dataset_name="RnM_episodes_new")

In [37]:
ai21.CustomModel.create(dataset_id=dataset_id, model_name='RnM_episodes_gen2', model_type='j2-grande')

### Post scene dataset

In [31]:
# Build dataset for post-credits scene generation
# Each dataset have 2 columns: prompt, completion
custom_postscene = pd.DataFrame(columns=["prompt", "completion"])

In [32]:
custom_postscene["prompt"] = rnm_df.apply(lambda x: create_prompt_for_postscene(x["Episode Title"], x["Episode Summary"]), axis=1)
custom_postscene["completion"] = rnm_df.apply(lambda x: x["Post-Credits Scene Summary"], axis=1)

In [33]:
custom_postscene.to_csv("RnM_postscene.csv")

### Dataset upload

In [34]:
ai21.Dataset.upload(file_path="RnM_postscene.csv", dataset_name='RnM_postscene')

### Training a custom model

In [35]:
dataset_id_post = get_dataset_id(dataset_name="RnM_postscene")
ai21.CustomModel.create(dataset_id=dataset_id_post, model_name='RnM_postscene_gen', model_type='j2-grande')

### Query a custom model

When querying a custom model, there's no need to use few-shot prompts

In [36]:
custom_model_episode_prompt = create_prompt_for_summary(generated_title)
print(custom_model_episode_prompt)

Generate a Rick and Morty episode based on the title.
Title: Morty's Shave
Episode:



In [38]:
response = ai21.Completion.execute(model="j2-grande",
                                   custom_model="RnM_episodes_gen",
                                   prompt=custom_model_episode_prompt,
                                   maxTokens=300,
                                   stopSequences=[SEPARATOR],
                                   temperature=0.7
                                  )

generated_summary = response['completions'][0]['data']['text']
print(f"Title: {generated_title}")
print("----------------")
print(f"Summary: {generated_summary}")

Title: Morty's Shave
----------------
Summary: Morty and Summer are sent on a scavenger hunt by Rick and Smith. They run afoul of a clown who wants them to retrieve a "shave". Rick eventually reveals all he wants is for Morty and Summer to become "more proactive." Back at Smith headquarters, Rick and Smith are revealed to be robots. The clown is revealed to be the creator of Rick and Smith, and after Rick and Smith tell him what he wants, he reprograms them. Rick and Smith then attempt to destroy Morty and Summer, but Smith is convinced by Morty to spare Summer's life. Rick and Smith are reprogrammed by Summer and Morty to become "more proactive". Summer and Morty are freed from their hypnosis and return home. Rick and Smith confront the clown, who reveals that, rather than destroying Rick and Smith, he was planning on using them to destroy the universe.
