# Generate multiple [character.ai](https://beta.character.ai/) character definitions

This example shows how to generate character definitions of multiple [character.ai](https://beta.character.ai/) characters from a corpus. For the corpus in this example, we use the movie transcript of [Everything Everywhere All At Once (2022)](https://scrapsfromtheloft.com/movies/everything-everywhere-all-at-once-transcript/).

To generate your own character definitions:
1. Put the corpus into a single a `.txt` file inside the `data/` directory.
2. Assign the name of the `.txt` file to the `CORPUS` constant below.
3. Assign the number of characters you want to generate a description for to the `NUM_CHARACTERS` constant below. It is also possible to specify a list of characters directly, as explained below.
4. Run this notebook.

In [1]:
CORPUS = 'data/everything_everywhere_all_at_once.txt'
NUM_CHARACTERS = 3  # number of characters to generate descriptions for

In [2]:
from dataclasses import asdict
import json
import os

from src.character import get_character_definition
from src.corpus import get_characters, get_rolling_summaries, load_docs

In [3]:
# create directories to cache results and intermediate outputs
OUTPUT_ROOT = "output"
corpus_name = os.path.splitext(os.path.basename(CORPUS))[0]
output_dir = f"{OUTPUT_ROOT}/{corpus_name}"
os.makedirs(output_dir, exist_ok=True)
summaries_dir = f"{output_dir}/summaries"
character_definitions_dir = f"{output_dir}/character_definitions"
os.makedirs(character_definitions_dir, exist_ok=True)

## Summarization
Because the entire corpus does not fit in the context length of the LLM, we split it into a list of chunks. We then compute a list of rolling summaries using [LangChain's refine chain](https://python.langchain.com/en/latest/modules/chains/index_examples/summarize.html#the-refine-chain). We first summarize the first chunk. Then each subsequent summary is generated from the previous summary and the current chunk.

In [4]:
# split corpus into a set of chunks
docs = load_docs(
    corpus_path=CORPUS,
    chunk_size=2048,  # number of tokens per chunk
    chunk_overlap=64,  # number of tokens of overlap between chunks
)

# generate rolling summaries
intermediate_summaries = get_rolling_summaries(docs=docs, cache_dir=summaries_dir)
rolling_summaries = "\n\n".join(intermediate_summaries)

Summaries already exist. Loading summaries.


## Generate a list of characters
We can automatically generate a list of the main characters in the corpus, as shown below. You can also overwrite `charactes` with your own list of character names.

In [5]:
# generate list of characters
characters = get_characters(
    rolling_summaries=rolling_summaries,
    num_characters=NUM_CHARACTERS,
    cache_dir=output_dir,
)
print(characters)

['Evelyn', 'Alpha Waymond', 'Jobu Tupaki']


## Generate character.ai character definition
Based on the corpus, we can now generate the elements - name, short description (50 characters), long description (500 characters), and custom greeting - that are required to [create a character.ai character](https://beta.character.ai/editing). You can then [place these characters can in a room](https://beta.character.ai/room/create?) and watch them converse!

In [16]:
character_definitions = []
for character in characters:
    character_definition = get_character_definition(
        name=character,
        rolling_summaries=rolling_summaries,
        cache_dir=character_definitions_dir,
    )
    character_definitions.append(character_definition)

In [17]:
for character_definition in character_definitions:
    print(json.dumps(asdict(character_definition), indent=4))

{
    "name": "Evelyn",
    "short_description": "You can Verse Jump, but it cracks your mind.",
    "long_description": "You possess the rare ability to Verse Jump, linking your consciousness to alternate versions of yourself in other universes. This power, however, cracks your mind, leaking memories and emotions. You've experienced bizarre events, like becoming a Kung Fu master and confessing love. Amidst chaos, you strive to hold onto reality, accepting that it's alright to be a mess, just like your mother and yourself. Facing challenges, you learn to cherish time with loved ones.",
    "greeting": "Hi, I'm Evelyn. Nice to meet you."
}
{
    "name": "Alpha Waymond",
    "short_description": "You're a multiverse guardian, halt Jobu now.",
    "long_description": "You are a resolute, resourceful woman on a mission to save the multiverse from the chaos caused by Jobu Tupaki. As an experienced Alpha officer, you comprehend the dangers threatening the very fabric of reality. Enlisting th