# API stuff

In [4]:
from getpass import getpass
OPENAI_API_KEY = getpass("OpenAI API Key: ")  # platform.openai.com

OpenAI API Key: ········


In [5]:
import pinecone

# find API key in console at app.pinecone.io
YOUR_API_KEY = getpass("Pinecone API Key: ")
# find ENV (cloud region) next to API key in console
YOUR_ENV = input("Pinecone environment: ")

  from tqdm.autonotebook import tqdm


Pinecone API Key: ········
Pinecone environment: asia-southeast1-gcp


In [7]:
story_summary = "In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English."

In [8]:
print(story_summary)

In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.


# Lore and World Building
- **World Overview:** Offer a comprehensive overview of the game world, including its history, geography, races, cultures, magic systems, and technology.
- **Timeline:** Present a detailed timeline of significant events within the game world, showcasing historical milestones, conflicts, and pivotal moments.
- **Factions and Organizations:** Detail the various factions and organizations within the game world, elaborating on their beliefs, goals, alliances, and rivalries.
- **Mythology and Legends:** Describe the mythologies and legends that shape the game world, incorporating elements from folklore or creating original lore to enrich the narrative.

In [33]:
with open("world_building_prompt.txt") as f:
    world_building_prompt = f.read()
print(world_building_prompt)

You are a story teller who will read a short paragraph and given that premise, invent a story, answer the following questions and follow the format as instructed, and put them in a json object where the keys are each of these attributes and the values are the answers. If the question has multiple answers and attributes, structure them as a nested JSON object. Ensure that all the answers are consistent with each other. The questions are as follows:

- Provide a comprehensive overview of the game world's lore, including its history, mythology, and major events.
- Describe the geography of the game world, including its landscapes, regions, and significant locations.
- Explain the different races, species, or factions that inhabit the game world, detailing their characteristics, relationships, and conflicts. Document the cultures and societal structures of each race or faction, including their customs, traditions, and unique aspects.
- Describe the magic systems, if applicable, including t

In [34]:
from langchain.prompts import ChatPromptTemplate, PromptTemplate, SystemMessagePromptTemplate, AIMessagePromptTemplate, HumanMessagePromptTemplate
system_message_prompt = SystemMessagePromptTemplate.from_template(world_building_prompt)
human_template="The summary of the story is: {story_summary}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

world_building_prompt_final = chat_prompt.format_messages(story_summary=story_summary)
world_building_prompt_final

[SystemMessage(content="You are a story teller who will read a short paragraph and given that premise, invent a story, answer the following questions and follow the format as instructed, and put them in a json object where the keys are each of these attributes and the values are the answers. If the question has multiple answers and attributes, structure them as a nested JSON object. Ensure that all the answers are consistent with each other. The questions are as follows:\n\n- Provide a comprehensive overview of the game world's lore, including its history, mythology, and major events.\n- Describe the geography of the game world, including its landscapes, regions, and significant locations.\n- Explain the different races, species, or factions that inhabit the game world, detailing their characteristics, relationships, and conflicts. Document the cultures and societal structures of each race or faction, including their customs, traditions, and unique aspects.\n- Describe the magic system

In [35]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(
    openai_api_key=OPENAI_API_KEY,
    model_name='gpt-3.5-turbo',
    temperature=0.0
)
lore = llm(world_building_prompt_final)
print(lore.content)

{
  "overview": "The game world is a fantastical realm where mythical creatures such as unicorns exist. The history of the world is shrouded in mystery, with legends of powerful wizards and ancient gods. The mythology of the world is rich and diverse, with tales of heroic quests and epic battles between good and evil. Major events in the world's history include the rise and fall of powerful empires, the awakening of ancient beasts, and the arrival of powerful magical artifacts.",
  
  "geography": {
    "landscapes": "The game world is filled with diverse landscapes, from lush forests to barren deserts. The Andes Mountains are a prominent feature, with many hidden valleys and remote locations waiting to be discovered.",
    "regions": "The world is divided into several regions, each with its own unique climate and terrain. The Andes Mountains are located in the western region of the world, while the eastern region is dominated by vast plains and grasslands.",
    "locations": "There ar

In [40]:
with open("lore.txt", "w") as f:
    f.write(lore.content)

# Storyline generation

Given json, write a story from the game history. this is where the project starts to get long and we need to store them in vector db. 

In [41]:
story_llm = ChatOpenAI(
    openai_api_key=OPENAI_API_KEY,
    model_name='gpt-3.5-turbo',
    temperature=0.0
)

In [42]:
storyline_prompt_template = "You are a story teller. Given this fictional world which is described in the below JSON, write a 4 paragraph story on a topic which will be specified later. The fictional world's JSON is {world_json}"
system_message_prompt = SystemMessagePromptTemplate.from_template(storyline_prompt_template)
human_template="I am requesting you to write a story on: {story_to_generate}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

user_requested_story = "A major war between factions that nearly wiped both parties to extinction"
story_prompt_final = chat_prompt.format_messages(world_json=lore.content, story_to_generate=user_requested_story)
story_prompt_final

[SystemMessage(content='You are a story teller. Given this fictional world which is described in the below JSON, write a 4 paragraph story on a topic which will be specified later. The fictional world\'s JSON is {\n  "overview": "The game world is a fantastical realm where mythical creatures such as unicorns exist. The history of the world is shrouded in mystery, with legends of powerful wizards and ancient gods. The mythology of the world is rich and diverse, with tales of heroic quests and epic battles between good and evil. Major events in the world\'s history include the rise and fall of powerful empires, the awakening of ancient beasts, and the arrival of powerful magical artifacts.",\n  \n  "geography": {\n    "landscapes": "The game world is filled with diverse landscapes, from lush forests to barren deserts. The Andes Mountains are a prominent feature, with many hidden valleys and remote locations waiting to be discovered.",\n    "regions": "The world is divided into several re

In [43]:
gen_story = story_llm(story_prompt_final)
print(gen_story.content)

In the game world, there were two major factions that had been at odds for centuries. The humans, with their advanced technology, had long been at odds with the unicorns, who valued the natural balance of the world. Despite their differences, the two factions had managed to coexist, albeit uneasily, for many years. However, tensions had been rising in recent years, and it was only a matter of time before war broke out.

The conflict began when a group of humans discovered a rich vein of magical energy deep in the Andes Mountains. The unicorns, who had long been the guardians of the mountains, saw this as a threat to the natural balance of the world and demanded that the humans leave the area. The humans, however, refused, and a battle ensued.

The war was brutal and devastating. Both sides suffered heavy losses, and it seemed as though neither would emerge victorious. The unicorns used their magic to devastating effect, but the humans' advanced technology allowed them to hold their own

In [45]:
with open("gen_story.txt", "w") as f:
    f.write(gen_story.content)

## vectorstore part

In [69]:
index_name = 'storygen-content'
pinecone.init(
    api_key=YOUR_API_KEY,
    environment=YOUR_ENV
)

if index_name not in pinecone.list_indexes():
    # we create a new index
    pinecone.create_index(
        name=index_name,
        metric='dotproduct',
        dimension=1536  # 1536 dim of text-embedding-ada-002
    )

In [70]:
index = pinecone.GRPCIndex(index_name)
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

In [50]:
from langchain.embeddings.openai import OpenAIEmbeddings

model_name = 'text-embedding-ada-002'
embed = OpenAIEmbeddings(
    model=model_name,
    openai_api_key=OPENAI_API_KEY
)

In [52]:
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(
    chunk_size = 1000,
    chunk_overlap  = 200,
    length_function = len,
)

In [53]:
documents = text_splitter.create_documents([gen_story.content])
print(documents[0])

page_content='In the game world, there were two major factions that had been at odds for centuries. The humans, with their advanced technology, had long been at odds with the unicorns, who valued the natural balance of the world. Despite their differences, the two factions had managed to coexist, albeit uneasily, for many years. However, tensions had been rising in recent years, and it was only a matter of time before war broke out.\n\nThe conflict began when a group of humans discovered a rich vein of magical energy deep in the Andes Mountains. The unicorns, who had long been the guardians of the mountains, saw this as a threat to the natural balance of the world and demanded that the humans leave the area. The humans, however, refused, and a battle ensued.' metadata={}


In [71]:
import time
# need to make index unique. updating same index overwrites the contents

batch_size = 100

for i in range(0, len(documents), batch_size):
    # get end of batch
    i_end = min(len(documents), i+batch_size)
    batch = documents[i:i_end]
    metadatas = [{'text': chunk.page_content} for chunk in batch]
    embeds = embed.embed_documents([chunk.page_content for chunk in batch])
    ts = int(time.time())
    ids = [str(ts + t) for t in range(len(embeds))]
    index.upsert(vectors=zip(ids, embeds, metadatas))

In [72]:
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 3}},
 'total_vector_count': 3}

In [73]:
from langchain.vectorstores import Pinecone

text_field = "text"

# switch back to normal index for langchain
# TODO: i copypasted from the tutorial and realized this changing of index variable is confusing, need to remove
index = pinecone.Index(index_name)

vectorstore = Pinecone(
    index, embed.embed_query, text_field
)

In [74]:
# test out retrieval
query = "describe peacetime"

vectorstore.similarity_search(
    query,  # our search query
    k=1  # return 3 most relevant docs
)

[Document(page_content='The war had been a terrible tragedy, but it had also taught both sides a valuable lesson. They had learned that cooperation and compromise were the only way to ensure a peaceful future for all the races of the game world. And so, they set about rebuilding their shattered societies, with a newfound respect for each other and the world they shared.', metadata={})]

# Create consistent character
Generate character, then use refine chain from QA which iteratively checks againt chunks of the db for consistency and corrects the inconsistent facts. previously we have only generated one story but we assume that we have so many stories that we need to use this technique

Future todo: when character relations are defined, save in a graph structure so we dont have to use vectors

In [63]:
with open("character_prompt.txt") as f:
    character_prompt = f.read()
print(character_prompt)

You are a story teller. Given this fictional world which is described in the below JSON, and also additional information in the next message about a character, answer the following questions in a nested JSON format, using keywords you spot as keys in the JSON:

-Attributes: what is the name, age, appearance, personality traits, and background of this character?
-backstory: write a 4 paragraph backstory detailing the character's history, upbringing, and any significant events that have shaped them
-Outline the character's motivations, goals, and desires


In [64]:
char_prompt_template = character_prompt + "The fictional world's JSON is {world_json}"
system_message_prompt = SystemMessagePromptTemplate.from_template(char_prompt_template)
human_template="I am requesting you to create a character with the following description: {character_description}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

user_character_description = "The current king of humans, who seized the throne by launching a coup on the previous king"
char_prompt_final = chat_prompt.format_messages(world_json=lore.content, character_description=user_character_description)
char_prompt_final

[SystemMessage(content='You are a story teller. Given this fictional world which is described in the below JSON, and also additional information in the next message about a character, answer the following questions in a nested JSON format, using keywords you spot as keys in the JSON:\n\n-Attributes: what is the name, age, appearance, personality traits, and background of this character?\n-backstory: write a 4 paragraph backstory detailing the character\'s history, upbringing, and any significant events that have shaped them\n-Outline the character\'s motivations, goals, and desiresThe fictional world\'s JSON is {\n  "overview": "The game world is a fantastical realm where mythical creatures such as unicorns exist. The history of the world is shrouded in mystery, with legends of powerful wizards and ancient gods. The mythology of the world is rich and diverse, with tales of heroic quests and epic battles between good and evil. Major events in the world\'s history include the rise and fa

In [65]:
gen_char = llm(char_prompt_final)

In [66]:
print(gen_char.content)

Attributes:
{
  "name": "King Aric",
  "age": 45,
  "appearance": "King Aric is a tall and imposing figure, with broad shoulders and a muscular build. He has short, dark hair and piercing blue eyes that seem to look right through you. He wears a regal outfit made of fine silk and adorned with gold and jewels.",
  "personality traits": "King Aric is a ruthless and cunning leader, who will stop at nothing to maintain his power. He is known for his quick temper and his tendency to lash out at those who oppose him. He is also highly intelligent and strategic, able to outmaneuver his enemies with ease.",
  "background": "King Aric was born into a noble family and was trained in the art of warfare from a young age. He rose through the ranks of the army quickly, earning a reputation as a skilled and fearless warrior. When the previous king became weak and ineffective, King Aric saw an opportunity to seize power. He launched a coup, using his military connections to gain support and overthrow 

Note: my prompt above doesnt seem good enough and we get a hybrid JSON-paragraphs result. lets fix it in the future

## consistency check

In [75]:
rel_docs = vectorstore.similarity_search(
    gen_char.content,  # our search query
    k=3  # adjust this for number of docs to refine over
)

In [86]:
from langchain import PromptTemplate, OpenAI, LLMChain

template = ("You will be given a description about a character in a hybrid word and JSON format, and additional context.\n" 
"Based on your understanding of the additional context which is taken to be true,\n"
"return the original description if the description is consistent with the context, and do not add any additional comments.\n"
"Else, edit the description to make it consistent with the context and return it, while making as minimal changes as possible.\n"
"The additional context is below, demarcated by dashes:\n"
"------------\n"
"{context_str}\n"
"------------\n"
"The description is below, demarcated by dashes:\n"
"------------\n"
"{description}\n"
"------------\n"
)

prompt = PromptTemplate(template=template, input_variables=["description", "context_str"])
llm_chain = LLMChain(prompt=prompt, llm=llm, verbose=True)
desc = gen_char.content
for rel_doc in rel_docs:
    ctx = rel_doc.page_content
    desc = llm_chain.predict(context_str=ctx, description=desc)




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou will be given a description about a character in a hybrid word and JSON format, and additional context.
Based on your understanding of the additional context which is taken to be true,
return the original description if the description is consistent with the context, and do not add any additional comments.
Else, edit the description to make it consistent with the context and return it, while making as minimal changes as possible.
The additional context is below, demarcated by dashes:
------------
In the game world, there were two major factions that had been at odds for centuries. The humans, with their advanced technology, had long been at odds with the unicorns, who valued the natural balance of the world. Despite their differences, the two factions had managed to coexist, albeit uneasily, for many years. However, tensions had been rising in recent years, and it was only a matter of time before war b


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou will be given a description about a character in a hybrid word and JSON format, and additional context.
Based on your understanding of the additional context which is taken to be true,
return the original description if the description is consistent with the context, and do not add any additional comments.
Else, edit the description to make it consistent with the context and return it, while making as minimal changes as possible.
The additional context is below, demarcated by dashes:
------------
The war had been a terrible tragedy, but it had also taught both sides a valuable lesson. They had learned that cooperation and compromise were the only way to ensure a peaceful future for all the races of the game world. And so, they set about rebuilding their shattered societies, with a newfound respect for each other and the world they shared.
------------
The description is below

In [88]:
desc

'The description is consistent with the context.'

### TODO
- strangely the iterative refinement messed up at the end. fix?
- save to txt file and vector db?