## Data for Learning to Speak and Act in a Fantasy Text Adventure Game
 

We previously used the LIGHT dataset(Learning in Interactive Games with Humans and Text) from the Facebook AI Research paper [Learning to Speak and Act in a Fantasy Text Adventure Game](https://arxiv.org/abs/1903.03094).  In our previous homework, we used its locations and object descriptions.

Here we will use a different part of the data which contains characters with first person "persona" descriptions, plus dialogues between pairs of characters.


## Load the data

The LIGHT data was released as part of the Facebook's ParlAI system. I extracted the data into several JSON files:
* ```light_environment_train.json``` contains information about the locations, objects, and characters in the text-adventure games.  
* ```light_dialogue_data.json``` contains sample conversations between pairs of characters.   We'll use this later in the semester. 



In [None]:
!wget https://raw.githubusercontent.com/interactive-fiction-class/interactive-fiction-class-data/master/light_dialogue/light_environment_train.json

--2022-03-23 01:00:37--  https://raw.githubusercontent.com/interactive-fiction-class/interactive-fiction-class-data/master/light_dialogue/light_environment_train.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3541467 (3.4M) [text/plain]
Saving to: ‘light_environment_train.json’


2022-03-23 01:00:38 (91.9 MB/s) - ‘light_environment_train.json’ saved [3541467/3541467]



In [None]:
import sys
import os
import json
from collections import defaultdict


json_filename = 'light_environment_train.json'

f = open(json_filename)
light_environment = json.load(f)

def get_categories(light_environment):
  return light_environment['categories'].values()
categories = get_categories(light_environment)

def get_room_name(room_id, rooms_by_id):
  return rooms_by_id[room_id]['setting']

def print_rooms_for_category(category, rooms_by_category, rooms_by_id):
  rooms = rooms_by_category[category]
  print(category.capitalize())
  for room_id in rooms:
    print('\t', room_id, '-', get_room_name(room_id))


def sort_objects_by_property(objects_by_id):
  objects_by_property = defaultdict(set)
  for object_id, obj in objects_by_id.items(): 
    name = obj['name']
    for label, value in obj.items():
      if label.startswith('is_') and value == 1:
        objects_by_property[label].add(object_id)
  return objects_by_property


rooms_by_id = light_environment['rooms']
rooms_by_category = defaultdict(set)
for room_id in rooms_by_id:
  category = light_environment['rooms'][room_id]['category']
  rooms_by_category[category].add(room_id)
objects_by_id = light_environment['objects']
objects_by_property = sort_objects_by_property(objects_by_id)




# Characters in LIGHT 


Characters have a description, a persona (a first person description of who they are and what their motivations might be), a character type (person, creature or object), a location (```in_room_id```) and an an inventory (```carrying_objects```)

The Gravedigger character is listed in the Unfinished Mausoleum's ``in_characters`` variable.  The ``in_characters`` are characters that are explictly mentioned in the location's ``description`` or ``background`` variables. 
```
light_environment['characters']['203']

{'base_form': ['gravedigger'],
 'carrying_objects': [890],
 'char_type': 'person',
 'character_id': 203,
 'corrected_name': 'gravedigger',
 'desc': 'You might want to talk to the gravedigger, specially if your looking for a friend, he might be odd but you will find a friend in him.',
 'ex_room_ids': [100, 349],
 'in_room_ids': [62],
 'is_plural': 0,
 'name': 'gravedigger',
 'orig_room_id': 349,
 'personas': ["I am low paid labor in this town. I do a job that many people shun because of my contact with death. I am very lonely and wish I had someone to talk to who isn't dead."],
 'wearing_objects': [],
 'wielding_objects': []}
 ```


In [None]:
light_environment['characters']['203']

{'base_form': ['gravedigger'],
 'carrying_objects': [890],
 'char_type': 'person',
 'character_id': 203,
 'corrected_name': 'gravedigger',
 'desc': 'You might want to talk to the gravedigger, specially if your looking for a friend, he might be odd but you will find a friend in him.',
 'ex_room_ids': [100, 349],
 'in_room_ids': [62],
 'is_plural': 0,
 'name': 'gravedigger',
 'orig_room_id': 349,
 'personas': ["I am low paid labor in this town. I do a job that many people shun because of my contact with death. I am very lonely and wish I had someone to talk to who isn't dead."],
 'wearing_objects': [],
 'wielding_objects': []}

# Dialogue Data


In [None]:
!wget https://raw.githubusercontent.com/interactive-fiction-class/interactive-fiction-class-data/master/light_dialogue/light_dialogue_data_train.json.gz
!gunzip light_dialogue_data_train.json.gz

In [None]:
import json
light_dialogue_json_filename = 'light_dialogue_data_train.json'
f = open(light_dialogue_json_filename)
light_dialogues = json.load(f)

In [None]:
def get_dialogue_description(dialogue):
  """
  Constructs a string representation of the dialogue.
  """
  # static things
  agents = dialogue["agents"] # A list of dictionaries with keys "name" and "persona"
  setting = dialogue["setting"] # A dictionary with keys "name", "category", "description", "background"
  context = dialogue["context"][0] # A second-person description of the set-up (maybe presented to Turkers?)
  object_descriptions = dialogue["all_descriptions"]

  character_order = dialogue["character"]
  speech = dialogue["speech"]
  emotes = dialogue["emote"]
  actions = dialogue["action"]

  turns = []
  for i, _ in enumerate(character_order):
    turns.append((character_order[i], speech[i], emotes[i], actions[i]))

  # Setting description
  setting_str = ""
  setting_str += "Setting:\n"
  setting_str += "* {setting} - {description}\n".format(setting=setting["name"], description=setting["description"])
  # Personas of the agents
  setting_str += "Dramatis personae:\n"
  for agent in agents:
    setting_str +="* {character} - {persona}\n".format(character=agent["name"].capitalize(), persona=agent["persona"])
  # Conversation 
  dialogue_str = ""
  dialogue_str += "Conversation:\n"
  for character, line, emote, action in turns:
    dialogue_str += '{character}: "{line}"\n'.format(character=character.capitalize(), line=line.capitalize().strip())
    if emote:
      dialogue_str += "{character}: Gestures - {emote}\n".format(character=character.capitalize(), emote=emote.capitalize().strip())
    if action:
      dialogue_str += "{character}: Stage Direction - {action}\n".format(character=character.capitalize(), action=action.capitalize().strip())
  dialogue_str += "===\n"
  return setting_str, dialogue_str


In [None]:
for i in range(0, 5):
  dialogue = light_dialogues[i]
  setting_str, dialogue_str = get_dialogue_description(dialogue)
  print(setting_str, dialogue_str)

Setting:
* Watchtower - The tower is the largest section of the castle. It contains an observatory for nighttime scouting, but is also used by the wise men to study the stars. Armed guardsmen are always to be found keeping watch.
Dramatis personae:
* Court wizard - I am an advisor of anything magical. I sell spells to those who need them. I am wealthy and hold an important place in political life
* Soldier - I came from the fertile valley when I was conscripted. The king needed strong farmer's sons to fight in the war. I am very unhappy here in the cold, damp, rainy north. I miss my friends and my dog. I hope to go back to my father's farm when the war ends.
 
Conversation:
Court wizard: "A quiet night this evening..."
Soldier: "Yes it is"
Court wizard: "Have any else come up this eve? i had hoped for a quiet night to examine the stars"
Court wizard: Gestures - Ponder
Soldier: "Yes, a few came through, but it is a cold night for me, i am used to warmer weather"
Soldier: Gestures - Nod


# Create a Few-Shot Prompt for GPT3



In [None]:
%%capture
!pip install --upgrade openai
!pip install jsonlines

You can find your OpenAI API key [here](https://beta.openai.com/account/api-keys).


In [None]:
import os
import openai

print('Enter OpenAI API key:')
openai.api_key = input()

os.environ['OPENAI_API_KEY']=openai.api_key

Enter OpenAI API key:
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


In [None]:
few_shot_prompt = """
===
Setting:
* Dining Hall - The dining hall has a long, elegant wooden table with ornate chairs at both ends and along the sides. There are place settings at each seat with a rustic silverware set and gold plates. The wall is decorated with paintings of knights and past royalty.
Dramatis personae:
* Descendant of the sons - My great-grandfather was a duke.  My family used to own a castle.  I have out coat of arms on my mantle.
* Ghost - I am a ghost who haunts a small village.  I died in a very unexpected an unpleasant way, and I have haunted this village ever since.  Everyone who passes through here fears an encounter with me.

Conversation:
Descendant of the sons: "I love these arms, they are a symbol of my families heritage. only if someone else was here to admire them withe me.."
Descendant of the sons: ** Grin **
Ghost: "These halls were to me miiine!  how dare you admire them, they are miiiine!"
Ghost: ** Scream **
Descendant of the sons: "Who goes there? name yourself! "
Descendant of the sons: ** Frown **
Ghost: "Ah, how soon they forget!  how soon they live their pathetic lives and forget me... but i was to be queen... empresss! meeee!"
Descendant of the sons: "What business do you have in my dining hall? are you here to haunt me for eternity. "
Ghost: "Do you not know what your forbears did to me?  what right have you to talk so, with my blood on your hands?  me, but an innocent village girl... did you father not speak of me, not once?"
Ghost: ** Cry **
Descendant of the sons: "There, there. i was never told this tale. how much pain you must be in, poor ghost. "
Descendant of the sons: Hug ghost 
Ghost: "He said he loved me... he said he would marry me and make me queen... and then... that faithless wretch!  get back, you wretch! you are just like him!"
Ghost: Hit descendant of the sons 
Descendant of the sons: "You foul being! leave these halls at once! it is no business of mine what you did with my father. "
Descendant of the sons: ** Gasp **
Ghost: "He slew me as i slept!  that beast you call a father, he stabbed me over and over.  you are a monster just like him!  blood will tell!"
Ghost: ** Scream **
Descendant of the sons: "Ahh! stop this screaming at once! what do you want from me ghost, for which you will leave me in peace? "
Descendant of the sons: Hit ghost 
Ghost: "Ha! you think you can appease me!  you see violence, even now! i have not even a proper burial but was flung in a ditch, less than garbage.  what could you possible do to right your father's evil doings!"
Descendant of the sons: "I see now that you are trapped in theses halls until we find your worldly body and give you a burial under the tapestry. "
Descendant of the sons: ** Stare **
Ghost: "A burial... rest... peace... i cannot even contemplate such things... it has been so long since i rested."
Ghost: ** Sigh **
===

===
Setting:
* Hidden passageway - This is a narrow underground path that is surrounded by stone walls and a dirty and grimy floor.  Many insects and rodents dwell here underneath the temple.
Dramatis personae:
* Priest - I am the king's personal priest. I ensure that all religious duties are taken care of. I believe in the beyond.
* Old woman - I am old and near death. I have seen and learned many things. I wish I could live longer.

Conversation:
Priest: "What brings you here at this late hour, old woman?"
Old woman: "I have come to seek the king's help. I have been ill for a long time and I wish to be made young again."
Priest: "Your story is not unfamiliar to me. Many people come to me with the same request. They ask that I use my power to make them young again. I am sorry to tell you that it is not possible. The gods have given us our bodies and our time here is short. When it is over, it is over. There is no returning."
Old woman: "I understand. I realize that you are only following the teachings of your god, but can you help me anyway? I do not have much time left on this earth. I have run out of my savings and I have no family to turn to. I will be dead soon. Can you help me?"
Priest: "I am sorry, old woman. It is not the king's wish to help you."
Old woman: "Please, priest, I will do anything. I have more money. I will give you everything I have if you will use your power to make me young."
Priest: "I am sorry. I cannot help you."
Old woman: "I will give you all of my money and I will offer you my service. You can command me to do whatever you wish. I will do anything you ask of me."
Priest: "I am sorry. You are wasting your time. The king will never grant the wishes of an old woman."
Old woman: "I am sorry to hear that. I truly am. If I had known, I would have never come here."
===
"""

setting = """
Setting:
* Bazaar - This place is made from oak wood. It is very small due to the wood aisles. The front of the store has a big wooden table as well as a scale.
Dramatis personae:
* The bazaar owner - I own the bazaar in the village.  I sell a variety of items that I know you must need.  But I also have some items in the back that I only sell to certain people.
* An assistant - I am only the blacksmith's assistant, but everyone has pressured me to go on this wolf hunt. This has me very worried - people say I'm strong but I've never killed anything before. I don't know that I won't run away when the wolves charge us.
Conversation:
"""


In [None]:
def get_dialogue_turn(few_shot_prompt, setting, turns, current_character):
  conversation = ""
  for character, turn in turns:
    conversation += character + ': ' + turn + '\n'
  conversation += current_character + ': '
  response = openai.Completion.create(
     engine="davinci",
     prompt=few_shot_prompt + setting + conversation,
     stop=["\n"],
     temperature=0.7,
     max_tokens=64,
     top_p=1,
     frequency_penalty=0,
     presence_penalty=0
   )
  turn = response['choices'][0]['text']
  return turn


In [None]:
characters = ["The bazaar owner", "An assistant"]
#turns = []
for i in range(6):
  current_character = characters[i % 2]
  turn = get_dialogue_turn(few_shot_prompt, setting, turns, current_character)
  turns.append((current_character, turn))




In [None]:
turns

[('The bazaar owner',
  '\xa0"I see you are new here in the village. \xa0You must be here to buy something. \xa0What can I get for you?"'),
 ('An assistant',
  '\xa0"I\'m looking for something to hunt wolves with. \xa0My master said I need something powerful to bring down a wolf."'),
 ('The bazaar owner',
  '\xa0"I have just the thing for you, but it is very expensive. \xa0I will sell it for no less than 1000 gold coins. \xa0Would you like to purchase it?"'),
 ('An assistant',
  '\xa0"1000 gold coins! \xa0I don\'t have that much money. \xa0How about something cheaper?"'),
 ('The bazaar owner', '\xa0"I am sorry. \xa0This is the only thing I have."'),
 ('The bazaar owner', '\xa0"I will sell it to you for 500 gold coins."'),
 ('An assistant',
  '\xa0"I only have 200 gold coins right now. \xa0Could I get a discount?"'),
 ('The bazaar owner',
  '\xa0"I am sorry, I cannot give you a discount. \xa0I have a lot of expenses to pay."'),
 ('An assistant',
  '\xa0"I understand. \xa0I don\'t have t

# Format Data for Fine-Tuning 

Below, I show how to create data to fine-tune OpenAI.  The OpenAI API documentation has a [guide to fine-tuning models](https://beta.openai.com/docs/guides/fine-tuning) that you should read.   The basic format of fine-tuning data is a JSONL file (one JSON object per line) with two key-value pairs: `prompt:` and `completion:`.

```
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
...
```

In the code below, I'll extract a prompt that contains the `Category` and `Setting` variables from a LIGHT Environment room, and I'll have the completion be the room's `Description`.

In [None]:
def create_dialogue_finetuning_data(filename, max_dialogues=1000):
  fine_tuning_data = []
  for i in range(min(max_dialogues, len(light_dialogues))): 
    dialogue = light_dialogues[i]
    setting_str, dialogue_str = get_dialogue_description(dialogue)
    data = {}
    data['prompt'] = setting_str
    data['completion'] = dialogue_str
    fine_tuning_data.append(data)

  with open(filename, 'w') as out:
    for data in fine_tuning_data:
        out.write(json.dumps(data))
        out.write('\n')

jsonl_filename='fine_tune_LIGHT_dialogue.jsonl'
create_dialogue_finetuning_data(jsonl_filename)

# Fine-tune GPT3 with the OpenAI API

Next, we'll perform fine-tuning with this data using OpenAI. 

In [None]:
!head '{jsonl_filename}'
!wc -lw '{jsonl_filename}'

{"prompt": "Setting:\n* Watchtower - The tower is the largest section of the castle. It contains an observatory for nighttime scouting, but is also used by the wise men to study the stars. Armed guardsmen are always to be found keeping watch.\nDramatis personae:\n* Court wizard - I am an advisor of anything magical. I sell spells to those who need them. I am wealthy and hold an important place in political life\n* Soldier - I came from the fertile valley when I was conscripted. The king needed strong farmer's sons to fight in the war. I am very unhappy here in the cold, damp, rainy north. I miss my friends and my dog. I hope to go back to my father's farm when the war ends.\n", "completion": "Conversation:\nCourt wizard: \"A quiet night this evening...\"\nSoldier: \"Yes it is\"\nCourt wizard: \"Have any else come up this eve? i had hoped for a quiet night to examine the stars\"\nCourt wizard: Gestures - Ponder\nSoldier: \"Yes, a few came through, but it is a cold night for me, i am use

Next, we'll make the fine tuning API call via the command line.  Here the -m argument gives the model.  There are 4 sizes of GPT3 models.  They go in alphabetical order from smallest to largest.
* Ada 
* Baddage
* Currie
* Davinci

The models as the model sizes increase, so does their quality and their cost.  Davinci is the highest quality and highest cost model.  I recommend starting by fine-tuning smaller models to debug your code first so that you don't rack up costs.

Fine-tuning curie on 1000 dialogues costs about $6.50.


In [None]:
!openai api fine_tunes.create -t '{jsonl_filename}' -m curie
#!openai api fine_tunes.create -t '{jsonl_filename}' -m davinci


Logging requires wandb to be installed. Run `pip install wandb`.
Upload progress: 100% 2.05M/2.05M [00:00<00:00, 3.10Git/s]
Uploaded file from fine_tune_LIGHT_dialogue.jsonl: file-bhyZbtcfRqT7T14y8UThKOnA
Created fine-tune: ft-wczaBdF9amHpCiJgCWgjRSjb
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-03-23 02:42:54] Created fine-tune: ft-wczaBdF9amHpCiJgCWgjRSjb
[2022-03-23 02:43:02] Fine-tune costs $62.04
[2022-03-23 02:43:02] Fine-tune enqueued. Queue number: 0
[2022-03-23 02:43:06] Fine-tune started

Stream interrupted. Job is still running.
To resume the stream, run:

  openai api fine_tunes.follow -i ft-wczaBdF9amHpCiJgCWgjRSjb

To cancel your job, run:

  openai api fine_tunes.cancel -i ft-wczaBdF9amHpCiJgCWgjRSjb



In [None]:
#!openai api fine_tunes.cancel -i ft-NwXfffYxfrc3BIqYACBSSDFG

In [None]:
# Curie
#!openai api fine_tunes.follow -i ft-83yYKphzn8sfrYRTJIpI1o9T
# Davinci
#!openai api fine_tunes.follow -i ft-wczaBdF9amHpCiJgCWgjRSjb


Logging requires wandb to be installed. Run `pip install wandb`.
[2022-03-23 02:26:46] Created fine-tune: ft-83yYKphzn8sfrYRTJIpI1o9T
[2022-03-23 02:26:56] Fine-tune costs $6.20
[2022-03-23 02:26:57] Fine-tune enqueued. Queue number: 0
[2022-03-23 02:27:00] Fine-tune started
[2022-03-23 02:33:23] Completed epoch 1/4
[2022-03-23 02:39:05] Completed epoch 2/4
[2022-03-23 02:44:50] Completed epoch 3/4

Stream interrupted (client disconnected).
To resume the stream, run:

  openai api fine_tunes.follow -i ft-83yYKphzn8sfrYRTJIpI1o9T



You should copy down the fine-tune numbers which look like this:

```
Created fine-tune: ft-VzQpTwfnWAzDXNKgPTFtiZg2

[2022-01-21 23:22:47] Uploaded model: curie:ft-ccb-lab-members-2022-01-21-23-22-46
```

If you forget to write it down, you can list your fine-tuned runs and models this way. These model names aren't mneumonic, so it is probably a good idea to make a note on what your model's inputs and outputs are. 

In [None]:
!openai api fine_tunes.list

You can run your fine tuned model in the OpenAI Playground.  After the model is finished finetuning you'll find it in the Engine dropdown menu.  

You'll need to give the inputs that we used as the "prompt" in our training data.  In this case we gave it a `Category:` and a `Location:` name.  You can also add the `Description:` part of the completion if you want. You should also set the stop sequence to be `###`. For example, we could type this input into the playground:

```
Category: Dark Forest
Location: Winter's Glade
Description:
```
If you press the "Generate" then your fine tuned model will output something like:

> The forest at night is not a place one would willingly set foot in. It is cold and dark and seems to go on forever. Trees loom over you, blocking out the light of the moon and stars. The only sound is the occasional howl of a wolf and the occasional splashing of water.

If you don't like the description, you can press the "Regerate" button to get other outputs like:

> The winter's glade is a dark and eerie place. It is home to many animals, but little else. The trees are barren and the ground is covered in snow.

Or

> The dark forest is a place where not even a ray of light can pierce the tangled web of branches overhead. Needles from vast numbers of trees protrude at awkward angles, their branches thin and frail, more like twigs than the strong trunks they resemble. Between the branches, a darkness deeper than night reigns. It is from this darkness that the trees themselves appear as ghosts, for the branches do not embrace the earth so much as merely touch it. The tips of the branches move slightly with every breath taken by the trees, and the dark forest seems to breathe along with them.

Or 

> The forest has now turned white.  The trees are barren and dead, their branches thin and broken.  There's a light dusting of snow on the ground, and it looks as if the forest is trying to erase all traces of life from the earth.


You can press the "Code" button to get a snippet of code that you can adapt into your own Python programs.  

```
import os
import openai

openai.api_key = os.getenv("OPENAI_API_KEY")

response = openai.Completion.create(
  model="curie:ft-ccb-lab-members-2022-01-21-23-22-46",
  prompt="Category: Dark Forest\nLocation: Winter's Glade\nDescription:",
  temperature=0.7,
  max_tokens=64,
  top_p=1,
  frequency_penalty=0,
  presence_penalty=0,
  stop=["###"]
)
```

Here's an example of how to write a function using the code that the OpenAI API provides.

In [None]:
def get_location_description(category, location_name, finetuned_model):
  response = openai.Completion.create(
      model=finetuned_model,
      prompt="Category: {category}\nLocation: {location}\nDescription:".format(
          category=category.capitalize(),
          location=location_name.capitalize()
      ),
      temperature=0.7,
      max_tokens=64,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["###"]
      )
  return response['choices'][0]['text']

# Replace with your model's name
finetuned_model = "curie:ft-ccb-lab-members-2022-01-21-23-22-46"
category = "Dark Forest"
location_name = "Winter's Glade"

descripton = get_location_description(category, location_name, finetuned_model)
print(descripton)