## Data for Learning to Speak and Act in a Fantasy Text Adventure Game
 

Facebook AI Research released a dataset for their paper [Learning to Speak and Act in a Fantasy Text Adventure Game](https://arxiv.org/abs/1903.03094).

Here's paper's abstract:

> We introduce a large-scale crowdsourced text adventure game as a research platform for studying grounded dialogue. In it, agents can perceive, emote, and act while conducting dialogue with other agents. Models and humans can both act as characters within the game. We describe the results of training state-of-the-art generative and retrieval models in this setting. We show that in addition to using past dialogue, these models are able to effectively use the state of the underlying world to condition their predictions. In particular, we show that grounding on the details of the local environment, including location descriptions, and the objects (and their affordances) and characters (and their previous actions) present within it allows better predictions of agent behavior and dialogue. We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully.

Their data is called the LIGHT dataset (Learning in Interactive Games with Humans and Text).  It contains 663 locations, 3462 objects and 1755 characters.  I have divided this data into training/dev/test splits.


## Load the data

The LIGHT data was released as part of the Facebook's ParlAI system. I extracted the data into several JSON files:
* ```light_environment_train.json``` contains information about the locations, objects, and characters in the text-adventure games.  
* ```light_dialogue_data.json``` contains sample conversations between pairs of characters.   We'll use this later in the semester. 



In [1]:
!wget https://raw.githubusercontent.com/interactive-fiction-class/interactive-fiction-class-data/master/light_dialogue/light_environment_train.json

--2022-01-30 17:19:15--  https://raw.githubusercontent.com/interactive-fiction-class/interactive-fiction-class-data/master/light_dialogue/light_environment_train.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3541467 (3.4M) [text/plain]
Saving to: ‘light_environment_train.json’


2022-01-30 17:19:15 (65.6 MB/s) - ‘light_environment_train.json’ saved [3541467/3541467]



In [2]:
import sys
import os
import json

json_filename = 'light_environment_train.json'

f = open(json_filename)
light_environment = json.load(f)


# LIGHT Environment Data

This section of the Python Notebook will walk you through the LIGHT environment data to show you the different elements of the JSON file.  We will use different pieces of these to fine-tune GPT3 in order to generate new locations and objects for our own text adventure games.


### Categories

The locations in LIGHT environment are grouped by categories. 

```
categories =  light_environment['categories']

categories

 {'11': 'Graveyard',
 '12': 'Wasteland',
 '13': 'Abandoned',
 '14': 'Mountain',
 '15': 'Cave',
 '16': 'Tavern',
 '17': 'Jungle',
 '18': 'Trail',
 '19': 'Town',
 '2': 'Forest',
 '20': 'Dungeon',
 '21': 'Inside Cottage',
 ... }
```


I split the LIGHT environment data into training/dev/test splits based on cateogries.  Here are the categories that ended up in the training partition.

In [3]:
def get_categories(light_environment):
  return light_environment['categories'].values()
categories = get_categories(light_environment)

print("\n".join(categories))

Forest
Shore
Countryside
Port
Swamp
Lake
Graveyard
Abandoned
Cave
Trail
Dungeon
Outside Cottage
Inside Castle
Outside Castle
Inside Church
Outside Church
Inside Temple
Outside Temple
Inside Tower
Outside Tower
Inside Palace
Outside Palace
Farm
city in the clouds
magical realm
netherworld
supernatural
underwater aquapolis



### Rooms

In text-adventure games, locations or settings are often called "rooms".  Rooms have a primary description of the location, a secondary description of the location with its background information, connections to neighboring rooms, and they can contain objects or non-player characters. 

Here's what the data structure looks like for a particular room in LIGHT (room number 62, 'An Unfinished Mausoleum', part of the 'Graveyard' category.

```
rooms = light_environment['rooms']
rooms['62']

{'background': "Bright white stone was all the fad for funerary architecture, once upon a time. It's difficult to understand why someone would abandon such a large and expensive undertaking. If they didn't have the money to finish it, they could have sold the stone, surely - or the mausoleum itself. Maybe they just haven't needed it yet? A bit odd, though, given how old it is. Maybe the gravedigger remembers... if he's sober.",
 'category': 'Graveyard',
 'description': 'Two-and-a-half walls of the finest, whitest stone stand here, weathered by the passing of countless seasons. There is no roof, nor sign that there ever was one. All indications are that the work was abruptly abandoned. There is no door, nor markings on the walls. Nor is there any indication that any coffin has ever lain here... yet.',
 'ex_characters': [204, 75, 156, 720],
 'ex_objects': [1791, 1792, 439],
 'in_characters': [203, 203],
 'in_objects': [1790],
 'neighbors': [108, 109],
 'room_id': 62,
 'setting': 'An Unfinished Mausoleum'}
```

The **in_objects** and **in_characters** are people and things that are explictly mentioned  listed in the description or the backstory.  The **ex_characters** and **ex_objects** are objects that are possibly present but not mentioned directly. These characters and objects are referenced by a numeric ID which are stored in a seperate part of the LIGHT environment file.



Here are the rooms that are in the 'Graveyard' category. 

In [4]:
from collections import defaultdict

rooms_by_id = light_environment['rooms']

rooms_by_category = defaultdict(set)
for room_id in rooms_by_id:
  category = light_environment['rooms'][room_id]['category']
  rooms_by_category[category].add(room_id)

def get_room_name(room_id, rooms_by_id=rooms_by_id):
  return rooms_by_id[room_id]['setting']

def print_rooms_for_category(category, rooms_by_category, rooms_by_id):
  rooms = rooms_by_category[category]
  print(category.capitalize())
  for room_id in rooms:
    print('\t', room_id, '-', get_room_name(room_id))

print_rooms_for_category('Graveyard', rooms_by_category, rooms_by_id)



Graveyard
	 702 - Main street
	 431 - Abandoned workers shed
	 340 - A cursed grave
	 62 - An Unfinished Mausoleum
	 277 - Graveyard
	 661 - Main graveyard
	 462 - Dead Tree
	 100 - Old Crypt
	 158 - the fountain
	 162 - Reception area
	 144 - Cemetery
	 386 - Tombstones of the Kings


### Neighbors

Rooms are connected to other rooms.  The LIGHT dataset stores the connections in a variable called ```light_environment['neighbors']```.  Here is an example of what is information is stored about these connections.

```
 '108': {'connection': 'walking carefully between fallen headstones',
  'destination': 'Fresh Grave',
  'direction': 'West',
  'inverse_id': None,
  'room_id': 62},
 '109': {'connection': 'following a dirt trail behind the mausoleum',
  'destination': 'Dead Tree',
  'direction': 'South',
  'inverse_id': None,
  'room_id': 62},
```

These can be thought of as arcs in a directed graph, where the rooms are nodes, and these elements are the arcs that connect a pair of nodes.  The head of the arc (the ***to node***) is specified by the ```destination``` field (a description rather than an ID), and tail of the arc (the ***from node***) is specified by the ```room_id```.

In [5]:
arcs = light_environment['neighbors']

# Create a dictionary that maps room names ('setting') to IDs
room_names_to_id = {room['setting']:room_id for (room_id,room) in rooms_by_id.items()}


def make_connections(arcs):
#  direction, connected_location, travel_description
  for arc_id, arc in arcs.items():
    try:
      source_id = str(arc['room_id'])
      target_id = str(room_names_to_id[arc['destination']])
      direction = arc['direction']
      travel_description = arc['connection']
      source_name = get_room_name(source_id)
      target_name = get_room_name(target_id)
      # Print out the room connections in the Graveyard
      if source_id in rooms_by_category['Graveyard']:
        print('====')
        print(source_name, '-->', target_name)
        print(direction)
        print(travel_description)
    except:
      pass

make_connections(arcs)

====
An Unfinished Mausoleum --> Dead Tree
South
following a dirt trail behind the mausoleum
====
Old Crypt --> Abandoned workers shed
South
walking down the cobbled path
====
Cemetery --> Main street
West
following the cobblestone path
====
Reception area --> Main graveyard
East
walking
====
Tombstones of the Kings --> Church
North
exiting the graveyard
====
Abandoned workers shed --> Old Crypt
North
walking down the cobbled path
====
Main street --> Cemetery
South
traveling the road south
====
Main street --> Cemetery
East
following the cobblestone path



### Characters 


Characters have a description, a persona (a first person description of who they are and what their motivations might be), a character type (person, creature or object), a location (```in_room_id```) and an an inventory (```carrying_objects```)

The Gravedigger character is listed in the Unfinished Mausoleum's ``in_characters`` variable.  The ``in_characters`` are characters that are explictly mentioned in the location's ``description`` or ``background`` variables.  In this case, the Gravedigger is mentioned in the Unfinished Mausoleum's ``background variahle``. 
```
light_environment['characters']['203']

{'base_form': ['gravedigger'],
 'carrying_objects': [890],
 'char_type': 'person',
 'character_id': 203,
 'corrected_name': 'gravedigger',
 'desc': 'You might want to talk to the gravedigger, specially if your looking for a friend, he might be odd but you will find a friend in him.',
 'ex_room_ids': [100, 349],
 'in_room_ids': [62],
 'is_plural': 0,
 'name': 'gravedigger',
 'orig_room_id': 349,
 'personas': ["I am low paid labor in this town. I do a job that many people shun because of my contact with death. I am very lonely and wish I had someone to talk to who isn't dead."],
 'wearing_objects': [],
 'wielding_objects': []}

 ```
 Here are the ``ex_characters`` from the Unfinished Mausoleum.  They are not explicitly mentioned in the room's description or background, but the annotators thought that these characters were the kinds of characters that might be found there.

```
for id in "204, 75, 156, 720".split(','):
  print(light_environment['characters'][id.strip()]['corrected_name'])

thief
peasant
mouse
bat
```


In [6]:
for id in "204, 75, 156, 720".split(','):
  print(light_environment['characters'][id.strip()]['corrected_name'])

thief
peasant
mouse
bat


Here is the Gravedigger character.  Characters have descriptions, name, and personas.  We'll use personas later in the semester when we look at generating dialogue for characters.

In [7]:
light_environment['characters']['203']

{'base_form': ['gravedigger'],
 'carrying_objects': [890],
 'char_type': 'person',
 'character_id': 203,
 'corrected_name': 'gravedigger',
 'desc': 'You might want to talk to the gravedigger, specially if your looking for a friend, he might be odd but you will find a friend in him.',
 'ex_room_ids': [100, 349],
 'in_room_ids': [62],
 'is_plural': 0,
 'name': 'gravedigger',
 'orig_room_id': 349,
 'personas': ["I am low paid labor in this town. I do a job that many people shun because of my contact with death. I am very lonely and wish I had someone to talk to who isn't dead."],
 'wearing_objects': [],
 'wielding_objects': []}

In [8]:
characters_by_id = light_environment['characters']
characters_by_id['203']

from collections import Counter

def count_character_types(characters_by_id):
  character_types = Counter()
  for character_id in characters_by_id:
    character = characters_by_id[character_id]
    char_type = character['char_type']
    character_types[char_type] += 1
  return character_types

character_types = count_character_types(characters_by_id)
print(character_types)


Counter({'person': 1028, 'creature': 304, 'object': 38})


### Objects

Objects are inanimate things in the game.  They have descriptions, locations, and a set of properties that could be used to govern how a player interacts with them.  The properties of objects in the light dataset are 
* is_container
* is_drink
* is_food
* is_gettable
* is_plural
* is_surface
* is_weapon
* is_wearable

These properties have numeric values associated with them.  The values seem to be something like 0.0 = false, 1.0 = true, 0.5 = possibly. 

Here is an example object:
```
light_environment['objects']['1188']

 {'base_form': ['sword', 'Sword'],
 'desc_entries': 2,
 'descriptions': ['The sword is very old, you would assume it had once belonged to a legendary warrior.',
  "The sword's legend is known by everyone, it is famous throughout the land."],
 'ex_room_ids': [],
 'holding_character_ids': [],
 'in_room_ids': [12],
 'is_container': 0.0,
 'is_drink': 0.0,
 'is_food': 0.0,
 'is_gettable': 1.0,
 'is_plural': 1.0,
 'is_surface': 0.0,
 'is_weapon': 1.0,
 'is_wearable': 0.0,
 'link_entries': 1,
 'name': 'Legendary swords',
 'object_id': 1188}
 ```

In [None]:
light_environment['objects']['1188']

{'base_form': ['sword', 'Sword'],
 'desc_entries': 2,
 'descriptions': ['The sword is very old, you would assume it had once belonged to a legendary warrior.',
  "The sword's legend is known by everyone, it is famous throughout the land."],
 'ex_room_ids': [],
 'holding_character_ids': [],
 'in_room_ids': [12],
 'is_container': 0.0,
 'is_drink': 0.0,
 'is_food': 0.0,
 'is_gettable': 1.0,
 'is_plural': 1.0,
 'is_surface': 0.0,
 'is_weapon': 1.0,
 'is_wearable': 0.0,
 'link_entries': 1,
 'name': 'Legendary swords',
 'object_id': 1188}

In [None]:
obj = light_environment['objects']['1188']
print(obj['name'])
print(obj['object_id'])
for label, value in obj.items():
  if label.startswith('is_') and value == 1.0:
    print(label, value)

Legendary swords
1188
is_gettable 1.0
is_weapon 1.0
is_plural 1.0


In [55]:
objects_by_id = light_environment['objects']

def sort_objects_by_property(objects_by_id):
  objects_by_property = defaultdict(set)
  for object_id, obj in objects_by_id.items(): 
    name = obj['name']
    for label, value in obj.items():
      if label.startswith('is_') and value == 1:
        objects_by_property[label].add(object_id)
  return objects_by_property

objects_by_property = sort_objects_by_property(objects_by_id)

# print 20 objects for each property
for prop in objects_by_property:
  print(prop)
  for counter, object_id in enumerate(objects_by_property[prop]):
    if counter < 20:
      obj_name = objects_by_id[object_id]['name']
      print('\t', obj_name)


is_gettable
	 sand
	 water buckets
	 a barrel
	 black rock
	 needles
	 religious idols
	 underground plant-life
	 chamber pot
	 benches made from wood
	 fetilizer
	 mace
	 a holy book on a shelf underneath the seat
	 Daggers
	 two home made wooden crosses
	 hammer
	 finest linens
	 cutting board
	 bedding
	 branches
	 silks
is_plural
	 herd some sheep
	 water buckets
	 needles
	 tiles
	 fancy dinners
	 religious idols
	 underground plant-life
	 benches made from wood
	 fruit trees
	 ornate railings
	 two home made wooden crosses
	 tombs for every empress and emperor
	 stalagmites
	 finest linens
	 open plains
	 ceilings
	 branches
	 jeweled eyes
	 beds
	 silks
is_weapon
	 whip
	 Self-portraits
	 black rock
	 needles
	 compass
	 igneous rock
	 A broken fishing pole
	 stars
	 hunting rifles
	 garden bench
	 broken cutlass
	 benches made from wood
	 Pitch fork
	 a cooking pot
	 mace
	 a holy book on a shelf underneath the seat
	 Swords
	 Daggers
	 Empty mug
	 bows and arrows
is_surface
	 

# Format Data for Fine-Tuning 

Below, I show how to create data to fine-tune OpenAI.  The OpenAI API documentation has a [guide to fine-tuning models](https://beta.openai.com/docs/guides/fine-tuning) that you should read.   The basic format of fine-tuning data is a JSONL file (one JSON object per line) with two key-value pairs: `prompt:` and `completion:`.

```
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
...
```

In the code below, I'll extract a prompt that contains the `Category` and `Setting` variables from a LIGHT Environment room, and I'll have the completion be the room's `Description`.

In [9]:

def get_room_description(room_id, rooms_by_id, light_environment):
  """
  This generates a prompt and a completion which can be used to fine-tune OpenAI.
  This version just gnnerates 
  """
  prompt = ""
  completion = ""
  prompt += "Category: {category}\n".format(category=rooms_by_id[room_id]['category'].capitalize())
  prompt += "Setting: {setting}\n".format(setting=rooms_by_id[room_id]['setting'].capitalize())
  completion += "Description: {description}\n".format(description=rooms_by_id[room_id]['description'])
  completion += "###\n"

  return prompt, completion


def create_location_finetuning_data(filename='fine_tuning_location_descriptions.jsonl'):
  fine_tuning_data = []
  for category in categories:
    rooms = rooms_by_category[category]
    for room_id in rooms:
      data = {}
      prompt, completion = get_room_description(room_id, rooms_by_id, light_environment)
      data['prompt'] = prompt
      data['completion'] = completion
      print(prompt, end="")
      print(completion)
      fine_tuning_data.append(data)

  with open(filename, 'w') as out:
    for data in fine_tuning_data:
        out.write(json.dumps(data))
        out.write('\n')

create_location_finetuning_data()

Category: Forest
Setting: Woodlands
Description: The lush hunting grounds cover acres of land, tall trees and thickets of bushes sprawling across the land. Fields of open land are under the deep blue sky, showing the depth of the landscape. The grounds forest is thick but full of light, letting the multitude of wildlife be seen by the naked eye. The abundance of greenery makes it a prime location for hogs and oxen as they explore the land.
###

Category: Forest
Setting: Forbidden forest
Description: The forbidden forest is a dark and very scary place. The trees are covered in spider web nets and huge spiders. The forest echoes sounds of witches' laughs and creepy crawly animals.  There is a brisk breeze that runs through the trees that will put chills down your spine.
###

Category: Forest
Setting: North forest
Description: Filled with huge trees, the forest is home to many types of animals. Huge predators such as tigers and bears roam the grounds as do small animals like squirrels. Co

# Fine-tune GPT3 with the OpenAI API

Next, we'll perform fine-tuning with this data using OpenAI. 

In [6]:
%%capture
!pip install --upgrade openai
!pip install jsonlines

Once you've got access to the OpenAI API, you can find your OpenAI API key [here](https://beta.openai.com/account/api-keys).

In [None]:
import os
import openai

print('Enter OpenAI API key:')
openai.api_key = input()

os.environ['OPENAI_API_KEY']=openai.api_key

In [10]:
!head fine_tuning_location_descriptions.jsonl

{"prompt": "Category: Forest\nSetting: Woodlands\n", "completion": "Description: The lush hunting grounds cover acres of land, tall trees and thickets of bushes sprawling across the land. Fields of open land are under the deep blue sky, showing the depth of the landscape. The grounds forest is thick but full of light, letting the multitude of wildlife be seen by the naked eye. The abundance of greenery makes it a prime location for hogs and oxen as they explore the land.\n###\n"}
{"prompt": "Category: Forest\nSetting: Forbidden forest\n", "completion": "Description: The forbidden forest is a dark and very scary place. The trees are covered in spider web nets and huge spiders. The forest echoes sounds of witches' laughs and creepy crawly animals.  There is a brisk breeze that runs through the trees that will put chills down your spine.\n###\n"}
{"prompt": "Category: Forest\nSetting: North forest\n", "completion": "Description: Filled with huge trees, the forest is home to many types of 

Next, we'll make the fine tuning API call via the command line.  Here the -m argument gives the model.  There are 4 sizes of GPT3 models.  They go in alphabetical order from smallest to largest.
* Ada 
* Baddage
* Currie
* Davinci

The models as the model sizes increase, so does their quality and their cost.  Davinci is the highest quality and highest cost model.  I recommend starting by fine-tuning smaller models to debug your code first so that you don't rack up costs.

Fine-tuning curie costs about $0.50 for this data.


In [60]:
light_environment.keys()

dict_keys(['categories', 'rooms', 'neighbors', 'characters', 'objects'])

In [None]:
!openai api fine_tunes.create -t fine_tuning_location_descriptions.jsonl -m curie


Found potentially duplicated files with name 'fine_tuning_location_descriptions.jsonl', purpose 'fine-tune' and size 196498 bytes
file-isIVJnKirOczR5QKVQKYfHfg
Enter file ID to reuse an already uploaded file, or an empty string to upload this file anyway: 
Upload progress: 100% 196k/196k [00:00<00:00, 262Mit/s]
Uploaded file from fine_tuning_location_descriptions.jsonl: file-uTIbP7fRx8RADabfCSD3fE6n
Created fine-tune: ft-967ExyrfGqnZ4zV29R61yVlt
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-01-28 18:13:47] Created fine-tune: ft-967ExyrfGqnZ4zV29R61yVlt
[2022-01-28 18:13:57] Fine-tune costs $0.47
[2022-01-28 18:13:57] Fine-tune enqueued. Queue number: 0
[2022-01-28 18:14:01] Fine-tune started
[2022-01-28 18:17:46] Completed epoch 1/4
[2022-01-28 18:20:52] Completed epoch 2/4
[2022-01-28 18:23:56] Completed epoch 3/4
[2022-01-28 18:27:02] Completed epoch 4/4
[2022-01-28 18:27:33] Uploaded model: curie:ft-cis-700-

You should copy down the fine-tune numbers which look like this:

```
Created fine-tune: ft-VzQpTwfnWAzDXNKgPTFtiZg2

[2022-01-21 23:22:47] Uploaded model: curie:ft-ccb-lab-members-2022-01-21-23-22-46
```

If you forget to write it down, you can list your fine-tuned runs and models this way. These model names aren't mneumonic, so it is probably a good idea to make a note on what your model's inputs and outputs are. 

In [None]:
!openai api fine_tunes.list

{
  "data": [
    {
      "created_at": 1643171130,
      "fine_tuned_model": "babbage:ft-cis-700-31-2022-01-26-04-28-54",
      "hyperparams": {
        "batch_size": 1,
        "learning_rate_multiplier": 0.05,
        "n_epochs": 4,
        "prompt_loss_weight": 0.1
      },
      "id": "ft-gQz3VvxCdlXkPZXKfIKPb7Pz",
      "model": "babbage",
      "object": "fine-tune",
      "organization_id": "org-F69R2DHlJEWY5pRMNnAC0oYV",
      "result_files": [
        {
          "bytes": 16255,
          "created_at": 1643171337,
          "filename": "compiled_results.csv",
          "id": "file-fuqAemUsZ2CZG1VM5dCYYqeV",
          "object": "file",
          "purpose": "fine-tune-results",
          "status": "processed",
          "status_details": null
        }
      ],
      "status": "succeeded",
      "training_files": [
        {
          "bytes": 5827,
          "created_at": 1643171129,
          "filename": "fine_tuning_intent_determination-1.jsonl",
          "id": "file-pWTYKK

You can run your fine tuned model in the OpenAI Playground.  After the model is finished finetuning you'll find it in the Engine dropdown menu.  

You'll need to give the inputs that we used as the "prompt" in our training data.  In this case we gave it a `Category:` and a `Location:` name.  You can also add the `Description:` part of the completion if you want. You should also set the stop sequence to be `###`. For example, we could type this input into the playground:

```
Category: Dark Forest
Location: Winter's Glade
Description:
```
If you press the "Generate" then your fine tuned model will output something like:

> The forest at night is not a place one would willingly set foot in. It is cold and dark and seems to go on forever. Trees loom over you, blocking out the light of the moon and stars. The only sound is the occasional howl of a wolf and the occasional splashing of water.

If you don't like the description, you can press the "Regerate" button to get other outputs like:

> The winter's glade is a dark and eerie place. It is home to many animals, but little else. The trees are barren and the ground is covered in snow.

Or

> The dark forest is a place where not even a ray of light can pierce the tangled web of branches overhead. Needles from vast numbers of trees protrude at awkward angles, their branches thin and frail, more like twigs than the strong trunks they resemble. Between the branches, a darkness deeper than night reigns. It is from this darkness that the trees themselves appear as ghosts, for the branches do not embrace the earth so much as merely touch it. The tips of the branches move slightly with every breath taken by the trees, and the dark forest seems to breathe along with them.

Or 

> The forest has now turned white.  The trees are barren and dead, their branches thin and broken.  There's a light dusting of snow on the ground, and it looks as if the forest is trying to erase all traces of life from the earth.


You can press the "Code" button to get a snippet of code that you can adapt into your own Python programs.  

```
import os
import openai

openai.api_key = os.getenv("OPENAI_API_KEY")

response = openai.Completion.create(
  model="curie:ft-ccb-lab-members-2022-01-21-23-22-46",
  prompt="Category: Dark Forest\nLocation: Winter's Glade\nDescription:",
  temperature=0.7,
  max_tokens=64,
  top_p=1,
  frequency_penalty=0,
  presence_penalty=0,
  stop=["###"]
)
```

Here's an example of how to write a function using the code that the OpenAI API provides.

In [13]:
def get_location_description(category, location_name, finetuned_model):
  response = openai.Completion.create(
      model=finetuned_model,
      prompt="Category: {category}\nSetting: {location}\nDescription:".format(
          category=category.capitalize(),
          location=location_name.capitalize()
      ),
      temperature=0.7,
      max_tokens=64,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["###"]
      )
  return response['choices'][0]['text']

# Replace with your model's name
finetuned_model = "curie:ft-cis-700-31-2022-01-28-18-27-32"
category = "Campus"
location_name = "Winter's UPenn"

descripton = get_location_description(category, location_name, finetuned_model)
print(descripton)

 The sky is overcast and murky. The wind blows snow in drifts everywhere. The ground is covered in a white blanket that seems to go on for miles. This is Winter's upenn. The temperature is well below freezing.



# TODO: Fine-Tune Additional Models for Text Adventure Games

In this assignment, we'll ask you to fine-tune models to perform the following tasks:
1. Describe a location (I've given you this code.  You can adapt it for other models)
- inputs: category, location name
- output: location description 
2. List the items that are at a location
- inputs: category, location name, location description, number of items
- output: list of item names
3. Describe an item
- inputs: category, location name, location description, item name
- output: item description 
5. List connections from the current location
- inputs: category, location name, location description, and optionally a partial list of existing connections (direction, location name) tuples 
- output: a list of (direction, location name) tuples
4. Get an item's properties
- inputs: item name, item description, property (e.g. gettable)
- output: True or False if the item has that property


In [31]:
# get all items in one specific location
def get_items(room_id, rooms_by_id, light_environment):
  """
  This generates a prompt and a completion which can be used to fine-tune OpenAI.
  This version just gnnerates 
  """
  prompt = ""
  completion = ""
  prompt += "Category: {category}\n".format(category=rooms_by_id[room_id]['category'].capitalize())
  prompt += "Setting: {setting}\n".format(setting=rooms_by_id[room_id]['setting'].capitalize())
  prompt += "Description: {description}\n".format(description=rooms_by_id[room_id]['description'])
  in_objects=rooms_by_id[room_id]['in_objects']
  objects_array=[]
  for obj in in_objects:
    objects_array.append(light_environment["objects"][str(obj)]["name"])
  completion += "Items: {items}\n".format(items=objects_array)
  completion += "###\n"

  return prompt, completion


def create_items_finetuning_data(filename='fine_tuning_location_items.jsonl'):
  fine_tuning_data = []
  for category in categories:
    rooms = rooms_by_category[category]
    for room_id in rooms:
      data = {}
      prompt, completion = get_items(room_id, rooms_by_id, light_environment)
      data['prompt'] = prompt
      data['completion'] = completion
      print(prompt, end="")
      print(completion)
      fine_tuning_data.append(data)

  with open(filename, 'w') as out:
    for data in fine_tuning_data:
        out.write(json.dumps(data))
        out.write('\n')

create_items_finetuning_data()

Category: Forest
Setting: Woodlands
Description: The lush hunting grounds cover acres of land, tall trees and thickets of bushes sprawling across the land. Fields of open land are under the deep blue sky, showing the depth of the landscape. The grounds forest is thick but full of light, letting the multitude of wildlife be seen by the naked eye. The abundance of greenery makes it a prime location for hogs and oxen as they explore the land.
Items: []
###

Category: Forest
Setting: Forbidden forest
Description: The forbidden forest is a dark and very scary place. The trees are covered in spider web nets and huge spiders. The forest echoes sounds of witches' laughs and creepy crawly animals.  There is a brisk breeze that runs through the trees that will put chills down your spine.
Items: []
###

Category: Forest
Setting: North forest
Description: Filled with huge trees, the forest is home to many types of animals. Huge predators such as tigers and bears roam the grounds as do small animal

In [32]:
!head fine_tuning_location_items.jsonl

{"prompt": "Category: Forest\nSetting: Woodlands\nDescription: The lush hunting grounds cover acres of land, tall trees and thickets of bushes sprawling across the land. Fields of open land are under the deep blue sky, showing the depth of the landscape. The grounds forest is thick but full of light, letting the multitude of wildlife be seen by the naked eye. The abundance of greenery makes it a prime location for hogs and oxen as they explore the land.\n", "completion": "Items: []\n###\n"}
{"prompt": "Category: Forest\nSetting: Forbidden forest\nDescription: The forbidden forest is a dark and very scary place. The trees are covered in spider web nets and huge spiders. The forest echoes sounds of witches' laughs and creepy crawly animals.  There is a brisk breeze that runs through the trees that will put chills down your spine.\n", "completion": "Items: []\n###\n"}
{"prompt": "Category: Forest\nSetting: North forest\nDescription: Filled with huge trees, the forest is home to many types

In [33]:
!openai api fine_tunes.create -t fine_tuning_location_items.jsonl -m curie

Upload progress:   0% 0.00/224k [00:00<?, ?it/s]Upload progress: 100% 224k/224k [00:00<00:00, 393Mit/s]
Uploaded file from fine_tuning_location_items.jsonl: file-eNHOBuLarPn5uvkKswfljQjy
Created fine-tune: ft-uBmnY7J7B3LoJoH2IXYcEoU3
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-01-30 18:20:43] Created fine-tune: ft-uBmnY7J7B3LoJoH2IXYcEoU3
[2022-01-30 18:20:49] Fine-tune costs $0.57
[2022-01-30 18:20:49] Fine-tune enqueued. Queue number: 0
[2022-01-30 18:20:52] Fine-tune started
[2022-01-30 18:24:40] Completed epoch 1/4
[2022-01-30 18:27:47] Completed epoch 2/4
[2022-01-30 18:30:52] Completed epoch 3/4
[2022-01-30 18:33:58] Completed epoch 4/4
[2022-01-30 18:34:25] Uploaded model: curie:ft-cis-700-31-2022-01-30-18-34-23
[2022-01-30 18:34:28] Uploaded result file: file-PvCoPAuwbMzyExTCmCZkt39J
[2022-01-30 18:34:28] Fine-tune succeeded

Job complete! Status: succeeded 🎉
Try out your fine-tuned model:

openai a

In [36]:
def get_items_at_location(category, location_name, location_description, finetuned_model):
  response = openai.Completion.create(
      model=finetuned_model,
      prompt="Category: {category}\nSetting: {location}\nDescroption: {description}\nItems:".format(
          category=category.capitalize(),
          location=location_name.capitalize(),
          description=location_description.capitalize()
      ),
      temperature=0.7,
      max_tokens=64,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["###"]
      )
  string_result=response['choices'][0]['text'].replace("[","").replace("]","").replace("'","").strip()
  array=string_result.split(", ")
  return array

# Replace with your model's name
finetuned_model = "curie:ft-cis-700-31-2022-01-30-18-34-23"
category = "Underwater aquapolis"
location_name = "Neptune's throne room"
location_description = "A small room in the castle, Neptune's throne room is adorned with painted walls as blue as the sea itself. Tables constructed of beautiful coral lean against all four walls with decorative shells serving as a border for them. The ceiling is painted to resemble waves."

items = get_items_at_location(category, location_name, location_description, finetuned_model)
print(items)


['painted walls as blue as the sea itself', 'tables constructed of beautiful coral', 'decorative shells', 'ceiling']


In [17]:
# get single item description in one specific location
def get_single_item_description(room_id, rooms_by_id, light_environment, item_name, item_description):
  """
  This generates a prompt and a completion which can be used to fine-tune OpenAI.
  This version just gnnerates 
  """
  prompt = ""
  completion = ""
  prompt += "Category: {category}\n".format(category=rooms_by_id[room_id]['category'].capitalize())
  prompt += "Setting: {setting}\n".format(setting=rooms_by_id[room_id]['setting'].capitalize())
  prompt += "Description: {description}\n".format(description=rooms_by_id[room_id]['description'])
  prompt += "Item: {item_name}\n".format(item_name=item_name)
  completion += "Item_Description: {item_description}\n".format(item_description=item_description)
  completion += "###\n"

  return prompt, completion


def create_item_description_finetuning_data(filename='fine_tuning_single_item_description.jsonl'):
  fine_tuning_data = []
  for category in categories:
    rooms = rooms_by_category[category]
    for room_id in rooms:
      items=rooms_by_id[room_id]['ex_objects']
      for item_id in items:
        data = {}
        item=light_environment["objects"][str(item_id)]
        item_name=item["name"]
        item_description=item["descriptions"]
        prompt, completion = get_single_item_description(room_id, rooms_by_id, light_environment, item_name, item_description)
        data['prompt'] = prompt
        data['completion'] = completion
        print(prompt, end="")
        print(completion)
        fine_tuning_data.append(data)

  with open(filename, 'w') as out:
    for data in fine_tuning_data:
        out.write(json.dumps(data))
        out.write('\n')

create_item_description_finetuning_data()

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
###

Category: Inside castle
Setting: Dining hall
Description: The dining hall is a grandiose room with enormously tall ceilings. Their is one long wooden dining table, with two huge chairs on each end, and normal benches on the sides. The walls of the room are decorated with huge elegant tapestries depicting memories and moments of the kingdom. Their is a stage for performers, and there are many windows that offer spectacular views of the beautiful castle grounds.
Item: Mugs
Item_Description: ['The engravings on the mugs detail the ancient battles of heroes.', 'This mug would be perfect for having a hot drink. It is glassy so you can see right through it.']
###

Category: Inside castle
Setting: Banquet hall
Description: Cherry wood walls with paintings of historic kings and queens outlined in gold. A very long cherry wood colored table with chairs on all sides. Candle pieces are on the table about a couple feet apart wit

In [None]:
!head fine_tuning_single_item_description.jsonl

{"prompt": "Category: Forest\nSetting: Mysterious forest outskirts\nDescription: The forest is a dark and mysterious place, where few but the bravest hunters dare wander. The trees loom over you, obscuring any sort of sunlight. Despite this, it is a valuable food and lumber resource for the nearby castle's inhabitants, for the trees are tall and thick and a plentiful amount of deer wander about within.\nItem: rope bridge\n", "completion": "Item_Description: ['The ancient rope bridge appears fragile and rickety, swaying to and fro in the wind.  The bridge floor is made of planks of rotting wood which, if stepped through, would abruptly pull the victim down to an unsuspecting death.']\n###\n"}
{"prompt": "Category: Forest\nSetting: Mysterious forest outskirts\nDescription: The forest is a dark and mysterious place, where few but the bravest hunters dare wander. The trees loom over you, obscuring any sort of sunlight. Despite this, it is a valuable food and lumber resource for the nearby 

In [None]:
!openai api fine_tunes.create -t fine_tuning_single_item_description.jsonl -m curie

Upload progress:   0% 0.00/579k [00:00<?, ?it/s]Upload progress: 100% 579k/579k [00:00<00:00, 1.02Git/s]
Uploaded file from fine_tuning_single_item_description.jsonl: file-KSNKlhvImmU2LwypaxiSO9mp
Created fine-tune: ft-ETW5qUKhlW3c2xltWH8DQJPZ
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-01-29 01:46:52] Created fine-tune: ft-ETW5qUKhlW3c2xltWH8DQJPZ
[2022-01-29 01:47:03] Fine-tune costs $1.49
[2022-01-29 01:47:03] Fine-tune enqueued. Queue number: 0
[2022-01-29 01:47:06] Fine-tune started

Stream interrupted (client disconnected).
To resume the stream, run:

  openai api fine_tunes.follow -i ft-ETW5qUKhlW3c2xltWH8DQJPZ



In [None]:
!openai api fine_tunes.follow -i ft-ETW5qUKhlW3c2xltWH8DQJPZ

[2022-01-29 01:46:52] Created fine-tune: ft-ETW5qUKhlW3c2xltWH8DQJPZ
[2022-01-29 01:47:03] Fine-tune costs $1.49
[2022-01-29 01:47:03] Fine-tune enqueued. Queue number: 0
[2022-01-29 01:47:06] Fine-tune started
[2022-01-29 01:53:58] Completed epoch 1/4
[2022-01-29 02:00:06] Completed epoch 2/4
[2022-01-29 02:06:16] Completed epoch 3/4
[2022-01-29 02:12:24] Completed epoch 4/4
[2022-01-29 02:12:55] Uploaded model: curie:ft-cis-700-31-2022-01-29-02-12-53
[2022-01-29 02:12:58] Uploaded result file: file-6bFHX4MjCbZfgI9MkIvegkNM
[2022-01-29 02:12:59] Fine-tune succeeded

Job complete! Status: succeeded 🎉
Try out your fine-tuned model:

openai api completions.create -m curie:ft-cis-700-31-2022-01-29-02-12-53 -p <YOUR_PROMPT>


In [38]:
def get_item_description(category, item_name, finetuned_model, location_name="", location_description=""):
  response = openai.Completion.create(
      model=finetuned_model,
      prompt="Category: {category}\nSetting: {location}\nDescription: {description}\nItem: {item_name}\nItem_Description:".format(
          category=category.capitalize(),
          location=location_name.capitalize(),
          description=location_description.capitalize(),
          item_name=item_name.capitalize()
      ),
      temperature=0.7,
      max_tokens=64,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["###"]
      )
  return response['choices'][0]['text']

# Replace with your model's name
finetuned_model = "curie:ft-cis-700-31-2022-01-29-02-12-53"
category = "Underwater aquapolis"
location_name = "Neptune's throne room"
location_description = "A small room in the castle, Neptune's throne room is adorned with painted walls as blue as the sea itself. Tables constructed of beautiful coral lean against all four walls with decorative shells serving as a border for them. The ceiling is painted to resemble waves."
item_name = "king's crown"


item_description = get_item_description(category, item_name, finetuned_model, location_name="", location_description="")
print(item_description)

 ['The crown is magnificent, made of solid gold and diamond encrusted.']



In [None]:
light_environment['neighbors']
# room_names_to_id

{'1': {'connection': 'down the hill',
  'destination': 'creek',
  'direction': 'West',
  'inverse_id': None,
  'room_id': 1},
 '10': {'connection': 'other docks',
  'destination': 'Boats',
  'direction': 'West',
  'inverse_id': None,
  'room_id': 9},
 '1005': {'connection': 'walking',
  'destination': 'Weapon Closet',
  'direction': 'Down',
  'inverse_id': None,
  'room_id': 596},
 '1007': {'connection': 'Atlantic Ocean.',
  'destination': 'An abandoned castle on an ocean cliff side',
  'direction': 'South',
  'inverse_id': None,
  'room_id': 634},
 '1011': {'connection': '{}',
  'destination': 'Church Entryway',
  'direction': 'Outside',
  'inverse_id': None,
  'room_id': 662},
 '1015': {'connection': 'taking a rocky path',
  'destination': 'the bone pit',
  'direction': 'South',
  'inverse_id': None,
  'room_id': 567},
 '1016': {'connection': 'following the horse trail',
  'destination': "the horse's stables",
  'direction': 'South',
  'inverse_id': None,
  'room_id': 568},
 '1017': 

In [19]:
# get connections of one specific location
def get_location_connection(room_id, rooms_by_id, light_environment, current_connections):
  """
  This generates a prompt and a completion which can be used to fine-tune OpenAI.
  This version just gnnerates 
  """
  prompt = ""
  completion = ""
  prompt += "Category: {category}\n".format(category=rooms_by_id[room_id]['category'].capitalize())
  prompt += "Setting: {setting}\n".format(setting=rooms_by_id[room_id]['setting'].capitalize())
  prompt += "Description: {description}\n".format(description=rooms_by_id[room_id]['description'])
  prompt += "Current_Connections: {current_connections}\n".format(current_connections=str(current_connections))
  connections=[]
  for neighbor_code in rooms_by_id[room_id]['neighbors']:
    connections.append((light_environment['neighbors'][str(neighbor_code)]["direction"],light_environment['neighbors'][str(neighbor_code)]["destination"]))
  completion += "Connections: {connections}\n".format(connections=str(connections))
  completion += "###\n"

  return prompt, completion

possible_connections=[]
n=0
neighbors=None
def backtrack(i, array):
  if i==n:
    possible_connections.append(array[:])
    return
  neighbor_code = neighbors[i]
  array.append((light_environment['neighbors'][str(neighbor_code)]["direction"],light_environment['neighbors'][str(neighbor_code)]["destination"]))
  backtrack(i+1, array)
  array.pop()
  backtrack(i+1, array)

def create_location_connections_finetuning_data(filename='fine_tuning_location_connections.jsonl'):
  global possible_connections, n, neighbors
  # fine_tuning_data = []
  with open(filename, 'w') as out:
    for category in categories:
      rooms = rooms_by_category[category]
      for room_id in rooms:
        data = {}
        possible_connections=[]
        neighbors=rooms_by_id[room_id]['neighbors']
        n = len(neighbors)
        backtrack(0, [])
        for current_connections in possible_connections:
          prompt, completion = get_location_connection(room_id, rooms_by_id, light_environment, current_connections)
          data['prompt'] = prompt
          data['completion'] = completion
          out.write(json.dumps(data))
          out.write('\n')
          print(prompt[:], end="")
          print(completion[:])
        

create_location_connections_finetuning_data()

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
###

Category: Inside tower
Setting: Dining hall
Description: A somewhat ornate room with stone walls, hanging tapestries, and a detailed look dinner table in the center with many chairs.  There's candles on the dinner tables along with place mats for those who come to eat.
Current_Connections: [('Down', 'Dungeon')]
Connections: [('North', 'Main entrance'), ('Down', 'Dungeon')]
###

Category: Inside tower
Setting: Dining hall
Description: A somewhat ornate room with stone walls, hanging tapestries, and a detailed look dinner table in the center with many chairs.  There's candles on the dinner tables along with place mats for those who come to eat.
Current_Connections: []
Connections: [('North', 'Main entrance'), ('Down', 'Dungeon')]
###

Category: Inside tower
Setting: Weapon closet
Description: The closet is a small dark room. There are different types of swords, axes, bows and arrows, and other types of weapons organize

In [73]:
!head fine_tuning_location_connections.jsonl

{"prompt": "Category: Forest\nSetting: Mysterious forest outskirts\nDescription: The forest is a dark and mysterious place, where few but the bravest hunters dare wander. The trees loom over you, obscuring any sort of sunlight. Despite this, it is a valuable food and lumber resource for the nearby castle's inhabitants, for the trees are tall and thick and a plentiful amount of deer wander about within.\nCurrent_Connections: [('East', \"Hunters' cabins\"), ('Inside', 'Mythical dragon castle'), ('Outside', 'castle entrance')]\n", "completion": "Connections: [('East', \"Hunters' cabins\"), ('Inside', 'Mythical dragon castle'), ('Outside', 'castle entrance')]\n###\n"}
{"prompt": "Category: Forest\nSetting: Mysterious forest outskirts\nDescription: The forest is a dark and mysterious place, where few but the bravest hunters dare wander. The trees loom over you, obscuring any sort of sunlight. Despite this, it is a valuable food and lumber resource for the nearby castle's inhabitants, for th

In [77]:
!openai api fine_tunes.create -t fine_tuning_location_connections.jsonl -m curie

Upload progress:   0% 0.00/982k [00:00<?, ?it/s]Upload progress: 100% 982k/982k [00:00<00:00, 1.23Git/s]
Uploaded file from fine_tuning_location_connections.jsonl: file-5QmZ486f8rOFcMUKpUAsgQrb
Created fine-tune: ft-Wh2fqxmljuNDBigyORpQ7AWP
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-01-29 23:09:43] Created fine-tune: ft-Wh2fqxmljuNDBigyORpQ7AWP
[2022-01-29 23:09:54] Fine-tune costs $2.67
[2022-01-29 23:09:54] Fine-tune enqueued. Queue number: 0
[2022-01-29 23:09:59] Fine-tune started

Stream interrupted (client disconnected).
To resume the stream, run:

  openai api fine_tunes.follow -i ft-Wh2fqxmljuNDBigyORpQ7AWP



In [80]:
!openai api fine_tunes.follow -i ft-Wh2fqxmljuNDBigyORpQ7AWP

[2022-01-29 23:09:43] Created fine-tune: ft-Wh2fqxmljuNDBigyORpQ7AWP
[2022-01-29 23:09:54] Fine-tune costs $2.67
[2022-01-29 23:09:54] Fine-tune enqueued. Queue number: 0
[2022-01-29 23:09:59] Fine-tune started
[2022-01-29 23:21:29] Completed epoch 1/4
[2022-01-29 23:46:37] Completed epoch 3/4
[2022-01-29 23:57:24] Completed epoch 4/4
[2022-01-29 23:57:55] Uploaded model: curie:ft-cis-700-31-2022-01-29-23-57-53
[2022-01-29 23:57:59] Uploaded result file: file-IrVQrBxle5O2FLoE7k3KoUnb
[2022-01-29 23:57:59] Fine-tune succeeded

Job complete! Status: succeeded 🎉
Try out your fine-tuned model:

openai api completions.create -m curie:ft-cis-700-31-2022-01-29-23-57-53 -p <YOUR_PROMPT>


In [48]:
def get_connections(category, location_name, location_description, finetuned_model, current_connections=[]):
  response = openai.Completion.create(
      model=finetuned_model,
      prompt="Category: {category}\nLocation: {location}\nDescription: {description}\nCurrent_Connections: {cconnections}\nConnections:".format(
          category=category.capitalize(),
          location=location_name.capitalize(),
          description=location_description.capitalize(),
          cconnections=current_connections.capitalize()
      ),
      temperature=0.7,
      max_tokens=64,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["###"]
      )
  string_result=response['choices'][0]['text'].replace("[","").replace("]","").strip()
  array=string_result.split("), (")
  result=[]
  for item in array:
    item=item.replace("(","").replace(")","").replace("'","")
    parts=item.split(", ")
    if len(parts)!=2:
      continue
    to_append=(parts[0],parts[1])
    result.append(to_append)
  return result

# Replace with your model's name
finetuned_model = "curie:ft-cis-700-31-2022-01-29-23-57-53"
category = "Underwater aquapolis"
location_name = "Neptune's throne room"
location_description = "A small room in the castle, Neptune's throne room is adorned with painted walls as blue as the sea itself. Tables constructed of beautiful coral lean against all four walls with decorative shells serving as a border for them. The ceiling is painted to resemble waves."
current_connections = "[('east', 'Underwater park')]"

connections = get_connections(category, location_name, location_description, finetuned_model, current_connections)
print(connections)

[('outside', 'Courtyard'), ('east', 'underwater park')]


In [22]:
# get property of one item
def get_property(item_name, item_description, property_description, property_score):
  """
  This generates a prompt and a completion which can be used to fine-tune OpenAI.
  This version just gnnerates 
  """
  prompt = ""
  completion = ""
  prompt += "Item: {item_name}\n".format(item_name=item_name)
  prompt += "Item_Description: {item_description}\n".format(item_description=item_description)
  prompt += "Property_Description: {property_description}\n".format(property_description=property_description)
  completion += "True_Or_False: {score}\n".format(score=property_score)
  completion += "###\n"

  return prompt, completion


def create_item_property_finetuning_data(filename='fine_tuning_single_item_property.jsonl'):
  fine_tuning_data = []
  for item_id in light_environment["objects"].keys():
    for property_description in ["is_container", "is_drink", "is_food", "is_gettable", "is_surface", "is_weapon", "is_wearable"]:
      data = {}
      item=light_environment["objects"][item_id]
      item_name=item["name"]
      item_description=item["descriptions"]
      property_score=item[property_description]
      prompt, completion = get_property(item_name, item_description, property_description, property_score)
      data['prompt'] = prompt
      data['completion'] = completion
      print(prompt, end="")
      print(completion)
      fine_tuning_data.append(data)

  with open(filename, 'w') as out:
    for data in fine_tuning_data:
        out.write(json.dumps(data))
        out.write('\n')

create_item_property_finetuning_data()

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
###

Item: vase
Item_Description: ['The vase looks like a mosaic covered in emeralds and rubies.']
Property_Description: is_container
True_Or_False: 1.0
###

Item: vase
Item_Description: ['The vase looks like a mosaic covered in emeralds and rubies.']
Property_Description: is_drink
True_Or_False: 0.0
###

Item: vase
Item_Description: ['The vase looks like a mosaic covered in emeralds and rubies.']
Property_Description: is_food
True_Or_False: 0.0
###

Item: vase
Item_Description: ['The vase looks like a mosaic covered in emeralds and rubies.']
Property_Description: is_gettable
True_Or_False: 1.0
###

Item: vase
Item_Description: ['The vase looks like a mosaic covered in emeralds and rubies.']
Property_Description: is_surface
True_Or_False: 0.0
###

Item: vase
Item_Description: ['The vase looks like a mosaic covered in emeralds and rubies.']
Property_Description: is_weapon
True_Or_False: 0.0
###

Item: vase
Item_Description

In [94]:
!head fine_tuning_single_item_property.jsonl

{"prompt": "Item: towering pine trees\nItem_Description: ['the tree is tall and leafy']\nProperty_Description: is_container\n", "completion": "True_Or_False: 0.0\n###\n"}
{"prompt": "Item: towering pine trees\nItem_Description: ['the tree is tall and leafy']\nProperty_Description: is_drink\n", "completion": "True_Or_False: 0.0\n###\n"}
{"prompt": "Item: towering pine trees\nItem_Description: ['the tree is tall and leafy']\nProperty_Description: is_food\n", "completion": "True_Or_False: 0.0\n###\n"}
{"prompt": "Item: towering pine trees\nItem_Description: ['the tree is tall and leafy']\nProperty_Description: is_gettable\n", "completion": "True_Or_False: 1.0\n###\n"}
{"prompt": "Item: towering pine trees\nItem_Description: ['the tree is tall and leafy']\nProperty_Description: is_surface\n", "completion": "True_Or_False: 0.0\n###\n"}
{"prompt": "Item: towering pine trees\nItem_Description: ['the tree is tall and leafy']\nProperty_Description: is_weapon\n", "completion": "True_Or_False: 0.

In [95]:
!openai api fine_tunes.create -t fine_tuning_single_item_property.jsonl -m curie

Upload progress:   0% 0.00/2.69M [00:00<?, ?it/s]Upload progress: 100% 2.69M/2.69M [00:00<00:00, 5.23Git/s]
Uploaded file from fine_tuning_single_item_property.jsonl: file-BLR5HrxDGE8x0bF7XN36IWRc
Created fine-tune: ft-lo4O9zmygNjxibLcPIbztVXy
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-01-30 01:13:47] Created fine-tune: ft-lo4O9zmygNjxibLcPIbztVXy
[2022-01-30 01:13:56] Fine-tune costs $7.58
[2022-01-30 01:13:57] Fine-tune enqueued. Queue number: 0
[2022-01-30 01:14:01] Fine-tune started

Stream interrupted (client disconnected).
To resume the stream, run:

  openai api fine_tunes.follow -i ft-lo4O9zmygNjxibLcPIbztVXy



In [96]:
!openai api fine_tunes.follow -i ft-lo4O9zmygNjxibLcPIbztVXy

[2022-01-30 01:13:47] Created fine-tune: ft-lo4O9zmygNjxibLcPIbztVXy
[2022-01-30 01:13:56] Fine-tune costs $7.58
[2022-01-30 01:13:57] Fine-tune enqueued. Queue number: 0
[2022-01-30 01:14:01] Fine-tune started
[2022-01-30 01:22:55] Completed epoch 1/4
[2022-01-30 01:31:05] Completed epoch 2/4
[2022-01-30 01:39:16] Completed epoch 3/4
[2022-01-30 01:47:27] Completed epoch 4/4
[2022-01-30 01:47:54] Uploaded model: curie:ft-cis-700-31-2022-01-30-01-47-52
[2022-01-30 01:47:57] Uploaded result file: file-LuljqyvT9WcaoqcFLT5qHmwi
[2022-01-30 01:47:57] Fine-tune succeeded

Job complete! Status: succeeded 🎉
Try out your fine-tuned model:

openai api completions.create -m curie:ft-cis-700-31-2022-01-30-01-47-52 -p <YOUR_PROMPT>


In [42]:
def get_item_property(property_name, item_name, item_description, finetuned_model, property_description=""):
  response = openai.Completion.create(
      model=finetuned_model,
      prompt="Item: {item}\nItem_Description: {description}\nProperty_Description: {property_des}\nTrue_Or_False:".format(
          item=item_name,
          description=item_description,
          property_des=property_description
      ),
      temperature=0.7,
      max_tokens=64,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["###"]
      )
  return float(response['choices'][0]['text'].strip()[:3])==1.0

def is_gettable(item_name, item_description, finetuned_model, property_description="A player can pick this item up and add it to their inventory."):
  return get_item_property("is_gettable", item_name, item_description, finetuned_model, property_description)

def is_weapon(item_name, item_description, finetuned_model, property_description="This item can be used as a weapon."):
  return get_item_property("is_weapon", item_name, item_description, finetuned_model, property_description)

def is_surface(item_name, item_description, finetuned_model, property_description="Another item can be placed on top of this item."):
  return get_item_property("is_surface", item_name, item_description, finetuned_model, property_description)

def is_container(item_name, item_description, finetuned_model, property_description="Other items can be stored inside of this item."):
  return get_item_property("is_container", item_name, item_description, finetuned_model, property_description)

def is_wearable(item_name, item_description, finetuned_model, property_description="This item can be worn as an item of cloting."):
  return get_item_property("is_wearable", item_name, item_description, finetuned_model, property_description)

def is_drink(item_name, item_description, finetuned_model, property_description="This item is a liquid that can be drunk."):
  return get_item_property("is_drink", item_name, item_description, finetuned_model, property_description)

def is_food(item_name, item_description, finetuned_model, property_description="This item can be eaten."):
  return get_item_property("is_food", item_name, item_description, finetuned_model, property_description)

print(is_gettable("water", "water tastes good", "curie:ft-cis-700-31-2022-01-30-01-47-52"))
print(is_weapon("water", "water tastes good", "curie:ft-cis-700-31-2022-01-30-01-47-52"))
print(is_surface("water", "water tastes good", "curie:ft-cis-700-31-2022-01-30-01-47-52"))
print(is_container("water", "water tastes good", "curie:ft-cis-700-31-2022-01-30-01-47-52"))
print(is_wearable("water", "water tastes good", "curie:ft-cis-700-31-2022-01-30-01-47-52"))
print(is_drink("water", "water tastes good", "curie:ft-cis-700-31-2022-01-30-01-47-52"))
print(is_food("water", "water tastes good", "curie:ft-cis-700-31-2022-01-30-01-47-52"))


True
True
False
False
False
True
False


# TODO: Generate A Game

You now have all of the pieces that you need to generate a game!

Build a game using your automatic methods, and then export it in the same JSON format as the LIGHT Environment Data.  

You'll upload your JSON file to Gradescope along with this notebook.

If you'd like, you can build a game using the same theme and location names as the one that you did in HW1.

In [61]:
#Escape Room Game (categories, rooms, neighbors, objects)
def build_game(category="apartment", initial_location_name="bedroom"):
  #return get_connections("apartment", "bedroom", "you wake up in the bedroom", "curie:ft-cis-700-31-2022-01-29-23-57-53", "")
  room_count=1
  connection_count=0
  object_count=0
  myDict={"categories": {"1": category}}
  myDict["rooms"]={}
  myDict["neighbors"]={}
  myDict["objects"]={}
  rooms_visited=set()
  rooms_visited.add(initial_location_name.strip().capitalize())
  objects_visited={}
  flag = 1
  def set_up(location_name):
    nonlocal room_count, connection_count, object_count, myDict, rooms_visited, objects_visited, flag
    if flag == 10:
      return
    flag+=1
    myDict["rooms"][str(room_count)]={"category": category, "setting": location_name}
    description = get_location_description(category, location_name, "curie:ft-cis-700-31-2022-01-28-18-27-32")
    in_objects = get_items_at_location(category, location_name, description, "curie:ft-cis-700-31-2022-01-30-18-34-23")
    tmp=[]
    for obj in in_objects:
      if obj not in objects_visited:
        object_count+=1
        objects_visited[obj.strip().capitalize()]=object_count
        item_description=get_item_description(category, obj, "curie:ft-cis-700-31-2022-01-29-02-12-53", location_name)
        myDict["objects"][str(object_count)]={}
        myDict["objects"][str(object_count)]["is_container"]=float(is_container(obj, item_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))
        myDict["objects"][str(object_count)]["is_drink"]=float(is_drink(obj, item_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))
        myDict["objects"][str(object_count)]["is_food"]=float(is_food(obj, item_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))
        myDict["objects"][str(object_count)]["is_gettable"]=float(is_gettable(obj, item_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))
        myDict["objects"][str(object_count)]["is_surface"]=float(is_surface(obj, item_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))
        myDict["objects"][str(object_count)]["is_wearable"]=float(is_wearable(obj, item_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))
        myDict["objects"][str(object_count)]["is_weapon"]=float(is_weapon(obj, item_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))
        myDict["objects"][str(object_count)]["name"]=obj
      tmp.append(objects_visited[obj.strip().capitalize()])
    myDict["rooms"][str(room_count)]["in_objects"] = tmp
    myDict["rooms"][str(room_count)]["description"] = description
    myDict["rooms"][str(room_count)]["neighbors"] = []
    connections = get_connections(category, location_name, str(description), "curie:ft-cis-700-31-2022-01-29-23-57-53", str([]))
    neighbors=[]
    original_room_count=room_count
    for direction, neighbor in connections:
      connection_count+=1
      myDict["rooms"][str(original_room_count)]["neighbors"].append(connection_count)
      myDict["neighbors"][str(connection_count)]={"destination": neighbor, "direction": direction, "room_id": original_room_count}
      if neighbor not in rooms_visited:
        room_count+=1
        rooms_visited.add(neighbor.strip().capitalize())
        set_up(neighbor)
  set_up(initial_location_name)
  game = myDict
  return game

game = build_game()

In [62]:
def export_game_json(game, output_filename="my_gpt3_game.json"):
  with open(output_filename, "w") as outfile:
    json.dump(game, outfile, indent=4)

export_game_json(game)

# TODO: Evaluation

An important part of NLP and machine learning is determining how good your models are.  It's very tricky to reliably evaluate generation output automatically.  For now, we'll evaluate the predictions of the model.

For your model's attribute predictions, you should compute it's precision and recall for each attribute type on the LIGHT development data.


In [53]:
!wget https://raw.githubusercontent.com/interactive-fiction-class/interactive-fiction-class-data/master/light_dialogue/light_environment_dev.json

--2022-01-30 20:38:24--  https://raw.githubusercontent.com/interactive-fiction-class/interactive-fiction-class-data/master/light_dialogue/light_environment_dev.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 485111 (474K) [text/plain]
Saving to: ‘light_environment_dev.json’


2022-01-30 20:38:24 (16.9 MB/s) - ‘light_environment_dev.json’ saved [485111/485111]



In [59]:
f = open('light_environment_dev.json')
light_environment_dev = json.load(f)
gold_standard_objects_by_property = sort_objects_by_property(light_environment_dev['objects'])

# You can modify this function definition
def compute_precision_and_recall_for_each_properites(gold_standard_objects_by_property):
  properties = ["is_gettable", "is_weapon", "is_surface", "is_container", "is_wearable", "is_drink", "is_food"]
  total_count=0
  correct=0
  for property in properties:
    for obj_number in gold_standard_objects_by_property[property]:
      obj_name=light_environment_dev['objects'][obj_number]["name"]
      obj_description=str(light_environment_dev['objects'][obj_number]["descriptions"])
      property_score=light_environment_dev['objects'][obj_number][property]
      if property=="is_gettable":
        correct+=float(is_gettable(obj_name, obj_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))==property_score
      if property=="is_weapon":
        correct+=float(is_weapon(obj_name, obj_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))==property_score
      if property=="is_surface":
        correct+=float(is_surface(obj_name, obj_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))==property_score
      if property=="is_container":
        correct+=float(is_container(obj_name, obj_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))==property_score
      if property=="is_wearable":
        correct+=float(is_wearable(obj_name, obj_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))==property_score
      if property=="is_drink":
        correct+=float(is_drink(obj_name, obj_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))==property_score
      if property=="is_food":
        correct+=float(is_food(obj_name, obj_description, "curie:ft-cis-700-31-2022-01-30-01-47-52"))==property_score
      total_count+=1
  return correct/total_count

compute_precision_and_recall_for_each_properites(gold_standard_objects_by_property)

0.5518394648829431