## Tutorial and exercises: Few-Shot prompting for non-programmers
*This Colab notebook is part of the [AI for media, art, and design](https://github.com/PerttuHamalainen/MediaAI) course of Aalto University and also used on the [Art of Writing](https://aalto.my.site.com/courses/s/course/a057T0000043wJBQAY/art-of-writing?language=en_US) course.*

This notebook demonstrates and lets you practice few-shot prompting of Large Language Models (LLMs).

**What is few-shot prompting?**

Few-shot prompting means that one includes *high-quality examples* of desired outputs in the LLM prompt. This can greatly improve the generated results. Similar to explaining things to a human, using concrete examples is often the most *efficient and precise way* to explain what one wants.

*Finding/writing the few-shot examples is a key skill to practise* in AI-assisted creative writing. Typically, this includes both writing example text yourself, and browsing and/or scraping the Internet for examples.

**Learning goals:**

* Practice producing the few-shot examples and learn to anticipate how different examples change the results
* Compare different LLMs and understand the differences between base models and finetuned models, as well as continuation models and chat models.
* Practice combinatorial few-shot prompting. For example, we will create game ideas that systematically explore combinations of game genres and mechanics

**New to Colab notebooks?**

Colab notebooks are browser-based learning environments consisting of *cells* that include either text or code. The code is executed in a Google virtual machine instead of your own computer. You can run code cell-by-cell (click the "play" symbol of each code cell), and selecting "Run all" as instructed above is usually the first step to verify that everything works. For more info, see Google's [Intro video](https://www.youtube.com/watch?v=inN8seMm7UI) and [curated example notebooks](https://colab.google/notebooks/)


### Getting started
1. Select "Run all" from the Runtime menu. This will run all the code in this notebook so that you can inspect the results.
2. Below, enter your OpenAI API key when requested. The class instructor should give you a temporary key. If you want to use your own key, follow [OpenAI's instructions](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key)
3. Proceed through the notebook, following the exercise instructions.

Some exercises will ask you to edit the code, but this is limited to only defining the prompt and requires no programming knowledge. The code you should not touch is clearly marked.

In [None]:
# @title
from getpass import getpass
try: OPENAI_API_KEY
except NameError: OPENAI_API_KEY = getpass('Enter OpenAI API key: ')

### Do some setup and connect to OpenAI.
This part will install and import libraries. It will take a while, but only needs to be done once.

A rotating "stop" symbol below means the code is executing and you should wait.

In [None]:
# @title
#First, install and import code packages
#! in the beginning of a Colab code line allows running Linux shell commands
!pip install openai
!pip install tiktoken
!pip install umap-learn
import tiktoken
import time
import asyncio
import pickle
import hashlib
import os
import openai
import numpy as np
import pandas as pd
import itertools
import matplotlib.pyplot as plt
import json
import plotly.express as px
import textwrap

client=openai.OpenAI(api_key=OPENAI_API_KEY)
client_async=openai.AsyncOpenAI(api_key=OPENAI_API_KEY)

In [None]:
# To make things easier, this cell defines some code helpers.
# source: https://github.com/PerttuHamalainen/LLMCode
import json

#Colab is already running an asyncio event loop => need this hack for async OpenAI API calling
import nest_asyncio
nest_asyncio.apply()

#progress bar helper
def print_progress_bar (iteration, total, prefix = '', suffix = '', decimals = 1, length = 100, fill = '█', printEnd = "\r"):
    """
    Call in a loop to create terminal progress bar
    @params:
        iteration   - Required  : current iteration (Int)
        total       - Required  : total iterations (Int)
        prefix      - Optional  : prefix string (Str)
        suffix      - Optional  : suffix string (Str)
        decimals    - Optional  : positive number of decimals in percent complete (Int)
        length      - Optional  : character length of bar (Int)
        fill        - Optional  : bar fill character (Str)
        printEnd    - Optional  : end character (e.g. "\r", "\r\n") (Str)
    """
    percent = ("{0:." + str(decimals) + "f}").format(100 * (iteration / float(total)))
    filledLength = int(length * iteration // total)
    bar = fill * filledLength + '-' * (length - filledLength)
    print(f'\r{prefix} |{bar}| {percent}% {suffix}', end = printEnd)
    # Print New Line on Complete
    if iteration == total:
        print()

#LLM response cache. For reducing API costs, if the exact same prompt has been
#used before, the helpers below offer an option to return results from cache.
cache_dir = "./llm_cache"
def cache_keys_equal(key1,key2):
    if (type(key1) is np.ndarray) and (type(key2) is np.ndarray):
        return np.array_equal(key1,key2)
    return key1==key2

def cache_hash(key):
    return hashlib.md5(key).hexdigest()

def load_cached(key):
    cached_name= cache_dir + "/" + cache_hash(key)
    if os.path.exists(cached_name):
        cached=pickle.load(open(cached_name,"rb"))
        if cache_keys_equal(cached["key"],key):
            #cache_copy_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), "cache_copy") #for debugging which files are actually used...
            #shutil.copy(cached_name, cache_copy_dir+"/" + cache_hash(key))
            return cached["value"]
    return None

def cache(key,value):
    if not os.path.exists(cache_dir):
        os.mkdir(cache_dir)
    cached_name= cache_dir + "/" + cache_hash(key)
    pickle.dump({"key":key,"value":value},open(cached_name,"wb"))


#Some info for tokenizing text (e.g., for calculating the number of prompt tokens)
tiktoken_encodings = {
    "gpt-4-turbo": tiktoken.get_encoding("cl100k_base"),
    "gpt-4-turbo-preview": tiktoken.get_encoding("cl100k_base"),
    "gpt-4": tiktoken.get_encoding("cl100k_base"),
    "gpt-3.5-turbo": tiktoken.get_encoding("cl100k_base"),
    "gpt-3.5-turbo-instruct": tiktoken.get_encoding("cl100k_base"),
    "gpt-3.5-turbo-16k": tiktoken.get_encoding("cl100k_base"),
    "davinci-002": tiktoken.get_encoding("cl100k_base"),
    "text-davinci-003": tiktoken.get_encoding("p50k_base"),
    "text-davinci-002": tiktoken.get_encoding("p50k_base"),
    "text-davinci-001": tiktoken.get_encoding("r50k_base"),
    "text-curie-001": tiktoken.get_encoding("r50k_base"),
    "text-babbage-001": tiktoken.get_encoding("r50k_base"),
    "text-ada-001": tiktoken.get_encoding("r50k_base"),
    "davinci": tiktoken.get_encoding("r50k_base"),
    "curie": tiktoken.get_encoding("r50k_base"),
    "babbage": tiktoken.get_encoding("r50k_base"),
    "ada": tiktoken.get_encoding("r50k_base"),
}

#Maximum number of tokens supported by different models
max_llm_context_length = {
    "gpt-4-turbo": 16384*2,
    "gpt-4-turbo-preview": 16384*2,
    "gpt-3.5-turbo-16k": 16384,
    "gpt-4": 8192,
    "gpt-3.5-turbo": 4096,
    "gpt-3.5-turbo-instruct": 4096,
    "text-davinci-003": 4096,
    "text-davinci-002": 4096,
    "text-davinci-001": 2049,
    "text-curie-001": 2049,
    "text-babbage-001": 2049,
    "text-ada-001": 2049,
    "davinci": 2049,
    "curie": 2049,
    "babbage": 2049,
    "ada": 2049
}

#Does a model only support the newer chat API and not the continuations API?
def is_chat_model(model):
    return ("gpt-4" in model) or ("gpt-3.5-turbo" in model) and ("gpt-3.5-turbo-instruct" not in model)

#Calculate the number of tokens for a string
def num_tokens_from_string(string: str, model: str) -> int:
    """Returns the number of tokens in a text string."""
    if not model in tiktoken_encodings:
        raise Exception(f"Tiktoken encoding unknown for LLM: {model}")
    encoding = tiktoken_encodings[model]
    num_tokens = len(encoding.encode(string))
    return num_tokens

# Queries an LLM for continuations of a batch of prompts given as a list
def query_LLM_batch(model, prompt_batch, max_tokens, use_cache=None, temperature=None,system_message=None,stop=None):
    if use_cache is None:
        use_cache=False
    cache_key=model.join(prompt_batch).encode('utf-8')
    if use_cache:
        cached_result=load_cached(cache_key)
        if cached_result is not None:
            return cached_result

    #choose whether to use the chat API or the older query API
    if is_chat_model(model):
        if system_message is None:
            system_message = "You are a helpful assistant."

        # each batch in the prompt becomes its own asynchronous chat completion request
        async def batch_request(prompt_batch):
            tasks=[]
            for prompt in prompt_batch:
                messages = [
                    {"role": "system", "content": ""},
                    {"role": "user", "content": prompt},
                ]
                tasks.append(client_async.chat.completions.create(
                    model=model,
                    messages=messages,
                    temperature=temperature,
                    max_tokens=max_tokens,
                    n=1,  # one completion per prompt
                    stop=stop,
                    frequency_penalty=0.0,
                    presence_penalty=0.0,
                ))
            return await asyncio.gather(*tasks)

        loop = asyncio.get_event_loop()
        responses = loop.run_until_complete(batch_request(prompt_batch))
        continuations = [response.choices[0].message.content.strip() for response in responses]

        # before we return the continuations, ensure that we don't violate OpenAI's rate limits
        total_tokens = 0
        for prompt in prompt_batch:
            total_tokens += num_tokens_from_string(string=system_message, model=model)
            total_tokens += num_tokens_from_string(string=prompt, model=model)
        for continuation in continuations:
            total_tokens += num_tokens_from_string(string=continuation, model=model)
        max_tokens_per_minute = 90000  # currently imposed limit for ChatGPT models
        wait_seconds = (total_tokens / max_tokens_per_minute) * 60.0
        #print(f"Waiting {wait_seconds} seconds to ensure staying within rate limit")
        time.sleep(wait_seconds)

    else:
        # The old completions API supports batched prompts out-of-the-box
        openai.api_key = os.getenv("OPENAI_API_KEY")
        response = client.completions.create(
            model=model,
            prompt=prompt_batch,
            temperature=temperature,
            max_tokens=max_tokens,
            top_p=1.0,
            frequency_penalty=0.0,
            presence_penalty=0.0,
            stop=stop,
            n=1  # one completion per prompt
        )
        # extract continuations
        continuations = [choice.text for choice in response.choices]

        # before we return the continuations, ensure that we don't violate OpenAI's rate limits
        total_tokens = 0
        for prompt in prompt_batch:
            total_tokens += num_tokens_from_string(string=prompt, model=model)
        for continuation in continuations:
            total_tokens += num_tokens_from_string(string=continuation, model=model)
        max_tokens_per_minute = 90000  # currently imposed limit for ChatGPT models
        wait_seconds = (total_tokens / max_tokens_per_minute) * 60.0
        #print(f"Waiting {wait_seconds} seconds to ensure staying within rate limit")
        time.sleep(wait_seconds)

    if use_cache:
        cache(key=cache_key,value=continuations)
    return continuations


#Query LLM with a list of prompts
def query_LLM(model, prompts, max_tokens, use_cache=None, temperature=None, system_message=None,stop=None):
    #Query the LLM in batches
    continuations=[]
    batch_size = 10  # The exact max batch_size for each model is unknown. This seems to work for all, and provides a nice speed-up.
    N = len(prompts)
    for i in range(0, N, batch_size):
        prompt_batch=prompts[i:min([N, i + batch_size])]
        continuations+=query_LLM_batch(model=model,
                                 prompt_batch=prompt_batch,
                                 max_tokens=max_tokens,
                                 use_cache=use_cache,
                                 temperature=temperature,
                                 system_message=system_message,
                                 stop=stop)
        print_progress_bar(min([N, i + batch_size]), N,printEnd="")
    return continuations



def embed(texts,use_cache=None,model=None):
    if model is None:
        model = "text-embedding-3-small"
    if use_cache is None:
        use_cache = True
    cache_key=(model+("".join(texts))).encode('utf-8')
    if use_cache:
        cached_result=load_cached(cache_key)
        if cached_result is not None:
            print("Loaded embeddings from cache, hash", cache_hash(cache_key))
            return cached_result


    #query embeddings from the API
    texts=[json.dumps(s) for s in texts]  #make sure we escape quotes in a way compatible with GPT-3 API's internal use of json
    openai.api_key = os.getenv("OPENAI_API_KEY")
    batch_size = 32
    N = len(texts)

    embed_matrix=[]
    for i in range(0, N, batch_size):
        print_progress_bar(i, N)
        embed_batch=texts[i:min([N, i + batch_size])]
        embeddings = client.embeddings.create(input=embed_batch, model=model)
        print(embeddings)
        for j in range(len(embed_batch)):
            embed_matrix.append(embeddings.data[j].embedding)
    print("")
    embed_matrix=np.array(embed_matrix)
    #dim = len(embeddings['data'][0]['embedding'])
    #embed_matrix = np.zeros([N, dim])
    #for i in range(N):
    #    embed_matrix[i, :] = embeddings['data'][i]['embedding']

    #update cache
    if use_cache:
        cache(cache_key,embed_matrix)

    #return results
    return embed_matrix


def reduce_embedding_dimensionality(embeddings,num_dimensions,method="UMAP",use_cache=True,n_neighbors=None):
    if isinstance(embeddings,list):
        #embeddings is a list of embedding matrices => pack all to one big matrix for joint dimensionality reduction
        all_emb = np.concatenate(embeddings, axis=0)
    else:
        all_emb = embeddings
    def unpack(x,embeddings_list):
        row = 0
        result = []
        for e in embeddings_list:
            N = e.shape[0]
            result.append(x[row:row + N])
            row += N
        return result

    cache_key=(str(all_emb.tobytes())+str(num_dimensions)+method+str(n_neighbors)).encode('utf-8')
    if use_cache:
        cached_result=load_cached(cache_key)
        if cached_result is not None:
            print("Loaded dimensionality reduction results from cache, hash ", cache_hash(cache_key))
            if isinstance(embeddings, list):
                return unpack(cached_result,embeddings)
            else:
                return cached_result
    from sklearn.manifold import MDS
    from sklearn.manifold import TSNE
    import umap
    from sklearn.decomposition import PCA
    #cosine distance
    all_emb=all_emb/np.linalg.norm(all_emb,axis=1,keepdims=True)

    if method=="MDS":
        mds=MDS(n_components=num_dimensions,dissimilarity="precomputed")
        cosine_sim = np.inner(all_emb, all_emb)
        cosine_dist = 1 - cosine_sim
        x=mds.fit_transform(cosine_dist)
    elif method=="TSNE":
        tsne=TSNE(n_components=num_dimensions)
        x=tsne.fit_transform(all_emb)
    elif method=="PCA":
        pca=PCA(n_components=num_dimensions)
        x=pca.fit_transform(all_emb)
    elif method=="UMAP":
        if n_neighbors is None:
            n_neighbors=5
        reducer = umap.UMAP(n_components=num_dimensions,metric='cosine',n_neighbors=n_neighbors)
        x=reducer.fit_transform(all_emb)
    else:
        raise Exception("Invalid dimensionality reduction method!")

    if use_cache:
        cache(cache_key,x)

    if isinstance(embeddings, list):
        return unpack(x,embeddings)
    return x


#some quick test methods
def test_embeddings():
    texts=["queen","king","man","woman"]
    embeddings=embed(texts)
    embeddings=reduce_embedding_dimensionality(embeddings,method="PCA",num_dimensions=2)
    df=pd.DataFrame()
    df["texts"]=texts
    df["x"]=embeddings[:,0]
    df["y"]=embeddings[:,1]
    px.scatter(df,
                #width=1300, height=1000, #The codes should be approximately 1:1 aspect ratio, but need space for the color bar
                x="x",
                y="y",
                hover_name="texts")

def test_batched_prompting():
    prompts=["what is 1+1?","what is 1+2?","what is 1+3?"]
    print(query_LLM("davinci-002",prompts,max_tokens=20))



### Exercise: Generating game ideas and comparing different LLMs

1. Run the code below 5 times to get alternative generations. Increase the "Maximum length" parameter if needed.
2. Select "davinci-002" from the "Model" menu and click "Regenerate" 5 times. Davinci-002 is the non-finetuned base model, whereas gpt-3.5-turbo-instruct is the finetuned version.
3. Run the code again 5 times.
4. Reduce the temperature to 0.3
5. Run the code again 5 times.



In [None]:
###################################################################
#Parameters that you can edit
model="gpt-3.5-turbo-instruct" # @param ["davinci-002", "gpt-3.5-turbo-instruct"] {allow-input: false}
temperature=1 # @param {type:"slider", min:0, max:1, step:0.1}
max_tokens=200 # @param {type:"slider", min:10, max:1000, step:10}

prompt="""A list of novel and experimental indie game ideas:

---

An FPS game where time only moves when the player moves or performs actions. This allows Matrix-style slow-motion gun ballet, and transforms real-time action into a puzzle.

---

A Sokoban-style game where the player pushes around blocks that are variables, operators, and definitions of a programming language. Blocks that connect to each other form statements that allow the player to define and alter the game's rules. For example, the player can connect "floor", "is", and "lava" to make the floor deadly.

---

"""

###################################################################
#Code that you should not touch if you don't know what you're doing.
#Feel free to view it, though.


#Query the OpenAI completions API
response = client.completions.create(
  model=model,
  prompt=prompt,
  temperature=temperature,
  max_tokens=max_tokens,
)

#Print the response
print(response.choices[0].text)


**What you should observe above:**

* gpt-3.5-turbo-instruct obeys instructions better than davinci-002, but the generated ideas can be more repetitive and generic.
* Especially with davinci-002, quality decreases over time if you generate many ideas in one go. This is because each generated token becomes part of the prompt for the next generated token. In other words, *the generated ideas become new few-shot examples for the next ideas*, and random errors gradually accumulate and make the output diverge. This is common to all LLMs, just more noticeable with base models like davinci-002.
* With davinci-002, reducing the temperature can help in balancing diversity and quality. The parameter controls how random the generation is. Temperature=1 is the default, where the model picks every next token randomly according to their probabilities. Temperature=0 means the model only picks the most probable tokens and disregards any alternatives.

**Why is davinci-002 less reliable but more diverse?**

The davinci-002 is the so-called "base" OpenAI model trained on very large and diverse data. gpt-3.5-turbo-instruct has been finetuned to follow instructions, which tends to increase generation quality but reduce diversity and produce the recognizably boring voice of ChatGPT. Finetuning is typically formulated as an optimization problem for which the optimal solution is to always give the single best answer to a particular request or question. The finetuning datasets are also smaller, which inevitably limits diversity.

**Key takeaway:**

Different models have different strengths and weaknesses. If correctness is a priority - for example, in generating code - you typically need a finetuned model. But if the finetuned model's output is too boring or vanilla, it may be good to try a base model.

### Exercise: Few-shot prompting in chat mode
Try the above, but with a chat model such as ChatGPT, Anthropic Claude, or Google Gemini.

You can access these in various ways for free:
- ChatGPT is available in the Edge browser if you are signed into a Microsoft Account
- ChatGPT also has a free trial if accessed via https://chat.openai.com
- Google Gemini is available here in Colab (top-right corner of the window) and at https://gemini.google.com/app, although this is only the less capable variant. Gemini Pro requires a subscription.
- Claude is available for free at: https://www.anthropic.com/pricing

In this case, you have to format your input a bit differently, such as:

```
Please give me a novel and experimental indie game idea.

Here are some examples of how to describe the idea in a way that clearly explains the core mechanics and/or design hook:

- An FPS game where time only moves when the player moves or performs actions. This allows Matrix-style slow-motion gun ballet, and transforms real-time action into a puzzle.

- A Sokoban-style game where the player pushes around blocks that are variables, operators, and definitions of a programming language. Blocks that connect to each other form statements that allow the player to define and alter the game's rules. For example, the player can connect "floor", "is", and "lava" to make the floor deadly.

```

**What you should observe:**

Again, each model has its own default flavor/tone of writing. Typically, as the chat models are heavily finetuned, it can be hard to break away from their default tone. Thus, using a non-chat base model can be a better alternative, depending on the output you want.

### Exercise: Craft your own few-shot examples

Above, the few-shot examples describe the indie games Superhot and Baba Is You in a way that makes the core mechanics clear. These award-winning indie games are some of my own personal favorites.

Now, your task is to replace the examples with descriptions of some other games, either real or fictional.

If your mind draws a blank, here are some sources for finding good examples:
- The Steam store descriptions of Steam's [hidden gems](https://steam250.com/hidden_gems). For each game, click on the Steam symbol that takes you to the store page.
- The descriptions of [Independent Games Festival finalists](https://igf.com/finalists-and-winners). For each finalist game, click to see the IGF entry page, e.g. [https://igf.com/entry/2024/1000xresist](https://igf.com/entry/2024/1000xresist)

Instead of games, you can also try creating other texts. Note that you will also need to rewrite the prompt start, accordingly. Here are some sources for examples:
- Generating 6-word stories based on examples from [r/sixwordstories](https://www.reddit.com/r/sixwordstories/top/?t=all)
- To get inspiration for longer stories, start by generating good opening sentences using [these examples](https://www.penguin.co.uk/articles/2022/04/best-first-lines-in-books). Note that you should also edit the prompt start.
- More inspiration for stories and aforisms can be found via Kindle's [most highlighted passages](https://www.amazon.com/amazonbookreview/read/B09KVGXNQQ?)

In [None]:
###################################################################
#Parameters that you can edit
model="gpt-3.5-turbo-instruct" # @param ["davinci-002", "gpt-3.5-turbo-instruct"] {allow-input: false}
temperature=1 # @param {type:"slider", min:0, max:1, step:0.1}
max_tokens=200 # @param {type:"slider", min:10, max:1000, step:10}

prompt="""A list of novel and experimental indie game ideas:

---

An FPS game where time only moves when the player moves or performs actions. This allows Matrix-style slow-motion gun ballet, and transforms real-time action into a puzzle.

---

A Sokoban-style game where the player pushes around blocks that are variables, operators, and definitions of a programming language. Blocks that connect to each other form statements that allow the player to define and alter the game's rules. For example, the player can connect "floor", "is", and "lava" to make the floor deadly.

---

"""


###################################################################
#Code that you should not touch if you don't know what you're doing.
#Feel free to view it, though.


#Query the OpenAI completions API
response = client.completions.create(
  model=model,
  prompt=prompt,
  temperature=temperature,
  max_tokens=max_tokens,
)

#Print the response
print(response.choices[0].text)


### Exercise: Curate your generated content and examples

Try this:
- Generate at least 10 ideas or stories
- Pick the best generation(s) and add them to your few-shot examples
- Repeat the steps above until you have between 5 and 10 examples. Do you notice an increase in the quality of generations?


In [None]:
###################################################################
#Parameters that you can edit
model="gpt-3.5-turbo-instruct" # @param ["davinci-002", "gpt-3.5-turbo-instruct"] {allow-input: false}
temperature=1 # @param {type:"slider", min:0, max:1, step:0.1}
max_tokens=200 # @param {type:"slider", min:10, max:1000, step:10}

prompt="""A list of novel and experimental indie game ideas:

---

An FPS game where time only moves when the player moves or performs actions. This allows Matrix-style slow-motion gun ballet, and transforms real-time action into a puzzle.

---

A Sokoban-style game where the player pushes around blocks that are variables, operators, and definitions of a programming language. Blocks that connect to each other form statements that allow the player to define and alter the game's rules. For example, the player can connect "floor", "is", and "lava" to make the floor deadly.

---

"""


###################################################################
#Code that you should not touch if you don't know what you're doing.
#Feel free to view it, though.


#Query the OpenAI completions API
response = client.completions.create(
  model=model,
  prompt=prompt,
  temperature=temperature,
  max_tokens=max_tokens,
)

#Print the response
print(response.choices[0].text)


### Exercise: Guide the generation in a specific direction
Try combining few-shot examples with the following:
- Rewrite the prompt start to be more specific, e.g., add adjectives
- Guide the new generations by writing the start of the generated text by yourself
- Test both base models and finetuned models (davinci-002 and gpt-3.5-turbo-instruct)

For example, you can try the prompt below and then rewrite the last line by combining some other game genre and emotion such as "Space Invaders but about heartbreak"

In [None]:
###################################################################
#Parameters that you can edit
model="gpt-3.5-turbo-instruct" # @param ["davinci-002", "gpt-3.5-turbo-instruct"] {allow-input: false}
temperature=1 # @param {type:"slider", min:0, max:1, step:0.1}
max_tokens=100 # @param {type:"slider", min:10, max:1000, step:10}

prompt="""A list of novel, witty, and experimental indie game ideas that clearly describe the core design hook:

---

An FPS game where time only moves when the player moves or performs actions. This allows Matrix-style slow-motion gun ballet, and transforms real-time action into a puzzle.

---

A Sokoban-style game where the player pushes around blocks that are variables, operators, and definitions of a programming language. Blocks that connect to each other form statements that allow the player to define and alter the game's rules. For example, the player can connect "floor", "is", and "lava" to make the floor deadly.

---

An FPS game about love
"""


###################################################################
#Code that you should not touch if you don't know what you're doing.
#Feel free to view it, though.


#Query the OpenAI completions API
response = client.completions.create(
  model=model,
  prompt=prompt,
  temperature=temperature,
  max_tokens=max_tokens,
)

#Print the response
print(response.choices[0].text)


### Combinatorial few-shot prompting
A common creativity method for humans is to pick random elements and recombine them. For example, the VNA game ideation method combines a random Verb, Noun, and Adjective.

This can help one think outside the box, and it can also be useful to force an LLM to generate more diverse ideas, especially when using a finetuned model that by default produces high quality but less diverse generations.

Here, we extend few-shot prompting with randomly combined modifiers: We prompt game ideas by *systematically combining genres and mechanics*.

**Exercises:**
* Run the code a few times and try different models. Try adding a third modifier, e.g., "Single Player, Multiplayer".

* Change the prompt start, modifiers, and examples to generate different kind of games. For example, you could try combining classic games such as Space Invaders and Pac Man with different emotions such as sadness, love, grief, heartbreak. If it's hard to come up with the few-shot examples, you can try prompting multiple times with less or no examples and then add the best results as examples.

* Generate story ideas that combine character archetypes from https://thescriptlab.com/blogs/38406-250-character-archetypes-to-use-in-your-screenplay/ with a type of conflict the characters face.


In [None]:
###################################################################
#Parameters that you can edit
model="gpt-3.5-turbo-instruct" # @param ["davinci-002", "gpt-3.5-turbo-instruct"] {allow-input: false}
temperature=1 # @param {type:"slider", min:0, max:1, step:0.1}
max_tokens=200 # @param {type:"slider", min:10, max:1000, step:10}

prompt="""A list of novel, witty, and experimental indie game ideas that clearly describe the core design hook:

---

FPS + Time manipulation: An FPS game where time only moves when the player moves or performs actions. This allows Matrix-style slow-motion gun ballet, and transforms real-time action into a puzzle.

---

Puzzle + Rule manipulation: A Sokoban-style game where the player pushes around blocks that are variables, operators, and definitions of a programming language. Blocks that connect to each other form statements that allow the player to define and alter the game's rules. For example, the player can connect "floor", "is", and "lava" to make the floor deadly.

---

Platformer + Camera manipulation: An FPS puzzle game with a "portal gun" that can create portals between two flat planes. For example, the player can create one portal on the floor and the second on the ceiling and push an object so that it falls down the floor portal and drops from the ceiling portal on top of some target.

---

Dating sim + Time manipulation:
"""

###################################################################
#Code that you should not touch if you don't know what you're doing.
#Feel free to view it, though.


#Query the OpenAI completions API
response = client.completions.create(
  model=model,
  prompt=prompt,
  temperature=1,
  max_tokens=500,
)

#Print the response
print(response.choices[0].text)
