# Introduction to Prompt Engineering

This jupyter notebook serves as an introduction to prompt engineering in the context of the workshop example provided.

This notebook briefly covers:
1. Setting up Docker & the back-end component (very briefly)
2. Setting up the env for the backend component
3. Configuring how the OpenAI API is called using `config.yaml` file
4. Zero-shot prompting
5. Adding context
6. Additional concepts related to LLMs
7. Additional ideas

For ease of use, this notebook is also executable as a stand-alone jupyter notebook.

This notebook is designed to expose participants to many facets of LLM (in simplified form), and does not require them to finish each of the tasks sequentially; feel free to flip through the sections and check out any external references included in the notebook based on what interests you.

## Pre-requisite: Launching the microservice

While it is assumed that participants are already able to launch the front-end + back-end, this section is included as a refresher / in case this assumption is not valid.

### Using Docker
The recommended way of launching the microservice is via Docker. Docker allows you to spin up [containers](https://www.docker.com/resources/what-container/) (you can think of them as applications running in a virtual machine) without having to worry about the dependencies (e.g. installing NodeJS, Python etc directly into your computer environment).

For example, you can use Docker to spin up the backend microservice (which uses NodeJS and optionally Python) without having to worry about what version of NodeJS/Python you are running, or the libraries you need to install.

To install Docker:

If you are only using Docker for this workshop, you can use Docker Desktop, which helps to setup and manage a Docker environment on your computer, which you can interact with using Docker CLI:
- Windows: https://docs.docker.com/desktop/install/windows-install/
- Mac: https://docs.docker.com/desktop/install/mac-install/

If you use Docker Desktop for your work, you will need a license. Most of us don't have a license, so you will have to use one of the open source alternatives to Docker Desktop in order to use Docker:

- Rauncher Desktop: https://rancherdesktop.io/
- Podman Desktop: https://iongion.github.io/podman-desktop-companion/
- Colima (Mac / Linux only): https://github.com/abiosoft/colima

### Running the microservice directly on your machine
- You will need to install [NodeJS](https://nodejs.org/en) on your machine.
- Run `npm install` in the directory of `/backend` to install the required dependencies.
- You may need Python 3, and then run `pip install openai==0.27.9` to install the python openai module.
- If you cannot find `pip`, you will either need to re-install Python and check the box to add Python to your PATH or install pip on your command line.



## Setting up the environment
In order to send requests to OpenAI, you will need to configure your API Key. Replace `<OPENAI-API-KEY>` in `/backend/.env` to the key you are provided.

## Running the backend component and backend endpoints of interest
1. To start/stop the service, you can follow the instructions provided in the `/backend` readme. If debugging your setup, you may wish to run the container using `docker compose up --build` which allows you to view any console logs directly via the CLI, and you can stop it by CTRL-C or closing the terminal.
2. When the service is running, access the Swagger UI of the microservice (basically an interface for you to interact with the API of the microservice) via http://localhost:9000/swagger/
3. The endpoints of interest are:
  1. Create Todo: You will first need to create a Todo object with a description. The description will then be used as part of the prompt context for OpenAI to generate content. Upon executing an API call to the endpoint, the ID for the todo object is returned as part of the response.
  2. Generate Todo Content: This endpoint accepts an id of the todo object that you want to generate content for. In this workshop example, we assume that the objective (e.g. "generate me a todo list from this description") is fixed and pre-determined by the Data Scientists / Developers for simplicity.

## Control flow of generating content using OpenAI API
1. Calling the generate todo content endpoint calls `generateTodoContent` in `/backend/routes/methods.ts`.
2. This in turn calls `generateContent` in `/backend/logic/prompt.ts`.
3. In `generateContent`, you are given the option to configure how you want to call OpenAI; if you are more familiar with NodeJS, you can change `generateContent` to call the `chatCompletion` function. If you are more familiar with python, you can leave the implementation as-is (`generateContent` to call the `pythonCompletion`). This triggers the code in `openai_bridge.py`.
    1. Note: if you intend to explore this at a deeper level, you may find that examples related to machine learning / prompt engineering tend to be in python, but for the purposes of this notebook, both are fine.
4. Ultimately, the code will eventually call OpenAI API to generate text using the `description` field in the todo object and store the output from OpenAI in the todo object's `content` field.

## Basic configuration

The application is designed to be modifiable by only touching the `config.yaml` file if you don't want to change the implementation too much. The `config.yaml` file exposes the parameters below, which are documented below ad verbatim from [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create):

```
# Model Configuration
model: gpt-3.5-turbo # Keep this as-is

temperature: 1
  What sampling temperature to use, between 0 and 2.
  Higher values like 0.8 will make the output more random,
  while lower values like 0.2 will make it more focused and deterministic.
  We generally recommend altering this or top_p but not both.

topP: 1
  An alternative to sampling with temperature, called nucleus sampling,
  where the model considers the results of the tokens with top_p probability mass.
  So 0.1 means only the tokens comprising the top 10% probability mass are considered.
  We generally recommend altering this or temperature but not both.

presencePenalty: 0
  Positive values penalize new tokens based on whether they appear in the text so far,
  increasing the model's likelihood to talk about new topics.

frequencyPenalty: 0
  Positive values penalize new tokens based on their existing frequency in the text so far,
  decreasing the model's likelihood to repeat the same line verbatim.

maxTokens: 200
  The maximum number of tokens to generate in the chat completion.

# Prompt Configuration
systemPrompt: |
  <You can modify the system prompt freely as you see fit>
```

# Zero-shot prompting
Zero shot prompts usually come in a format like this, not necessarily in this order:
1. The context (e.g. the Todo description)
2. The instruction(s) (e.g. I want a list of 5 items)
3. Any additional constraints of the instructions (e.g. The list should be concise)
4. Any other formatting that demarcates the sections (e.g. "Context: ... Query: ... Output: ")

In zero shot prompting, the model is not provided with additional examples to guide the type of response it is expected to produce.

In the context of the workshop application, the prompt sent to OpenAI API is in this format:
1. `systemPrompt` from `config.yaml` is used as the system message.
2. `description` from the todo object is used as the user message.

It uses the Chat Completions API as documented [here](https://platform.openai.com/docs/guides/gpt/chat-completions-api), and again the documentation is lifted ad verbatim for your reference:

> The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. However note that the system message is optional and the model’s behavior without a system message is likely to be similar to using a generic message such as "You are a helpful assistant."

> The user messages provide requests or comments for the assistant to respond to. Assistant messages store previous assistant responses, but can also be written by you to give examples of desired behavior.


In [1]:
# @title Parameters
# @markdown You can execute the code in this jupyter notebook without having to integrate it with the application by setting the parameters here (and directly modifying the code in this notebook).

# @markdown To execute a codeblock, open it in Google Colab & press the play icon beside it.

# @markdown If you change the config, you will have to run the codeblock(s) again to update the config.


OPENAI_API_KEY = '\u003COPENAI-API-KEY>' # @param {type:"string"}
model = "gpt-3.5-turbo" # @param {type:"string"}
temperature = 1 # @param {type: "raw"}
top_p = 1 # @param {type: "raw"}
presence_penalty = 0 # @param {type: "raw"}
frequency_penalty = 0 # @param {type: "raw"}
max_tokens = 200 # @param {type: "raw"}

# @markdown In the context of the application, "systemPrompt" refers to the prompt you configure for OpenAI, "userPrompt" refers to the Todo item's description by default.
system_prompt = "You are a helpful assistant that suggest steps to achieving to-dos of a user. Read the following task and generate 5 simple steps to achieve it." # @param {type:"string"}
user_prompt = "I want to start learning how to play the guitar, but I am a complete beginner at music." # @param {type:"string"}

## Code
The code below is almost equivalent to the one in `openai_bridge.py`.

In [2]:
%%capture pip_output
# Suppress pip output using %%capture to make it neat
!pip install openai

In [8]:
import os
import json
import openai
import pprint

def call_openai_api(payload: dict) -> str:
    """
    Call OpenAI API and returns output from the API.
    May throw an error if an exception occurred (e.g. invalid API key)
    Reference: https://github.com/openai/openai-cookbook/blob/main/examples/How_to_format_inputs_to_ChatGPT_models.ipynb
    """
    response = openai.ChatCompletion.create(
        model=payload["model"],
        messages=[
            {"role": "system", "content": payload["system_prompt"]},
            {"role": "user", "content": payload["user_prompt"]},
        ],
        temperature=payload["temperature"],
        top_p=payload["top_p"],
        frequency_penalty=payload["frequency_penalty"],
        presence_penalty=payload["presence_penalty"],
        max_tokens=payload["max_tokens"]
    )

    return response['choices'][0]['message']['content']

def run_with_params():
    """
    Call the API using the model_params defined in the global scope.
    """
    # OpenAI Config
    openai.api_key = OPENAI_API_KEY
    model_params = {
        "model": model,
        "temperature": temperature,
        "top_p": top_p,
        "presence_penalty": presence_penalty,
        "frequency_penalty": frequency_penalty,
        "max_tokens": max_tokens,
        "system_prompt": system_prompt,
        "user_prompt": user_prompt
    }
    # Pretty print the model params
    pp = pprint.PrettyPrinter(indent=4)
    pp.pprint(model_params)

    # Use the input
    output = call_openai_api(model_params)

    # Output to stdout
    print(output)

    return output

In [7]:
output = run_with_params()

{   'frequency_penalty': 0,
    'max_tokens': 200,
    'model': 'gpt-3.5-turbo',
    'presence_penalty': 0,
    'system_prompt': 'You are a helpful assistant that suggest steps to '
                     'achieving to-dos of a user. Read the following task and '
                     'generate 5 simple steps to achieve it.',
    'temperature': 1,
    'top_p': 1,
    'user_prompt': 'I want to start learning how to play the guitar, but I am '
                   'a complete beginner at music.'}
1. Research and choose the right beginner guitar: Start by researching different types of guitars and their suitability for beginners. Consider factors such as size, price, and ease of playability. Once you have narrowed down your options, make a decision based on your preferences and budget.

2. Learn the basic guitar chords: Begin by learning some of the basic chords on the guitar, such as C, D, E, G, and A. Familiarize yourself with the finger positions and practice transitioning between these cho

## Improving the prompt: CO-STAR framework
Great, we managed to get OpenAI to return an appropriate (hopefully!) response based on what we sent it. Of course, there's still a lot of room for improvement, and many times when you're working with AI, your primary objective is to find ways to improve upon an existing solution (in this case, our baseline is the example prompt that has been provided).

There are several ways to improve upon the quality of the output from the LLM. One of the easiest way to do so is to iteratively modify the prompt we use so that we can get a better response.

In the context of zero-shot prompting, we assume that we do not have any examples that we can use.

### CO-STAR framework
Launchpad (one of our product teams working with LLMs) proposes the CO-STAR framework for prompt crafting: CO-STAR. CO-STAR stands for Context, Objective, Style, Tone, Audience and Response, and it is described in further detail in this [PDF](https://drive.google.com/drive/folders/15rnYzCv4O0iRY-FTQ890WuVyGpV_MgYY) on pages 27 - 41 (with examples).

For a concise summary, CO-STAR refers to:

**Context**: Any information that the AI needs to know to provide a good answer (e.g. domain knowledge on the topic, situation-specific information).

For now, we will focus on providing basic hard-coded context; in the next section we will cover how you can introduce context from other sources.

**Objective**: The task the AI should perform should be clearly stated. Including the purpose of the output may also help the AI provide a better answer (e.g. "Summarise this document into a list of items that will then be used in a powerpoint slide")

**Style**: You may want to specify a persona / style that the AI should mimic (e.g. as a career coach, be concise etc).

**Tone**: The AI can also be influenced to respond in a particular tone (e.g. casual, professional, humorous).

**Audience**: The AI can tailor its response and choose words and phrases that the audience would understand or resonate with better.

**Response (Length & Format)**: If you already have something in mind for how your response should look like, include it in the prompt (e.g. "should be no more than 200 words", "in bullet point form", "a short excerpt followed by a bullet point list" etc)

## Example

In [10]:
# Update the prompts

# You can make the prompt less generic
system_prompt = "You are a musician with several years of teaching experience. Students with no more than a few weeks worth of experience often come to you with questions. You are to advise them using them short, actionable and concise bullet point items to help them improve their playing."

# For example, you can add more context to the description
user_prompt = "I want to start learning how to play the guitar as a hobby, but I am a complete beginner at music. I am a working adult with a 9-5 job on the weekdays and only have time to practice on the evenings or weekends."

# Run it again with the new prompt. run_with_params will execute the code using the global variables `system_prompt` and `user_prompt`
output = run_with_params()

{   'frequency_penalty': 0,
    'max_tokens': 200,
    'model': 'gpt-3.5-turbo',
    'presence_penalty': 0,
    'system_prompt': 'You are a musician with several years of teaching '
                     'experience. Students with no more than a few weeks worth '
                     'of experience often come to you with questions. You are '
                     'to advise them using them short, actionable and concise '
                     'bullet point items to help them improve their playing.',
    'temperature': 1,
    'top_p': 1,
    'user_prompt': 'I want to start learning how to play the guitar as a '
                   'hobby, but I am a complete beginner at music. I am a '
                   'working adult with a 9-5 job on the weekdays and only have '
                   'time to practice on the evenings or weekends.'}
- Start by learning basic chords like C, G, D, A, and E.
- Dedicate 15-30 minutes a day to practice. Consistency is key.
- Use online tutorials or a beginner's g

# Where to go from here?

After you have completed zero-shot prompt engineering, you've pretty much learnt the basics of prompt engineering; most of prompt engineering is about iteratively improving upon your prompt based on the current use case and on your preferences (of how the response should look like).

Due to the workshop's time constraints, there is limited time to explore what LLMs have to offer in its entirety. Hence, this notebook is designed to allow participants to choose what is most interesting / value-add to them, and follow-up on these areas as they please.

The sections below provide 2 basic areas of exploration, and a last "catch-all" area for everything else:

1) Prompt

The prompt is the input that the AI receives. If your response does not require the AI to have domain-specific contextual knowledge (or any knowledge of events that happened after 2021), modifying the prompt is sometimes sufficient. It is the easiest and quickest to modify, and iteratively fine-tune, as it only involves fine-tuning the instructions sent to the AI.

This section provides additional exercises which you can follow to improve the prompt for your application.

2) Context

Many times you might find that the AI is limited by its lack of domain specific knowledge, or it is unable to output an answer that is sufficiently specific because it does not have the appropriate context. Adding relevant context into the prompt can point the AI in producing a higher quality and more relevant answer.

This section provides additional exercises which you can follow to provide additional relevant context to the AI for your application.

3) Extensions

The field is growing at a very rapid pace and there are new ways to do things that were not possible just a few months ago. Some interesting new areas of exploration will be mentioned at the end of this notebook.

## What if I don't want to code but I want to read more about AI & LLMs
Aight I got your back bro

- Challenges and Applications of Large Language Models: https://arxiv.org/abs/2307.10169
- Technological, socio-economic and policy considerations: https://www.oecd.org/publications/ai-language-models-13d38f92-en.htm



## How can I use LLMs in my daily work?

It's cool to know how LLMs can be used to create complex systems, but perhaps it might be cooler to know that you can use LLMs (more specifically, ChatGPT) in your daily work.

Examples include:
- Summarisation
- Writing (email, article, essay, contents for slides etc)

Its generally good for:

Generative Tasks
- Question Answering
- Conversation / Roleplay
- Text Generation of Code / Essay
- Reasoning (usually requires some prompt engineering & guidance)

Discriminative Tasks
- Summarization
- Information Extraction
- Text Classification
- Clustering / Topic Modelling

As it is ultimately a text prediction system, it can also be creatively used for other purposes (although its efficacy is not assured in its use-cases and is best used with safeguards/humans to verify the output of the AI).


## What resources are available to me?
Unfortunately there are currently limited existing WOG resources to access LLMs (beyond a personal ChatGPT subscription plan / freely available alternatives like Bard):
1. Pair by OGP (requires whitelisting and there's a waitlist)
2. Launchpad by DSAID (also requires requesting for early access)

## What are some important caveats to keep in mind when using LLMs?
1. Ensure that the information you give OpenAI does not have to be kept secret. Anything that is passed as a prompt (or used to [fine-tune](https://platform.openai.com/docs/guides/fine-tuning) any OpenAI models) should be screened and verified to be of the correct data classification before usage.
2. Verify the correctness of the output before using it.

# 1) Improving the prompt

## Improving the prompt in the context of your application

Here are some guiding questions on how you can improve the prompt you currently have:

1. Who is the user of the application? What are their demographics?
2. What is the input most likely going to be like? Any domain in particular that would be brought up more often than others?
3. In what format would the user prefer the output to be in? What should the response focus on?

## Exercises

Here are some exercises you can try to improve the example prompts provided. I recommend trying them with yourself in mind; what kind of todos would you like to work on, and what kinds of prompts / information would be best for those types of items?


### Basic

1. Create a system prompt that outputs a plan (e.g. actionable steps and timeline) when the description of a todo item is used as the user prompt.
2. Modify the system prompt to include a persona (or description of a person) the plan should be tailored towards.
3. Create a system prompt that can take a long-term goal and break it down into smaller, shorter-term goals. Specify a finite number to reduce the number of goals generated.
4. Create a system prompt that can take a goal and suggest follow-up goals that would be relevant to the user.

### Advanced
The following exercises explores combining multiple types of prompts together, either by joining them to form a single more complicated prompt, or by chaining multiple prompts together via multiple API calls. It also looks into integrating the prompts you have crafted into the application.

You can skip ahead to the next section [2) Adding context](https://colab.research.google.com/drive/1cjzqYL9XjJqfzkpa_4jKOu8Y3DkDThWr#scrollTo=LGavFmGhBJZE&line=1&uniqifier=1) if you're more interested to find out how you can provide additional context to the AI.

5. Integrate the prompt created at step (2) into the workshop application, hard-coding the persona into the system prompt.
6. Integrate the prompt created at step (3) into the workshop application. This can be done by combining the prompt crafted at step (3) with the prompt crafted at step (2), or you can call the API multiple times with the separate prompts and concatenate the results into one.
7. Integrate the prompt created at step (4) into the workshop application.


## Addendum: Few shot prompting
Reference: https://github.com/dair-ai/Prompt-Engineering-Guide/blob/main/guides/prompts-advanced-usage.md#few-shot-prompting

If you have a few examples at hand and the responses / prompts are relatively short, you can consider few-shot prompting to get better responses.

Unfortunately in this context of the workshop application, few shot prompting is not very applicable because the examples in this situation may be too lengthy. There is a fixed amount of tokens you can pass to the API; in the context of gpt3.5 turbo, that context window size is ~8k tokens (including any input you pass to the API and the length of the output). any more and the API call will fail. If your examples are too long, you may encounter issues that will need to be resolved by [splitting the input](https://python.langchain.com/docs/modules/data_connection/document_transformers/), calling the API multiple times with the split input, and then merging the responses into a single output (e.g. [langchain mapreduce](https://python.langchain.com/docs/modules/chains/document/map_reduce))
  1. For simplicity (and time / API budget reasons), automated handling of long documents is not covered in this workshop.

However, it can still be useful to significantly improve the likelihood of getting a response in a format similar to that of the examples provided (in terms of length, structure, command of English etc)


# 2) Adding context

Some questions require additional context which the LLM model may not have. With access to the relevant context, a model would also be able to produce higher quality responses.

There are a few ways to feed this contextual knowledge to the OpenAI model. One way is through [fine-tuning](https://platform.openai.com/docs/guides/fine-tuning) a model. In doing so, the model is able to train on more examples than can be fit in a prompt, which improves on few-shot learning.

For ease of implementation (and to avoid the need to collect & prepare data for fine-tuning etc), we will be focusing on the other way of providing context: by feeding them directly into the prompt.

In this notebook we (non-exhaustively) briefly cover 2 ideas in which this can be done:

1. **Use the AI to generate relevant context**, which is then fed back into the AI in a subsequent API call
2. **Retrieve the relevant context based on the user prompt**. The system can then use another API (e.g. database retrieval technique) to retrieve the context. The API call to the AI is then made with the context. This is often referred to as *retrieval-augmented generation* (or *retrieval-augmented question & answering* depending on your use case).

Some have also explored using the AI to indicate what contextual knowledge it needs to answer the question (see [AutoGPT](https://github.com/Significant-Gravitas/Auto-GPT)). The system can then retrieve the context (using the techiques mentioned in step 2), then feed the information in a subsequent API call.


## Basic

### 2a) Hardcoding the context directly into the prompt / directly injecting the context from the user
As shown in the [example](https://colab.research.google.com/drive/1cjzqYL9XjJqfzkpa_4jKOu8Y3DkDThWr#scrollTo=xlKkVFRN-5L_&line=1&uniqifier=1) provided above, hardcoding / directly injecting the context from the user is a valid, minimal effort approach. If the users are the ones who possess the context needed by the AI, it may simply be easier to expose a new user input to allow users to provide additional context for the AI.

## Advanced

### 2b) Make the LLM generate additional relevant context before providing an answer

If the context required by the LLM is publicly available before Sep 2021, the LLM can be used to generate more specific pieces of context which can be used as part of the prompt. This is described in further detail in this [notebook](https://github.com/dair-ai/Prompt-Engineering-Guide/blob/main/guides/prompts-advanced-usage.md#generated-knowledge-prompting), which also includes other techniques.

Note: regarding the data cut-off date, the official year is 2021, but [there has been mention that this is not actually the case](https://community.openai.com/t/knowledge-cutoff-date-of-september-2021/66215/11). The technology is always evolving so as always, take care not to send any sensitive information to OpenAI.

An simple zero-shot example (unrelated to the workshop) is as follows:

In [12]:
# Get a baseline result first so you know if your solution improves upon it
# Note that the output from ChatGPT is stochastic (i.e. random).
# So in practice we will perform the comparison multiple times with different test cases
# But for the purposes of this workshop we'll keep it simple
user_input = "How do I get a driving license in Singapore?"

# Run a first prompt to obtain a list of questions
system_prompt = "You will be given a question. Answer the question concisely and accurately."
user_prompt = user_input
# run_with_params will execute the code using the global variables `system_prompt` and `user_prompt`
output = run_with_params()

{   'frequency_penalty': 0,
    'max_tokens': 200,
    'model': 'gpt-3.5-turbo',
    'presence_penalty': 0,
    'system_prompt': 'You will be given a question. Answer the question '
                     'concisely and accurately.',
    'temperature': 1,
    'top_p': 1,
    'user_prompt': 'How do I get a driving license in Singapore?'}
To get a driving license in Singapore, you need to follow these steps:
1. Enroll in a driving school approved by the Singapore Traffic Police.
2. Complete and pass the Basic Theory Test (BTT) at a driving center.
3. Attend and pass the practical driving lessons and Final Theory Test (FTT).
4. Upon passing the FTT, you will be issued a Provisional Driving License (PDL).
5. Book and pass the Practical Driving Test (PDT) to obtain a Qualified Driving License (QDL).
6. Once you have the QDL, you can apply for the actual driving license at any post office in Singapore.


In [9]:
# Example user input
user_input = "How do I get a driving license in Singapore?"

# Run a first prompt to obtain a list of questions
system_prompt = "You will be given a question. Identify 3 of the most important pieces of contextual information necessary to provide an accurate answer."
user_prompt = user_input
# run_with_params will execute the code using the global variables `system_prompt` and `user_prompt`
context_for_2b = run_with_params()

{   'frequency_penalty': 0,
    'max_tokens': 200,
    'model': 'gpt-3.5-turbo',
    'presence_penalty': 0,
    'system_prompt': 'You will be given a question. Identify 3 of the most '
                     'important pieces of contextual information necessary to '
                     'provide an accurate answer.',
    'temperature': 1,
    'top_p': 1,
    'user_prompt': 'How do I get a driving license in Singapore?'}
1) Age requirement: It is important to know the minimum age requirement for obtaining a driving license in Singapore. As of March 2022, individuals must be at least 18 years old to apply for a driving license.

2) Residency status: The process and requirements for obtaining a driving license may vary depending on the individual's residency status in Singapore. Singapore citizens, permanent residents, and foreign residents may have different procedures to follow.

3) Validity of foreign driving license: If the individual already holds a driving license from another country

In [13]:
# Include the context as an additional part of the prompt sent to OpenAI
def call_openai_api_2b():
    # OpenAI Config
    openai.api_key = OPENAI_API_KEY
    system_prompt = "You will be given a question together with 3 pieces of relevant information. Answer the question concisely and accurately."
    model_params = {
        "model": model,
        "temperature": temperature,
        "top_p": top_p,
        "presence_penalty": presence_penalty,
        "frequency_penalty": frequency_penalty,
        "max_tokens": max_tokens,
        "system_prompt": system_prompt,
        "context": context_for_2b,
        "user_input": user_input
    }
    # Pretty print the model params
    pp = pprint.PrettyPrinter(indent=4)
    pp.pprint(model_params)

    response = openai.ChatCompletion.create(
            model=model_params["model"],
            messages=[
                {"role": "system", "content": model_params["system_prompt"]},
                {"role": "user", "content": model_params["context"]},
                {"role": "user", "content": model_params["user_input"]},
            ],
            temperature=model_params["temperature"],
            top_p=model_params["top_p"],
            frequency_penalty=model_params["frequency_penalty"],
            presence_penalty=model_params["presence_penalty"],
            max_tokens=model_params["max_tokens"]
        )

    return response['choices'][0]['message']['content']

print(call_openai_api_2b())

{   'context': '1) Age requirement: It is important to know the minimum age '
               'requirement for obtaining a driving license in Singapore. As '
               'of March 2022, individuals must be at least 18 years old to '
               'apply for a driving license.\n'
               '\n'
               '2) Residency status: The process and requirements for '
               'obtaining a driving license may vary depending on the '
               "individual's residency status in Singapore. Singapore "
               'citizens, permanent residents, and foreign residents may have '
               'different procedures to follow.\n'
               '\n'
               '3) Validity of foreign driving license: If the individual '
               'already holds a driving license from another country, it is '
               'crucial to understand the validity and recognition of that '
               'license in Singapore. Depending on the country of issuance, '
               'indiv

### Exercise
For this workshop, there's only 1 exercise related to adding context:

1. Incorporate `2b) Make the LLM generate additional relevant context before providing an answer` with the prompts you have created and add it into the application workflow.

## Very advanced (Outside the workshop's scope fr)

### 2c) Search for relevant context (with a database /search engine)

One way to retrieve information is to get said information from search engines (searching either your own database of information or even using google's [programmable search engine API](https://developers.google.com/custom-search). This notebook won't go into details on how to implement it, but the high-level flow is:

1. Distil the user prompt into something that can be used to search for relevant documents. There are various ways to do so:
  - You can perform keyword extraction / named entity recognition to retrieve the things in the query that the AI will most likely require context on. This can be done using a separate query to the LLM, or via existing [keyphrase extraction models](https://huggingface.co/models?other=keyphrase-extraction)
  - You can convert the query into a embedding whic can then be against a database of embeddings to identify the most relevant documents. [Example](https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb)
2. Implement / Gain access to a mechanism that allows you to search a database. For example, if you want to use Google Search, you'll have to use their [API](https://www.youtube.com/watch?v=D4tWHX2nCzQ) that returns a list of relevant webpages based on your search query.
3. Select excerpts of the most relevant pieces of information from the documents found from the search results. This is ultimately an [information retrieval problem](https://stackoverflow.com/questions/43489969/how-to-find-most-relevant-strings-in-a-textfile) that can be solved by in a variety of ways; existing libraries / solutions include [Whoosh](https://pypi.org/project/Whoosh/), [Elasticsearch](https://www.elastic.co/elasticsearch/), [A database of embeddings](https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb) etc.

# 3) Extensions

Techniques using LLMs evolve at a rapid pace as people prototype and try LLMs for different use cases using different setups. Some techniques include:

Routing (multi-system)
- The AI can be used to classify user input. In some cases, you may already have a few (AI) systems in place that are able to tackle specific domain problems. A possible approach is then to use LLMs to classify incoming queries, and then ["route"](https://betterprogramming.pub/unifying-llm-powered-qa-techniques-with-routing-abstractions-438e2499a0d0) the queries to the most appropriate system. [Langchain](https://python.langchain.com/docs/modules/chains/foundational/router) gives a simplified example on this by routing incoming queries to the most relevant domain-specific prompts.

Function calling
- The AI can interact with scripts and machine executable code to [validate its results](https://github.com/dair-ai/Prompt-Engineering-Guide/blob/main/guides/prompts-applications.md#pal-program-aided-language-models) or to perform tasks beyond text generation.
  - For example, the prompt can include information on what functions it has access to, and it can be instructed to call the relevant functions where applicable. [More info](https://platform.openai.com/docs/guides/gpt/function-calling).
  - It can also be used to generate functional code. This allows the AI to either use machine code to improve the correctness of its results, or perform tasks beyond generating text, especially if you give the AI an environment (preferrably isolated) to run the code it generates.
  - An example of this in action is described in this [article](https://www.zdnet.com/article/the-moment-i-realized-chatgpt-plus-was-a-game-changer-for-my-business/) which describe how the author used ChatGPT Plus to process a CSV document using techniques based on this idea. The idea shown here is instead of feeding the user information (context) into the system prompt, the information is stored in the file and ChatGPT only writes code (presumably python) to interact with the file. The code is then executed to produce the appropriate responses.
  - Other examples include [Auto-GPT](https://github.com/Significant-Gravitas/Auto-GPT) which incorporates various strategies in an (ongoing) attempt to create a fully autonomous AI.


# Conclusion

You've come to the end of the workshop's contents. You don't have to stick to the context of the workshop if you don't find it interesting; feel free to experiment with other use cases beyond Todo items etc.

That said... try to avoid using them unsupervised. They have some problematic limitations that limit their usefulness (e.g. hallucinations, inability to verify correctness) that make them best used in workflows that involve ways that keep them in check (e.g. checking their output with humans in the loop).


# Additional ideas to play with

- Use the OpenAI API to create a timetable to incorporate all the plans that you have asked it to create.
- Use the OpenAI API to improve upon / fine-tune an existing plan via simulation / identifying & mitigating risks.
- Use the OpenAI API to classify and prioritise your goals.
- Use the OpenAI API to suggest additional features and to write code to implement them in your prototype.
