In [1]:
# set up the notebook

%load_ext autoreload
%autoreload 2

import sys
import os
import logging
logging.basicConfig(level=logging.ERROR)

# Preparation

This part will help you get access to OpenAI API and run your first completion.
- For more information about model size and pricing, see [this page](https://openai.com/api/pricing/).
- For a detailed API walkthrough (with examples), see [this documentation page](https://beta.openai.com/docs/api-reference/completions).

In [4]:
# get openAI access and see its model list

import openai
import json

with open("./credential.json", "r") as f:
    credential = json.load(f)["openai"]
openai.api_key = credential

# Test to see if you've got the access successfuly using the cheapest model.
completion = openai.Completion.create(
  model="ada",
  prompt="Say this is a test",
  max_tokens=7,
  temperature=0
)
completion

<OpenAIObject text_completion id=cmpl-6VBvNZ9r9yhK0in21fWSYqZpKtGaB at 0x7ff3d02c4dd0> JSON: {
  "choices": [
    {
      "finish_reason": "length",
      "index": 0,
      "logprobs": null,
      "text": " of your ability to read the signs"
    }
  ],
  "created": 1672892025,
  "id": "cmpl-6VBvNZ9r9yhK0in21fWSYqZpKtGaB",
  "model": "ada",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 7,
    "prompt_tokens": 5,
    "total_tokens": 12
  }
}

In [5]:
# If you got the completion successfully, you should be able to access the generated text through this line:
completion["choices"][0]["text"]

' of your ability to read the signs'

# Writeup 0: Task and Strategy Description

Here, you should describe what task you are working on, and what workflow/pipeline you intend to replicate (from which crowdsourcing paper). As a reminder, you should [pick a crowdsourcing paper here](https://docs.google.com/spreadsheets/d/1nIoU04CulTH128-r6rtykhuqNdd37UJXShw2J19IEdE/edit?usp=sharing). The spreadsheet also points to example tasks in the crowdsourcing papers; However, you DON'T have to stick to the paper-provided input/output. Please feel free to come up with your own tasks as long as they seem suitable for the paper/pipeline you are replicating.
 

**EDIT THIS PART TO PROVIDE AN OVERVIEW OF YOUR ATTEMPTS**

- **Task Description**: (description of your testing task.)
- **Example Input/output**: Write >=3 input-output pairs of your task. You should test your strategy on all the three examples.
    ```
    Input: [INPUT EXAMPLE]
    Output: [OUTPUT EXAMPLE]
    
    Input: [INPUT EXAMPLE]
    Output: [OUTPUT EXAMPLE]
    
    Input: [INPUT EXAMPLE]
    Output: [OUTPUT EXAMPLE]
    ```
- **Workflow prompting strategy**: Describe your designed pipeline.
- **Crowdsourcing paper**: [PAPER TITLE & URL](paper_url) -- where the pipelining strategy comes from.

## Coding: Use this part to do your prompting.

**Please use `text-davinci-003` when you generate your final model outputs.** `text-davinci-003` is effectively InstructGPT, and it can do any language task with better quality, longer output, and consistent instruction-following than other versions of GPT-3 ([as discussed in this page](https://platform.openai.com/docs/models/overview)). Using the latest model can give you a better idea on what these models are (not) capable of.

Tips for prompting:
- [OpenAI playground](https://beta.openai.com/playground) offers more interactive prompting experiments.
- Read the [API documentation](https://beta.openai.com/docs/api-reference/completions?lang=python) to use the right parameters (e.g., `temperature`, `max_length`).
- Read the [tutorial doc](https://beta.openai.com/docs/introduction) to get a general sense of GPT-3.
- Read [OpenAI examples](https://beta.openai.com/examples) to get a sense of what GPT-3 can do.

Tips for saving credits:

- You can use a cheaper model for tweaking your prompts (see [pricing](https://openai.com/api/pricing/)) before generating the final responses using the most capable model `text-davinci-003`. However, be careful that sometimes a prompt that seems to not work for any other models may works for `text-davinci-003`.
- [AI21](https://www.ai21.com/studio) provides a model similar to GPT-3 / API service similar to OpenAI, and it involves more free credits for you to try. If you choose to use this, remember to also get its API key and set up the credential.
- Play with [ChatGPT](https://chat.openai.com/chat) which is a free and similarly capable model. It uses a dialog interface (vs. InstructGPT / `text-davinci-003` uses text completion interface), so naturally your way of interacting with the model may change a bit (e.g., you will ask questions there but use incomplete sentences here.)

## Baseline prompting

Use this section to do a one-step prompting for your selected task.

In [None]:
# Write your code here.

## Crowdsourcing-Pipeline-inspired prompting

Use this section to your selected pipeline prompting for your task.

In [None]:
# Write your code here.

# Writeup 1: Report & Reflection

Fill in the following three sections by reflecting on your results.

## Report the result

For all the inputs you tried, summarized the input, baseline output, pipeline output, which output you like and why. To answer "why", you should first think of some criteria you want to use for evaluating the output:

### Important criteria:
1. 

### Inputs and outputs:


- **Input**:
- **Baseline output**:
- **Pipeline output**:
- **Which output you like and why**:

---

- **Input**:
- **Baseline output**:
- **Pipeline output**:
- **Which output you like and why**:

---

...

## Reflect on prompting effectiveness

Write some paragraphs to describe: Did you find the pipeline prompting workflow effective? Why or Why not?

## Envision possible improvements

Write some paragraphs to describe: What are some possible ways to further improve the pipeline or prompt design?

# Writeup 2 (Optional): InstructGPT vs. ChatGPT

If you also tried to use the [ChatGPT](https://chat.openai.com/chat) interface to complete your task, you might have thoughts on these following two questions!

## Chat interface vs. complete-the-sentence interface
Did your way of prompting change based on the interaction interface? What do you think are some pros and cons for the chat interface (you used in ChatGPT), and the complete-the-sentence interface (you used in this notebook)?

## Qualitative reflection on outputs
Provide some examples of your ChatGPT output. Did you notice any differences?


- **Input**:
- **InstructGPT output**:
- **ChatGPT output**:
- **Which output you like and why**:

---

- **Input**:
- **InstructGPT output**:
- **ChatGPT output**:
- **Which output you like and why**: