# Code Generation with GPT-3

## Imports and Globals

To run this notebook please install the [OpenAI Python library](https://github.com/openai/openai-python). To install the package run ```pip install --upgrade openai```. 

In [None]:
import os
import openai

from IPython.core.display import display, HTML

To concatenate examples of specifications and programs to a prompt for GPT-3 we introduce question token, end of question token (EOQ), answer token, and end of answer token (EOA).

In [None]:
QUESTION_TOKEN = "Q: "
EOQ_TOKEN = "\n"
ANSWER_TOKEN = "A: "
EOA_TOKEN = "\n\n"

## Authentication

To authenticate please set the environment variable OPENAI_API_KEY to the API key that you received from OpenAI. More information on authentication can be found in the [OpenAI API docs](https://beta.openai.com/docs/api-reference/authentication).

In [None]:
openai.api_key = os.getenv("OPENAI_API_KEY")

## Data Processing

The following functions load the examples from the data directory and concatenate them to a single string.

In [None]:
def get_examples(language : str = "python", instances : list = None):
    """loads the examples from the data directory for a specific language
    
        Args:
            language: either html, python, or shell
            instances: specifies the examples, if None all examples for that language are loaded

        Returns:
            a list of specification-program pairs

        Raises:
            ValueError: if language is not html, python, or shell
    """
    if language == "html":
        filename = "document.html"
    elif language == "python":
        filename = "program.py"
    elif language == "shell":
        filename = "command.sh"
    else:
        raise ValueError(f"Unkown language {language}")

    examples = []
    data_path = os.path.join(os.path.join(os.getcwd(), "data"), language)
    for ex in os.listdir(data_path):
        if instances and ex not in instances:
            continue
        ex_path = os.path.join(data_path, ex)

        spec_path = os.path.join(ex_path, "specification.txt")
        with open(spec_path, "r") as sf:
            spec = sf.read().replace('"', '\"')

        prog_path = os.path.join(ex_path, filename)
        with open(prog_path, "r") as pf:
            prog = pf.read().replace('"', '\"')

        examples.append((spec, prog))

    return examples


In [None]:
def get_prompt(examples : list):
    """concatenates specification-program pairs

        Args:
            examples: a list of specification-program pairs
        
        Returns:
            a string of concatenated specification-program pairs
    """
    prompt = ""
    for spec, prog in examples:
        prompt += QUESTION_TOKEN + spec + EOQ_TOKEN + ANSWER_TOKEN + prog + EOA_TOKEN
    return prompt

In [None]:
#for debugging
#get_prompt(get_examples(language="html", instances=["traffic"]))

## HTML Code Generation

In addtion to the examples we provide GPT-3 with the following context:

In [None]:
HTML_CONTEXT = "We generate HTML documents from natural language descriptions.\n"

In [None]:
def html_code_gen(spec : str):
    """calls GPT-3 with context, examples, and a new description

        Args:
            spec: a string that describes an HTML document

        Returns:
            GPT-3 prediction
    """
    prompt = HTML_CONTEXT + get_prompt(get_examples(language="html", instances=["button", "stopwatch", "traffic"])) + QUESTION_TOKEN + spec
    response = openai.Completion.create(
        engine='davinci',
        prompt = prompt,
        temperature=0.1,
        max_tokens=512,
        top_p=0.5,
        frequency_penalty=1,
        presence_penalty=1,
        stop=[EOA_TOKEN]
    )
    prediction = response["choices"][0]["text"]
    return prediction.partition("A:")[2]

Let's try it out!

In [None]:
prediction = html_code_gen("form to submit name, address, and phone")
prediction

In [None]:
prediction = html_code_gen("table with the highest-grossing films of all time")
prediction

In [None]:
prediction = html_code_gen("button that randomly changes its position when clicked")
prediction

Let's try to render the prediction!

In [None]:
display(HTML(prediction))

## Python Code Generation

In [None]:
PYTHON_CONTEXT = "We generate Python programs that implement a natural language specification.\n"

In [None]:
def python_code_gen(spec : str):
    """calls GPT-3 with context, examples, and a new specification

        Args:
            spec: a string that describes the Python program

        Returns:
            GPT-3 code prediction
    """
    prompt = PYTHON_CONTEXT + get_prompt(get_examples(language="python", instances=["fibonacci", "sin", "tensorflow"])) + QUESTION_TOKEN + spec
    response = openai.Completion.create(
        engine='davinci',
        prompt = prompt,
        temperature=0.1,
        max_tokens=512,
        top_p=0.5,
        frequency_penalty=1,
        presence_penalty=1,
        stop=[EOA_TOKEN]
    )
    prediction = response["choices"][0]["text"]
    return prediction.partition("A:")[2]

In [None]:
python_code_gen("convert fahrenheit to celsius")

In [None]:
python_code_gen("check if an integer is prime")

In [None]:
python_code_gen("transpose a csv file")

## Shell Command Generation

In [None]:
SHELL_CONTEXT = "We generate Shell commands given a natural language description.\n"

In [None]:
def shell_cmd_gen(spec : str):
    """calls GPT-3 with context, examples, and a new description

        Args:
            a string that describes the shell command

        Returns:
            GPT-3 command prediction
    """
    prompt = SHELL_CONTEXT + get_prompt(get_examples(language="shell", instances=["cipher", "count", "move"])) + QUESTION_TOKEN + spec
    response = openai.Completion.create(
        engine='davinci',
        prompt = prompt,
        temperature=0.1,
        max_tokens=128,
        top_p=0.5,
        frequency_penalty=1,
        presence_penalty=1,
        stop=[EOA_TOKEN]
    )
    prediction = response["choices"][0]["text"]
    return prediction.partition("A:")[2]

In [None]:
shell_cmd_gen("remove all files in directory /tmp that were created or modified today")

In [None]:
shell_cmd_gen("make script.py executable")

In [None]:
shell_cmd_gen("extract the text of document.pdf to output.txt")