# Slackbot Example

In this notebook, you’ll see how to train BeepGPT on your Slack history using only OpenAI’s API’s and open-source Python libraries - Data Science PhD not required.

We'll train BeepGPT in three steps:
1. Build an initial set of training examples from a Slack Export
2. Use few-shot learning with ChatGPT to clean our training examples
3. Send our training data to OpenAI and create a fine-tuned model


First lets install the libraries we will use below. And initialize our OpenAI session with an API key.

In [None]:
%pip install -q backoff pandas pyarrow openai scikit-learn kaskada==0.6.0a4

In [None]:
import openai
import getpass

# Initialize OpenAI
openai.api_key = getpass.getpass('OpenAI: API Key')

## 1. Build an initial set of training examples from a Slack Export

The steps involved here include:
* 1.1 Convert the Slack Export into a format that can be consumed by Kaskada
* 1.2 Use Kaskada to break the Slack Export into a set of conversations
* 1.3 Generate the initial training examples

### 1.1 Convert the Slack Export into a format that can be consumed by Kaskada

Historical slack messages can be exported by following the instructions in Slack's [Export your workspace data](https://slack.com/help/articles/201658943-Export-your-workspace-data) web page. We'll use these messages to teach BeepGPT about the members of your workspace.

The export from Slack contains a zip of numererous folders and files. After uncompressing the archive, there are folders for each public channel in your Slack workspace. Inside each folder are json files for each day, which each contain all the events from the day.

We execute a short python script (utilizing pandas), to concatenate all the data files together into a single parquet file.

Parquet files store data in columns instead of rows. Some benefits of Parquet include:
* Fast queries that can fetch specific column values without reading full row data
* Highly efficient column-wise compression

In [None]:
import pandas as pd
import os

def get_file_df(json_path):
    df = pd.read_json(json_path, precise_float=True)
    # drop rows where subType is not null
    if "subtype" in df.columns:
        df = df[df["subtype"].isnull()]
    # only keep these columns
    df = df[df.columns.intersection(["ts", "user", "text", "thread_ts"])]
    return df

def get_channel_df(channel_path):
    dfs = []
    for root, dirs, files in os.walk(channel_path):
        for file in files:
            dfs.append(get_file_df(os.path.join(root, file)))
    return pd.concat(dfs, ignore_index=True)

def get_export_df(export_path):
    dfs = []
    for root, dirs, files in os.walk(export_path):
        for dir in dirs:
            df = get_channel_df(os.path.join(root, dir))
            # add channel column
            df["channel"] = dir
            dfs.append(df)
    return pd.concat(dfs, ignore_index=True)

Be sure to set the path to your slack export before running the following command:

In [None]:
path_to_slack_export = "slack-export"

get_export_df(path_to_slack_export).to_parquet("messages.parquet")

### 1.2 Use Kaskada to break the Slack Export into a set of conversations

To do this, we will:
* 1.2.1 Start a Kaskada session, and load in the data
* 1.2.2 Break the data into *threads* and *non_threads*
* 1.2.3 Convert the *non_threads* into *threads*
* 1.2.4 Rejoin all the messages into a single timestream
* 1.2.5 Split the messages into conversations

#### 1.2.1 Start a Kaskada session, and load in the data

The following block should only be run once per session:

In [None]:
import pandas as pd
import kaskada as kd

# Initialize Kaskada with a local execution context.
kd.init_session()

# set pandas to display all floats with 6 decimal places
pd.options.display.float_format = '{:.6f}'.format

This block can be re-run as often as you would like to restart the process:

In [None]:
# if you wan to load in your own slack data, change this to the path of your output file from 1.1 above
# otherwise continue with `slack-generation/messages.parquet`, which contains generated slack data for 
# example purposes. See the `slack-generation/notebook.ipynb` notebook for more info.
input_file = "slack-generation/messages.parquet"

# Use the "ts" column as the time associated with each row, 
# and the "channel" column as the entity associated with each row.
# messages = await kd.sources.Parquet.create(
#     input_file,
#     time_column = "ts", 
#     key_column = "channel",
#     time_unit = "s"
# )

# There is currently a bug with parquet file loading, so we will use the jsonl file for now
messages = await kd.sources.JsonlFile.create(
    "slack-generation/messages.jsonl",
    time_column = "ts", 
    key_column = "channel",
    time_unit = "s"
)

The `messages` object is a Kaskada Timestream.  We can use the `preview()` method to export the first few rows as a pandas dataframe:

In [None]:
# View the first 5 events
messages.preview(5)

#### 1.2.2 Break the data into *threads* and *non_threads*

Before generating examples to fine-tune a LLM model, we need to break up the Slack data into *"conversations"*. For our purposes, we define *"conversations"* as either:
* All the messages in a thread
* A group of messages outside a thread, that have at least a 10 minute gap bewteen the next message group.

In the Slack Export data, messages in a thread all have the same `thread_ts` value, which matches the `ts` of the first message in the thread. Messages outside a thread (in the root of the channel), do not have a `thread_ts` set. Therefore, we can use this field to filter our data into 2 sets: *threads* and *non-threads*.

In [None]:
threads = messages.filter(messages.col("thread_ts").is_not_null())
threads.preview(5)

In [None]:
non_threads = messages.filter(messages.col("thread_ts").is_null())
non_threads.preview(5)

#### 1.2.3 Convert the *non_threads* into *threads*

Using the definition of a *"conversation"* for a non-thread from above, we can convert the *non-threads* data into *threads* by:
* separating the messages into groups, where there is at least 10 mintues between each group
* setting the `thread_ts` field on all messages in the group equal to the `ts` of the first message in the group

Note that all data in Kaskada is sorted by time. Therefore you can know that the first message in a group is always the first one that occured.

In [None]:
import pyarrow as pa
from datetime import timedelta

ts = non_threads.col("ts")
ts_since = ts.seconds_since_previous()

# ideally do : 
# is_new = ts_since > timedelta(minutes=10)
is_new = ts_since.cast(pa.int64()) > 600

# Eventually this will just be: `thread_ts = ts.first(window=kd.windows.Since(is_new, start="inclusive"))`
#
# However, the `Since()` window currently collects all the messages until the predicate is True, 
# then outputs, and starts re-collecting. But we need the opposite: when the predicate is True, 
# clear the output, then start collecting, and output. 
#
# In other words, `Since()` is currently exclusive on the start of the window, inclusive on the end. 
# But we need inclusive on the start and exclusive on the end.
# 
# The hack below does what we need until the `Since()` provides additional options for inclusivity
shifted_non_threads = non_threads.shift_by(timedelta(microseconds=0.001))
shifted_ts = shifted_non_threads.lag(1).col("ts").first(window=kd.windows.Since(is_new))
thread_ts = ts.if_(is_new).else_(shifted_ts)

# replace the `thread_ts` in the data with our generated value and filter out messages we no longer need
non_threads_threads = non_threads.extend({"thread_ts": thread_ts}).filter(ts.is_not_null().and_(thread_ts.is_not_null()))
non_threads_threads.preview(5)

#### 1.2.4 Rejoin all the messages into a single timestream

Now that both threads and non-threads have the `threads_ts` set, we merge the data back into a single timestream.

In [None]:
# Ideally we would just do the below, but there is a bug that currently prevents this from working
# joined = threads.else_(non_threads_threads)

# Until the bug is fixed, this gets us the same result
joined = kd.record({
    "ts": threads.col("ts").else_(non_threads_threads.col("ts")),
    "text": threads.col("text").else_(non_threads_threads.col("text")),
    "user" : threads.col("user").else_(non_threads_threads.col("user")),
    "thread_ts" : threads.col("thread_ts").else_(non_threads_threads.col("thread_ts")),
    "channel" : threads.col("channel").else_(non_threads_threads.col("channel")),
})

joined.preview(5)

#### 1.2.5 Split the messages into conversations

Now we create a new timestream from the `joined`, using the `with_key()` method. This updates the entity key for the timestream to be based on the combined value of the `channel` and the `thread_ts`. The effect of this is that each entity is now a unique conversation.

In [None]:
messages = joined.with_key(kd.record({
        "channel": joined.col("channel"),
        "thread": joined.col("thread_ts"),
    }))

messages.preview(5)

### 1.3 Generate the initial training examples

Here we:
* 1.3.1 Collect messages into groups on a per-conversation basis
* 1.3.2 Create single-token labels for all of the users
* 1.3.3 Format, clean, and output the initial examples

#### 1.3.1 Collect messages into groups on a per-conversation basis

We collect up lines from the conversation for outputting. On each row, we want the previous set of messages from the conversation, limiting to at most 5 messages:

In [None]:
# collect the previous 1 to 5 messages and the associated user for each message
conversation = messages.select("user", "text").collect(max=5, min=1).lag(1)

# add the conversation to the current row
examples = messages.extend({"conversation":conversation}).filter(conversation.is_not_null())

examples.preview(5)

#### 1.3.2 Create single-token labels for all of the users

The following script initializes a Scikit-Learn LabelEncoder, which we can use to ensure that each user is represented by a single "token", and that our training examples are formatted in a way that is easier for model fine-tuning.

In [None]:
from sklearn import preprocessing
import json

# Encode user ID labels
le = preprocessing.LabelEncoder()
le.fit(examples.to_pandas()["user"])
with open('labels_.json', 'w') as f:
    json.dump(le.classes_.tolist(), f)

#### 1.3.3 Format, clean, and output the initial examples

The following script interates over the full results, and outputs the examples to a jsonl file: `examples.jsonl`

The examples will be used to teach the model the specific users who are interested in a given conversation. Each example consists of a "prompt" containing the state of a conversation at a point in time and a "completion" containing the user that responded to the previous set of messagees.

We will soon be releasing an update with UDF support, so that the following cleanup can be done inside Kaskada

In [None]:
import json, re

def strip_links_and_users(line):
    return re.sub(r"<.*?>", '', line)

def strip_emoji(line):
    return re.sub(r":.*?:", '', line)

def clean_messages(messages):
    cleaned = []
    for msg in messages:
        text = strip_links_and_users(msg)
        text = strip_emoji(text)
        text = text.strip()
        if text == "" or text.find("```") >= 0:
            continue
        cleaned.append(text)
    return cleaned

prompt_suffix = "\n\n###\n\n"
max_prompt_len = 5000

# Format prompt for the OpenAI API
def format_prompt(messages):
    cleaned = clean_messages(messages)
    if len(cleaned) == 0:
        return None
    cleaned.reverse()
    prompt = "\n\n".join(cleaned)
    if len(prompt) > max_prompt_len:
        prompt = prompt[0:max_prompt_len]
    return prompt+prompt_suffix


# use the label mapping to transform the userId to a single token
def format_completion(user):
    return " " + le.transform([user])[0].astype(str) + " end"

with open('examples.jsonl', 'w') as out_file:
    last_prompt = ""
    for row in examples.run_iter(kind="row"):
        user = row["user"]
        non_user_messages = []
        for msg in row["conversation"]:
            if msg["user"] != user:
                non_user_messages.append(msg["text"])
        prompt = format_prompt(non_user_messages)
        if prompt and prompt != last_prompt:
            example = { "prompt": prompt, "completion": format_completion(user) }
            out_file.write(json.dumps(example) + "\n")
            last_prompt = prompt

## 2. Use few-shot learning with ChatGPT to clean our training examples

Before we send our fine-tuning examples to OpenAI to create a custom model, we should clean up our examples to ensure they provide enough signal for determining the interests of our users. 

Browsing through the generated examples, you may find that some of the messages only contain "fluff". These messages can degrade the accuracy of our final fine-tuned model. Furthermore, we probably don't want to alert users about conversations like these, so these become good negative examples for the training set.

The cleanup stage will comprise four steps:
* 2.1 Human-in-the-loop strong/weak message gathering
* 2.2 Strong message summarization, via ChatGPT few-shot learning
* 2.3 Example classification, via ChatGPT few-shot learning
* 2.4 Final training file creation

### 2.1 Human-in-the-loop strong/weak message gathering

First we need to look through the examples we generated to find some examples of messages that provide a strong signal and some examples have a weak signal. To do this, you can manually look through the `examples.jsonl` file or use the included `human.py` script. Ideally we are looking for 50 examples of each type.

* A couple examples message with strong signal: 

    > I'm familiar with Jupyter's employment of CodeMirror, although I'm unsure about the specific tool used for the readme. Currently, I've been utilizing the Rust syntax highlighter for my code blocks. While it's not flawless, it does a reasonably good job of differentiating between functions and literals. Moreover, it highlights instances of 'let' in a unique color. Still, the idea of having a custom highlighter is quite appealing to me.

    > The issues stemming from heavy dependence on non-stable components in Kubernetes led to core special interest groups (sigs) being burdened with problems they didn't want to deal with initially. Once a component becomes stable (stable), it can remain in use consistently. However, adding new functionality to stable components involves a rigorous process. Interestingly, in the past couple of years, a new policy has been put in place—new elements can be added at v1beta1 level but require a commitment demonstrated through version updates; otherwise, they will be automatically deprecated.

* A few examples of messages with weak signal:

    > some very interesting ideas in here, thx for sharing

    > were there any issues with this? i'll start verifying a few things in a bit.

    > standup?



### 2.2 Strong message summarization, via ChatGPT few-shot learning

In step 2.3, we are going to use few-shot learning to mark our previously generated examples as having strong or weak signal. To do this, we are going to pass the strong & weak examples determined above on every API request. Based on those examples, we will let ChatGPT decide if a message has a strong or weak signal. 

The examples passed to ChatGPT in this way are super important. It will make decisions based on what it can learn from these few examples. There is also a maximum set of tokes that can be passed to ChatGPT on each request. Therefore we should make sure these examples best represent the type of data that the model may see.

From Step 2.1, we see that the messages with strong signal are often quite long. Lets use ChatGPT with few-shot learning to summarize our "strong" examples before moving on to the next step.  With summarized text, we should be able to pass more examples to the API and have better overall results.  

To do this, we provide instructions to the ChatCompletion API via a set of messages. Each message object contains `role` and `content` properties. The `role` can be either `system`, `user`, or `assistant`. 

The first message should always be from the `system` role, and provide general instructions to the model of its function. 

Following this, message pairs of `user` and `assistant` should be added, where the `user` content is our example input and the `assistant` content is our expected response from ChatGPT. These are the "few-shot" learnings that ChatGPT uses to help it determine our desired output.

Finally, we append a final `user` message that contains the content we to have summarized by the model.  

See https://platform.openai.com/docs/guides/gpt for more info.

In [None]:
file = open(f'examples_strong.jsonl', 'r')
out_file = open(f'examples_strong_summarized.jsonl', 'w')

prompt_suffix = "\n\n###\n\n"

while True:
    line = file.readline()

    if not line:
        break

    data = json.loads(line)

    prompt = data["prompt"].removesuffix(prompt_suffix)

    msgs = [
        {"role": "system", "content": "You are a helpful assistant that provides concise summaries of messages. The response must be significantly shorter than the input. The output should be written as if you were the original author."},
        {"role": "user", "content": "I'm familiar with Jupyter's employment of CodeMirror, although I'm unsure about the specific tool used for the readme. Currently, I've been utilizing the Rust syntax highlighter for my code blocks. While it's not flawless, it does a reasonably good job of differentiating between functions and literals. Moreover, it highlights instances of 'let' in a unique color. Still, the idea of having a custom highlighter is quite appealing to me."},
        {"role": "assistant", "content": "Familiar with Jupyter's use of CodeMirror, uncertain about readme tool. Using Rust syntax highlighter for code blocks, highlighting 'let' distinctly. Interested in a custom highlighter for better differentiation."},
        {"role": "user", "content": "The issues stemming from heavy dependence on non-stable components in Kubernetes led to core special interest groups (sigs) being burdened with problems they didn't want to deal with initially. Once a component becomes stable (stable), it can remain in use consistently. However, adding new functionality to stable components involves a rigorous process. Interestingly, in the past couple of years, a new policy has been put in place—new elements can be added at v1beta1 level but require a commitment demonstrated through version updates; otherwise, they will be automatically deprecated."},
        {"role": "assistant", "content": "Heavy reliance on unstable Kubernetes components burdened core SIGs initially. Stable components ensure consistent use, but adding new features is strict. New policy permits v1beta1 additions with commitment, else auto deprecation."},
        {"role": "user", "content": prompt},
    ]

    res = openai.ChatCompletion.create(
        model = "gpt-3.5-turbo",
        messages = msgs
    )

    prompt = res["choices"][0]["message"]["content"]

    data["prompt"] = prompt+prompt_suffix

    out_file.write(json.dumps(data) + "\n")

file.close()
out_file.close()

### 2.3 Example classification, via ChatGPT few-shot learning

Now that we have summarized *strong* examples, we will use few-shot learning again, to classify all our previously generated examples as having *strong* or *weak* signal. 

This time we will pull our few-shot examples from files instead of including them in the code directly. A few things to note:
* If you get an error about too-many tokens used, reduce the `max_count` of included examples, or go back to step 2.2 to further summarize your *strong* examples.
* This will cost a fair amount on OpenAI. A rough estimate is $50 per 10,000 examples.
* This can take a long time to run to completion. The ChatCompletion API limits the number of tokens used per minute. Running 10,000 examples can take 8 or more hours.

First lets build up our "few-shots" message array, so we don't have to regenerate that each time.

In [None]:
strong = open(f'examples_strong_summarized.jsonl', 'r')
weak = open(f'examples_weak.jsonl', 'r')

messages = [{
    "role": "system",
    "content": "You are a helpful assistant. Your job is to determine if a prompt will be helpful for fine-tuning a model. All prompts start with 'start -->' and end with: '\\n\\n###\\n\\n'. You should respond 'yes' if you think the prompt has enough context to be helpful, or 'no' if not. No explanation is needed. You should only respond with 'yes' or 'no'."
}]

count = 0
max_count = 50
while True:

    strong_line = strong.readline()
    weak_line = weak.readline()
    count += 1

    if (not strong_line) or (not weak_line) or (count > max_count):
        break

    strong_data = json.loads(strong_line)
    weak_data = json.loads(weak_line)

    messages.append({"role": "user", "content": f'start -->{strong_data["prompt"]}'})
    messages.append({"role": "assistant", "content": "yes"})
    messages.append({"role": "user", "content": f'start -->{weak_data["prompt"]}'})
    messages.append({"role": "assistant","content": "no"})

strong.close()
weak.close()

Then we use those example messages to predict if a prompt will be helpful for training or not. 

Note that we use the `backoff` library to retry requests that have failed due to a rate-limit error. Even so, sometimes the process stalls and must be manually restarted. The code below appends to the output file instead of replacing it, so that the process can be restarted after an error occurs.

In [None]:
# only re-run this cell if you want to completely start over after an error occurs
starting_count = 0

total_count = 0
with open(f'examples.jsonl', 'r') as file:
    for line in file:
        total_count += 1

print(f'There are {total_count} lines in the input file.')

In [None]:
import json, openai, time, logging, backoff

# for debugging responses from the API, un-comment this
# logging.getLogger('backoff').addHandler(logging.StreamHandler())

file = open(f'examples.jsonl', 'r')
strong_file = open(f'examples_cleaned_strong.jsonl', 'a')
weak_file = open(f'examples_cleaned_weak.jsonl', 'a')

@backoff.on_exception(backoff.expo, (openai.error.RateLimitError, openai.error.ServiceUnavailableError))
def chat_with_backoff(**kwargs):
    time.sleep(1)
    try:
        return openai.ChatCompletion.create(**kwargs)
    except openai.error.InvalidRequestError:
        return None

count = 0
for line in file:
    count +=1

    # helpful for restarting after issue
    if count < starting_count:
        continue

    data = json.loads(line)

    prompt = data["prompt"]

    msgs = messages.copy()

    msgs.append({"role": "user", "content": f'start -->{prompt}'})

    res = chat_with_backoff(
        model = "gpt-3.5-turbo",
        messages = msgs
    )
    if not res:
        continue
    response = res["choices"][0]["message"]["content"]

    # for debugging responses from the API, un-comment this
    # print(f'Result was `{response}` for prompt: {prompt}')

    print(f'Currently processing line {count} of {total_count}')

    if response == "yes":
        strong_file.write(line)
        strong_file.flush()
    else:
        # for weak messages, re-write the completion as ` nil`
        data["completion"] = " nil end"
        weak_file.write(json.dumps(data) + '\n') 
        weak_file.flush()

    starting_count = count

file.close()
strong_file.close()
weak_file.close()

### 2.4 Final training file creation

To create the final training file, we want to ensure that we have an equal number of *strong* and *weak* examples. We will use pandas to grab random examples from each set, combine them in a random order, and the save the final example set to disk.

In [None]:
import pandas as pd

strong_df = pd.read_json('examples_cleaned_strong.jsonl', lines=True, orient='records')
weak_df = pd.read_json('examples_cleaned_weak.jsonl', lines=True, orient='records')

min_length = min([len(strong_df.index), len(weak_df.index)])

strong_df = strong_df.sample(min_length)
weak_df = weak_df.sample(min_length)

combined = pd.concat([strong_df, weak_df])
combined = combined.sample(frac=1)
combined.to_json('final_examples.jsonl', lines=True, orient='records')

Before sending our examples for fine-tuning, we use a tool provided by OpenAI to perform some verification on our input data and then split the dataset into 2 files.  The tool does the following for us:

* makes sure all prompts end with same suffix
* removes examples that use too many tokens
* removes duplicated examples

Note: We aren't doing classification, so don't start a fine-tune as suggested by the output.

In [None]:
import openai
from openai import cli
from types import SimpleNamespace

args = SimpleNamespace(file='final_examples.jsonl', quiet=True)
cli.FineTune.prepare_data(args)

## 3. Send our training data to OpenAI and create a fine-tuned model

Finally, we'll send our fine-tuning examples to OpenAI to create a custom model.

To do this, we will do the following:
* 3.1 Upload training data
* 3.2 Create a fine-tuning job
* 3.3 Wait for the fine-tuning to start
* 3.4 Wait for the fine-tuning to finish
* 3.5 Try using the model

### 3.1 Upload training data

First we upload the verified final examples file from above to OpenAI. We need to make sure the file has successfully uploaded before moving onto the next step.

In [None]:
import time

training_file_name = 'final_examples_prepared_train.jsonl'

# start the file upload
training_file_id = cli.FineTune._get_or_upload(training_file_name, True)

# Poll and display the upload status until the file finishes
while True:
    time.sleep(2)
    file_status = openai.File.retrieve(training_file_id)["status"]
    print(f'Upload status: {file_status}')
    if file_status in ["succeeded", "failed", "processed"]:
        break

### 3.2 Create a fine-tuning job

We recommened using either the `curie` or `davinci` models for fine-tuning. We had good success with both of them. Note that the `curie` model is cheaper to use, but takes longer to train a helpful model.

There are many parameters to set when training a model. Use our recommendations or try your own. With `curie` used 8 epochs, and with `davinci` we used 4. More help can be found here: https://platform.openai.com/docs/api-reference/completions

In [None]:
create_args = {
    "training_file": training_file_id,
    "model": "curie",
    "n_epochs": 8,
    "learning_rate_multiplier": 0.02,
    "suffix": "beep-gpt"
}

# Create the fine-tune job and retrieve the job ID
resp = openai.FineTune.create(**create_args)
job_id = resp["id"]

### 3.3 Wait for the fine-tuning to start

Note, it can take several hours for the fine-tuning to start. 

In [None]:
# Poll and display the fine-tuning status until the it starts
while True:
    time.sleep(5)
    job_status = openai.FineTune.retrieve(id=job_id)["status"]
    print(f'Job status: {job_status}')
    if job_status in ["failed", "started", "succeeded"]:
        break

### 3.4 Wait for the fine-tuning to finish

Note, it can take a long time for the fine-tuning to finish. A training set of 2000 examples took about 2 hours to train on `curie` with 8 epochs.

Run the following code block periodically, until you see a `failed` or `succeeded` status.

In [None]:
job_details = openai.FineTune.retrieve(job_id)

print(f'Job status: {job_details["status"]}')
print(f'Job events: {job_details["events"]}')

if job_details["status"] == "succeeded":
    model_id = job_details["fine_tuned_model"]
    print(f'Successfully fine-tuned model with ID: {model_id}')

### 3.5 Try using the model

Using the validation file, we can try sending a few prompts to our new model and see if it recommends alerting any users.

In [None]:
# choose which row in the validation file to send
row = 6

valid_df = pd.read_json('final_examples_prepared_valid.jsonl', lines=True, orient='records')

prompt = valid_df['prompt'][i]
completion = valid_df['completion'][i]

# this is the text we send to the model for it to determine if we should alert a user
print(f'Prompt: {prompt}')

# this is the user (or nil) we would have expected for the response
print(f'Completion: {completion}')

# this is the response from the model. The `text` feild contains the actual prediction. The `logprobs` arrary contains the log-probability from the 5 highest potential matches.
print(f'Prediction:')

openai.Completion.create(model=model_id, prompt=prompt, max_tokens=1, n=1, logprobs=5, stop=" end", temperature=0)