<font size="5">**OpenAI GPT Basic Example**</font>

This is a basic example of how to use the 
OpenAI GPT API in Python. This Jupyter
notebook was designed to run in Google Colab,
but can easily work on other platforms as well.

In this example we construct a prompt that utilizes
a number of variables that describe a forest stand in the 
formulation of a response.The prompt informs the model of 
an average value for basal area that it my comment on 
when drafting the stand description.

William Zipse, NJ Forest Service 2023

In [None]:
#Set API Key
#You will need an OpenAI API key to run this script!
#Use your own key where quoted in the API_KEY constant below.
#This will establish which key to charge for tokens!
#BE CAREFUL!!!
API_KEY =""

In [None]:
#solves problem with openai install in Colab
#This is a workaround for a known bug at this time
#Uncomment below if using Google Colab and openai won't install
#!pip install aiohttp

The code below is used to install the openai Python library
in the Google Colab environment

In [None]:
#install OpenAI
#Uncomment below if using Google Colab
#comment out if using a local install
!pip install --upgrade openai

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting openai
  Downloading openai-0.27.2-py3-none-any.whl (70 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.1/70.1 KB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
Collecting aiohttp
  Downloading aiohttp-3.8.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m19.4 MB/s[0m eta [36m0:00:00[0m
Collecting aiosignal>=1.1.2
  Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting yarl<2.0,>=1.0
  Downloading yarl-1.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (264 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m264.6/264.6 KB[0m [31m12.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting async-timeout<5.0,>=4.0.0a3
  Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Collecting multidict<7.0,>=4.5
  Downloading multid

In [None]:
#Taken from OpenAI documentation
#os is used if calling environment variables
import os
import openai

In [None]:
#Establish paths and connections

In [None]:
#This code is to set the API Key through environment variables
#uncomment below if using API key through environment varibles method only!
#!export OPENAI_API_KEY="<OPENAI_API_KEY>"
#openai.api_key = os.getenv("OPENAI_API_KEY")

In [None]:
#Set API Key from API_KEY constant set at the beginning of this notebook
openai.api_key = API_KEY

<font size="3">**Variables for the Prompt**</font>

The variables below contain the values that will be used in
the forest stand description by the GPT model. In this example
the values are set in the definitions below, however these 
variables could easily be changed in other implementations.

In [None]:
#define forest stand variables
qmd = 6.0
tpa = 311
ba = 95
ft = 'Pitch Pine/Oak'
area = 150

<font size="3">**Making the Prompt**</font>

The string variable called "p" below contains the prompt to send to the openai API. 
Note that the variables defined previously are referred to in this prompt. Also 
note that prompts are preceded by "Prompt:" and responses are preceded by "Response:" 
In this case, the response after "Response:"is left blank for the model to complete.
You may give the model a series of defined prompts and responses prior to leaving a
response blank. The delimiters of "Prompt:" and "Response:" are defined in the call 
to openai.Completion.create() in the block after the definition of the variable "p".

In [None]:
p1 = '''Prompt:
Generate a description of a forest stand where 
QMD is quadratic mean diameter in inches, TPA is trees per acre, BA is average tree basal 
area in square feet per acre, FT is forest type naming dominant tree species in the overstory, 
and area is the stand area in
acres. '''+'In this stand QMD = '+str(qmd)+', '+'TPA = '+str(tpa)+', BA = '+str(ba)+', FT = '+str(ft)+'area = '+str(area)+'''
The statewide average BA is 111.0 square feet per acre. \n Response:'''

<font size="3">**Calling the OpenAI API**</font>

Below the variable "response" is defined by calling the 
openai.Completion.create() function.  The function returns 
nested Python dictionaries.  Model parameters can be set here. 
See OpenAI documentation for details. Note the "prompt" variable 
is set to "p" above; max_tokens can be adjusted to adjust the 
length of prompts and responses (larger values may charge 
more tokens on the OpenAI account associated with the API key);
temperature can be set between 0 - 2. This is were the "stop" 
list is set as well. Here it is set to "Prompt:" for the prompt 
and "Response:" for the response."

In [None]:
#set model parameters here
response = openai.Completion.create(
  model="text-davinci-003",
  prompt= str(p1),
  temperature=1.1,
  max_tokens=200,
  top_p=1,
  frequency_penalty=0.0,
  presence_penalty=0.0,
  stop=["Prompt:", "Response:"]
)

<font size="3">**Locating the Text Response**</font>

Below the variable "answer" is set to the location of the 
text response from the "response" variable above.

In [None]:
answer = response['choices'][0]['text']

<font size="3">**Let's see the text response!**</font>

Here we print the "answer" variable generated by the model.

Now you can try changing the prompt and variables above and
see what kinds of results you can get.

In [None]:
print(answer)


This forest stand of Pitch Pine/Oak consists of an area of 150 acres, with an average QMD of 6.0 inches, 311 trees per acre and 95 square feet of basal area per acre. The overall basal area in this stand is below the statewide average of 111, making it one of the lower-density stands in the region.


**Let's try using our prompt in a slightly different way.**

In the example above we defined one text prompt in the variable, p. This prompt
has only one defined prompt after the stop sequence defined with the string
of "Prompt:". We then follow the prompt with the stop sequence of "Response:" and allow the model to generate the response.

This format works for relatively small prompts because prompts and responses have limited numbers of tokens (aka characters) that can be processed. It is possible to construct a prompt containing a series of prompts and responses
before requesting a response from the model, allowing more tokens to be
processed in a prompt sequence.  the prompt stored in the variable "p2" is an
example of defining a prompt sequence this way.

In [None]:
#setup a function if you want to pass many prompts
#to a model setup the same way for each prompt
def askGPT(p):
  response = openai.Completion.create(
    model="text-davinci-003",
    prompt= str(p),
    temperature=1.1,
    max_tokens=200,
    top_p=1,
    frequency_penalty=0.0,
    presence_penalty=0.0,
    stop=["Prompt:", "Response:"])
  return response

**A prompt with prompts and responses**

Notice that although the prompt defined in the variable "p2" below prompts the 
same information as the prompt as that at "p1," this prompt spreads the information across multiple prompts and responses. This allows for the 
construction of longer prompts with more background information.

In [None]:
p2 = '''Prompt: What does QMD stand for?
Response: QMD is quadratic mean diamer in inches.\n
Prompt: What does TPA stand for?\n
Response: TPA stands for trees per acre.\n
Prompt: What does BA stand for?\n
Response: BA stands for average forest stand basal area in square feet per acre.\n
Prompt: What is the statewide average BA for reference?\n
Response: The statewide average BA is 111.0 square feet per acre.\n
Prompt: What is area?\n
Response: area is the size of the forest stand in acres.\n
Prompt: What is FT?\n
Response: FT is the forest type, which names the dominant tree species in the overstory.\n
Prompt: Generate a description of a forest stand where
'''+'in this stand QMD = '+str(qmd)+', '+'TPA = '+str(tpa)+', BA = '+str(ba)+', FT = '+str(ft)+'area = '+str(area)+'''
Mention how the stand BA compares to the statewide BA\n Response:'''

In [None]:
#call our function to run the model
response = askGPT(p2)

In [None]:
answer2 = response['choices'][0]['text']

Once again we have a response that was left blank in p2, generated by the model.

In [None]:
print(answer2)

 This stand is 150 acres in size, with a Quadratic Mean Diameter (QMD) of 6.0 inches and Trees Per Acre (TPA) of 311. It is a Pitch Pine/Oak forest type with an Average Forest Stand Basal Area (BA) of 95 square feet per acre, which is 14.0 square feet per acre lower than the statewide average.
