<a href="https://colab.research.google.com/github/vanderbilt-data-science/poschat-dssg/blob/main/01_template_API-calling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Making API Calls to OpenAI Models

This notebook outlines the basic setup for making API calls to OpenAI's models from within a jupyter notebook.

It is meant as a "template" notebook - it has simple functionalities laid out that can be built upon later to make more complex systems.

## Step 1. Setup

Before we can start writing code to make the calls, we need to do some setup. This will include installing packages, importing packages, and giving ourself model access with our API key.

### Installing necessary packages
Some packages need to be installed into our current environment before we're able to use them.
This cell will print out some short messages as it works to collect those packages for us.

In [1]:
# Install required packages
!pip install -q openai
!pip install -q PyPDF2

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m325.5/325.5 kB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m232.6/232.6 kB[0m [31m18.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25h

### Importing those packages
In this code, we'll be reading in a PDF using the PyPDF2 package, and calling on OpenAI's models using the openai package.
Here, we import those in so that we can use them.
We will also import the getpass and os libraries, which will allow us to easily paste our OpenAI API key directly into this notebook.

In [14]:
from openai import OpenAI
import PyPDF2
from getpass import getpass
import os

### Adding in our API key

The final setup step we need is to add our API key, so that OpenAI will give us access to their models.
When running the cell below, a text box should open where you can paste your API key in and hit "enter", granting access.

In [3]:
from getpass import getpass

os.environ['OPENAI_API_KEY'] = getpass()

··········


## Step 2. Set up PDF reading

To read in a PDF with python, we need to use the PyPDF2 package. This package is built specifically to read in PDF's, so we can make use of its base functionalities and not have to write too much code ourselves.

In this code cell, we define a function that uses PyPDF2 to read in the text from a PDF. It will take in a path to the PDF, and then return all of the text of the PDF into a string.

In [6]:
# Function to extract text from PDF
def extract_text_from_pdf(pdf_path):
    reader = PyPDF2.PdfReader(pdf_path) # PDFReader class opens our PDF for us
    text = ""
    for page in reader.pages: # We loop over all of the pages in the PDF
        text += page.extract_text() # Strings can be simply concatenated
    return text # The returned text is a string of all of the text in the pdf.

Now, we can use that function on a PDF.

This function assumes we have a path to our PDF, so we need it in our local environment.

In Google Colab, in the menu to the left, their should be an "upload" option. You can upload any PDF file directly from your computer into the session storage to use the function on.

In [7]:
pdf_path = 'sample.pdf' # this can be a path to any named PDF file!
pdf_text = extract_text_from_pdf(pdf_path)

Now, we can look at what's in `pdf_text` to ensure it matches what was on our pdf.

In [10]:
print(pdf_text)

Sample PDFThis is a simple PDF ﬁle. Fun fun fun.Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Phasellus facilisis odio sed mi. Curabitur suscipit. Nullam vel nisi. Etiam semper ipsum ut lectus. Proin aliquam, erat eget pharetra commodo, eros mi condimentum quam, sed commodo justo quam ut velit. Integer a erat. Cras laoreet ligula cursus enim. Aenean scelerisque velit et tellus. Vestibulum dictum aliquet sem. Nulla facilisi. Vestibulum accumsan ante vitae elit. Nulla erat dolor, blandit in, rutrum quis, semper pulvinar, enim. Nullam varius congue risus. Vivamus sollicitudin, metus ut interdum eleifend, nisi tellus pellentesque elit, tristique accumsan eros quam et risus. Suspendisse libero odio, mattis sit amet, aliquet eget, hendrerit vel, nulla. Sed vitae augue. Aliquam erat volutpat. Aliquam feugiat vulputate nisl. Suspendisse quis nulla pretium ante pretium mollis. Proin velit ligula, sagittis at, egestas a, pulvinar quis, nisl.Pellentesque sit amet lectus. Praesent pulv

## Step 3. Make an API call to OpenAI to chat

Now, we'll set up sending a prompt to OpenAI's GPT-3.5 model. We could instead use GPT-4, given our API has access.

More details on the different models that can be called with the API:
https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4

Keep in mind that the more powerful the model, the more expensive! It's best to develop with gpt-3.5 whenever possible.

The basic setup for an API call with OpenAI looks like this:
```
completion = client.chat.completions.create(
  model="<model_name>",
  messages=[
    {"role": "system", "content": "<system prompt>"},
    {"role": "user", "content": "<user prompt>"}
  ]
)
```
Here, we specify what model we want with `model = ...`
The `messages` define what context we're giving to the model. There are two types of these messages, `system` and `user`.

* The `system` prompt sets the behavior, guidelines, and constraints for the model. It provides initial context and instructions that the model should follow throughout the conversation.

* The `user` prompt is the inputs from the user. It's what the model generates responses to.

When adding in our PDF text, we'll add it to our prompt, which we'll have become the "user" message.

Let's make a prompt string that includes our PDF text that we can call on the model. The sample PDF that I used in this example was largely not English. Let's try asking the model about the language of the PDF.

In [11]:
prompt_string = "Here is some text from a pdf: " + pdf_text + "What language is this?"

Now, we'll call the oi

In [16]:
client = OpenAI() # this connects to OpenAI, and makes use of our API key we set earlier.

completion = client.chat.completions.create(
  model="gpt-3.5-turbo", # We'll use gpt-3.5 to experiment with. But this could be changed if necessary.
  messages=[
    {"role": "system", "content": "You are a helpful assistant."}, # We don't need much guidance here, this is a very simple system prompt.
    {"role": "user", "content": prompt_string} # Insert our prompt string here!
  ]
)

The "completions" object that we get out from the chat call contains a lot of information, more about that here: https://platform.openai.com/docs/api-reference/chat/create

Four our purposes, let's just look at the response message.

In [17]:
print(completion.choices[0].message)

ChatCompletionMessage(content='The text you provided appears to be in Latin. Lorem ipsum is a commonly used placeholder text in the publishing and graphic design industries to simulate the appearance of written text.', role='assistant', function_call=None, tool_calls=None)


Awesome! It looks like we sucessfully made the call, and it was accurately able to include information from the provided PDF text in the response.