### Using the OpenAI Library to Programmatically Access GPT-3.5-turbo!

In [1]:
!pip install openai cohere tiktoken -q

In order to get started, we'll need to provide our OpenAI API Key - detailed instructions can be found [here](https://github.com/AI-Maker-Space/Interactive-Dev-Environment-for-LLM-Development#-setting-up-keys-and-tokens)!

In [2]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/authentication?lang=python) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `system`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://help.openai.com/en/articles/7042661-chatgpt-api-transition-guide)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [3]:
from openai import OpenAI

client = OpenAI()

In [4]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-9px9ZlhEAI0U42Hmau1Ly3amDTwoR', choices=[Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content='LangChain and LlamaIndex are two completely different entities with different purposes:\n\n- LangChain is a language learning platform that uses blockchain technology to provide users with a secure and decentralized way to access language learning resources, connect with tutors, and earn rewards for their language learning efforts. It aims to revolutionize the way people learn languages by making the process more efficient, accessible, and rewarding.\n\n- LlamaIndex, on the other hand, is a financial index that tracks the performance of assets within the cryptocurrency market. It provides investors and traders with insight into how specific cryptocurrencies are performing against each other and against other financial assets. LlamaIndex helps users make informed investment decisions by providing them with real-time data and analysis of the ma

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [6]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4o-mini") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "system", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [7]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

LangChain and LlamaIndex are both frameworks designed to enhance the capabilities of large language models (LLMs) by providing tools and abstractions for building applications that utilize these models. However, they focus on different aspects and functionalities:

### LangChain
- **Purpose**: LangChain is primarily designed to facilitate the integration of LLMs into applications by providing a structure for building end-to-end applications that leverage natural language understanding and generation.
- **Features**:
  - **Chain Abstractions**: LangChain allows developers to create chains of prompts that can be executed in sequence, making it easier to manage complex workflows.
  - **Agents**: It includes built-in functionality for creating agents that can take actions based on the model's outputs.
  - **Memory**: LangChain provides ways to add memory to applications, allowing them to remember past interactions for more coherent conversations.
  - **Integration**: It offers integrations with various data sources, APIs, and external services, enabling applications to pull in contextual data.
  - **Multi-Model Support**: LangChain can work with different models and provide tools to switch between them as needed.

### LlamaIndex (previously known as GPT Index)
- **Purpose**: LlamaIndex focuses on providing a structured way to connect LLMs to external data sources, enabling efficient retrieval and indexing of information for query answering.
- **Features**:
  - **Data Indexing**: It allows users to create indices of documents or datasets that can be efficiently queried by the model.
  - **Retrieval-Augmented Generation (RAG)**: LlamaIndex facilitates RAG, where models are able to retrieve relevant information from an indexed dataset in real-time to enhance their responses.
  - **Support for Various Data Types**: It can handle both structured and unstructured data, making it versatile for different applications.
  - **Efficiency**: The focus is on optimizing the retrieval process so that LLMs can provide answers based on up-to-date or large volumes of external information.

### Summary
In summary, **LangChain** focuses on building applications and workflows around LLMs, offering features like chaining and memory, while **LlamaIndex** emphasizes data retrieval and indexing to enhance the model's ability to answer questions based on external information. Depending on your project requirements—whether it's building a full application or improving data handling with LLMs—you might choose one or the other or even combine them for a comprehensive solution.

Let's focus on extending this a bit, and incorporate a `system` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The system message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the system prompt.

In [8]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

I don't have preferences, especially when I'm this hungry! But honestly, whether it's crushed or cubed ice, what I really need right now is some food! Can we talk about something tasty instead?

Let's try that same prompt again, but modify only our system prompt!

In [9]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I think both have their own charm! Crushed ice is perfect for getting that refreshing chill in drinks, while cubed ice looks nice and lasts longer in a glass. What about you? Do you have a favorite?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [10]:
print(joyful_response)

ChatCompletion(id='chatcmpl-9pxLqD2LJVfoOfmGpsezFZ5XWJyuG', choices=[Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content='I think both have their own charm! Crushed ice is perfect for getting that refreshing chill in drinks, while cubed ice looks nice and lasts longer in a glass. What about you? Do you have a favorite?', role='assistant', function_call=None, tool_calls=None), logprobs=None)], created=1722169902, model='gpt-4o-mini-2024-07-18', object='chat.completion', system_fingerprint='fp_ba606877f9', usage=CompletionUsage(completion_tokens=44, prompt_tokens=30, total_tokens=74))


### Few-shot Prompting

Now that we have a basic handle on the `system` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-3.5-turbo` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [11]:
list_of_prompts = [
    user_prompt("Please use the words 'stimple' and 'falbean' in a sentence.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple design of the chair complemented the room, while the falbean curtains added a vibrant touch of color.

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [12]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

"The stimple furniture in the workshop was perfectly complemented by the falbean that made assembling the pieces a breeze."

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought Prompting

We'll head one level deeper and explore the world of Chain of Thought prompting (CoT).

This is a process by which we can encourage the LLM to handle slightly more complex tasks.

Let's look at a simple reasoning based example without CoT.

In [13]:
reasoning_problem = """
Billy wants to get home from San Fran. before 7PM EDT.

It's currently 1PM local time.

Billy can either fly (3hrs), and then take a bus (2hrs), or Billy can take the teleporter (0hrs) and then a bus (1hrs).

Does it matter which travel option Billy selects?
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

To determine if it matters which travel option Billy selects, we need to compare the total travel times for both options, factoring in the time zones.

First, we confirm the current local time in San Francisco. San Francisco is in the Pacific Time Zone (PT), which is UTC-7 during Daylight Saving Time. Since it's currently 1 PM local time in San Francisco, that means it is 4 PM EDT (Eastern Daylight Time) in New York.

Now, let's evaluate the two options:

1. **Option 1 (Flying + Bus):**
   - Flight time: 3 hours
   - Bus time: 2 hours
   - Total travel time: 3 hours + 2 hours = 5 hours
   - Departure time from San Francisco: 1 PM PT
   - Arrival time in New York (EDT): 
     - 1 PM PT + 5 hours = 6 PM PT ()
     - Convert 6 PM PT to EDT: 6 PM PT + 3 hours = 9 PM EDT

2. **Option 2 (Teleporter + Bus):**
   - Teleportation time: 0 hours
   - Bus time: 1 hour
   - Total travel time: 0 hours + 1 hour = 1 hour
   - Departure time from San Francisco: 1 PM PT
   - Arrival time in New York (EDT): 
     - 1 PM PT + 1 hour = 2 PM PT
     - Convert 2 PM PT to EDT: 2 PM PT + 3 hours = 5 PM EDT

Now to see if either option gets Billy home before 7 PM EDT:
- **Option 1 arrives at 9 PM EDT,** which is after 7 PM EDT.
- **Option 2 arrives at 5 PM EDT,** which is before 7 PM EDT.

Thus, yes, it does matter which travel option Billy selects: **Choosing the teleporter followed by the bus is the only option that gets Billy home before 7 PM EDT.**

As humans, we can reason through the problem and pick up on the potential "trick" that the LLM fell for: 1PM *local time* in San Fran. is 4PM EDT. This means the cumulative travel time of 5hrs. for the plane/bus option would not get Billy home in time.

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task:

In [14]:
list_of_prompts = [
    user_prompt(reasoning_problem + " Think though your response step by step.")
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

To determine if it matters which travel option Billy selects, we first need to establish the current time based on where Billy is, calculate the total travel times for each option, and then see if he can make it home before 7 PM EDT.

### Step 1: Identify current local time and travel timezone
- It is currently 1 PM local time in San Francisco (PDT, which is UTC-7). 
- To convert this to EDT (Eastern Daylight Time, which is UTC-4):
  - 1 PM PDT + 3 hours = 4 PM EDT.

### Step 2: Determine the deadlines
- Billy needs to reach home before 7 PM EDT.

### Step 3: Calculate total travel time for each option
**Option 1: Fly then take a bus**
- Flight time: 3 hours
- Bus time: 2 hours
- Total travel time = 3 hours (flight) + 2 hours (bus) = 5 hours

Since it's currently 1 PM PDT:
- Departure time from San Francisco = 1 PM PDT
- 5 hours of travel means Billy would arrive at:
  - 1 PM + 5 hours = 6 PM PDT 
- To convert 6 PM PDT to EDT:
  - 6 PM PDT + 3 hours = 9 PM EDT

**Option 2: Teleport then take a bus**
- Teleport time: 0 hours
- Bus time: 1 hour
- Total travel time = 0 hours (teleport) + 1 hour (bus) = 1 hour

Since it's still 1 PM PDT:
- Departure time from San Francisco = 1 PM PDT
- 1 hour of travel means Billy would arrive at:
  - 1 PM + 1 hour = 2 PM PDT 
- To convert 2 PM PDT to EDT:
  - 2 PM PDT + 3 hours = 5 PM EDT

### Step 4: Compare arrival times with the deadline
- For Option 1 (Fly + bus): Arrival time = 9 PM EDT
- For Option 2 (Teleport + bus): Arrival time = 5 PM EDT

### Conclusion
Yes, it does matter which travel option Billy selects. 

- If he takes the **fly then bus option**, he arrives at **9 PM EDT**, which is **after** the deadline.
- If he takes the **teleport then bus option**, he arrives at **5 PM EDT**, which is **before** the deadline.

Therefore, Billy should choose the teleport option to reach home before 7 PM EDT.

With the addition of a single phrase `"Think through your response step by step."` we're able to completely turn the response around.

### Conclusion

Now that you're accessing `gpt-3.5-turbo` through an API, developer style, let's move on to creating a simple application powered by `gpt-3.5-turbo`!

You can find the rest of the steps in [this](https://github.com/AI-Maker-Space/Beyond-ChatGPT/tree/main) repository!

This notebook was authored by [Chris Alexiuk](https://www.linkedin.com/in/csalexiuk/)