<a href="https://colab.research.google.com/github/zxie52/AI_workshop_2024Summer/blob/main/2_1_programmatic_llm_elements.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a target="_blank" href="https://colab.research.google.com/github/vanderbilt-data-science/ai_summer/blob/main/2_1-programmatic-llm-elements.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Programmatic LLM Elements
> For Vanderbilt University AI Summer 2024 Prepared by Dr. Charreau Bell

_Code versions applicable: May 13, 2024_

## Learning Outcomes:
* Participants will be able to explain the core messaging elements of programmatic interaction with ChatLLMs, specifically system messages, user messages, and assistant messages.
* Participants will be able to explain the programmatic requirements of conversational AI with completions-like APIs in relationship to context and chat history.
* Participants will be able to integrate additional parameters of LLM API calls to modify the default behavior of the LLMs.
* Participants will understand the usage of APIs within demonstrative applications.

### Setup
Here, we'll prepare the coding environment with packages and API keys. This notebook assumes Google Colab.

In [None]:
# install from pypi
! pip install openai gradio



In [None]:
# standard practice is all imports at the top, but for learning purposes, this is distributed throughout the notebook
import os

In [None]:
# make available to python
from google.colab import userdata

Here, you need to make sure that you have added your OpenAI API key:
* Go to: https://platform.openai.com
* Make sure that you are logged in
* Go to sidebar API keys
* Follow instructions
* SAVE YOUR API KEY AS YOU WILL NEVER SEE IT WRITTEN AGAIN
* Add it to the Colab sidebar with name `OPENAI_API_KEY`

In [None]:
# set environment variable to be used by openAI client
os.environ['OPENAI_API_KEY'] = userdata.get('ZengboXie')

## Programmatic LLM API Elements
Resources:
* [OpenAI API Reference](https://platform.openai.com/docs/overview)
* [Direct Link to Completions Quickstart](https://platform.openai.com/docs/quickstart)

### Get completion: From OpenAI Quickstart

In [None]:
# Copy from Quickstart
from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)

print(completion.choices[0].message)


ChatCompletionMessage(content='Hello! How can I assist you today?', role='assistant', function_call=None, tool_calls=None)


In [None]:
# view structure of completion
completion.model_dump()

{'id': 'chatcmpl-9ORedycPNvmWXsUA3MZPd804xLr12',
 'choices': [{'finish_reason': 'stop',
   'index': 0,
   'logprobs': None,
   'message': {'content': 'Hello! How can I assist you today?',
    'role': 'assistant',
    'function_call': None,
    'tool_calls': None}}],
 'created': 1715613203,
 'model': 'gpt-3.5-turbo-0125',
 'object': 'chat.completion',
 'system_fingerprint': None,
 'usage': {'completion_tokens': 9, 'prompt_tokens': 19, 'total_tokens': 28}}

In [None]:
# obtain the string content of the completion
print(completion.choices[0].message.content)

# save into variable just for convenience
completion_1 = completion.choices[0].message.content
print(completion_1)

Hello! How can I assist you today?
Hello! How can I assist you today?


### Updating the conversational context from response
Tip: If developing locally, debuggers are your best friend!

In [None]:
completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."},
    {"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."},
    # add updates to conversation here
    {"role": "assistant", "content": completion_1},
    {'role': 'user', 'content': "You didnt tell me anything about recursion"}

  ]
)

In [None]:
# view the response
completion.model_dump()

{'id': 'chatcmpl-9ORjiMg1t3xFSrsQNJTIHb1I9s4Q0',
 'choices': [{'finish_reason': 'stop',
   'index': 0,
   'logprobs': None,
   'message': {'content': "On wings of code, a tale unfolds,\nA concept deep, a story told.\nRecursion dances, in loops it twirls,\nUnleashing magic, in the coding world.\n\nA function calls itself, brave and bold,\nInto the abyss of tasks untold.\nLike a mirror reflecting its own gaze,\nRecursion loops in a mystic maze.\n\nWith each call, a problem is split,\nInto smaller pieces, bit by bit.\nUntil a base case, like a shining star,\nEnds the journey, near or far.\n\nInfinite patterns, in a loop's embrace,\nRecursion reveals its wondrous grace.\nA never-ending dance, a poetic rhyme,\nIn the realm of code, through space and time.",
    'role': 'assistant',
    'function_call': None,
    'tool_calls': None}}],
 'created': 1715613518,
 'model': 'gpt-3.5-turbo-0125',
 'object': 'chat.completion',
 'system_fingerprint': None,
 'usage': {'completion_tokens': 146, 'promp

In [None]:
# just the text
completion_2 = completion.choices[0].message.content
print(f"{completion.choices[0].message.content}")


On wings of code, a tale unfolds,
A concept deep, a story told.
Recursion dances, in loops it twirls,
Unleashing magic, in the coding world.

A function calls itself, brave and bold,
Into the abyss of tasks untold.
Like a mirror reflecting its own gaze,
Recursion loops in a mystic maze.

With each call, a problem is split,
Into smaller pieces, bit by bit.
Until a base case, like a shining star,
Ends the journey, near or far.

Infinite patterns, in a loop's embrace,
Recursion reveals its wondrous grace.
A never-ending dance, a poetic rhyme,
In the realm of code, through space and time.


### On your own: Continuing the conversation
Now, continue the conversation based on the previous response. Your new question should be based on something about python dictionaries without specifically referencing it. Some examples:
* "Show me an example of a comprehension based on this."
* "How does this differ from a python list?"

View the response as a string, and not the total response object.

In [None]:
# Continue the conversation
completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."},
    {"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."},
    {"role": "assistant", "content": completion_1},
    {"role": "user", "content": "Now explain python dictionaries."},
    {"role": "assistant", "content": completion_2},
    {"role": "user", "content": "Can you give a code example on this?"}
  ]
)

# view the response as a string
print(completion.choices[0].message.content)

Of course! Here's a Python code snippet that demonstrates recursion with a function that calculates the factorial of a number:

```python
def factorial(n):
    # Base case: if n is 0 or 1, return 1
    if n == 0 or n == 1:
        return 1
    else:
        # Recursive call: n! = n * (n-1)!
        return n * factorial(n - 1)

# Calling the factorial function with n = 5
result = factorial(5)
print(result)  # Output: 120
```

In this example, the `factorial` function recursively calculates the factorial of a number `n` by multiplying `n` with the factorial of `n-1`. The recursion continues until it reaches the base case of `n = 0` or `n = 1`, where the function returns 1 to terminate the recursive calls.


## APIs: An exploration using OpenAI's Chat Completions API
APIs are contracts between developers and services

### APIs: Example 1 - Limiting Tokens

In [None]:
conversation_messages = [
    {"role": "system", "content": "You are a fantastic storyteller and are an expert in telling long, engaging, highly descriptive, imaginative stories."},
    {"role": "user", "content": "Tell me a story about an extremely mischievous cat who finds creative places to hide whenever his mommy needs to cut his nails."}
]

full_token_completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=conversation_messages,
  # add other parameters
  max_tokens = None

)

full_token_completion.model_dump()

{'id': 'chatcmpl-9OSFhGUshN2RUTCCKWVivclJGJOn0',
 'choices': [{'finish_reason': 'stop',
   'index': 0,
   'logprobs': None,
   'message': {'content': 'Once upon a time, in a cozy little cottage at the edge of a bustling town, there lived a mischievous cat named Whiskers. Whiskers was a fluffy orange tabby with bright green eyes and a playful spirit that often got him into trouble. However, his most dreaded moment was when his mommy, Mrs. Jenkins, needed to trim his sharp claws.\n\nMrs. Jenkins loved Whiskers dearly and always tried to keep his claws trimmed to prevent any accidents around the house. But the moment Whiskers saw the nail clippers in her hand, he knew it was time to play his favorite game – the hiding game.\n\nAs soon as Whiskers caught sight of the nail clippers, he would dart off, his paws thudding softly on the wooden floors as he sought out the perfect hiding spot. Mrs. Jenkins would call out to him, "Whiskers, it\'s time for your nail trim," but Whiskers paid no mind

In [None]:
# show reason for ending
full_token_completion.choices[0].finish_reason

'stop'

In [None]:
# add in specific valuef or max_tokens parameter
max_token_completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=conversation_messages,
  # add other parameters
  max_tokens = 20
)

max_token_completion.model_dump()

{'id': 'chatcmpl-9OSI5ZGvDkgDwbp7XjrAD8J1l87M3',
 'choices': [{'finish_reason': 'length',
   'index': 0,
   'logprobs': None,
   'message': {'content': 'In a quaint little town nestled between rolling green hills and babbling brooks, there lived a cat',
    'role': 'assistant',
    'function_call': None,
    'tool_calls': None}}],
 'created': 1715615649,
 'model': 'gpt-3.5-turbo-0125',
 'object': 'chat.completion',
 'system_fingerprint': None,
 'usage': {'completion_tokens': 20, 'prompt_tokens': 59, 'total_tokens': 79}}

In [None]:
# show reason for ending
max_token_completion.choices[0].finish_reason

'length'

### APIs: Example 2 - Drilldown Inputs
Sometimes you'll need to navigate deep into the API to get the information you need.

In [None]:
# Try completions with images
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                },
            ],
        }
    ],
    max_tokens=300,
)


image_response.model_dump()


BadRequestError: Error code: 400 - {'error': {'message': "Invalid type for 'messages[0].content[1].image_url': expected an object, but got a string instead.", 'type': 'invalid_request_error', 'param': 'messages[0].content[1].image_url', 'code': 'invalid_type'}}

### Breakout Room: Chat Completions API (10 minutes)
In this breakout room, you're going to explore implementing different parameters of the Chat Completions API. It has lots of functionality, and you can explore it based on your interests and those of your lab/group partners.

In your breakout room:
1. Choose a group leader to share your screen, and choose a writer who will document the results of the exploration.

**[Creating Chat Completions](https://platform.openai.com/docs/api-reference/chat/create)**
  * Read over all of the parameters that can be used in the request. Other than messages and model, which 3 parameters seem most interesting/useful for your purposes? Why?
  * Choose a straightforward parameter of interest (e.g., temperature, seed, presence_penalty) and test it out with a simple prompt. What happens when you change the value of this parameter?
  * [If time] Repeat the above with another straightforward parameter of interest.
  * [Optional] Repeat the above with another parameter of interest, particularly if you are interested in `logprobs`, `response_format`, etc. We will work with tools another day.

**[The chat completion object](https://platform.openai.com/docs/api-reference/chat/object)**
  * What is the structure of choices? How would you access choice responses for n>1? Implement this below.
  * What is the system fingerprint? In what cases do you think it would be useful?

**[Extra Credit: The Moderation API](https://platform.openai.com/docs/api-reference/moderations)**
  * What is the purpose of the moderation API? How might it be useful in a chatbot context?
  * What models are available to do moderation? Do they appear to be GPT (LLM-based) models based on the models available and language of the API? Justify your answer and speculate in its implications/reasoning.
  * Examine the example request given in the API documentation and try it out here.

#### Sample Code Solutions

In [None]:
# chat completions implementation 1, object 1


In [None]:
# using moderations API (call your variable `moderation` and the code below will work)


In [None]:
# moderations viewing helper
import pandas as pd
df = pd.DataFrame({'category':moderation.results[0].categories.model_dump(),
                   'score':moderation.results[0].category_scores.model_dump()})
df.index = df.index.str.replace('/', '_')
df.drop_duplicates().sort_values(by=['category', 'score'], ascending=False)

## Foundational behavior vs application behavior
The code above is foundational programmatic access of LLMs (in OpenAI). Then, you add programming around it to actually make it useful.

### "Command line" interaction
Below is the functionality that keeps track of the history and makes the API calls.

In [None]:
# This is an example of a function that can help to update chat history
def get_assistant_response(user_message, llm_chat_history, update_chat_history=True, model_name = "gpt-3.5-turbo"):

    ## completions as normal
    completion = client.chat.completions.create(
        model=model_name,
        messages=llm_chat_history + [{'role': 'user', 'content': user_message}]
    )

    ## update chat history if desired
    if update_chat_history:
        new_messages = [{'role':'user', 'content': user_message},
                        {'role': 'assistant', 'content': completion.choices[0].message.content}]
        llm_chat_history.extend(new_messages)

    ## return the response and updated chat history
    return completion.choices[0].message.content, llm_chat_history

The code below demonstrates our initialization and our "user interface". Best practice is to separate functionality from appearance.

In [None]:
# create initial chat history
system_message = 'You are a helpful assistant. Be brief, succinct, and clear in your responses. Only answer what is asked.'
openai_chat_history = [{'role': 'system', 'content': system_message}]

# wait for first input
print('Begin chatting with the LLM!')
user_input = input("You: ")

# continue chatting until user types 'exit'
while user_input != "exit":
    print('User: ', user_input)
    response, openai_chat_history = get_assistant_response(user_input, openai_chat_history)
    print("Assistant:", response)
    user_input = input("You: ")

### UI Interaction
We can also other programming elements (classes) to help us maintain the state, while helping us format the conversation for use in a streamlined library for developing UIs (gradio).

In [None]:
class OpenAIChatClient:
    def __init__(self, model="gpt-3.5-turbo", system_message="You are a helpful assistant."):
        self.client = OpenAI() # assumes API key is in an environment
        self.model = model

        self.system_message = system_message
        self.messages = [{"role": "system", "content": self.system_message}]  # Message list for OpenAI format
        self.conversations = []  # Message list for gradio format


    def add_user_message(self, text):
        self.messages.append({"role": "user", "content": text})
        return self.call_completions_api()

    def call_completions_api(self):

      response = self.client.chat.completions.create(
          model=self.model,
          messages=self.messages
      )

      assistant_response = response.choices[0].message.content
      self.messages.append({"role": "assistant", "content": assistant_response})
      user_message = self.messages[-2]['content']
      self.conversations.append((user_message, assistant_response))

      return assistant_response

    def get_conversation(self):
        return self.messages

    def get_formatted_conversations(self):
        return self.conversations

    def pretty_print_conversation(self):
        return '\n'.join([f"{message['role'].capitalize()}: {message['content']}" for message in self.messages])

    def reset_conversation(self, system_message = None):

        # add a new system message if provided
        if system_message:
            self.system_message = system_message

        # reset the messages and conversations
        self.messages = [{"role": "system", "content": self.system_message}]
        self.conversations = []

We will now explore the behavior without the UI, and then with the UI.

In [None]:
# create OpenAI Client with history management


In [None]:
# Simulate first interaction


# Do some formatting to make the printing prettier
print(custom_chat_client.pretty_print_conversation())


In [None]:
# Simulate first interaction
_ = custom_chat_client.add_user_message("Create a minimal working user interface with the Blocks implementation (not Interface) in Python.")

# Do some formatting to make the printing prettier
print(custom_chat_client.pretty_print_conversation())

#### Integrate with gradio
Gradio is another platform where the API is new and can be confusing to generative AI. Here, I usually just copy/paste/adapt what I need from the API reference.

[Example ChatBot](https://www.gradio.app/guides/creating-a-chatbot-fast)

Basic story:
* Wherever you see `chatbot_history`, that basically represents message history.
* Message history is formatted as a list of `(user_message, bot_message)` tuples. Hence, why we wanted the format above added.

In [None]:
# reset conversation
custom_chat_client.reset_conversation()

In [None]:
import gradio as gr

def respond(message, chat_history):
  custom_chat_client.add_user_message(message)
  return '', custom_chat_client.get_formatted_conversations()

with gr.Blocks() as demo:
    chatbot_history = gr.Chatbot()
    msg_textbox = gr.Textbox()
    reset_button = gr.ClearButton([msg_textbox, chatbot_history]) #doesn't do anything right now

    msg_textbox.submit(respond, inputs=[msg_textbox, chatbot_history], outputs=[msg_textbox, chatbot_history])

demo.launch()

# Homework
Congratulations! You've made it through an incredible firehose introduction to implementing LLMs programmatically! For homework, you're tasked with gaining more depth into several of the concepts.

## Required Exercises
### Learning more about LLM Platforms
Read and summarize the following short articles to gain more background on the OpenAI API:
* [OpenAI API Documentation Introduction](https://platform.openai.com/docs/introduction)
* [Text Generation](https://platform.openai.com/docs/guides/text-generation) (Note: You can skip `Completions API (Legacy)`)
* [Moderation](https://platform.openai.com/docs/guides/moderation/overview)

### Practice with the OpenAI API using Prompt Engineering
* Read this short section on [Prompt Engineering by AWS](https://catalog.us-east-1.prod.workshops.aws/workshops/a4bdb007-5600-4368-81c5-ff5b4154f518/en-US/050-prompt-engineering), focusing **ONLY** on the prompt patterns introduced.
* Use the OpenAI API to implement each of the patterns. Hint: This will look like the `client.chat.completions.create` cells that we have above, but with modified user messages.
* How might some of these patterns benefit you in your project efforts?

### Testing Your Understanding of LLM Concepts using a different ChatLLM API
#### Implement the 3-turn conversation with Google's Gemini API
Now, you will really test your mettle by implementing the 3-turn conversation at the beginning of this notebook using a completely different API - Google's Gemini API. You want to make sure to implement it as "multi-turn"; in other words, that you are making sure that the chat history is sent in the full context somehow. There will be differences in this API from the OpenAI implementation. Based on all of the steps above, starting from authentication, implement interaction with the prompts above. Below are some hints/steps to help if needed.

<details>
<summary>  Show Hints </summary>
<div style="margin-left: 20px;">
    <details>
        <summary>Step 1</summary>
        <p>Locate the Gemini API. You can find the platform overview <a href=https://ai.google.dev/>here</a>, or the API documentation <a href=https://ai.google.dev/gemini-api/docs>here</a></p>
    </details>
    <details>
        <summary>Step 2</summary>
        <p><strong> Create your API key.</strong> Read the <a href=https://ai.google.dev/gemini-api/docs/quickstart>Quick Start</a>. Note that this Quick Start also has a button where you can "Run in Google Colab". That's a great place to start. You can either follow the instructions in the Quick Start, or open the Colab notebook and follow the instructions from there.</p>
    </details>
    <details>
        <summary>Step 3</summary>
        <p> <strong> Setup your Colab environment with API keys. </strong> If you haven't already, open up the Colab notebook in the <a href=https://ai.google.dev/gemini-api/docs/quickstart>Quick Start</a> or create your own Colab notebook and copying [while understanding] the cells). Don't forget to use the Colab key in the sidebar. That's where you can set your API key.</p>
    </details>
    <details>
        <summary>Step 4</summary>
        <p> <strong> Read the API overview and implement the steps, paying particular attention to the multi-turn conversation. </strong> Sidebars are your friend. Find the API Overview and begin reading to learn more and figure out what you need to do.</p>
    </details>
    <details>
        <summary>Step 5</summary>
        <p> Once you understand what you're doing, write/adapt the code so that it does what you expect. Verify the behavior.</p>
    </details>
    <details>
        <summary>Hint</summary>
        <p> If all fails, you always have chatGPT/Gemini/ChatLLMofChoice to help you! You have the code on the Gemini API page - use it in your context and ask questions about it to help you come to the answer.</p>
    </details>
</div>
</details>

#### Explore Differences
Now that you've implemented the functionality, describe the similarities/differences that you see in:
* API Key usage
* API implementation code structure (e.g., OpenAI used `client.chat.completions.create` - what about Google?)
* Model selection
* Usage of role-based prompts (i.e., "user", "system", etc)
* Implementation of chat history
* Overall programmatic feel
* Anything else you observe

## Optional Exercises
1. We briefly went over applications/interaction using the OpenAI API. Use ChatGPT/Colab AI/ChatLLMofChoice to make sure that you understand the code.
2. Ignore the class and function that were created for interaction. Use generative AI (or not) to help you create your own code to maintain chat history when needed.
3. Read more about [Gradio](https://www.gradio.app/) (you know what to do! Quickstart -> API Docs) and add other components to the user interface. You can start by using the `reset_button` and adding functionality to reset the chat LLM so it is just the default system message.