# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [94]:
# imports
import os
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display, update_display

In [95]:
# constants

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# check api key
if not api_key:
    print("No API key was found!")
else:
    print("API key found.")
    
ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
openai = OpenAI()

MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

API key found.


In [96]:
# here is the question; type over this to ask something new

system_prompt = """You are Gregory, a friendly and knowledgeable AI tutor specializing in technical topics, especially programming, computer science, and software engineering.
Your goal is to help users understand technical concepts clearly, provide accurate code explanations, and guide them through learning with patience and clarity.

- Always use clear, conversational language suited for learners of varying levels.
- Break down complex ideas into digestible steps.
- Use code examples where appropriate, and comment your code for better understanding.
- If a user asks a vague question, ask clarifying questions before giving an answer.
- Be encouraging, supportive, and professional.
- When in doubt, prioritize helping the user build confidence in learning technical skills."""

user_prompt = input("""🤖 Hi there! I’m Gregory, your AI-powered tutor.
Feel free to ask me AI related technical questions — I’m here to help!
For example, you can ask me how a piece of code works or anything else you're curious about.\n
🤖 Please enter your question:\n""")

question=[
    {"role":"system", "content":system_prompt}
    , {"role":"user", "content":user_prompt}
]

🤖 Hi there! I’m Gregory, your AI-powered tutor.
Feel free to ask me AI related technical questions — I’m here to help!
For example, you can ask me how a piece of code works or anything else you're curious about.

🤖 Please enter your question:
 # get gpt-4o-mini to answer, with streaming def stream_gpt(question):     stream = openai.chat.completions.create(         model=MODEL_GPT,         messages=question,         stream=True     )      response = ""     display_handle = display(Markdown(""), display_id=True)     for chunk in stream:         response += chunk.choices[0].delta.content or ''         response = response.replace("```","").replace("markdown", "")         update_display(Markdown(response), display_id=display_handle.display_id)


In [97]:
# get gpt-4o-mini to answer, with streaming
def stream_gpt(question):
    stream = openai.chat.completions.create(
        model=MODEL_GPT,
        messages=question,
        stream=True
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [98]:
stream_gpt(question)

It looks like you're trying to implement a streaming response handler to interact with the OpenAI GPT-4o-mini model. I see that you want to receive streamed responses and display them dynamically. Let's break down your code step by step and clarify some aspects to ensure it works effectively.

Here's an improved version of your function with comments for clarity:

python
import openai
from IPython.display import display, Markdown, update_display

# Replace 'MODEL_GPT' with your actual model name (e.g., "gpt-3.5-turbo").
MODEL_GPT = 'gpt-4o-mini'

def stream_gpt(question):
    # Create a streaming request to the OpenAI API with the specified model and user question.
    stream = openai.chat.completions.create(
        model=MODEL_GPT,
        messages=question,
        stream=True
    )
    
    # Initialize an empty response string to build the complete output.
    response = ""
    
    # Create a display handle for Markdown output in Jupyter Notebook or similar environments.
    display_handle = display(Markdown(""), display_id=True)
    
    # Loop through each chunk of streamed response.
    for chunk in stream:
        # Retrieve the content of the current chunk and append it to the response string.
        response += chunk.choices[0].delta.content or ''
        
        # Clean up response text to remove any unwanted Markdown formatting.
        response = response.replace("", "").replace("", "")
        
        # Update the displayed text in real-time.
        update_display(Markdown(response), display_id=display_handle.display_id)

# To use this function, call it with a properly formatted question.
# Example of usage:
# stream_gpt([{"role": "user", "content": "What's the weather like today?"}])


### Key Points to Note:
1. **Streaming Behavior**: The `stream=True` parameter in the `openai.chat.completions.create` call allows you to get part of the response as it’s being generated instead of waiting for the entire completion.
  
2. **Question Formatting**: Ensure to pass the `question` into the `messages` parameter as a list of dictionaries, where each dictionary contains the 'role' of the speaker (like 'user' or 'assistant') and the message content.

3. **Updating Display**: Using `IPython.display` allows real-time updates of the Markdown output in environments like Jupyter notebooks.

4. **Error Handling**: Consider adding error handling for HTTP errors or issues with the streaming process. This ensures that your function can gracefully handle problems.

5. **Environment Compatibility**: This code works seamlessly in an interactive environment that supports IPython, such as Jupyter notebooks.

Feel free to ask more questions if you need further clarification on any part of this code or if you want to expand its functionality!

In [99]:
# get Llama 3.2 to answer
def stream_llama(question):
    stream = ollama_via_openai.chat.completions.create(
        model=MODEL_LLAMA,
        messages=question,
        stream=True
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [100]:
stream_llama(question)

Hello there! It seems like you're working with the OpenAI GPT-4 model to generate human-like responses. The code snippet you provided is quite interesting, and I'll do my best to break it down for you.

**What this code does**

This `stream_gpt` function appears to be a wrapper around the OpenAI API, which generates text completions based on user input (you). Here's what the function does in detail:

1. **Create GPT-4 model instance**: It creates an instance of the GPT-4 model using the `MODEL_GPT` variable, which suggests that this is a predefined model configuration.
2. **Open API stream**: It opens a connection to the OpenAI API's completions endpoint using the `openai.chat.completions.create` method, passing in the `model` parameter (the GPT-4 instance) and the `messages` parameter (your question).

   python
stream = openai.chat.completions.create(
    model=MODEL_GPT,
    messages=question,
    stream=True
)


   The `stream=True` parameter is necessary because we want to read responses from the API in real-time without having to wait for the entire response to be received.

3. **Process responses**: Inside an infinite loop (`forchunk in stream:`), it reads and processes each chunk of response from the API:

    python
for chunk in stream:
response += chunk.choices[0].delta.content or ''


   - `chunk` is a dictionary-like object containing information about the API's response.
   - `choices` is an array of possible completions, with only one choice shown (`[0]`) by default. We're assuming this is the primary completion we want to display.
   - `.delta.content` gives us the actual text response from the API. This could be a full paragraph, sentence, or even just a word.
   - `response += chunk.choices[0].delta.content or ''`: We simply append any remaining text from previous chunks if there was one.

4. **Format and display**: It reformats the response to remove Markdown formatting (``)) and then uses a `display` function to show an updated version of the original question:

    python
response = response.replace("", "").replace("", "")
update_display(Markdown(response), display_id=display_handle.display_id)


5. **Update display**: After formatting, it updates the display with the latest response.

**Issue concerns**

One potential issue here: `while True` or a similar loop structure should be used instead of an `Infinite` loop for this streamer's functionality.

Also, error handling would be necessary if we wanted more control over any possible errors while streaming results from API requests.