In this jupyter notebook we'll be creating a chat bot that is able to take data and create line of best fit for that data using python, we'll give it access to a data analysis textbook specifically Measurements and their Uncertainties A practical guide to modern error analysis by I.G. Hughes and T.P.A.Hase such that it performs the task in a way we'll be familiar with. We start as we have done before: importing modules that we need to and creating the assistant. Make sure that if you haven't already installed the openai module that you install it and if anything goes wrong in the code and you have not installed the openai module, install it and try again. 

You'll notice that when creating the assistant we have defined a term known as "top_p", a lower top_p makes the model more deterministic but a higher top_p makes the model more creative and diverse. Because we are trying to make a chat bot that gives information to the user based on data that it already has we'll use a top_p of 0.1.

In [None]:
#pip install openai

In [None]:
from openai import OpenAI

openai_api_key = ""

client = OpenAI(api_key=openai_api_key)

In [None]:
assistant = client.beta.assistants.create(
  name="Code_Interpreter+File_Supported",
  description="You are a factual education AI Assistant dedicated to providing accurate, useful information. Your primary task is to assist me by providing me reliable and clear responses to my questions, only ever use information from file search as your source, this knowledge base is the measurements and their uncertainties textbook.  You are reluctant of making any claims unless they are stated or supported by the knowledge base.",
  model="gpt-4o",
  tools=[{"type": "file_search"}, {"type": "code_interpreter"}],
  top_p=0.1
)

We can then call .update to give the assistant access to a precreated vector store of the pdf of your choice (you'll need to do this through Open AIs dashboard). We could have also done this when creating the assistant but as a means for demonstrating that you may change an assistant during your code I have chosen to do it this way.

In [None]:
assistant = client.beta.assistants.update(
  assistant_id=assistant.id,
  tool_resources={"file_search": {"vector_store_ids": ["your_vector_store_id"]}},
)

We then load the data we want to give the chat bot and start a thread including that data and the prompt we want to give the chatbot, here because we are giving the chatbot data we have specified the tool we want it to use is the code_interpreter but we could have included all of the tools if we were unsure what the input file would be. 

In [None]:
file = client.files.create(
    file=open("Data.csv", "rb"),
    purpose="user_data"
)

In [None]:
thread = client.beta.threads.create(
  messages=[
    {
      "role": "user",
      "content": "how would i go about writing code that can plot the line of best fit on this data using a method of least squares ",
      "attachments": [
        {"file_id": file.id,"tools": [{"type": "code_interpreter"}]}, 
      ]
    }
  ],
  
)


We'll use the exact same event handler as we did in the code interpreter and then run our stream.

In [None]:
from typing_extensions import override
from openai import AssistantEventHandler
from PIL import Image
import io
import requests

class EventHandler(AssistantEventHandler):    
    @override
    def on_text_created(self, text) -> None:
        print(f"\nassistant > ", end="", flush=True)
      
    @override
    def on_text_delta(self, delta, snapshot):
        print(delta.value, end="", flush=True)
      
    def on_tool_call_created(self, tool_call):
        print(f"\nassistant > {tool_call.type}\n", flush=True)
  
    def on_tool_call_delta(self, delta, snapshot):
        if delta.type == 'code_interpreter':
            if delta.code_interpreter.input:
                print(delta.code_interpreter.input, end="", flush=True)
            if delta.code_interpreter.outputs:
                print(f"\n\noutput >", flush=True)
                for output in delta.code_interpreter.outputs:
                    if output.type == "logs":
                        print(f"\n{output.logs}", flush=True)
                    elif output.type == "image":
                        # Fetch the image data using the file_id
                        file_id = output.image.file_id
                        image_data = self.download_image(file_id)
                        if image_data:
                            image = Image.open(io.BytesIO(image_data))
                            image.show()
  
    def download_image(self, file_id):
        url = f"https://api.openai.com/v1/files/{file_id}/content"
        headers = {
            "Authorization": f"Bearer {openai_api_key}",
        }
        response = requests.get(url, headers=headers)
        if response.status_code == 200:
            return response.content
        else:
            print(f"Failed to download image: {response.status_code} {response.text}")
            return None

In [None]:
with client.beta.threads.runs.stream(
  thread_id=thread.id,
  assistant_id=assistant.id,
  instructions="",
  event_handler=EventHandler(),
) as stream:
  stream.until_done()