## Setup - Create Azure OpenAI Assistant - Data Analysis

The notebook below demonstrates how to create an Assistant using the Azure OpenAI Assistants API to be used for interactive data analysis/visualization via the Assistants API's Code Interpreter tool (sandboxed execution of LLM-generated python).

Retrieve the created assistant's ID for testing in the app in this repo.

### Load required packages

In [7]:
import os
from openai import AzureOpenAI
from dotenv import load_dotenv

load_dotenv(override=True)

True

### Create Azure OpenAI Client

In [8]:
client = AzureOpenAI(
            azure_endpoint=os.environ["AOAI_ENDPOINT"],
            api_key=os.environ["AOAI_KEY"],
            api_version=os.getenv("AZURE_OPENAI_API_VERSION", "2024-02-15-preview"),
        )


### Load or overwrite instructions

In [9]:
instructions = open('assistant.txt', 'r').read()
print(instructions)

You are a helpful AI Assistant that assists users in answering their questions about Peloton exercise equipment.

You have access to a tool called 'retrieve_documents' which allows you to dynamically search for peloton equipment information to address user questions.

You should always retrieve source information when answering a user question and answer ONLY with facts that you retrieve from your search.

If there is not enough information to answer the user's question, you should say you don't know. If asking a clarifying question would help, ask the question. 

For tabular information return it as an html table. Do not return markdown format. If the question is not in English, answer in the language used in the question.

Each source has a name followed by colon and the actual information, always include the source name for each fact you use in the response. Use square brackets to reference the source, for example [source_page.txt]. Don't combine sources, list each source separately

### Create Assistant

In [10]:
assistant = client.beta.assistants.create(
    name="RAG Chat Assistant (Peloton)",
    instructions=instructions,
    tools=[{
        "type": "function",
        "function":{
        "name": "retrieve_documents",
        "description": "Retrieve Peloton-related information based on keywords",
        "parameters": {
            "type": "object",
            "properties": {
            "keywords": {
                "type": "string",
                "description": "Search terms to find relevant Peloton information"
            },
            "document_count": {
                "type": "integer",
                "description": "The number of documents to retrieve based on the search terms"
            }
            },
            "required": [
            "keywords",
            "document_count"
            ]
        },
        "strict": False
    }}],
    model="gpt-4o" #You must replace this value with the deployment name for your model.
)

### Retrieve Assistant ID

In [11]:
assistant_id = assistant.id
print(assistant_id)

asst_peyaqamJC4QwvLZHeorfHoNl
