<a href='https://colab.research.google.com/github/is-leeroy-jenkins/Boo/blob/main/ipynb/GPT.ipynb' target='_parent'><img src='https://colab.research.google.com/assets/colab-badge.svg' alt='Open In Colab'/></a>

###### Load Dependencies

In [5]:
import os
from openai import OpenAI
from pathlib import Path
from playwright.sync_api import sync_playwright
import tiktoken
import base64
import requests
from sklearn.model_selection import train_test_split


# Responses API
- Supports text and image inputs, and text outputs.
- Create stateful interactions with the model, using the output of previous responses as input
- Allow the model access to external systems and data using function calling
___

#### Text Input

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Response
response = client.responses.create(
  model='gpt-4o',
  input='Tell me a three sentence bedtime story about a unicorn.'
)

print( response.output_text )


#### Create Image

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Response
response = client.responses.create(
    model='gpt-4o',
    input=[
        {
            'role': 'user',
            'content': [
                { 'type': 'input_text', 'text': 'what is in this image?' },
                {
                    'type': 'input_image',
                    'image_url': 'https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg'
                }
            ]
        }
    ]
)

print( response.output_text )

#### Search Web

In [8]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Response
response = client.responses.create(
    model='gpt-4o',
    tools=[{ 'type': 'web_search_preview' }],
    input='What was a positive news story from today?',
)

print( response.output_text )

On April 4, 2025, Russian President Vladimir Putin's investment envoy, Kirill Dmitriev, expressed optimism about improving U.S.-Russia relations following discussions with officials from President Donald Trump's administration. Dmitriev highlighted progress on issues such as Ukraine, Arctic cooperation, rare metals, and space exploration, including potential joint missions to Mars involving Elon Musk. He noted efforts to restore direct air travel and emphasized the importance of continued meetings to resolve outstanding issues. Dmitriev also credited recent U.S.-Russia talks in Saudi Arabia and support from U.S. envoy Steve Witkoff as instrumental in facilitating diplomatic progress, including ceasefires in energy sectors and ensuring safe navigation in the Black Sea. ([reuters.com](https://www.reuters.com/world/europe/putin-envoy-dmitriev-says-some-forces-trying-sow-discord-between-russia-us-2025-04-03/?utm_source=openai))

Additionally, the Texas women's basketball team advanced to t

#### Search File

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Response
response = client.responses.create(
    model="gpt-4o",
    tools=[
	{
      "type": "file_search",
      "vector_store_ids": ["vs_1234567890"],
      "max_num_results": 20
    } ],
	input="What are the attributes of an ancient brown dragon?",
)

print( response.output_text )

#### Stream

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Response
response = client.responses.create(
  model="gpt-4o",
  instructions="You are a helpful assistant.",
  input="Hello!",
  stream=True
)

for event in response:
  print(event)

# Completion API
- Generates a model response from a list of messages comprising a conversation.
- Parameter support can differ depending on the model used to generate the response
___

##### Ex. Completion Response
- 'choices'
- `completion.choices[ 0 ].message`
- `response.choices[0].message.content`

In [None]:
[
        {
            "index": 0,
            "message":
            {
                "role": "assistant",
                "content": "Under the soft glow of the moon, Luna the unicorn danced through fields of twinkling stardust, leaving trails of dreams for every child asleep.",
                "refusal": null
             },
             "logprobs": null,
             "finish_reason": "stop"
        }
]

#### Create Completion

In [12]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Completion
completion = client.chat.completions.create(
	model='gpt-4o',
	messages=
	[
		{ 'role': 'system', 'content': 'You are a helpful assistant.' },
		{ 'role': 'user', 'content': 'What is an appropriation?' }
	]
)

print( completion.choices[ 0 ].message )

ChatCompletionMessage(content='An appropriation is an authorization by a legislative body, typically a parliament or congress, to allocate a specific amount of money for a particular purpose or project. This allocation allows government agencies or departments to incur obligations and make payments out of government funds. Appropriations are part of the budgeting process and play a critical role in ensuring that government operations and services are funded appropriately.\n\nThere are different types of appropriations, including:\n\n1. **Annual Appropriations**: These are regular budgetary allocations made each fiscal year to fund ongoing operations and activities.\n\n2. **Supplemental Appropriations**: These are additional funds allocated outside the regular budget cycle, often to address unforeseen expenditures or emergencies.\n\n3. **Continuing Appropriations**: Sometimes known as continuing resolutions, these are temporary funds provided to keep government operations running when t

#### Retreive Completions


In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Completion
completions = client.chat.completions.list( )
first_id = completions[ 0 ].id
first_completion = client.chat.completions.retrieve( completion_id=first_id )
print( first_completion )

#### View Message

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Completion
completions = client.chat.completions.list( )
first_id = completions[ 0 ].id
first_completion = client.chat.completions.retrieve( completion_id=first_id )
messages = client.chat.completions.messages.list( completion_id=first_id )
print( messages )

#### List Chats

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Compmletion
completions = client.chat.completions.list( )
print( completions )


#### Upload File

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create File Reqeust
file = client.files.create(
    file=open('draconomicon.pdf', 'rb'),
    purpose='user_data'
)

# Create Completion
completion = client.chat.completions.create(
    model='gpt-4o',
    messages=[
        {
            'role': 'user',
            'content': [
                {
                    'type': 'file',
                    'file':
					{
                        'file_id': file.id,
                    }
                },
                {
                    'type': 'documents',
                    'documents': 'What is the first dragon in the book?',
                },
            ]
        }
    ]
)

print(completion.choices[0].message.content)

#### Update Completion

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Completion
completions = client.chat.completions.list( )
first_id = completions[ 0 ].id
updated_completion = client.chat.completions.update( completion_id=first_id,
	request_body={ 'metadata': { 'foo': 'bar' } } )
print( updated_completion )

#### Delete Completion

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Completion
completions = client.chat.completions.list( )
first_id = completions[ 0 ].id
delete_response = client.chat.completions.delete( completion_id=first_id )
print( delete_response )

# Assistants API
- Can call models and use tools to perform tasks..
- An Assistant has instructions and can leverage models, tools, and files to respond to user queries.
##### Tools:

- Code Interpreter
- File Search
- Function calling
___

#### Create

In [None]:
# Create Client
client = OpenAI()
client.api_key = os.getenv( 'OPENAI_API_KEY' )

assistant = client.beta.assistants.create(
  name="Math Tutor",
  instructions="You are a personal math tutor. Write and run code to answer math questions.",
  tools=[{"type": "code_interpreter"}],
  model="gpt-4o",
)

#### Add Thread

In [None]:
# Create Client
client = OpenAI()
client.api_key = os.getenv( 'OPENAI_API_KEY' )

thread = client.beta.threads.create()

#### Add Message

In [None]:
# Create Client
client = OpenAI()
client.api_key = os.getenv( 'OPENAI_API_KEY' )

#### Create Run

In [None]:
# Create Client
client = OpenAI()
client.api_key = os.getenv( 'OPENAI_API_KEY' )

run = client.beta.threads.runs.create_and_poll(
  thread_id=thread.id,
  assistant_id=assistant.id,
  instructions="Please address the user as Jane Doe. The user has a premium account."
)

if run.status == 'completed':
  messages = client.beta.threads.messages.list(
    thread_id=thread.id
  )
  print(messages)
else:
  print(run.status)

#### Stream Run

In [None]:
# Create Client
client = OpenAI()
client.api_key = os.getenv( 'OPENAI_API_KEY' )

from typing_extensions import override
from openai import AssistantEventHandler

# First, we create a EventHandler class to define
# how we want to handle the events in the response stream.

class EventHandler(AssistantEventHandler):
  @override
  def on_text_created(self, text) -> None:
    print(f"\nassistant > ", end="", flush=True)

  @override
  def on_text_delta(self, delta, snapshot):
    print(delta.value, end="", flush=True)

  def on_tool_call_created(self, tool_call):
    print(f"\nassistant > {tool_call.type}\n", flush=True)

  def on_tool_call_delta(self, delta, snapshot):
    if delta.type == 'code_interpreter':
      if delta.code_interpreter.input:
        print(delta.code_interpreter.input, end="", flush=True)
      if delta.code_interpreter.outputs:
        print(f"\n\noutput >", flush=True)
        for output in delta.code_interpreter.outputs:
          if output.type == "logs":
            print(f"\n{output.logs}", flush=True)

# Then, we use the `stream` SDK helper
# with the `EventHandler` class to create the Run
# and stream the response.

with client.beta.threads.runs.stream(
  thread_id=thread.id,
  assistant_id=assistant.id,
  instructions="Please address the user as Jane Doe. The user has a premium account.",
  event_handler=EventHandler(),
) as stream:
  stream.until_done()

# Speech API
- The maximum input length is 4096 characters.
- TTS models: tts-1, tts-1-hd or gpt-4o-mini-tts.
___

#### Create Audio
- Generates audio from the input text.

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Speech
speech_file_path = Path( __file__ ).parent  # 'speech.mp3'
prompt = 'The quick brown fox jumped over the lazy dog.'
response = openai.audio.speech.create( model='tts-1', voice='alloy', input=prompt )
response.stream_to_file( speech_file_path )


#### Create Transcription
- Transcribes audio into the input language.

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Open Audio File 'speech.mp3'
audio_file = open( 'speech.mp3', 'rb' )
transcript = client.audio.transcriptions.create( model='whisper-1', file=audio_file )


#### Create Translation
- Translates audio into English.

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Translation
audio_file = open( 'speech.mp3', 'rb' )
transcript = client.audio.translations.create( model='whisper-1', file=audio_file )


#### Create Audio Output from Model

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Transcription
completion = client.chat.completions.create(
    model='gpt-4o-audio-preview',
    modalities=[ 'documents', 'audio' ],
    audio={ 'voice': 'alloy', 'format': 'wav' },
    messages=[
        {
            'role': 'user',
            'content': 'Is a golden retriever a good family dog?'
        }
    ]
)

print( completion.choices[ 0 ] )

# Convert to bytes
wav_bytes = base64.b64decode( completion.choices[ 0 ].message.audio.data )
with open( 'dog.wav', 'wb' ) as f:
    f.write( wav_bytes )

#### Get Audio Input from Model

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Fetch the audio file and convert it to a base64 encoded string
url = 'https://cdn.openai.com/API/docs/audio/alloy.wav'
response = requests.get(url)
response.raise_for_status()
wav_data = response.content
encoded_string = base64.b64encode(wav_data).decode('utf-8')

# Create Completion
completion = client.chat.completions.create(
    model='gpt-4o-audio-preview',
    modalities=['documents', 'audio'],
    audio={'voice': 'alloy', 'format': 'wav'},
    messages=[
        {
            'role': 'user',
            'content': [
                {
                    'type': 'documents',
                    'documents': 'What is in this recording?'
                },
                {
                    'type': 'input_audio',
                    'input_audio': {
                        'values': encoded_string,
                        'format': 'wav'
                    }
                }
            ]
        },
    ]
)

print(completion.choices[0].message)

# Embedding API
- An embedding is a vector (list) of floating point numbers.
- The distance between two vectors measures their relatedness.
- Small distances suggest high relatedness and large distances suggest low relatedness.

##### Use Cases:

- **Search** (where results are ranked by relevance to a query string)
- **Clustering** (where text strings are grouped by similarity)
- **Recommendations** (where items with related text strings are recommended)
- **Anomaly Detection** (where outliers with little relatedness are identified)
- **Diversity Measurement** (where similarity distributions are analyzed)
- **Classification** (where text strings are classified by their most similar label)
___

#### Tiktoken Use
- Split a string into tokens with OpenAI's tokenizer

#### Count Tokens

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

def num_tokens_from_string( string: str, encoding_name: str ) -> int:
    '''Returns the number of tokens in a documents string.'''
    encoding = tiktoken.get_encoding( encoding_name )
    num_tokens = len(encoding.encode(string))
    return num_tokens

num_tokens_from_string( 'tiktoken is great!', 'cl100k_base' )


#### Create Ada


In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

client.embeddings.create(
	model='documents-embedding-ada-002',
	input='The food was delicious and the waiter...',
	encoding_format='float'
)


#### Create Small

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Embedding
def get_embedding( text, model='documents-embedding-3-small' ):
    text = text.replace( '\n', ' ' )
    return client.embeddings.create( input = [text], model=model ).data[0].embedding


#### Create Large

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Embedding
def get_embedding( text, model='documents-embedding-3-large' ):
    text = text.replace( '\n', ' ' )
    return client.embeddings.create( input = [text], model=model ).data[0].embedding

### Reducing Dimensions
- Dynamically changing the dimensions enables very flexible usage.
- When using a vector data store that only supports embeddings up to 1024 dimensions long, developers can now still use our best embedding model text-embedding-3-large and specify a value of 1024 for the dimensions API parameter, which will shorten the embedding down from 3072 dimensions, trading off some accuracy in exchange for the smaller vector size.

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

def normalize_l2( x ):
    x = np.array( x )
    if x.ndim == 1:
        norm = np.linalg.norm( x )
        if norm == 0:
            return x
        return x / norm
    else:
        norm = np.linalg.norm( x, 2, axis=1, keepdims=True )
        return np.where( norm == 0, x, x / norm )


response = client.embeddings.create( model='documents-embedding-3-small',
	input='Testing 123', encoding_format='float' )

cut_dim = response.data[ 0 ].embedding[ :256 ]
norm_dim = normalize_l2( cut_dim )

print( norm_dim )


### Question & Answer
- There are many common cases where the model is not trained on data which contains key facts and information you want to make accessible when generating responses to a user query.
- One way of solving this, as shown below, is to put additional information into the context window of the model.
- This is effective in many use cases but leads to higher token costs.

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create Query
query = f'''
	Use the below article on the 2022 Winter Olympics to answer the subsequent question.
	If the answer cannot be found, write 'I don't know.'

	Article:
	\'\'\'
	{wikipedia_article_on_curling}
	\'\'\'

	Question: Which athletes won the gold medal in curling at the 2022 Winter Olympics?
'''

# Create Response
response = client.chat.completions.create(
    messages=[
        {'role': 'system', 'content': 'You answer questions about the 2022 Winter Olympics.'},
        {'role': 'user', 'content': query},
    ],
    model=GPT_MODEL,
    temperature=0,
)

print(response.choices[0].message.content)

### Text Search
- To retrieve the most relevant documents, use the cosine similarity between the embedding vectors of the query and each document, and return the highest scored documents.


In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

def search_reviews(df, product_description, n=3, pprint=True):
    embedding = get_embedding(product_description, model='documents-embedding-3-small')
    df['similarities'] = df.ada_embedding.apply(lambda x: cosine_similarity(x, embedding))
    res = df.sort_values('similarities', ascending=False).head(n)
    return res

res = search_reviews(df, 'delicious beans', n=3)


### Code Search
- Code search works similarly to embedding-based text search.
- We provide a method to extract Python functions from all the Python files in a given repository.
- Each function is then indexed by the **text-embedding-3-small** model.


In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

df['code_embedding'] = df['code'].apply(lambda x: get_embedding(x, model='documents-embedding-3-small'))

def search_functions(df, code_query, n=3, pprint=True, n_lines=7):
    embedding = get_embedding(code_query, model='documents-embedding-3-small')
    df['similarities'] = df.code_embedding.apply(lambda x: cosine_similarity(x, embedding))

    res = df.sort_values('similarities', ascending=False).head(n)
    return res

res = search_functions(df, 'Completions API tests', n=3)


#### Recommendations
- Because shorter distances between embedding vectors represent greater similarity, embeddings can be useful for recommendation.


In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

def recommendations_from_strings( strings: List[str], index_of_source_string: int,
    model='documents-embedding-3-small' ) -> List[int]:
    '''Return nearest neighbors of a given string.'''
    # get embeddings for all strings
    embeddings = [ embedding_from_string( string, model=model ) for string in strings ]

    # get the embedding of the source string
    query_embedding = embeddings[ index_of_source_string ]

    # get distances between the source embedding and other embeddings (function from embeddings_utils.py)
    distances = distances_from_embeddings( query_embedding, embeddings, distance_metric='cosine' )

    # get indices of nearest neighbors (function from embeddings_utils.py)
    indices_of_nearest_neighbors = indices_of_nearest_neighbors_from_distances( distances )
    return indices_of_nearest_neighbors


#### Text Featurization
- An embedding can be used as a general free-text feature encoder within a machine learning model.
- Incorporating embeddings will improve the performance of any machine learning model, if some of the relevant inputs are free text.
- An embedding can also be used as a categorical feature encoder within a ML model.

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

X_train, X_test, y_train, y_test = train_test_split( list( df.ada_embedding.values ), df.Score,
	test_size=0.2, random_state=42 )


# Tools API
- File Search augments the Assistant with knowledge from outside its model
- Code Interpreter allows Assistants to write and run Python code in a sandboxed execution environment
- Function calling allows you to describe functions to the Assistants API then call them
___

#### Check Weather

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

tools =[
    {
        'type': 'function',
        'function': {
            'name': 'get_weather',
            'description': 'Gets the current weather for a given location',
            'parameters': {
                'type': 'object',
                'properties': {
                    'location': {
                        'type': 'string',
                        'description': 'The city and state, e.g., San Francisco, CA',
                    },
                    'unit': {
                        'type': 'string',
                        'enum': ['celsius', 'fahrenheit']
                    }
                },
                'required': ['location']
            }
        }
    }
]

completion = client.chat.completions.create(
	model='gpt-4o',
	messages=[ { 'role': 'user', 'content': 'What is the weather like in Paris today?' } ],
	tools=tools
)

print( completion.choices[ 0 ].message.tool_calls )


#### Send Email

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Assign Variables
_tools = [
	{
		'type': 'function',
		'function':
		{
			'name': 'send_email',
			'description': 'Send an email to a given recipient with a subject and message.',
			'parameters':
			{
				'type': 'object',
				'properties':
				{
					'to':
					{
						'type': 'string',
						'description': 'The recipient email address.'
					},
					'subject':
					{
						'type': 'string',
						'description': 'Email subject line.'
					},
					'body':
					{
						'type': 'string',
						'description': 'Body of the email message.'
					}
				},
				'required': [ 'to', 'subject', 'body' ],
				'additionalProperties': False
			},
			'strict': True
		}
	}
]

# Create Completion
completion = client.chat.completions.create(
	model='gpt-4o',
	messages=[ { 'role': 'user',
	             'content': 'Can you send an email to terryeppler@gmail.com saying hi?' } ],
	tools=_tools
)

print( completion.choices[ 0 ].message.tool_calls )

#### Search Documents

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Assign variables
_tools = [
	{
		'type': 'function',
		'function':
			{
				'name': 'search_knowledge_base',
				'description': 'Query a knowledge base to retrieve relevant info on a topic.',
				'parameters':
					{
						'type': 'object',
						'properties':
							{
								'query':
									{
										'type': 'string',
										'description': 'The user question or search query.'
									},
								'options':
									{
										'type': 'object',
										'properties':
											{
												'num_results':
													{
														'type': 'number',
														'description': 'Number of top results to return.'
													},
												'domain_filter':
													{
														'type':
														[ 'string', 'null' ],
														'description': 'Optional domain to narrow the search. Pass null if not needed.'
													},
												'sort_by':
													{
														'type':
														[ 'string', 'null' ],
														'enum':
														[ 'relevance', 'date', 'popularity', 'alphabetical' ],
														'description': 'How to sort results. Pass null if not needed.'
													}
											},
										'required':
										[ 'num_results', 'domain_filter', 'sort_by' ],
										'additionalProperties': False
									}
							},
						'required':
						[ 'query', 'options' ],
						'additionalProperties': False
					},
				'strict': True
			}
	}
]

_messages = [
	{
		'role': 'user',
		'content': 'Can you find information about ChatGPT in the AI knowledge base?'
	}
]

# Create completion
completion = client.chat.completions.create( model='gpt-4o', messages=_messages, tools=tools )

print( completion.choices[ 0 ].message.tool_calls )

#### Search Web

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Assign variables
_model = 'gpt-4o-search-preview'
_message = [
        {
            'role': 'user',
            'content': 'What was a positive news story from today?',
        } 
]

# Create completion
completion = client.chat.completions.create( model=_model, web_search_options={},
    messages=_message )

print( completion.choices[ 0 ].message.content )

#### File Search

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Step 1: Create a new Assistant with File Search Enabled
assistant = client.beta.assistants.create(
  name='Financial Analyst Assistant',
  instructions='You are an expert financial analyst. Use you knowledge base to answer questions about audited financial statements.',
  model='gpt-4o',
  tools=[{'type': 'file_search'}],
)

# Step 2: Upload files and add them to a Vector Store
vector_store = client.vector_stores.create( name='Financial Statements' )
file_paths = [ 'edgar/goog-10k.pdf', 'edgar/brka-10k.txt' ]
file_streams = [ open( path, 'rb' ) for path in file_paths ]
file_batch = client.vector_stores.file_batches.upload_and_poll(
  vector_store_id=vector_store.id, files=file_streams
)

print( file_batch.status )
print( file_batch.file_counts )

# Step 3: Update the assistant to use the new Vector Store
assistant = client.beta.assistants.update(
  assistant_id=assistant.id,
  tool_resources={'file_search': {'vector_store_ids': [vector_store.id]}},
)

# Step 4: Create a thread
message_file = client.files.create(
  file=open('edgar/aapl-10k.pdf', 'rb'), purpose='assistants'
)

thread = client.beta.threads.create(
  messages=[
    {
      'role': 'user',
      'content': 'How many shares of AAPL were outstanding at the end of of October 2023?',
      'attachments': [
        { 'file_id': message_file.id, 'tools': [{'type': 'file_search'}] }
      ],
    }
  ]
)

print( thread.tool_resources.file_search )

# Step 5: Create a run and check the output
run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id, assistant_id=assistant.id
)

messages = list(client.beta.threads.messages.list(thread_id=thread.id, run_id=run.id))

message_content = messages[0].content[0].text
annotations = message_content.annotations
citations = []
for index, annotation in enumerate(annotations):
    message_content.value = message_content.value.replace(annotation.text, f'[{index}]')
    if file_citation := getattr(annotation, 'file_citation', None):
        cited_file = client.files.retrieve(file_citation.file_id)
        citations.append(f'[{index}] {cited_file.filename}')

print(message_content.value)
print('\n'.join(citations))


print(response)

#### Use Browser

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Assign variables
_args = [ '--disable-extensions', '--disable-file-system' ]

with sync_playwright() as p:
    browser = p.chromium.launch( headless=False, chromium_sandbox=True,
        env={}, args=_args )
    
    page = browser.new_page( )
    page.set_viewport_size( {'width': 1024, 'height': 768} )
    page.goto( 'https://bing.com' )

    page.wait_for_timeout( 10000 )

#### Use Computer

In [None]:
# Create Client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Assign Variables
_model = 'computer-use-preview'
_reasoning = { 'generate_summary': 'concise', }
_truncation = 'auto'
_tools = [
	{
        'type': 'computer_use_preview',
        'display_width': 1024,
        'display_height': 768,
        'environment': 'browser' # other possible values: 'mac', 'windows', 'ubuntu'
    }
]

_input = [
	{
            'role': 'user',
            'content': 'Check the latest OpenAI news on bing.com.'
	}
]

# Create Response
response = client.responses.create( model=_model, tools=_tools, input=_input,
    reasoning=_reasoning, truncation='auto' )

print( response.output )

#### Send Computer Use Request

- Send a request to create a Response with the **computer-use-preview** model equipped with the computer_use_preview tool.
- This request should include details about your environment, along with an initial input prompt.

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Assign variables
_model = 'computer-use-preview'
_tools = [
	{
        'type': 'computer_use_preview',
        'display_width': 1024,
        'display_height': 768,
        'environment': 'browser' # other possible values: 'mac', 'windows', 'ubuntu'
    }
]

_input = [
	{
            'role': 'user',
            'content': 'Check the latest OpenAI news on bing.com.'
	}
]

# Create response
response = client.responses.create( model=_model, tools=_tools, input=_input,
    reasoning={ 'generate_summary': 'concise', }, truncation='auto' )

print( response.output )

#### Execution Action

- Execute the corresponding actions on your computer or browser.
- How you map a computer call to actions through code depends on your environment.
- This code shows example implementations for the most common computer actions.

In [None]:
def handle_model_action( page, action ):
    '''
    
		Given a computer action (e.g., click, double_click, scroll, resources.),
		execute the corresponding operation on the Playwright page.
		
    '''
    action_type = action.type
    try:
        match action_type:
            case 'click':
                x, y = action.x, action.y
                button = action.button
                print( f"Action: click at ({x}, {y}) with button '{button}' " )
                # Not handling things like middle click, resources.
                if button != 'left' and button != 'right':
                    button = 'left'
                page.mouse.click( x, y, button=button )

            case 'scroll':
                x, y = action.x, action.y
                scroll_x, scroll_y = action.scroll_x, action.scroll_y
                print(f'Action: scroll at ({x}, {y}) with offsets (scroll_x={scroll_x}, scroll_y={scroll_y})')
                page.mouse.move( x, y )
                page.evaluate( f'window.scrollBy({scroll_x}, {scroll_y})' )

            case 'keypress':
                keys = action.keys
                for k in keys:
                    print( f"Action: keypress '{k}' " )
                    # A simple mapping for common keys; expand as needed.
                    if k.lower( ) == 'enter':
                        page.keyboard.press( 'Enter' )
                    elif k.lower( ) == 'space':
                        page.keyboard.press( ' ' )
                    else:
                        page.keyboard.press( k )

            case 'type':
                text = action.text
                print( f'Action: type documents: {text}' )
                page.keyboard.type( text )

            case 'wait':
                print( f'Action: wait' )
                time.sleep( 2 )

            case 'screenshot':
                # Nothing to do as screenshot is taken at each turn
                print( f'Action: screenshot' )

            # Handle other actions here

            case _:
                print( f'Unrecognized action: {action}' )

    except Exception as e:
        print( f'Error handling action {action}: {e}' )

#### Capture Screenshot

- After executing the action, capture the updated state of the environment as a screenshot, which also differs depending on your environment.

In [None]:
def get_screenshot( page ):
    '''
    
    	Take a full-page screenshot using Playwright and return the image bytes.
    	
    '''
    return page.screenshot()

#### Repeating Loop

- Once you have the screenshot, you can send it back to the model as a **computer_call_output** to get the next action.
- Repeat these steps as long as you get a **computer_call** item in the response.

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

def computer_use_loop( instance, response ):
    '''
    	Run the loop that executes computer actions until no 'computer_call' is found.
    '''
    while True:
        computer_calls = [ item for item in response.output if item.type == 'computer_call' ]
        if not computer_calls:
            print( 'No computer call found. Output from model:' )
            for item in response.output:
                print( item )
            break  # Exit when no computer calls are issued.

        # We expect at most one computer call per response.
        computer_call = computer_calls[ 0 ]
        last_call_id = computer_call.call_id
        action = computer_call.action

        # Execute the action (function defined in step 3)
        handle_model_action( instance, action )
        time.sleep( 1 )  # Allow time for changes to take effect.

        # Take a screenshot after the action (function defined in step 4)
        screenshot_bytes = get_screenshot( instance )
        screenshot_base64 = base64.b64encode( screenshot_bytes ).decode( 'utf-8' )

        # Send the screenshot back as a computer_call_output
        response = client.responses.create(
            model='computer-use-preview',
            previous_response_id=response.id,
            tools=[
                {
                    'type': 'computer_use_preview',
                    'display_width': 1024,
                    'display_height': 768,
                    'environment': 'browser'
                }
            ],
            input=[
                {
                    'call_id': last_call_id,
                    'type': 'computer_call_output',
                    'output': {
                        'type': 'input_image',
                        'image_url': f'values:image/png;base64,{screenshot_base64}'
                    }
                }
            ],
            truncation='auto'
        )

    return response

#### Create VectorStore

In [None]:
# Create a vector store caled 'Financial Statements'
vector_store = client.vector_stores.create(name='Financial Statements')

# Ready the files for upload to OpenAI
file_paths = ['edgar/goog-10k.pdf', 'edgar/brka-10k.txt']
file_streams = [open(path, 'rb') for path in file_paths]

# Use the upload and poll SDK helper to upload the files, add them to the vector store,
# and poll the status of the file batch for completion.
file_batch = client.vector_stores.file_batches.upload_and_poll(
  vector_store_id=vector_store.id, files=file_streams
)

# You can print the status and the file counts of the batch to see the result of this operation.
print(file_batch.status)
print(file_batch.file_counts)

# Files API
- Upload up to 100 pages and 32MB of total content in a single request to the API, across multiple file inputs.
- Only models that support both text and image inputs, such as gpt-4o, gpt-4o-mini, or o1, can accept PDF files as input
- Recommend using the user_data purpose for files you plan to use as model inputs.
___

#### Upload
- Upload a PDF using the Files API, then reference its file ID in an API request to the model.

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Assign vriables
_messages = [
        {
            'role': 'user',
            'content': [
                {
                    'type': 'file',
                    'file':
	                {
                        'file_id': file.id,
                    }
                },
                {
                    'type': 'documents',
                    'documents': 'What is the first dragon in the book?',
                },
            ]
        }
]

# Create File Request
file = client.files.create( file=open( 'draconomicon.pdf', 'rb' ), purpose='user_data' )
completion = client.chat.completions.create( model='gpt-4o', messages=_messages, )

print( completion.choices[ 0 ].message.content )

#### List

In [1]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

_files = client.files.list()

for i in _files:
	print( i )


NameError: name 'OpenAI' is not defined

#### Retreive File

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Reteive by file_id
client.files.retrieve( 'file-abc123' )


#### Reteive Contents

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )
content = client.files.content( 'file-abc123' )


#### Upload

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

client.files.create( file=open('mydata.jsonl', 'rb'), purpose='fine-tune' )


### Base64-encoded files
- You can send PDF file inputs as Base64-encoded inputs as well.

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

with open( 'draconomicon.pdf', 'rb' ) as f:
    data = f.read( )

base64_string = base64.b64encode( data ).decode( 'utf-8' )
completion = client.chat.completions.create(
    model='gpt-4o',
    messages=[
        {
            'role': 'user',
            'content': [
                {
                    'type': 'file',
                    'file':
	                {
                        'filename': 'draconomicon.pdf',
                        'file_data': f'values:application/pdf;base64,{base64_string}',
                    }
                },
                {
                    'type': 'documents',
                    'documents': 'What is the first dragon in the book?',
                }
            ],
        },
    ],
)

print( completion.choices[ 0 ].message.content )

# Retreival API
- The Retrieval API allows you to perform semantic search over your data, which is a technique that surfaces semantically similar results — even when they match few or no keywords.
- The Retrieval API is powered by vector stores, which serve as indices for your data.
___

#### Create Upload

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

vector_store = client.vector_stores.create( name='Support FAQ' )
client.vector_stores.files.upload_and_poll(  vector_store_id=vector_store.id,
    file=open( 'customer_policies.txt', 'rb' ) )

#### Search Stores

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create search
user_query = 'What is the return policy?'
results = client.vector_stores.search( vector_store_id=vector_store.id, query=user_query, )

#### Semantic Search

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

results = client.vector_stores.search( vector_store_id=vector_store.id,
    query='How many woodchucks are allowed per passenger?', )

In [None]:


# Initialize the client
data_client = DataAPIClient( 'YOUR_TOKEN' )
url = 'https://f9c2ce0a-d399-4670-8989-302798d48696-westus3.apps.astra.datastax.com'
db = data_client.get_database_by_api_endpoint( url )

print( f'Connected to Astra DB: {db.list_collection_names( )}' )

# Vector Store API
- Vector stores are the containers that power semantic search for the Retrieval API and the Assistants API file search tool.
- When you add a file to a vector store it will be automatically chunked, embedded, and indexed.
- ___

### I. Vector Store Operations

#### Create

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create vector store
client.vector_stores.create( name='Support FAQ', file_ids=[ 'file_123' ] )

#### Retrieve

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Retreive vector store
client.vector_stores.retrieve( vector_store_id='vs_123' )

#### Update

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Update vector store
client.vector_stores.retrieve(  vector_store_id='vs_123' )

#### Delete

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Delete vector store
client.vector_stores.delete( vector_store_id='vs_123' )

#### List

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# List stores
client.vector_stores.list( )

### II. Vector File Operations

#### Create

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create files for vector store
client.vector_stores.files.create(
    vector_store_id='vs_123',
    file_id='file_123',
    attributes={
        'region': 'US',
        'category': 'Marketing',
        'date': 1672531200      # Jan 1, 2023
    }
)

#### Upload

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

client.vector_stores.files.upload_and_poll( vector_store_id='vs_123',
    file=open('customer_policies.txt', 'rb') )


#### Retreive

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

client.vector_stores.files.retrieve( vector_store_id='vs_123', file_id='file_123' )


#### Update

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Update file
client.vector_stores.files.update(
    vector_store_id='vs_123',
    file_id='file_123',
    attributes={'key': 'value'}
)


#### Delete

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Delete file
client.vector_stores.files.delete( vector_store_id='vs_123', file_id='file_123' )


#### List

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# List files
client.vector_stores.files.list( vector_store_id='vs_123' )


### II. Vector Batch Operations

#### Create

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

client.vector_stores.file_batches.create_and_poll( vector_store_id='vs_123',
    file_ids=['file_123', 'file_456'] )


#### Retrieve

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

client.vector_stores.file_batches.retrieve( vector_store_id='vs_123', batch_id='vsfb_123' )


#### Cancel

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

client.vector_stores.file_batches.cancel( vector_store_id='vs_123', batch_id='vsfb_123' )


#### List

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

client.vector_stores.file_batches.list( vector_store_id='vs_123' )


# Assistant API
- Designed to help developers build powerful AI assistants capable of performing a variety of tasks.
- Build AI assistants within your own applications.
##### Supports the following tools:
1. Code Interpreter
2. File Search
3. Function calling.
___

#### Create assistant

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

my_assistant = client.beta.assistants.create(
    instructions='You are a personal math tutor. When asked a question, write and run Python code to answer the question.',
    name='Math Tutor',
    tools=[{'type': 'code_interpreter'}],
    model='gpt-4o',
)
print(my_assistant)


#### List Assistants

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

my_assistants = client.beta.assistants.list(
    order='desc',
    limit='20',
)
print(my_assistants.data)


#### Retreive Assistant

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

my_assistant = client.beta.assistants.retrieve('asst_abc123')
print(my_assistant)


#### Update Assistant

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

my_updated_assistant = client.beta.assistants.update(
  'asst_abc123',
  instructions='You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.',
  name='HR Helper',
  tools=[{'type': 'file_search'}],
  model='gpt-4o'
)

print(my_updated_assistant)


#### Delete Assistant

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

response = client.beta.assistants.delete('asst_abc123')
print(response)


### I. Threads

#### Create

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

empty_thread = client.beta.threads.create()
print(empty_thread)


#### Retreive

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

my_thread = client.beta.threads.retrieve('thread_abc123')
print(my_thread)


#### Update

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

my_updated_thread = client.beta.threads.update(
  'thread_abc123',
  metadata={
    'modified': 'true',
    'user': 'abc123'
  }
)
print(my_updated_thread)


#### Delete

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

response = client.beta.threads.delete('thread_abc123')
print(response)


### II. Messages

#### Create

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Create message
thread_message = client.beta.threads.messages.create( 'thread_abc123',  role='user',
  content='How does AI work? Explain it in simple terms.', )

print(thread_message)


#### List

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# List messages in a thread
thread_messages = client.beta.threads.messages.list( 'thread_abc123' )
print(thread_messages.data)


#### Retreive

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Retreive messages from thread
message = client.beta.threads.messages.retrieve(
  message_id='msg_abc123',
  thread_id='thread_abc123',
)
print(message)


#### Update

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Update messages in thread
message = client.beta.threads.messages.update(
  message_id='msg_abc12',
  thread_id='thread_abc123',
  metadata={
    'modified': 'true',
    'user': 'abc123',
  },

)
print(message)


#### Delete

In [None]:
# Create client
client = OpenAI( )
client.api_key = os.getenv( 'OPENAI_API_KEY' )

# Delete messages in thread
deleted_message = client.beta.threads.messages.delete(  message_id='msg_abc12',
	thread_id='thread_abc123', )

print(deleted_message)
