# What is openai api?

The OpenAI API provides a simple interface to state-of-the-art AI models for natural language processing, image generation, semantic search, and speech recognition. The best practices of OpenAI API is to generate human-like responses to natural language prompts, create vector embeddings for semantic search, and generate images from textual descriptions.

In [3]:
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

model_name = os.getenv('OPENAI_MODEL')
api_key = os.getenv("OPENAI_API_KEY")
client = OpenAI()

# capabilities

## [text generation](https://platform.openai.com/docs/guides/text-generation)

We use openai.chat.completions endpoint in the REST API / OpenAI SDKs.

In [None]:
# example prompt
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the best way to make a cup of coffee?"},
]

completion = client.chat.completions.create(
    model=model_name,
    messages=messages,
    max_tokens=150,
)

print(completion.choices[0].message.content)



The "best" way to make a cup of coffee can vary greatly depending on personal preference, as well as the type of coffee you enjoy. However, one popular and widely appreciated method is the pour-over technique, known for producing a clean and flavorful cup. Here’s a simple guide to making coffee using the pour-over method:

### Equipment and Ingredients Needed:
- Freshly roasted coffee beans
- Burr grinder (to ensure even grind size)
- Kettle (preferably with a gooseneck spout for precision)
- Pour-over dripper (e.g., Hario V60, Kalita Wave, or Chemex)
- Paper filter (choose one that fits your dripper)
- Digital kitchen scale
- Timer
- Mug


message usage

a message consists of lists of dict, where each dict is a message including role and content.

- content: string or array
```python
# single content
{"role": "user", "content": "count down from 10"}
# multiple content
{
    "role": "user", 
    "content": [
        {
            "type": "text",
            "content": "what's in the picture?"
        },
        {
            "type": "image",
            "content": "https://***.jpg"
        }
    ]
}
```



- role

| role | description |
|---|---|
|user| similar to end user in the webui/front-end|
|developer(system)| Instructions to the model that are prioritized ahead of user messages, following chain of command. Previously called the system prompt. |
|assistant| A message generated by the model, perhaps in a previous generation request. usually in a conversation|

## multimodality

### images

#### image2text
By combining ser prompts and image content(everything should be json-like), we could ask about the questions like what is in the picture with low/high/auto fidelity. We could also upload local images with ```Base64``` encoded in the content.

```python
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "image_url",
                        "detail": "high", # optional
                    },
                },
                {
                    "type": "image_url",
                    "image_url": {"url": f"image_path", "detail": "auto"},
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0])
```

### audio

## structure output

### embeddings

Getting embeddings with openai embeddings API endpoint along with the embedding model name.


#### fundemental usage of embeddings

1. use client.embeddings to generate a query for the embedding with given corpus
2. the query could be used combining dataframe function and schema


In [None]:
embedding_model = "text-embedding-3-small"
client = OpenAI()

response = client.embeddings.create(
    model=embedding_model,
    input="Sample document",
    dimensions=10  # optional, defaults 1536 for text-embedding-3-small and 3072 for text-embedding-3-large
)

print(len(response.data[0].embedding))

10


#### usage of embeddings

1.  embedding-based search
2.  text/code/recommendation with search: similarity measurement with cosine similarity
3.  zero-shot learning
4.  cold-start solving
5.  clustering

### json structure output

benefits:
1. simpler prompting: no need to handcraft complex prompts
2. consistency: no need to validate format

usage:
1. formatted output
2. chain of thoughts

To turn on JSON mode with the Chat Completions or Assistants API you can set the response_format to { "type": "json_object" }. If you are using function calling, JSON mode is always turned on.

In [34]:
from pydantic import BaseModel # pydantic is a library for python data validation. often used to define schemas.

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed
print(event)

name='Science Fair' date='Friday' participants=['Alice', 'Bob']


### predicted outputs

Reduce latency for model responses where much of the response is known ahead of time.

In [None]:
from openai import OpenAI

code = """
class User {
  firstName: string = "";
  lastName: string = "";
  username: string = "";
}

export default User;
"""

refactor_prompt = """
Replace the "username" property with an "email" property. Respond only
with code, and with no markdown formatting.
"""

client = OpenAI()

completion = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": refactor_prompt
        },
        {
            "role": "user",
            "content": code
        }
    ],
    prediction={
        "type": "content",
        "content": code
    }
)

print(completion)
print(completion.choices[0].message.content)

ChatCompletion(id='chatcmpl-B7PyT1H43QdZbNHEXvBF88p752b45', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='class User {\n  firstName: string = "";\n  lastName: string = "";\n  email: string = "";\n}\n\nexport default User;', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1741107841, model='gpt-4o-2024-08-06', object='chat.completion', service_tier='default', system_fingerprint='fp_f9f4fb6dbf', usage=CompletionUsage(completion_tokens=45, prompt_tokens=66, total_tokens=111, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=16), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))
class User {
  firstName: string = "";
  lastName: string = "";
  email: string = "";
}

export default User;


## Reasoning

### [Function calling](https://platform.openai.com/docs/guides/function-calling)

Extending openai model by giving access to ```tools```:

|type| description|
|---|---|
|Function Calling| Developer-defined tools|
|Hosted Tools| OpenAI SDK built-in tools(file search, code interpreter) with assistant API|



### [Reasoning with test-time compute](https://platform.openai.com/docs/guides/reasoning)

The reasoning takes intermediate reasoning tokens in addition to the input and output tokens. The intermediate tokens are where the balance between reasoning performance and reasoning cost is. It breaks down the reasoning into several reasoning steps with the RL fine-tuning. The cost is controllable. Reasoning mode may provide better results on high-level prompts than precise instructions.

In [9]:
prompt = """
Write a bash script that takes a matrix represented as a string with
format '[1,2],[3,4],[5,6]' and prints the transpose in the same format.
"""

response = client.chat.completions.create(
    model=model_name,
    # reasoning_effort="medium",  # optional
    messages=[
        {
            "role": "user",
            "content": prompt
        }
    ]
)

print(response.choices[0].message.content)

To transpose a matrix represented as a string in the format `[1,2],[3,4],[5,6]`, you can use a bash script that processes the string, transposes the rows and columns, and then outputs the transposed matrix in the same format. Below is a sample bash script to achieve this:

```bash
#!/bin/bash

# Function to parse the input string into an array of arrays (matrix)
parse_matrix() {
    local input="$1"
    input="${input#[}"  # Remove the leading '['
    input="${input%]}"  # Remove the trailing ']'

    # Extract each row and add it to the matrix array
    local matrix=()
    IFS=',' read -r -a rows <<< "${input//],/[}"
    for ((i = 0; i < ${#rows[@]}; i+=2)); do
        matrix+=("(${rows[i]},${rows[i+1]})")
    done

    echo "${matrix[@]}"
}

# Function to transpose the matrix
transpose_matrix() {
    local matrix=("$@")
    local transposed_matrix=()

    # Calculate number of rows and columns
    local num_rows=${#matrix[@]}
    local num_cols=$(echo "${matrix[0]}" | tr -cd ',' | wc

# Best practices