# Azure OpenAI Deep Dive


## OpenAI offerings

#### Language models
- Completion and Chat
- Embedding

#### Other models
- Dall-E
- Whisper


In [None]:
from dotenv import load_dotenv
load_dotenv()
import os 
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_KEY = os.getenv("AZURE_OPENAI_KEY")
AZURE_OPENAI_VERSION = os.getenv("AZURE_OPENAI_VERSION")
CHAT_MODEL = os.getenv("AZURE_OPENAI_MODEL_CHAT")
EMBEDDING_MODEL = os.getenv("AZURE_OPENAI_MODEL_EMBEDDING")



In [None]:
# Call OpenAI from their Rest API
import requests 
import json 
URL = f'{AZURE_OPENAI_ENDPOINT}/openai/deployments/{CHAT_MODEL}/chat/completions'
HEADERS = {
    "api-key": AZURE_OPENAI_KEY,
    "Content-Type": "application/json",
}
PARAMS = {
    "api-version": AZURE_OPENAI_VERSION,
}
DATA = {
    "messages":[
        {"role":"system", "content": "You are an assistant."},
        {"role":"user", "content": "Write 3 paragraphs about the history of Christmas."},
    ],
    "max_tokens": 800,
    "temperature": 0.7,
    "frequency_penalty": 0,
    "presence_penalty": 0,
    "top_p": 0.95,
    "stop": None,
}

res = requests.post(URL,json.dumps(DATA), headers=HEADERS, params=PARAMS)
response = res.json()


In [None]:
response

In [None]:
for result in response['prompt_filter_results']:
    print(f'prompt_index: {result["prompt_index"]}')
    content_filter = result['content_filter_results']
    for key in content_filter:
        print(f'  - {key}: {content_filter[key]}')
        

In [None]:
response['choices']

In [None]:
response['usage']

## Using OpenAI Python Package

In [None]:
from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
    api_key=AZURE_OPENAI_KEY,
    api_version=AZURE_OPENAI_VERSION,
)

chat_completion = client.chat.completions.create(
    messages=[
        {"role":"system", "content": "You are an assistant."},
        {"role":"user", "content": "Write 3 paragraphs about the history of Christmas."},
    ],
    model=CHAT_MODEL,
    max_tokens=800,
    temperature=0.7,
    frequency_penalty=0,
    presence_penalty=0,
    top_p=0.95,
    stop=None,
)

chat_completion

In [None]:
chat_completion.choices[0].message

In [None]:
from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
    api_key=AZURE_OPENAI_KEY,
    api_version=AZURE_OPENAI_VERSION,
)

response_stream = client.chat.completions.create(
    messages=[
        {"role":"system", "content": "You are an assistant."},
        {"role":"user", "content": "Write 3 paragraphs about the history of Christmas."},
    ],
    model=CHAT_MODEL,
    max_tokens=800,
    temperature=0.7,
    frequency_penalty=0,
    presence_penalty=0,
    top_p=0.95,
    stop=None,
    stream=True,
)

counter = 0
for s in response_stream:
    if s.choices and len(s.choices) > 0:
        counter +=1
        print(s.choices[0].delta.content or '', end='', flush=True)
    if counter % 10==0:
        print('')
    

## Embedding and Vectorization

Humans understand the word "apple", a picture of an apple, and the word "Apfel" (German for Apple) are similar.

Computers understand these as three totally different entities.

How can we teach computer that the three are similar? And they are different from Apple iPhone?

### Similarity in mathematics

- Two numbers `a` and `b` are similar if `|a-b|` is small. 
- Two 2-D points `A=(a1,a2)` and `B=(b1,b2)` are similar if ...
- Two n-D points `A=(a1,...,an)` and `B=(b1,...,bn)` are similar if ...

There are multiple definitions of similarity in Linear Algebra.
Three of the most common ones:
- Euclidean: `d(A,B) = SQRT((a1-b1)^2+...(an-bn)^2)`
    - A and B are more similar when distance is closer to 0.
- dot-product: `dot(A,B) = A . B`
    - A and B are more similar when the dot product is larger.
- cosine: `cos_sim(A,B) = (A . B)/(||A|| . ||B||)`
    - A and B are more similar when cosine similarity is closere to 1.

### Converting text to vectors

- A very difficult promblem. 
- Context-dependent.

### A new era of search

- Anything can be converted to vectors.
- There are models for converting: 
    - image to vector
    - video to vector
    - audio to vector
    - text in different languages to vector
- You can search across images and videos
- You can input an image and get related videos and text articles.
- You can hum music and find soundtracks.



In [None]:
res = client.embeddings.create(
    input=[
        "I ate an Apple.",
        "ich habe einen Apfel gegessen.",
        "Sky is blue.",
        "من یک سیب خوردم.",
        ],
    model=EMBEDDING_MODEL,
)

vector = res.data[0].embedding
print(f"vector length: {len(vector)}")
vector[:10]

In [None]:
apple, apfel,sky,sib = [r.embedding for r in res.data]

In [None]:
def dot(a,b):
    return sum([x*y for x,y in zip(a,b)])

print(f"dot(apple,apple): {dot(apple,apple)}")
print(f"dot(apple,apfel): {dot(apple,apfel)}")
print(f"dot(apple,sky): {dot(apple,sky)}")
print(f"dot(apfel,sky): {dot(apfel,sky)}")
print(f"dot(apple,sib): {dot(apple,sib)}")

# In OpenAI dot and cosine are equivalent