## Text Generation - chat completions endpoint

In [None]:
from openai import OpenAI

client = OpenAI(api_key="OpenAI_API_Key")

response = client.chat.completions.create(
    model="gpt-4o-mini"
    messages=[{"role": "user", "content": "What is OpenAI API?"}]
)

# response from OpenAI is a ChatCompletion object.
print(response.choices[0].message.content)

### Generating and Transforming Text
- `temperature`: Control randomness [0-2] (default-1)
- `max_tokens`: Control maximum length of response, shortening or lengthening it.
- 

#### Cost
- Usage costs dependent on amount of input and output text
    - Models are priced by cost/tokens
    - Input and output tokens can be priced differently
- Increasing max_tokens increases cost
- Scoping feature cost often starts with a rough calculation of cost per time.


### Single-turn tasks
- Text generation
- Text transformation
- Text classification

### Multi-turn converstions
conversations built on previous prompts and responses

#### Roles
- System: controls assistant's behavior
- User: instruct the assistant
- Assistant: response to user instruction
    - can also be written by the user to provide examples

## Text moderation - moderations endpoint
Text moderation is the process of identifying text that is inappropriate for the context it is being used in.  
Traditionally, this moderation was done by-hand, where a team of moderators flagged the content that breached usage rules. Manual moderation is extremely time-consuming and, if multiple moderators are involved, introduces a subjective element that may result in inconsistencies.  
More recently this was done by algorithms that detected and flagged content containing particular words. Keyword pattern matching, although much quicker and able to run round-the-clock, can be a clumsy tool that misses some malicious content while accidentally flagging perfectly good content because it doesn't understand nuance or the context of the discussion.

To prevent the misuse of their own models, OpenAI have developed moderation models to flag content that breaches their usage policies.
- Identify violations of terms or use
- Differentiate violation type by category
    - Violence
    - Hate Speech

In [None]:
from openai import OpenAI

client = OpenAI(api_key="OpenAI_API_Key")

response = client.moderations.create(
    model="text-moderation-latest",
    input="I could kill for a hamburger."
)

# dump the response to a dictionary for easy readability
print(response.model_dump())

### Interpreting the results
- `categories`: `true`/`false` indicator of category violation for different categories
- `category_scores`: float values for each category indicating the model's confidence of a violation
    - The scores can be between 0 and 1, and although higher values represent higher confidence, they should not be interpreted as probabilities.
    - The model uses the rest of the sentence to interpret the context and accurately infer the statement's meaning and use that to moderate the content.
    - Determine appropriate thresholds for each use case.
- `flagged`: `true`/`false` indicator of overall violation i.e., whether the terms of use have been violated in any way.

## OpenAI's Whisper - audio endpoint

### Speech-to-text capabilities - audio.transcriptions endpoint
- Transcribe audio
- Translate and transcribe audio into English
- Supports `mp3`, `mp4`, `mpeg`, `mpga`, `m4a`, `wav`, and `webm` (25 MB limit)
- Meeting transcript

Sensitive and confidential audio should not be sent to the model.

In [None]:
# open file to read in binary format (typical for non-text files like audio, video and images)
audio_file = open("path/to/file/meeting_recording.mp3", "rb")

response = client.audio.transcriptions.create(model="whisper-1", file=audio_file)

print(response.text)

### Translation capabilities - audio.translations endpoint
- Translate and transcribe audio
- Currently limited to English transcript

### Prompts
- Can provide `prompt` to the model (optional)
- Improve response quality by:
    - providing an example of desired style. Eg. Retaining filler words
    - provide additional context about transcript

In [None]:
audio_file = open("non-english_audio.m4a", 'rb')
prompt = "The transcript is about AI trends and ChatGPT."

response = client.audio.translations.create(
    model="whisper-1", 
    file=audio_file, 
    # prompt=prompt              # optional
)

print(response.text)

## Combine models

### Chaining
Chaining is when models are combined by feeding the output from one model directly into another model as an input. We can chain multiple calls to the same model together or use different models. If we chain two text models together, we can ask the model to perform a task in one call to the API and send it back with an additional instruction.

One example is to transcribe an audio file and use perform text capabilities of OpenAI on it.