# Going Beyond Text Completions

OpenAI has not only developed models for text generation but also for text moderation and audio transcription and translation. You'll learn to use OpenAI's moderation models to detect violations of their terms of use—a crucial function in user-facing applications. You'll also discover how the Whisper model can be used to transcribe and translate audio from different languages, which has huge applications in automated meeting notes and caption generation.

## Text moderation
### Requesting moderation
Aside from text and chat completion models, OpenAI provides models with other capabilities, including text moderation. OpenAI's text moderation model is designed for evaluating prompts and responses to determine if they violate OpenAI's usage policies, including inciting hate speech and promoting violence.

In this exercise, you'll test out OpenAI's moderation functionality on a sentence that may have been flagged as containing violent content using traditional word detection algorithms.


- Check if "My favorite book is To Kill a Mockingbird." violates OpenAI’s policies using the Moderations endpoint.
- Print the category scores to see the results.

In [None]:
client = OpenAI(api_key="<OPENAI_API_TOKEN>")

# Create a request to the Moderation endpoint
response = client.moderations.create(
    model="text-moderation-latest",
    input="My favorite book is To Kill a Mockingbird."
)

# Print the category scores
print(response.results[0].category_scores)

### Examining moderation category scores
The same request you created in the last exercise to the Moderation endpoint has been run again, sending the sentence "My favorite book is To Kill a Mockingbird." to the model. The response from the API has been printed for you, and is available as response.

What is the correct interpretation of the category_scores here?


`ModerationCreateResponse(id='modr-BBRkPefIkB3xbchgGTDbJ96xDzy43', model='text-moderation-007', results=[Moderation(categories=Categories(harassment=False, harassment_threatening=False, hate=False, hate_threatening=False, illicit=None, illicit_violent=None, self_harm=False, self_harm_instructions=False, self_harm_intent=False, sexual=False, sexual_minors=False, violence=False, violence_graphic=False, self-harm=False, sexual/minors=False, hate/threatening=False, violence/graphic=False, self-harm/intent=False, self-harm/instructions=False, harassment/threatening=False), category_applied_input_types=None, category_scores=CategoryScores(harassment=5.243551186140394e-06, harassment_threatening=1.1516095810293336e-06, hate=4.767837526742369e-05, hate_threatening=3.2021056028952444e-08, illicit=None, illicit_violent=None, self_harm=9.466615438213921e-07, self_harm_instructions=5.426785065765216e-08, self_harm_intent=1.5536235764557205e-07, sexual=3.545879735611379e-06, sexual_minors=1.1304399549771915e-06, violence=0.0001064608441083692, violence_graphic=1.086988686438417e-05, self-harm=9.466615438213921e-07, sexual/minors=1.1304399549771915e-06, hate/threatening=3.2021056028952444e-08, violence/graphic=1.086988686438417e-05, self-harm/intent=1.5536235764557205e-07, self-harm/instructions=5.426785065765216e-08, harassment/threatening=1.1516095810293336e-06), flagged=False)])
`

Possible answers
1. The model believes that the sentence contains violent content, as the violence category is close to 0.
2. The model believes that there are no violations, as all categories are close to 0.
3. The model believes that the sentence contains hate speech, as the hate category is close to 0.

## Speech-to-Text Transcription with Whisper

### Creating a podcast transcript
The OpenAI API Audio endpoint provides access to the Whisper model, which can be used for speech-to-text transcription and translation. In this exercise, you'll create a transcript from a DataFramed podcast episode with OpenAI Developer, Logan Kilpatrick.

If you'd like to hear more from Logan, check out the full ChatGPT and the OpenAI Developer Ecosystem podcast episode.


- Open the openai-audio.mp3 file.
- Create a transcription request to the Audio endpoint.
- Extract and print the transcript text from the response.

In [None]:
client = OpenAI(api_key="<OPENAI_API_TOKEN>")

# Open the openai-audio.mp3 file
audio_file = open("openai-audio.mp3", "rb")

# Create a transcript from the audio file
response = client.audio.transcriptions.create(model="whisper-1", file=audio_file)

# Extract and print the transcript text
print(response.text)

### Transcribing a non-English language
The Whisper model can not only transcribe English language, but also performs well on speech in many other languages.
In this exercise, you’ll create a transcript from audio.m4a, which contains speech in Portuguese.
 
- Open the audio.m4a file.
- Create a request to the Audio endpoint to transcribe audio.m4a.

In [None]:
client = OpenAI(api_key="<OPENAI_API_TOKEN>")

# Open the audio.m4a file
audio_file = open("audio.m4a", "rb")

# Create a transcript from the audio file
response = client.audio.transcriptions.create(model="whisper-1", file=audio_file)

print(response.text)

## Speech Translation with Whisper

### Translating Portuguese
Whisper can not only transcribe audio into its native language but also supports translation capabilities for creating English transcriptions.

In this exercise, you'll return to the Portuguese audio, but this time, you'll translate it into English!


- Open the audio.m4a file.
- Create a translation request to the Audio endpoint.
- Extract and print the translated text from the response.

In [None]:
client = OpenAI(api_key="<OPENAI_API_TOKEN>")

# Open the audio.m4a file
audio_file = open("audio.m4a", "rb")

# Create a translation from the audio file
response = client.audio.translations.create(model="whisper-1", file=audio_file)

# Extract and print the translated text
print(response.text)

### Translating with prompts
The quality of Whisper's translation can vary depending on the language spoken, the audio quality, and the model's awareness of the subject matter. If you have any extra context about what is being spoken about, you can send it along with the audio to the model to give it a helping hand.

You've been provided with an audio file, audio.wav; you're not sure what language is spoken in it, but you do know it relates to a recent World Bank report. Because you don't know how well the model will perform on this unknown language, you opt to send the model this extra context to steer it in the right direction.

- Open the audio.wav file.
- Write a prompt that informs the model that the audio relates to a recent World Bank report, which will help the model produce an accurate translation.
- Create a request to the Audio endpoint to transcribe audio.wav using your prompt.

In [None]:
client = OpenAI(api_key="<OPENAI_API_TOKEN>")

# Open the audio.wav file
audio_file = open("audio.wav", "rb")

# Write an appropriate prompt to help the model
prompt = "the audio is related to world bank report."

# Create a translation from the audio file
response = client.audio.translations.create(model="whisper-1",
                                            file=audio_file,
                                            prompt= prompt)

print(response.text)

## Combining models

### Identifying audio language
You've learned that you're not only limited to creating a single request, and that you can actually feed the output of one model as an input to another! This is called chaining, and it opens to the doors to more complex, multi-modal use cases.

In this exercise, you'll practice model chaining to identify the language used in an audio file. You'll do this by bringing together OpenAI's audio transcription functionality and its text models with only a few lines of code.

- Open the audio.wav file and assign to audio_file.
- Create a transcript from audio_file and assign to audio_response.
- Prompt a text model using the text from audio_response to discover the language used in audio.wav.

In [None]:
client = OpenAI(api_key="<OPENAI_API_TOKEN>")

# Open the audio.wav file
audio_file = open("audio.wav", "rb")

# Create a transcription request using audio_file
audio_response = client.audio.transcriptions.create(model="whisper-1", file=audio_file)
transcript=audio_response.text

# Create a request to the API to identify the language spoken
chat_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role":"user", "content":"what is the language in the text" + transcript}]
)
print(chat_response.choices[0].message.content)

### Creating meeting summaries
Time for business! One time-consuming task that many find themselves doing day-to-day is taking meeting notes to summarize attendees, discussion points, next steps, etc.

In this exercise, you'll use AI to augment this task to not only save a substantial amount of time, but also to empower attendees to focus on the discussion rather than administrative tasks. You've been provided with a recording from DataCamp's Q2 Roadmap webinar, which summarizes what DataCamp will be releasing during that quarter. You'll chain the Whisper model with a text or chat model to discover which courses will be launched in Q2.

- Open the datacamp-q2-roadmap.mp3 file and assign to audio_file.
- Create a transcript from audio_file and assign to audio_response.
- Prompt a text model using the text from transcript and summarize it into concise bullet points.

In [None]:
client = OpenAI(api_key="<OPENAI_API_TOKEN>")

# Open the datacamp-q2-roadmap.mp3 file
audio_file = open("datacamp-q2-roadmap.mp3", "rb")

# Create a transcription request using audio_file
audio_response = client.audio.transcriptions.create(model="whisper-1", file=audio_file)
transcript=audio_response.text

# Create a request to the API to summarize the transcript into bullet points
chat_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role":"user", "content":"summarize and give bullet points"+ transcript}
    ],
    max_tokens=100
)
print(chat_response.choices[0].message.content)