Skip to content

Conversation

msingh-openai
Copy link
Contributor

Summary

This document provides a step-by-step guide on how to use OpenAI’s GPT-4o to translate and dub audio files from one language to another, specifically focusing on translating English audio into Hindi. It outlines the key concepts of language and script, as well as the benefits of GPT-4o’s audio-in and audio-out modality, which simplifies the dubbing process by handling transcription and translation in one step.

Motivation

New audio-in and audio-out modality for GPT-4o.


For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

  • I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
  • I have conducted a self-review of my content based on the contribution guidelines:
    • Relevance: This content is related to building with OpenAI technologies and is useful to others.
    • Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
    • Spelling and Grammar: I have checked for spelling or grammatical mistakes.
    • Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
    • Correctness: The information I include is correct and all of my code executes successfully.
    • Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

Copy link
Contributor

@ericning-o ericning-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments!

" \n",
"**- Language** refers to the spoken or written system of communication. For instance, Hindi and Marathi are different languages, but both use the Devanagari script. Similarly, English and French are different languages, but are written in Latin script. \n",
" \n",
"**- Script** refers to the set of characters or symbols used to write the language. For example, Serbian language traditionally written in Cyrillic Script, is also written in Latin script.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would link to the wikipedia page on "Writing system" https://en.wikipedia.org/wiki/Writing_system

"\n",
"A note on semantics used in this Cookbook regarding **Language** and written **Script**. These words are generally used interchangeably, though it's important to understand the distinction, given the task at hand. \n",
" \n",
"**- Language** refers to the spoken or written system of communication. For instance, Hindi and Marathi are different languages, but both use the Devanagari script. Similarly, English and French are different languages, but are written in Latin script. \n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, would link to https://en.wikipedia.org/wiki/Language

" }\n",
"\n",
" # Construct the request data\n",
" data = {\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you're using the API directly? Why not use the python library since you made the users install it already?

Copy link
Contributor

@ericning-o ericning-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm overall! Let's add SDK when it gets fully released

@msingh-openai msingh-openai merged commit 38666c4 into main Oct 22, 2024
@msingh-openai msingh-openai deleted the msingh-openai-voice-solutions-gpt4o-audio branch October 22, 2024 21:32
joshagilend pushed a commit to joshagilend/openai-cookbook that referenced this pull request Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants