<a href="https://colab.research.google.com/github/essability/OEE-Designer/blob/master/Solution_Whisper_API_Working_with_Audio_files.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project 6: Automatic Voice Message Translation System

In this innovative project, we're going to build a system that bridges language barriers: an automatic voice message translation system. Utilizing OpenAI's Whisper API for state-of-the-art speech-to-text capabilities and the ChatCompletion API for accurate text translation, we will create an end-to-end solution that can translate any voice message into a chosen language.

## What You Will Learn

- **Whisper API for Speech Recognition**: Master the use of OpenAI's Whisper API to convert speech from voice messages into text.
- **ChatCompletion API for Translation**: Learn how to implement the ChatCompletion API to translate the transcribed text into the desired language.
- **Audio File Handling**: Develop skills for processing audio files in various formats within the Google Colab environment.

## Preparation Checklist

Before we start, make sure you have the following:

- An active Google Colab account.
- Basic to intermediate knowledge of Python programming.
- Familiarity with handling APIs and audio data.
- Access to OpenAI API keys with permissions to use Whisper and ChatCompletion APIs ([OpenAI](https://platform.openai.com/account/api-keys)).

## Time to Break Down Language Barriers

Get ready to dive into the world of automatic translation. By the end of this project, you will be capable of turning voice messages from any language into text and then into another language of your choice, all within moments!



# 2. Libraries import

In [None]:
!pip install openai

Collecting openai
  Downloading openai-1.0.1-py3-none-any.whl (153 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m153.9/153.9 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.25.1-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.0/75.0 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.1-py3-none-any.whl (76 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.9/76.9 kB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting h11<0.15,>=0.13 (from httpcore->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: h11, httpcore, httpx, openai
[31mERROR: pip's dependency resolver does not currently

In [None]:
import os
import openai

from openai import OpenAI

# 3. Sending a first request to OpenAI API


### 3.1 Setting up API Key

In [None]:
os.environ["OPENAI_API_KEY"] = "sk-XXXXXXXXXXXXX"
client = OpenAI()

# 4. Processing Audio files with Whisper

In [None]:
audio_file = open("audio_file_whisper.mp3", "rb")

transcript = client.audio.transcriptions.create(
  model="whisper-1",
  file=audio_file,
  response_format='vtt'
)

In [None]:
print(transcript)

WEBVTT

00:00:00.000 --> 00:00:07.280
I had this scenario where I was on an elevator with a senior EVP that I admire in a Fortune 500

00:00:07.280 --> 00:00:12.400
company I work at. I feel like I present myself well during the quick interaction. My question

00:00:12.400 --> 00:00:17.600
for you is, in a hypothetical situation, if you're on the elevator with a senior executive

00:00:17.600 --> 00:00:22.240
and you're one-on-one with him, how do you present yourself to them? What is your pitch? Yeah,

00:00:22.800 --> 00:00:27.440
this is a great question. So I remember when I used to work at Goldman Sachs years ago,

00:00:27.440 --> 00:00:31.920
I get nervous when I was in an elevator and there's a senior executive there. I remember

00:00:31.920 --> 00:00:37.600
one day there was this guy named Bob Steele, who was executive vice chairman of the firm at the

00:00:37.600 --> 00:00:40.640
time. And I was in the elevator with him and there's somebody else there too and I didn't

00:0

## Audio transcription

In [None]:
audio_file = open("audio_file_whisper.mp3", "rb")

transcript_translated = client.audio.translations.create(
  model="whisper-1",
  file=audio_file
)

In [None]:
transcript_translated.text

"I had this scenario where I was on an elevator with a senior EVP that I admire in a Fortune 500 company I work at. I feel like I present myself well during the quick interaction. My question for you is, in a hypothetical situation, if you're on the elevator with a senior executive and you're one-on-one with him, how do you present yourself to them? What is your pitch? Yeah, this is a great question. So I remember when I used to work at Goldman Sachs years ago, I get nervous when I was in an elevator and there's a senior executive there. I remember one day there was this guy named Bob Steele, who was executive vice chairman of the firm at the time. And I was in the elevator with him and there's somebody else there too, and I didn't know what to say. And so I said, beautiful day, isn't it? Right? And he kind of looked at me like, and they lose respect for you when you're, when you don't act yourself. Okay. So just kind of, I don't know, be yourself. You can just say something like, hey,

## Translating to any language using ChatGPT and Whisper



In [None]:
target_language = "serbian"
messages = [{"role": "system", "content": """I want you to act as an algorithm for translation to language {}. Systep will provide you with a text, and your only task is to translate it to {}. Never break character.""".format(target_language, target_language)}]
messages.append({"role": "user", "content": transcript_translated.text})

# NOTE: This model might be changed or depreicated in the future, use the most updated one :)
chat_response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=messages,
    temperature=0.9,
    max_tokens=2000,
)

In [None]:
print("Assistant:", chat_response.choices[0].message.content)

Assistant: Imao sam ovu situaciju gde sam se našao u liftu sa starijim izvršnim potpredsednikom koga veoma cenim u kompaniji Fortune 500 u kojoj radim. Osećam da sam se dobro predstavio tokom kratke interakcije. Moje pitanje za vas je, u hipotetičkoj situaciji, ako ste sami sa izvršnim direktorom u liftu, kako biste se predstavili? Šta biste rekli? Da, ovo je odlično pitanje. Sećam se kada sam nekada radio u Goldman Sachs pre mnogo godina, bilo me je nervozno kada bih bio u liftu sa nekim starijim izvršnim. Sećam se jednog dana kada sam bio u liftu sa čovekom po imenu Bob Stil, koji je tada bio izvršni potpredsednik kompanije. Tu je bio i neko drugi i nisam znao šta da kažem. Tako sam rekao, lep dan, zar ne? I on me je pogledao tako... i gubiš poštovanje kada se ne ponašaš svojstveno. Dakle, samo, ne znam, budi svoj. Možeš jednostavno reći nešto poput, "Hej, kako si? Jeste li videli utakmicu sinoć?" Što god. Nemoj biti napet kao što sam bio ja. Onda ćete se zapitati posle vožnje liftom