# Extract & postprocessing transcribe with whisper model
## File uploads are currently limited to 25 MB and the following input file types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm.

In [None]:
%pip install openai --upgrade
#%pip install dotenv for local testing
%pip install pydub
%pip install moviepy

In [14]:
import datetime, openai
from openai import OpenAI
#from dotenv import load_dotenv
from pydub import AudioSegment #in case you need to decrease audio files/cut
from moviepy.editor import AudioFileClip

In [None]:
# For local
# Import Key and Base from .env
#openai.api_key = os.getenv("OPENAI_API_KEY")
#openai.organization = os.getenv("OPENAI_ORGANIZATION")
# load_dotenv("secrets.env") 

from google.colab import userdata
openai.api_key = userdata.get('OPENAI_API_KEY')
openai.organization = userdata.get('OPENAI_ORGANIZATION')
openai.whisper_model = "whisper-1"
# Verify if load dotenvi correct
print("OpenAI organization : " + openai.organization)


In [23]:
# We create a client instance for openAI
client = OpenAI(
    api_key=openai.api_key,
    organization=openai.organization
)


In [None]:
# Usage to extract audio from a video mp4
# local pathFile = "../vids/interview.mp4" output_path = "../vids/output.mp3"

inputFile = "/content/interview.mp4"
fileMp3 = "/content/output.mp3"

In [24]:

def extract_audio_from_mp4(video_path, output_path):
    video = AudioFileClip(video_path)
    video.write_audiofile(output_path, bitrate="64k")

extract_audio_from_mp4(inputFile, fileMp3)

MoviePy - Writing audio in ../vids/output.mp3


                                                                      

MoviePy - Done.




## If you want transcribe the audio file - use this cell - if the person have a strong accent or very noisy background it could be messy - try translation

In [25]:
translation = None
audio_file= open(fileMp3, "rb")
transcript = client.audio.transcriptions.create(
  model=openai.whisper_model, 
  file=audio_file,
  response_format="text"
)

print(transcript)

Καλησπέρα όλοι και καλώς ήρθατε στο νέο ΜΒΑ Ελεκτρικό Επιτροπή Προϊόντας Gen.AI Προϊόντας και επιχειρήσεις. Είμαι ο Θεός Ευγενίου και ελπίζομαι να σας δω την επόμενη εβδομάδα σε αυτό το δρόμο. Αυτό είναι μόνο ένα πρόγραμμα βίντεο που θα κάνουμε όλοι. Πρέπει να είναι λιγότερο από ένα λεπτό και να περάσω μερικές ερωτήσεις που σας έδωσα μέσω e-mail. Θα σας δώσω ένα παράδειγμα, θα περάσω τις ερωτήσεις μόνος μου. Το όνομά μου είναι Θεός Ευγενίου, τα κυριαρχικά δίκαια μου είναι τεχνογραφία και τεχνογραφία, και είμαι στο ΙΣΙΑΔ από το 2001. Δεύτερη ερώτηση. Ποια είναι οι δύο σημαντικές ρόλες, εργασίες, εργασίες που θέλω να δουλέψω στο επόμενο ή αν έχω ήδη εργασία, ποιο ρόλο θα συμμετέχω στο επόμενο. Σε αυτό το περίπτωσέ μου θα παραμείνω προφητείς στο ΙΣΙΑΔ στο επόμενο. Ποια είναι οι δύο σημαντικές ρόλες που θέλετε να αποκάλυψετε από το δρόμο σας. Ως εγώ, η κυριαρχική σημασία θα είναι η αν συλλογικά, με την συναλλαγή, θα έχουμε μια καλύτερη, πιο πρακτική, πιο προσεκτική και πιο σωστή οπτική του

## If you want a translation in english text - use this cell
For the languages possibles to translate check this link in openAi page [page](https://platform.openai.com/docs/guides/speech-to-text/supported-languages)

In [33]:
transcript = None
audio_file= open(fileMp3, "rb")
translation = client.audio.translations.create(
  model=openai.whisper_model,
  file=audio_file,
  response_format="text"
)

print(translation)

Hello everybody and welcome to the new MBA elective, Building Gen AI Products and Businesses. I'm Theos Evgeniou and I'm looking forward to seeing you next week in this course. This is just an intro video recording that all of us will do. It's supposed to be less than one minute and go through a few questions I sent you by email. Let me give you an example, I'm going to go through the questions myself. First question is name, main degree, key role before INSEAD. My name is Theos Evgeniou, my main degree is in engineering and computer science and I've been at INSEAD since 2001. Second question, what are the two potential roles, jobs, industries I wish to work on next or if I already have a job, what role do I join next? In my case, I will remain a professor at INSEAD next. What are the top two outcomes you wish to get out of the course? Well, hopefully you have more than two. From my side, the main objective success will look like if we collectively by exchanging with each other, we all

## Process the result for the transcribe / translation with Model/instrauctions

In [34]:
model = "gpt-4" # you can change the model here with gpt-4 /gpt-4-32k / gpt-3.5-turbo / gpt-3.5-turbo-16k

# Create the client with params for OpenAI API
if transcript is not None:
    prompt = transcript
elif translation is not None:
    prompt = translation
else:
    prompt = ""  # Assign an empty string if both transcript and translation are None

print("Prompt from transcribe / translation : " + prompt)

Prompt from transcribe / translation : Hello everybody and welcome to the new MBA elective, Building Gen AI Products and Businesses. I'm Theos Evgeniou and I'm looking forward to seeing you next week in this course. This is just an intro video recording that all of us will do. It's supposed to be less than one minute and go through a few questions I sent you by email. Let me give you an example, I'm going to go through the questions myself. First question is name, main degree, key role before INSEAD. My name is Theos Evgeniou, my main degree is in engineering and computer science and I've been at INSEAD since 2001. Second question, what are the two potential roles, jobs, industries I wish to work on next or if I already have a job, what role do I join next? In my case, I will remain a professor at INSEAD next. What are the top two outcomes you wish to get out of the course? Well, hopefully you have more than two. From my side, the main objective success will look like if we collectivel

In [39]:
#Change it if you want to have a different point of view
promptSystem = """You are a conselor who has received this transcript from a student who is applying for an MBA,
 you have to describe what it's going on, you have to be precise , 
 extract the name of the person , make a short summary and spotligh on the differnt informations give by the person
 You could also give an advice on what do you think about the presentation 
 in the context of the MBA """



In [40]:

response = ClientOpenAi.chat.completions.create(
        model=model,
        n=1,
        temperature=0, # 0 to 1 by  step of 0.1 - O for  deterministic result, 1 is very creative
        messages=[
            {"role": "system", "content": promptSystem},
            {"role": "user", "content": prompt},
        ],
    )

print(response.choices[0].message.content)

The speaker in the transcript is Theos Evgeniou, a professor at INSEAD. He is introducing a new MBA elective course, Building Gen AI Products and Businesses. He is using this video to answer a set of questions he had previously sent to his students via email.

Theos has a background in engineering and computer science and has been with INSEAD since 2001. He plans to continue his role as a professor at INSEAD. His main objective for the course is to provide a practical, up-to-date, and sober view of Gen AI and its implications for business and society. He hopes that some students will even go on to build products and businesses based on what they learn in the course, potentially with funding from partner venture capitalists.

Theos also mentions his coding experience, noting that while he has coded for a significant portion of his life, he is not particularly skilled at Python. He encourages students to share interesting articles or ideas about Gen AI, noting that he has shared several 