<a href="https://colab.research.google.com/github/kmk4444/LLM/blob/main/generate_sound.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

- **Requirements.txt**

In [1]:
!touch requirements.txt
!echo python-dotenv >> requirements.txt
!echo openai >> requirements.txt
!echo streamlit >> requirements.txt
!echo assemblyai >> requirements.txt
!echo replicate >> requirements.txt

- **terminal / bash komutu**

In [2]:
pip install -r requirements.txt

Installing collected packages: websockets, watchdog, smmap, python-dotenv, h11, pydeck, httpcore, gitdb, httpx, gitpython, replicate, openai, assemblyai, streamlit
Successfully installed assemblyai-0.26.0 gitdb-4.0.11 gitpython-3.1.43 h11-0.14.0 httpcore-1.0.5 httpx-0.27.0 openai-1.23.6 pydeck-0.9.0b0 python-dotenv-1.0.1 replicate-0.25.2 smmap-5.0.1 streamlit-1.33.0 watchdog-4.0.0 websockets-12.0


**First Generate Audio**
- Using OpenAI API Key, we created voices in create_speech_from_text function.
- Using OpenAI API Key, we created sentence from voice in transcribe_with_whisper function.  
- Using OpenAI API Key, we created sentence and translate to english from turkish voice in translate_with_whisper function.  
- Using AssemblyAI API Key, we created sentence from voice in transcribe_with_conformer function.  

In [3]:
%%writefile audio_ops.py
from openai import OpenAI
import assemblyai as aai
import streamlit as st
import os
from dotenv import load_dotenv

#load_dotenv()
#my_key_openai = os.getenv("openai_apikey")
#my_key_assemblyai= os.getenv("assemblyai_apikey")

client = OpenAI(
    api_key="---"
)


def create_speech_from_text(prompt, speech_file_name, voice_type="alloy"):
  AI_Response = client.audio.speech.create(
      model="tts-1",
      voice=voice_type,
      response_format="mp3", #type of result: aac, flac, mp3 or opus
      input=prompt
  )
  #  analysis process of result

  AI_Response.stream_to_file(speech_file_name) # we save output to speech_file_name
  # Actually, it creates and saves a sound file in local system. local adress is speech_file_name
  return "Seslendirme işlemi tamamlandı."

def transcribe_with_whisper(audio_file_name):
    audio_file = open(audio_file_name, "rb")

    AI_generated_transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file,# file adress
        language="tr"# there are different options.
    )

    return AI_generated_transcript.text


def translate_with_whisper(audio_file_name):

    audio_file = open(audio_file_name, "rb")

    AI_generated_translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file
    )

    return AI_generated_translation.text

def transcribe_with_conformer(audio_file_name):

    aai.settings.api_key = my_key_assemblyai
    transcriber = aai.Transcriber()

    AI_generated_text = transcriber.transcribe(audio_file_name)

    return AI_generated_text.text

# We crate four tabs for our interface.
tab_TTS, tab_whisper, tab_translation, tab_conformer = st.tabs(
    [
     "TTS ile Ses Sentezleme",
     "Whisper ile Transkripsiyon",
     "Whisper ile Tercüme",
     "Conformer ile Transkripsiyon"
     ]
)



# This tab is related to TTS-1 model
with tab_TTS:
  st.subheader("TTS-1 Modeli ile Konuşma Sentezleme")
  st.divider()

  prompt = st.text_input("Seslendirmek istediğiniz metni giriniz",key="prompt_tts")
  voices = ["alloy", "echo", "fable", "onyx", "nova", "shimmer"] # voice types
  voice_type = st.selectbox(label="Ses tercihiniz:", options=voices, key="voice_tts") # user selects voice types.
  generate_btn = st.button("Ses Sentezle", key="button_tts")

  if generate_btn:
    status = create_speech_from_text(prompt=prompt,speech_file_name="speech.mp3",voice_type=voice_type)
    st.success(status)

    audio_file = open("speech.mp3","rb") # rb: read binary. Emter file adress and make read binary. We want to temporarily save a sound file
    audio_bytes = audio_file.read() # we save sound bytes for streamlit.

    st.audio(data=audio_bytes,format="audio/mp3")
    st.balloons()

with tab_whisper:
    st.subheader("Whisper Modeli ile Transkripsiyon")
    st.divider()

    selected_file = st.file_uploader("Bir ses dosyası seçiniz", type=["mp3"], key="file_whisper")

    if selected_file:

        audio_file = open(selected_file.name, "rb")
        audio_bytes = audio_file.read()
        st.audio(data=audio_bytes, format="audio/mp3") #user see audio/mp3

    transcribe_btn = st.button("Metne Dönüştür", key="button_whisper")

    if transcribe_btn:

        generated_text = transcribe_with_whisper(audio_file_name=selected_file.name)

        st.divider()
        st.info(f"TRANSKRİPSİYON: {generated_text}")
        st.balloons()

# Translation with Whisper model
with tab_translation:

    st.subheader("Whisper Modeli ile Tercüme")
    st.divider()

    selected_file = st.file_uploader("Bir ses dosyası seçiniz", type=["mp3"], key="file_translation")

    if selected_file:

        audio_file = open(selected_file.name, "rb")
        audio_bytes = audio_file.read()
        st.audio(data=audio_bytes, format="audio/mp3")

    translate_btn = st.button("Tercüme Et", key="button_translation")

    if translate_btn:

        translated_text = translate_with_whisper(audio_file_name=selected_file.name)

        st.divider()
        st.info(f"TERCÜME: {translated_text}")
        st.balloons()

with tab_conformer:

    st.subheader("Conformer Modeli ile Transkripsiyon")
    st.divider()

    selected_file = st.file_uploader("Bir ses dosyası seçiniz", type=["mp3"], key="file_conformer")

    if selected_file:

        audio_file = open(selected_file.name, "rb")
        audio_bytes = audio_file.read()
        st.audio(data=audio_bytes, format="audio/mp3")

    transcribe_btn = st.button("Metne Dönüştür", key="button_conformer")

    if transcribe_btn:

        generated_text = transcribe_with_conformer(audio_file_name=selected_file.name)

        st.divider()
        st.info(f"TRANSKRİPSİYON: {generated_text}")
        st.balloons()


Writing audio_ops.py


In [None]:
!npm install localtunnel
!streamlit run /content/audio_ops.py &>/content/logs.txt &
!npx localtunnel --port 8501

[K[?25h[37;40mnpm[0m [0m[30;43mWARN[0m [0m[35msaveError[0m ENOENT: no such file or directory, open '/content/package.json'
[0m[37;40mnpm[0m [0m[30;43mWARN[0m [0m[35menoent[0m ENOENT: no such file or directory, open '/content/package.json'
[0m[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m content No description
[0m[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m content No repository field.
[0m[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m content No README data
[0m[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m content No license field.
[0m
[K[?25h+ localtunnel@2.0.2
updated 1 package and audited 36 packages in 0.514s

3 packages are looking for funding
  run `npm fund` for details

found 2 [93mmoderate[0m severity vulnerabilities
  run `npm audit fix` to fix them, or `npm audit` for details
[K[?25hnpx: installed 22 in 2.111s
your url is: https://puny-jars-melt.loca.lt
^C
