# NLP Project - Gradio

This document provides an overview of the folder structure for the NLP Project stored on Google Drive. It includes descriptions of the directories and the files they contain.

## Folder Overview

The main project folder is located at:
`MyDrive/NLP/Project/`

This folder contains all the necessary components for the project, including example files and model data.

### Subfolders and Files:


#### Examples Folder
* Located under:
`MyDrive/NLP/Project/examples/`

* This folder contains various example files used in the project:

  - **male.wav**: An audio file with a male voice sample.
  - **female.wav**: An audio file with a female voice sample.
  - **hf-logo.png**: The logo of Hugging Face, used in the project's UI.
  - **app_ui.png**: A screenshot or a design mockup of the application's user interface.
  - **ai-chat-logo.png**: The logo used for the AI chat component of the project.


You can download the entire project folder using: [Download](https://drive.google.com/drive/folders/1yNIIoMLkyeumj5J6PqWqLA1fMkOMn_dt?usp=sharing)

#### Model Folder
* Located under:
`MyDrive/NLP/Project/llama-8-finetuned-onlyEnglish/`

* This folder contains the `llama-8` model fine-tuned for English language tasks. Specific details about the model's configuration and training data are also included within this folder.

#### Additional Information

For more details about the project, including setup instructions or usage examples, please refer to the specific documentation files or contact the project maintainer directly.




In [1]:
!pip install -q -U trl transformers accelerate
!pip install -q datasets bitsandbytes
!pip install -q huggingface_hub
!pip install -q torch
!pip install -q soundfile
!pip install -q librosa
!pip install -q pydub
!pip install -q TTS
!pip install -q gradio
!pip install -q -U git+https://github.com/huggingface/peft.git

from huggingface_hub import login, logout
from transformers import BitsAndBytesConfig
from huggingface_hub import notebook_login
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import gradio as gr
import random
import time
import re
from peft import PeftConfig, PeftModel


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m245.2/245.2 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.1/9.1 MB[0m [31m20.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.6/302.6 kB[0m [31m17.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m542.0/542.0 kB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m102.4/102.4 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m401.3/401.3 kB[0m [31m14.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m17.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.1/194.1 kB[0m [31m23.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━

In [2]:
from huggingface_hub import login, logout
from transformers import BitsAndBytesConfig
from huggingface_hub import notebook_login
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import gradio as gr
import random
import time

In [3]:
login("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")

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [4]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [5]:
import os

path = 'NLP/Project/'

os.chdir(f'//content/drive/MyDrive/{path}')
os.getcwd()

'/content/drive/MyDrive/NLP/Project'

## Whisper model

In [44]:
# Import required libraries
from transformers import pipeline
import librosa

whisper_model = pipeline("automatic-speech-recognition", model="openai/whisper-small")

# def transcribe_audio_from_path(file_path):
#     audio_input, _ = librosa.load(file_path, sr=16000)
#     transcription = whisper_model(audio_input)
#     return transcription["text"]

# use Audio input from gradio
def transcribe_audio_from_bytes(audio_input):
    try:
      transcription = whisper_model(audio_input)
      return transcription["text"]
    except Exception as e:
      return "Audio is a noise"

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


#### Example

In [68]:
audio_file = "./examples/male.wav"
transcription = transcribe_audio_from_bytes(audio_file)
print("Transcription:", transcription)

Transcription:  It is a pretty little spot there, a green grass plateau running along by the water's edge and overhung by willows.


## Xtts model

In [39]:
from TTS.api import TTS
from pydub import AudioSegment
from pydub.playback import play
import librosa


def text_to_audio(text, output_file="output.wav"):
    tts = TTS(model_name="tts_models/en/ljspeech/tacotron2-DDC", progress_bar=False, gpu=True)
    tts.tts_to_file(text=text, file_path=output_file)

    audio = AudioSegment.from_wav(output_file)
    audio = librosa.load(output_file, sr=16000)[0]
    return 16000, audio


def play_audio(file_path):
    audio = AudioSegment.from_wav(file_path)
    play(audio)

### Example

In [9]:
text = "I am 18 years old"
output_file = "output.wav"

# Converti il testo in audio
text_to_audio(text, output_file)

 > Downloading model to /root/.local/share/tts/tts_models--en--ljspeech--tacotron2-DDC
 > Model's license - apache 2.0
 > Check https://choosealicense.com/licenses/apache-2.0/ for more info.
 > Downloading model to /root/.local/share/tts/vocoder_models--en--ljspeech--hifigan_v2
 > Model's license - apache 2.0
 > Check https://choosealicense.com/licenses/apache-2.0/ for more info.
 > Using model: Tacotron2
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > log_func:np.log
 | > min_level_db:-100
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:20
 | > fft_size:1024
 | > power:1.5
 | > preemphasis:0.0
 | > griffin_lim_iters:60
 | > signal_norm:False
 | > symmetric_norm:True
 | > mel_fmin:0
 | > mel_fmax:8000.0
 | > pitch_fmin:1.0
 | > pitch_fmax:640.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linea

(16000,
 array([-0.00356169, -0.00487513, -0.00605161, ...,  0.        ,
         0.        ,  0.        ], dtype=float32))

In [10]:
audio = AudioSegment.from_wav("output.wav")
play(audio)

In [11]:
from IPython.display import Audio
Audio('output.wav')

## LLAMA3

In [12]:
!ls ./llama-8-finetuned-onlyEnglish/

adapter_config.json  adapter_model.safetensors	README.md


In [13]:
import os

# Directory containing the model
model_dir = './llama-8-finetuned-onlyEnglish/'
print("Directory contents:", os.listdir(model_dir))


Directory contents: ['adapter_config.json', 'adapter_model.safetensors', 'README.md']


In [14]:
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
path_to_model = "./llama-8-finetuned-onlyEnglish/"


## For 4 bit quantization
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

# model = AutoModelForCausalLM.from_pretrained(
#     model_id, quantization_config=quantization_config, device_map="auto"
# )

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=quantization_config,
    device_map="auto"
)

##############++++++++#################################################################
# model = PeftModel.from_pretrained(model, path_to_model)

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

terminators = [tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>")]

config.json:   0%|          | 0.00/654 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/187 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/51.0k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/73.0 [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


### Example

This is just for ouput example:

```python
tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=False)
```

<br>

Use this to get tensors:

```python
tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
```

<br>
Example of chat history with the template applied:

```
 <|begin_of_text|><|start_header_id|>assistant<|end_header_id|>

 Hello, I am a chatbot. How can I help you?<|eot_id|><|start_header_id|>user<|end_header_id|>

 hello<|eot_id|><|start_header_id|>assistant<|end_header_id|>

 Interesting<|eot_id|><|start_header_id|>user<|end_header_id|>

 aa<|eot_id|><|start_header_id|>assistant<|end_header_id|>
```


### LLMA3 chat function

In [34]:
def chat_with_model(messages, tokenizer, terminators, model):

    input_ids = tokenizer.apply_chat_template(

        messages, add_generation_prompt=True, return_tensors="pt"

    ).to(model.device)


    outputs = model.generate(
        input_ids,
        max_new_tokens=128,
        eos_token_id=terminators,
        do_sample=True,
        # temperature=0.6,
        # top_p=0.9,
        pad_token_id=tokenizer.eos_token_id,
    )


    response = outputs[0][input_ids.shape[-1] :]


    decoded_output = tokenizer.decode(response, skip_special_tokens=False)
    try:
      #################################*******************##########################
      # pattern = re.compile(r"<\|im_start\|>assistant\n(.*?)<\|im_end\|>", re.DOTALL)
      pattern = re.compile(r"(.*?)<\|im_end\|>", re.DOTALL)
      matches = pattern.findall(decoded_output)

      return matches[0].strip()


    except Exception as e:
      return decoded_output

## General util function

In [16]:
import wave
import io


def wave_header_chunk(frame_input=b"", channels=1, sample_width=2, sample_rate=24000):
    # This will create a wave header then append the frame input
    # It should be first on a streaming wav file
    # Other frames better should not have it (else you will hear some artifacts each chunk start)
    wav_buf = io.BytesIO()
    with wave.open(wav_buf, "wb") as vfout:
        vfout.setnchannels(channels)
        vfout.setsampwidth(sample_width)
        vfout.setframerate(sample_rate)
        vfout.writeframes(frame_input)

    wav_buf.seek(0)
    return wav_buf.read()

Input data:
``` python
chat_history = [
    [None, "Hello, I am a chatbot. How can I help you?"],
    ["hello", None],
    [None, "Interesting"],
    ("aa", None),
]
```

Output data:
``` python
 [{'role': 'assistant',
  'content': 'Hello, I am a chatbot. How can I help you?'},
 {'role': 'user', 'content': 'hello'},
 {'role': 'assistant', 'content': 'Interesting'},
 {'role': 'user', 'content': 'aa'}]
```

In [17]:
def prepare_chat_history(chat_history):
    chat = []

    chat.append({"role": "system",
             "content": "You are an assistant for answering questions. You are given the extracted parts of a long document and a question. Provide a conversational answer. If you don't know the answer, just say I do not know. Don't make up an answer."
             })

    for entry in chat_history:
        if entry[0] is None:
            chat.append({"role": "assistant", "content": entry[1]})
        else:
            chat.append({"role": "user", "content": entry[0]})

    return chat

In [18]:
def answer_the_question(chatbot_history, model, tokenizer, terminators):

    templated_chat = prepare_chat_history(chatbot_history)
    answer = chat_with_model(templated_chat, tokenizer, terminators, model)

    # answer = "Hello i am bot"

    chatbot_history.append((None, answer))


    return chatbot_history, answer

gr.ChatBot history works like this:

(None, message) -  this is for bot

(message, None) - this is for user

## Gradio

In [84]:
with gr.Blocks(theme=gr.themes.Soft()) as demo:

    # Define chatbot component
    chatbot = gr.Chatbot(
        value=[(None, "Hello, I am a chatbot. How can I help you?")],
        elem_id="chatbot",
        avatar_images=("examples/hf-logo.png", "examples/ai-chat-logo.png"),
        bubble_full_width=False
    )

    # Define generated audio playback component
    with gr.Row():
        sentence = gr.Textbox(visible=False)
        audio_playback = gr.Audio(
            value=None,
            label="Generated audio response",
            streaming=True,
            autoplay=True,
            interactive=False,
            show_label=True,
        )

    # Define text and audio record input components
    with gr.Row():
        txt_box = gr.Textbox(
            scale=2,
            show_label=True,
            placeholder="Enter text and press enter, or speak to your microphone",
            interactive=True,
            container=True,
            label="Text input for model",
        )

        with gr.Column():
            audio_record = gr.Audio(
                label="Upload Audio",
                type="filepath",
            )

    # Define chatbot voice component
    VOICES = ["female", "male"]
    with gr.Row():
        chatbot_voice = gr.Dropdown(
            label="Voice of the Chatbot",
            info="How should Chatbot talk like",
            choices=VOICES,
            multiselect=False,
            value=VOICES[0]
        )

    def add_text(chatbot_history, text):
        chatbot_history = [] if chatbot_history is None else chatbot_history
        chatbot_history = chatbot_history + [(text, None)]
        # block interactive to prevent user from typing while processing
        return chatbot_history, gr.update(value="", interactive=False)

    def add_audio(chatbot_history, audio):
        chatbot_history = [] if chatbot_history is None else chatbot_history

        # get result from whisper and strip it to delete begin and end space
        response = transcribe_audio_from_bytes(audio)
        text = response.strip()

        # text = "Hello I am bot"
        chatbot_history = chatbot_history + [(text, None)]
        return chatbot_history, gr.update(value="", interactive=False)

    def generate_answer(chatbot_history, chatbot_voice, initial_greeting=False):

        # Start by yielding an initial empty audio to set up autoplay
        # yield ("", chatbot_history, wave_header_chunk())

        chatbot_history, answer = answer_the_question(
            chatbot_history, model, tokenizer, terminators
        )

        print(chatbot_history)

        (audio_aswer) = text_to_audio(answer)

        return chatbot_history, answer, audio_aswer


    # Text message
    txt_msg = txt_box.submit(
        fn=add_text,
        inputs=[chatbot, txt_box],
        outputs=[chatbot, txt_box]
    ).then(
        fn=generate_answer,
        inputs=[chatbot, chatbot_voice],
        outputs=[chatbot, sentence, audio_playback],
    )

    txt_msg.then(
        fn=lambda: gr.update(interactive=True),
        inputs=None,
        outputs=[txt_box],
        queue=False,
    )

    # Audio message
    audio_msg = audio_record.play(
        fn=add_audio,
        inputs=[chatbot, audio_record],
        outputs=[chatbot, txt_box],
        queue=False,
    ).then(
        fn=generate_answer,
        inputs=[chatbot, chatbot_voice],
        outputs=[chatbot, sentence, audio_playback],
    )

    audio_msg.then(
        fn=lambda: (
            gr.update(interactive=True),
            gr.update(interactive=True, value=None),
        ),
        inputs=None,
        outputs=[txt_box, audio_record],
        queue=False,
    )

demo.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://8d5601f307a5a78f6c.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




## Testing

Input data:
``` python
chat_history = [
    [None, "Hello, I am a chatbot. How can I help you?"],
    ["hello", None],
    [None, "Interesting"],
    ("aa", None),
]
```

Output data:
``` python
 [{'role': 'assistant',
  'content': 'Hello, I am a chatbot. How can I help you?'},
 {'role': 'user', 'content': 'hello'},
 {'role': 'assistant', 'content': 'Interesting'},
 {'role': 'user', 'content': 'aa'}]
```

In [21]:
# [[None, 'Hello, I am a chatbot. How can I help you?'], ['hello', None], [None, 'interseting'], ('aa', None)]
chat_history = [
    [None, "Hello, I am a chatbot. How can I help you?"],
    ["hello", None],
    [None, "Interesting"],
    ("aa", None),
]

chat = []

chat.append({"role": "system",
             "content": "You are an assistant for answering questions. You are given the extracted parts of a long document and a question. Provide a conversational answer. If you don't know the answer, just say I do not know. Don't make up an answer."
             })

for entry in chat_history:
    if entry[0] is None:
        chat.append({"role": "assistant", "content": entry[1]})
    else:
        chat.append({"role": "user", "content": entry[0]})

print(chat)

[{'role': 'system', 'content': "You are an assistant for answering questions. You are given the extracted parts of a long document and a question. Provide a conversational answer. If you don't know the answer, just say I do not know. Don't make up an answer."}, {'role': 'assistant', 'content': 'Hello, I am a chatbot. How can I help you?'}, {'role': 'user', 'content': 'hello'}, {'role': 'assistant', 'content': 'Interesting'}, {'role': 'user', 'content': 'aa'}]


* This is just for ouput example

```python
tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=False)
```

* Use this to get tensors

```python
tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
```

* Example of chat history with the template applied:

```
 <|begin_of_text|><|start_header_id|>assistant<|end_header_id|>

 Hello, I am a chatbot. How can I help you?<|eot_id|><|start_header_id|>user<|end_header_id|>

 hello<|eot_id|><|start_header_id|>assistant<|end_header_id|>

 Interesting<|eot_id|><|start_header_id|>user<|end_header_id|>

 aa<|eot_id|><|start_header_id|>assistant<|end_header_id|>
```


In [24]:
chat_history = [
    [None, "Hello, I am a chatbot. How can I help you?"],
    ["hello", None],
    [None, "Interesting"],
    ("My name is Filip", None),
]

In [25]:
templated_chat = prepare_chat_history(chat_history)
templated_chat

[{'role': 'system',
  'content': "You are an assistant for answering questions. You are given the extracted parts of a long document and a question. Provide a conversational answer. If you don't know the answer, just say I do not know. Don't make up an answer."},
 {'role': 'assistant',
  'content': 'Hello, I am a chatbot. How can I help you?'},
 {'role': 'user', 'content': 'hello'},
 {'role': 'assistant', 'content': 'Interesting'},
 {'role': 'user', 'content': 'My name is Filip'}]

In [26]:
input_ids = tokenizer.apply_chat_template(

        templated_chat, add_generation_prompt=True, tokenize=False

)

In [27]:
input_ids

"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are an assistant for answering questions. You are given the extracted parts of a long document and a question. Provide a conversational answer. If you don't know the answer, just say I do not know. Don't make up an answer.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nHello, I am a chatbot. How can I help you?<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nhello<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nInteresting<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nMy name is Filip<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

In [28]:
input_ids = tokenizer.apply_chat_template(

        templated_chat, add_generation_prompt=True, return_tensors="pt"

    ).to(model.device)

In [29]:
input_ids

tensor([[128000, 128006,   9125, 128007,    271,   2675,    527,    459,  18328,
            369,  36864,   4860,     13,   1472,    527,   2728,    279,  28532,
           5596,    315,    264,   1317,   2246,    323,    264,   3488,     13,
          40665,    264,   7669,   1697,   4320,     13,   1442,    499,   1541,
            956,   1440,    279,   4320,     11,   1120,   2019,    358,    656,
            539,   1440,     13,   4418,    956,   1304,    709,    459,   4320,
             13, 128009, 128006,  78191, 128007,    271,   9906,     11,    358,
           1097,    264,   6369,   6465,     13,   2650,    649,    358,   1520,
            499,     30, 128009, 128006,    882, 128007,    271,  15339, 128009,
         128006,  78191, 128007,    271,  85415, 128009, 128006,    882, 128007,
            271,   5159,    836,    374,  42378, 128009, 128006,  78191, 128007,
            271]], device='cuda:0')

In [30]:
outputs = model.generate(
        input_ids,
        max_new_tokens=128,
        eos_token_id=terminators,
        do_sample=True,
        # temperature=0.6,
        # top_p=0.9,
        pad_token_id=tokenizer.eos_token_id,
)

response = outputs[0][input_ids.shape[-1] :]

decoded_output = tokenizer.decode(response, skip_special_tokens=True)

In [31]:
decoded_output = tokenizer.decode(response, skip_special_tokens=True)
decoded_output

"Nice to meet you, Filip! What's on your mind? Do you have a question about something related to the extracted parts of a document? I'm here to help!"

In [None]:
import re
pattern = re.compile(r"<\|start_header_id\|>assistant<\|end_header_id\|>(.*?)<\|eot_id\|>", re.DOTALL)
matches = pattern.findall(decoded_output)
matches[-1].strip()

In [None]:
import re
pattern = re.compile(r"<\|im_start\|>assistant\n(.*?)<\|im_end\|>", re.DOTALL)
matches = pattern.findall(decoded_output)
matches[0].strip()

In [None]:
import re
pattern = re.compile(r"(.*?)<\|im_end\|>", re.DOTALL)
matches = pattern.findall(decoded_output)
matches[0].strip()

In [35]:
answer = chat_with_model(templated_chat, tokenizer, terminators, model)


In [36]:
answer

"Nice to meet you, Filip! What's on your mind? Do you have a question about something? I'm here to help!<|eot_id|>"

In [40]:
sr, audio_aswer = text_to_audio(answer)



 > tts_models/en/ljspeech/tacotron2-DDC is already downloaded.
 > vocoder_models/en/ljspeech/hifigan_v2 is already downloaded.
 > Using model: Tacotron2
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > log_func:np.log
 | > min_level_db:-100
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:20
 | > fft_size:1024
 | > power:1.5
 | > preemphasis:0.0
 | > griffin_lim_iters:60
 | > signal_norm:False
 | > symmetric_norm:True
 | > mel_fmin:0
 | > mel_fmax:8000.0
 | > pitch_fmin:1.0
 | > pitch_fmax:640.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:None
 | > base:2.718281828459045
 | > hop_length:256
 | > win_length:1024
 > Model's reduction rate `r` is set to: 1
 > Vocoder Model: hifigan
 > Setting up Audio P

In [41]:
Audio(audio_aswer, rate=sr)

In [57]:
transcribe_audio_from_bytes("DSd")

'Audio is a noise'

In [60]:
!ls

app.ipynb  Fine-Tuning.ipynb  llama-8-finetuned-onlyEnglish  metricsv4.ipynb  tesk1.ipynb
examples   gradio.ipynb       llm.ipynb			     output.wav


In [65]:
path = "\examples"