## **TeeGuide Models** 






This notebook contains three models that will be deployed in our mobile app to provide guidance, advice, and schedules to our teenage customers.

1-The first model is a Speech-to-Text model. We utilized Transformer's Speech2TextProcessor and Speech2TextForConditionalGeneration.

2-The second model is an NLP model that generates advice and guidance for teenagers. We employed the Transformer model: the T5 model (Text-To-Text Transfer Transformer). We used TensorFlow to train the model.

3-The third model is a Text-to-Speech model. It takes the output from the second model and transforms it into speech using the SpeechT5Processor, SpeechT5ForTextToSpeech, and SpeechT5HifiGan. Additionally, the set_seed function is implemented to ensure reproducibility in the model's results

**All three models have been thoroughly tested and function well in the notebook**.

## **Speech To Text**

In [48]:
import torch
from transformers import Speech2TextProcessor, Speech2TextForConditionalGeneration
import librosa
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

# Change the working directory to the folder where your file is located
import os
os.chdir('/content/drive/MyDrive/tee_guide')

# Load the speech-to-text model and processor
model = Speech2TextForConditionalGeneration.from_pretrained("facebook/s2t-small-librispeech-asr")
processor = Speech2TextProcessor.from_pretrained("facebook/s2t-small-librispeech-asr")

# Load your own audio file
audio_file_path = 'Record.mp3'
# Adjust chunk size according to your preference and memory constraints
chunk_size_seconds = 10

# Function to transcribe audio chunks
def transcribe_audio_chunk(audio_chunk):
    inputs = processor(audio_chunk, sampling_rate=16000, return_tensors="pt")
    generated_ids = model.generate(inputs["input_features"], attention_mask=inputs["attention_mask"])
    transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)
    return transcription

# Load the audio file
audio_signal, sampling_rate = librosa.load(audio_file_path, sr=16000)

# Calculate the number of chunks
chunk_size_samples = chunk_size_seconds * sampling_rate
num_chunks = len(audio_signal) // chunk_size_samples

# Transcribe each chunk and concatenate the transcriptions into a string
transcription_string = ""
for i in range(num_chunks + 1):
    start = i * chunk_size_samples
    end = min((i + 1) * chunk_size_samples, len(audio_signal))
    audio_chunk = audio_signal[start:end]
    transcription_chunk = transcribe_audio_chunk(audio_chunk)
    transcription_string += ' '.join(transcription_chunk) + ' '

print("Transcription:", transcription_string)


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Some weights of Speech2TextForConditionalGeneration were not initialized from the model checkpoint at facebook/s2t-small-librispeech-asr and are newly initialized: ['model.encoder.embed_positions.weights', 'model.decoder.embed_positions.weights']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Transcription: i feel like there is never enough time for everything and it's dressed me out i struggle with time wasting activities and then i am stretched about not finishing my work 


**NLP Model: This model takes input from teenagers' problems, explores their experiences with anxiety and stress, and examines how these challenges impact their lives. It then offers advice and suggests time management methods to help them navigate through these issues.**






In [49]:
from google.colab import files

uploaded = files.upload()


Saving fin.csv to fin (2).csv


In [50]:
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import pandas as pd
import numpy as np
import tensorflow as tf
from transformers import TFAutoModelForSeq2SeqLM, AutoTokenizer

# Load CSV data
data = pd.read_csv('fin.csv',nrows=98)
print(data.dtypes)
df = pd.DataFrame(data, columns=('Input','Output'))
print(df.head())
print(df.info())

# Load CSV data
data = pd.read_csv('fin.csv', nrows=98)
print(data.dtypes)
df = pd.DataFrame(data, columns=('Input', 'Output'))
print(df.head())
print(df.info())

# Download NLTK resources (stopwords and punkt tokenizer) if not already downloaded
nltk.download('stopwords')
nltk.download('punkt')





Input     object
Output    object
dtype: object
                                               Input  \
0  I get anxious when I have too many assignments...   
1  I feel like there's never enough time for ever...   
2  I struggle with time-wasting activities, and t...   
3  I often feel guilty when I take breaks. How ca...   
4  I struggle with setting priorities and end up ...   

                                              Output  
0  Breaking down your assignments into smaller ta...  
1  Time management is crucial. Use a planner or d...  
2  Identify and limit time-wasting habits. Set sp...  
3  Self-care is essential for productivity. Sched...  
4  Create a to-do list and categorize tasks by ur...  
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 98 entries, 0 to 97
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Input   98 non-null     object
 1   Output  98 non-null     object
dtypes: object(2)
memory usage: 1.7+ KB

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

## **Text preprocessing Function**

In [51]:
# Define a function for text preprocessing
def preprocess_text(text):
    # Convert to lowercase
    text = text.lower()

    # Tokenize the text
    tokens = word_tokenize(text)

    # Remove stop words
    stop_words = set(stopwords.words('english'))
    tokens = [token for token in tokens if token not in stop_words]

    # Join the tokens back into a string
    preprocessed_text = ' '.join(tokens)

    return preprocessed_text

In [52]:
from transformers import TFAutoModelForSeq2SeqLM, AutoTokenizer
import tensorflow as tf
import pandas as pd
model_name = 't5-base'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForSeq2SeqLM.from_pretrained(model_name)
#Text preprocessing to the input and output columns
data['Input'] = data['Input'].apply(preprocess_text)
data['Output'] = data['Output'].apply(preprocess_text)

# Tokenize the preprocessed input and output sequences
input_sequences = tokenizer(data['Input'].tolist(), padding=True, truncation=True, return_tensors="tf")
output_sequences = tokenizer(data['Output'].tolist(), padding=True, truncation=True, return_tensors="tf")



For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
All PyTorch model weights were used when initializing TFT5ForConditionalGeneration.

All the weights of TFT5ForConditionalGeneration were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFT5ForConditionalGeneration for predictions without further training.


In [53]:
# Preparation of inputs and labels for training
inputs = {
    "input_ids": input_sequences["input_ids"],
    "attention_mask": input_sequences["attention_mask"],
    "decoder_attention_mask": output_sequences["attention_mask"][:, :-1] 
}
labels = output_sequences["input_ids"][:, 1:]  

# Compile the model
model.compile(optimizer="adam", loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=["accuracy"])
model.summary()

# Train the model
history = model.fit(inputs, labels, batch_size=2, epochs=10, validation_split=0.2)


Model: "tft5_for_conditional_generation_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 shared (Embedding)          multiple                  24674304  
                                                                 
 encoder (TFT5MainLayer)     multiple                  109628544 
                                                                 
 decoder (TFT5MainLayer)     multiple                  137949312 
                                                                 
Total params: 222903552 (850.31 MB)
Trainable params: 222903552 (850.31 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


**Testing The Model**

In [54]:
#function of Prediction:
def generate_output(input_text, model, tokenizer):
    # Tokenize the input sequence
    input_sequence = tokenizer(input_text, return_tensors="tf", padding=True, truncation=True)

    # Generate output using the trained model
    generated_output_ids = model.generate(
        input_sequence["input_ids"],
        max_length=100,
        num_beams=5,
        length_penalty=0.6,
        no_repeat_ngram_size=2
    )

    # Decode and return the generated output
    generated_output_text = tokenizer.decode(generated_output_ids[0], skip_special_tokens=True)
    return generated_output_text

In [55]:
#Testing Exemple
new_input_text1 = "I often feel guilty when I take breaks. How can I balance work and self-care without feeling bad?"
new_input_text2 = transcription_string #the string generated from the audio

# Call the function with different inputs
generated_output1 = generate_output(new_input_text1, model, tokenizer)

# Print the generated outputs
print("Generated Output 1:", generated_output1)


Generated Output 1: take breaks necessary maintaining focus mental well-being.


## **Text To Speech Model**



In [57]:
from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan, set_seed
import torch
import IPython.display as ipd

processor = SpeechT5Processor.from_pretrained("microsoft/speecht5_tts")
model = SpeechT5ForTextToSpeech.from_pretrained("microsoft/speecht5_tts")
vocoder = SpeechT5HifiGan.from_pretrained("microsoft/speecht5_hifigan")

txt = input(generated_output2)

inputs = processor(text= txt, return_tensors="pt")
speaker_embeddings = torch.zeros((1, 512))

set_seed(555)  # Make deterministic
speech = model.generate(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)

# Play generated audio in Colab
ipd.Audio(speech.squeeze().numpy(), rate=14000)

spend time outside work. take breaks necessary maintaining focus mental well-being.
