To use this file for testing, please upload files in the email as follows:-
1. [Audio Emotional Analysis] Do not need files.
2. [Text Sentiment Analysis] Do not need files.
3. [Contextual Coherence Model] Create a folder named 'coherence_model' in content folder, extract and upload files in coherence_model folder.
4. Upload testing file and rename it in main pipeline (ie dialogue1.txt)

To use this file for training:


1.   [Audio Emotional Analysis] Do not need training.
2.   [Text  Sentiment Analysis] Do not need training.
3.   [Contextual Coherence Analysis] upload 'dialogues_dataset.csv' (contextual coherence analysis dataset) in content folder, and change main pipeline (change load_model = False)

In [None]:
!pip install transformers datasets torchaudio librosa pandas

Collecting datasets
  Downloading datasets-3.0.1-py3-none-any.whl.metadata (20 kB)
Collecting pyarrow>=15.0.0 (from datasets)
  Downloading pyarrow-17.0.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (3.3 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess (from datasets)
  Downloading multiprocess-0.70.17-py310-none-any.whl.metadata (7.2 kB)
INFO: pip is looking at multiple versions of multiprocess to determine which version is compatible with other requirements. This could take a while.
  Downloading multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)
Downloading datasets-3.0.1-py3-none-any.whl (471 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m471.6/471.6 kB[0m [31m8.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dill-0.3.8-py3-none-any

In [None]:
#For sentiment
!pip install transformers datasets torch scikit-learn



In [None]:
from transformers import AutoModelForAudioClassification, Wav2Vec2FeatureExtractor
import torch
import os
import librosa
import pandas as pd
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import numpy as np
from transformers import AutoTokenizer, MobileBertForSequenceClassification
import math

# Audio Processing For Emotional Analysis

In [None]:
class PretrainedEmotionModel:
    """
    Use the pre-trained DistilHuBERT model from Hugging Face for emotion classification.
    """
    def __init__(self, model_name):
        # Load the pre-trained DistilHuBERT model and feature extractor
        self.feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained(model_name)
        self.model = AutoModelForAudioClassification.from_pretrained(model_name)

    def predict_all_labels(self, audio_file_path):
        """
        Predict all possible emotion labels with their respective confidence scores.
        """
        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

        # Load and preprocess the audio
        speech_array, sampling_rate = librosa.load(audio_file_path, sr=16000)
        inputs = self.feature_extractor(
            speech_array, return_tensors="pt", sampling_rate=sampling_rate, padding=True
        )

        # Move inputs and model to the appropriate device (GPU or CPU)
        inputs = {key: value.to(device) for key, value in inputs.items()}
        self.model.to(device)

        # Make predictions
        with torch.no_grad():
            logits = self.model(**inputs).logits
            probabilities = torch.softmax(logits, dim=-1).squeeze().cpu().numpy()

        # Map label IDs to emotion labels
        id2label = self.model.config.id2label

        # Collect the predictions and their associated confidence scores
        results = []
        for label_id, confidence in enumerate(probabilities):
            emotion = id2label[label_id]
            results.append({
                'audio_file': os.path.basename(audio_file_path),
                'emotion': emotion.capitalize(),
                'confidence': confidence
            })

        # Sort results by confidence in descending order
        df = pd.DataFrame(results)
        df = df.sort_values(by='confidence', ascending=False).reset_index(drop=True)

        return df


class EmotionPipeline:
    """
    Orchestrates the workflow of downloading audio data from Google Drive and performing emotion classification.
    """
    def __init__(self, model_name="pollner/distilhubert-finetuned-ravdess"):
        self.model_name = model_name
        self.model = PretrainedEmotionModel(self.model_name)

    def authenticate_and_create_drive(self):
        """
        Authenticates the user and creates a PyDrive GoogleDrive instance.
        """
        auth.authenticate_user()
        gauth = GoogleAuth()
        gauth.credentials = GoogleCredentials.get_application_default()
        drive = GoogleDrive(gauth)
        return drive

    def download_audio_from_drive(self, drive, audio_file_id, destination_path):
        """
        Downloads an audio file from Google Drive using its file ID.
        """
        print(f"Downloading audio file from Google Drive with file ID: {audio_file_id}")
        downloaded = drive.CreateFile({'id': audio_file_id})
        downloaded.GetContentFile(destination_path)
        print(f"Downloaded audio file and saved as {destination_path}")

    def load_and_predict(self, audio_file_ids):
        """
        Downloads the audio files using their file IDs and predicts all possible labels.
        """
        drive = self.authenticate_and_create_drive()

        for audio_file_name, audio_file_id in audio_file_ids.items():
            destination_path = f"./{audio_file_name}"
            self.download_audio_from_drive(drive, audio_file_id, destination_path)

            # Perform prediction using the pre-trained model
            result_df = self.model.predict_all_labels(destination_path)
        return result_df

# Text Processing with Sentiment Analysis

In [None]:
class TextSentimentAnalysisPipeline:
    def __init__(self, dataset_path, model_name='cambridgeltl/sst_mobilebert-uncased'):
        with open(dataset_path, 'r') as file:
            self.conversation = [line.strip() for line in file.readlines()]
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = MobileBertForSequenceClassification.from_pretrained(model_name)
        self.model.eval()  # Set model to evaluation mode

    def extract_caller_text(self):
        # Extract lines spoken by the "Caller"
        caller_lines = [line.lstrip('"Caller: ').rstrip('", ').split('. ') for line in self.conversation if line.startswith('"Caller:')]
        caller_lines = [sentence for sublist in caller_lines for sentence in sublist]
        return caller_lines

    def predict_sentiments(self, texts):
        inputs = self.tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)
        outputs = self.model(**inputs)
        probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
        return probs.detach().numpy()

    def evaluate(self):
        # Extract all text spoken by the Caller
        caller_text = self.extract_caller_text()

        # Predict sentiments and get confidence scores
        result = []
        for turn in caller_text:
          prob = self.predict_sentiments(turn)
          result.append({
              'Sentence': turn,
              'Positive Confidence': prob[0, 2],
              'Negative Confidence': prob[0, 0],
              'Neutral Confidence': prob[0, 1]
        })
        result_df = pd.DataFrame(result)
        avg_pos = result_df['Positive Confidence'].mean()
        avg_neg = result_df['Negative Confidence'].mean()
        avg_neu = result_df['Neutral Confidence'].mean()
        sentiment_df = pd.DataFrame({'sentiment':['Positive', 'Negative', 'Neutral'], 'confidence':[avg_pos, avg_neg, avg_neu]})
        return sentiment_df

# Contextual Coherence

In [None]:
# Import required libraries
from transformers import BigBirdForSequenceClassification, AutoTokenizer, Trainer, TrainingArguments, DataCollatorWithPadding
import torch
from torch.utils.data import Dataset
import pandas as pd
from google.colab import files
import joblib
import os
import torch.nn.functional as F

In [None]:
# Step 1: Class for defining the custom dataset
class DialogueDataset(Dataset):
    def __init__(self, dataframe, tokenizer, max_length):
        self.dataframe = dataframe
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.dataframe)

    def __getitem__(self, idx):
        context = self.dataframe.iloc[idx, 0]
        response = self.dataframe.iloc[idx, 1]
        label = self.dataframe.iloc[idx, 2]

        combined_text = context + " " + self.tokenizer.sep_token + " " + response
        encoding = self.tokenizer(
            combined_text,
            max_length=self.max_length,
            padding="max_length",
            truncation=True,
            return_tensors="pt",
        )

        input_ids = encoding["input_ids"].squeeze(0)
        attention_mask = encoding["attention_mask"].squeeze(0)

        return {
            "input_ids": input_ids,
            "attention_mask": attention_mask,
            "labels": torch.tensor(label, dtype=torch.long),
        }

# Step 2: Class for model training
class ModelTrainer:
    def __init__(self, train_dataset):
        self.tokenizer = AutoTokenizer.from_pretrained('google/bigbird-roberta-base')
        self.model = BigBirdForSequenceClassification.from_pretrained('google/bigbird-roberta-base')
        self.train_dataset = train_dataset
        self.training_args = self._setup_training_args()
        for name, param in self.model.named_parameters():
          if not param.is_contiguous():
            #print(f'Making contiguous:{name}')
            param.data = param.data.contiguous()
        #for name, param in self.model.named_parameters():
            #print(f'Layer:{name}, Contiguous:{param.is_contiguous()}')

    def _setup_training_args(self):
        # Set up training arguments, limiting to 1 epoch for quick testing
        return TrainingArguments(
            output_dir='./results',
            num_train_epochs=1,  # Quick testing with 1 epoch
            per_device_train_batch_size=8,
            learning_rate=2e-5,
            warmup_steps=500,
            weight_decay=0.01,
            logging_dir='./logs',
            logging_steps=50,
            save_total_limit=2,
            save_steps=200,
            evaluation_strategy="no",
        )

    def fine_tune_model(self):
        trainer = Trainer(
            model=self.model,
            args=self.training_args,
            train_dataset=self.train_dataset
        )
        trainer.train()
        return self.model

    def save_model(self, save_path):
        self.model.save_pretrained(save_path)
        self.tokenizer.save_pretrained(save_path)
        print(f"Model saved to {save_path}")

# Step 3: Class for coherence evaluation
class CoherenceEvaluator:
    def __init__(self, model_path):
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.model = BigBirdForSequenceClassification.from_pretrained(model_path)

    def tokenize_input(self, context, response):
        return self.tokenizer(context, response, return_tensors='pt', max_length=1024, truncation=True, padding='max_length')

    def compute_logits(self, inputs):
        outputs = self.model(**inputs)
        return outputs.logits

    def apply_softmax(self, logits):
        probabilities = F.softmax(logits, dim=1)
        return probabilities[0][1].item()

# Step 4: Main pipeline class to encapsulate the entire process
class CoherencePipeline:
    def __init__(self, dataset_path, model_save_path, load_model=False):
        self.file_path = dataset_path
        self.model_save_path = model_save_path
        self.load_model = load_model
        self.model_trainer = None
        self.coherence_evaluator = None

    def prepare_dataset(self):
        df = pd.read_csv(self.file_path)
        tokenizer = AutoTokenizer.from_pretrained('google/bigbird-roberta-base')
        train_dataset = DialogueDataset(df, tokenizer, max_length=256)
        return train_dataset

    def train_and_save_model(self, train_dataset):
        self.model_trainer = ModelTrainer(train_dataset)
        trained_model = self.model_trainer.fine_tune_model()
        self.model_trainer.save_model(self.model_save_path)
        return trained_model

    def evaluate_coherence(self):
        #file_name = list(self.file_path.keys())[0]
        with open(self.file_path, 'r') as file:
            dialogue = file.readlines()

        self.coherence_evaluator = CoherenceEvaluator(self.model_save_path)
        pairs = [(dialogue[i].strip(), dialogue[i + 1].strip()) for i in range(len(dialogue) - 1)]

        scores = []
        for context, response in pairs:
            inputs = self.coherence_evaluator.tokenize_input(context, response)
            logits = self.coherence_evaluator.compute_logits(inputs)
            score = self.coherence_evaluator.apply_softmax(logits)
            scores.append(score)

        # Create DataFrame to store results
        df_results = pd.DataFrame({
            'Pair Number': [f'Pair {i+1}' for i in range(len(pairs))],
            'Context': [pair[0] for pair in pairs],
            'Response': [pair[1] for pair in pairs],
            'Coherence Score': scores
        })

        # Calculate overall coherence score
        overall_score = sum(scores) / len(scores)
        df_results.loc['Overall'] = ['', '', 'Overall Coherence Score', overall_score]

        return df_results

    def run_pipeline(self):
        if self.load_model:
            # Check if fine-tuned model exists
            if self.model_save_path.startswith('google/'):
              print(f'Using pretrained model from Hugging Face:{self.model_save_path}')
            else:
                if not os.path.exists(self.model_save_path):
                    raise FileNotFoundError(f"No fine-tuned model found at {self.model_save_path}. Please train the model first.")
                print(f"Using existing model from {self.model_save_path}")
        else:
            # Train model if flag is set to True
            train_dataset = self.prepare_dataset()
            self.train_and_save_model(train_dataset)

        # Proceed to evaluate test data
        df_results = self.evaluate_coherence()
        print(df_results)
        return df_results

# Final Calculation


In [None]:
# import pandas as pd
# import math

# Function to map emotion to a score
def map_emotion_to_score(emotion):
    emotion_scores = {
        'Happy': 1,
        'Neutral': 0,
        'Calm': 0,
        'Angry': -1,
        'Disgust': -1,
        'Surprised': -1,
        'Fearful': -1,
        'Sad': -1}
    return emotion_scores.get(emotion, 0)

# Function to map sentiment to score
def map_sentiment_to_score(sentiment):
    sentiment_scores = {'Neutral':0, 'Negative':-1, 'Positive':1}
    return sentiment_scores.get(sentiment, 0)

# Function to apply sigmoid transformation and scale
def sigmoid_transform(x):
    x_sigmoid = 1 / (1 + math.exp(-x))
    x_scaled = x_sigmoid * 10
    return x_scaled

# Unified function to calculate the final score from any input format
def calculate_sentiment_score(df):
    if 'emotion' in df.columns:
        # Process DataFrame with emotions
        df['score'] = df.apply(lambda row: map_emotion_to_score(row['emotion']), axis=1)
        df['weighted_score'] = df['score'] * df['confidence']
    elif 'sentiment' in df.columns:
        # Process DataFrame with Sentiment, Confidence, and Score
        df['score'] = df.apply(lambda row: map_sentiment_to_score(row['sentiment']), axis = 1)
        df['weighted_score'] = df['score'] * df['confidence']
    else:
        raise ValueError("DataFrame format not recognized.")

    # Calculate weighted sum of scores
    weighted_sum = df['weighted_score'].sum()

    # Calculate total confidence
    total_confidence = df['confidence'].sum()

    # Compute the final raw score
    sentiment_score_raw = weighted_sum / total_confidence if total_confidence != 0 else 0

    # Apply sigmoid transformation to the final score
    sentiment_score = sigmoid_transform(sentiment_score_raw)

    return sentiment_score, total_confidence

def weighted_score(audio_score, audio_confidence, text_score, text_confidence):
    return (audio_score * audio_confidence + text_score * text_confidence) / (audio_confidence + text_confidence)

'''
# Example usage for both variations:

# Variation 1: Example DataFrame with emotion, level, and confidence
data_emotion = {
    'audio_file': ['audio1.mp3', 'audio1.mp3', 'audio1.mp3', 'audio1.mp3'],
    'emotion': ['Happiness', 'Anger', 'Neutral', 'Sadness'],
    'level': ['High', 'Medium', 'Unspecified', 'Low'],
    'confidence': [0.6, 0.2, 0.1, 0.1]
}
df_emotion = pd.DataFrame(data_emotion)

# Variation 2: Example DataFrame with Sentiment, Confidence, and Score
data_sentiment = {
    'Sentiment': ['neutral', 'positive', 'negative'],
    'Confidence': [0.868819, 0.049960, 0.081221]
    #'Score': [0, 1, -1]
}
df_sentiment = pd.DataFrame(data_sentiment)

# Calculate the final score for both variations
final_score_emotion = calculate_final_score(df_emotion)
final_score_sentiment = calculate_final_score(df_sentiment)

print(f"Final score (Emotion DataFrame): {final_score_emotion:.2f}")
print(f"Final score (Sentiment DataFrame): {final_score_sentiment:.2f}")
'''

'\n# Example usage for both variations:\n\n# Variation 1: Example DataFrame with emotion, level, and confidence\ndata_emotion = {\n    \'audio_file\': [\'audio1.mp3\', \'audio1.mp3\', \'audio1.mp3\', \'audio1.mp3\'],\n    \'emotion\': [\'Happiness\', \'Anger\', \'Neutral\', \'Sadness\'],\n    \'level\': [\'High\', \'Medium\', \'Unspecified\', \'Low\'],\n    \'confidence\': [0.6, 0.2, 0.1, 0.1]\n}\ndf_emotion = pd.DataFrame(data_emotion)\n\n# Variation 2: Example DataFrame with Sentiment, Confidence, and Score\ndata_sentiment = {\n    \'Sentiment\': [\'neutral\', \'positive\', \'negative\'],\n    \'Confidence\': [0.868819, 0.049960, 0.081221]\n    #\'Score\': [0, 1, -1]\n}\ndf_sentiment = pd.DataFrame(data_sentiment)\n\n# Calculate the final score for both variations\nfinal_score_emotion = calculate_final_score(df_emotion)\nfinal_score_sentiment = calculate_final_score(df_sentiment)\n\nprint(f"Final score (Emotion DataFrame): {final_score_emotion:.2f}")\nprint(f"Final score (Sentiment D

# Main Pipeline

In [None]:
# Main function to run the pipeline
def main():

  audio_model_name = "pollner/distilhubert-finetuned-ravdess"

  # Initialize the pipeline with the pre-trained model
  emotion_pipeline = EmotionPipeline(model_name=audio_model_name)

  # Provide the Google Drive file IDs of the audio files
  audio_file_ids = {
      'audio1.mp3': '108kPpEQeA_6RkQXmmLWDJXQzdiISlm0r'
  }

  # Download the audio files and perform emotion prediction
  audio_results_df = emotion_pipeline.load_and_predict(audio_file_ids)

  # Output the audio results
  print(audio_results_df)

  # Load sentiment analysis model to predict sentiment and confidence scores
  test_data_path = '/content/dialogue1.txt'
  pipeline = TextSentimentAnalysisPipeline(test_data_path)
  textresults_df = pipeline.evaluate()
  print(textresults_df)

  final_score_emotion, emotion_confidence = calculate_sentiment_score(audio_results_df)
  final_score_sentiment, sentiment_confidence = calculate_sentiment_score(textresults_df)

  print(f"Final score (Emotion DataFrame mingyao): {final_score_emotion:.2f}")
  print(f"Final score (Sentiment DataFrame bhavik): {final_score_sentiment:.2f}")

  user_satisfaction = weighted_score(final_score_emotion, emotion_confidence, final_score_sentiment, sentiment_confidence)
  print(f"Weighted score for user satisfaction: {user_satisfaction:.2f}")

  coherence_pipeline = CoherencePipeline(
    dataset_path = test_data_path, # Change to '/content/dialogues_dataset.csv' if you want to train
    model_save_path='./coherence_model', load_model=True)  # Set to False if you want to train
  coherence_result = coherence_pipeline.run_pipeline()
  coherence_score = coherence_result.loc['Overall', 'Coherence Score']

  final_score = 0.6*user_satisfaction+0.4*coherence_score*10
  print(f"Final score: {final_score:.2f}")

  #score = calculate_final_score(emotions, levels, confidences)
  #print(f"Final score: {score:.2f}")


if __name__ == "__main__":
  main()

Some weights of the model checkpoint at pollner/distilhubert-finetuned-ravdess were not used when initializing HubertForSequenceClassification: ['hubert.encoder.pos_conv_embed.conv.weight_g', 'hubert.encoder.pos_conv_embed.conv.weight_v']
- This IS expected if you are initializing HubertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing HubertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of HubertForSequenceClassification were not initialized from the model checkpoint at pollner/distilhubert-finetuned-ravdess and are newly initialized: ['hubert.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'hubert.encoder.pos_conv_em

Downloading audio file from Google Drive with file ID: 108kPpEQeA_6RkQXmmLWDJXQzdiISlm0r
Downloaded audio file and saved as ./audio1.mp3
   audio_file    emotion  confidence
0  audio1.mp3    Neutral    0.606270
1  audio1.mp3        Sad    0.158477
2  audio1.mp3      Happy    0.151101
3  audio1.mp3       Calm    0.070716
4  audio1.mp3    Disgust    0.005959
5  audio1.mp3      Angry    0.003035
6  audio1.mp3  Surprised    0.002574
7  audio1.mp3    Fearful    0.001867




  sentiment  confidence
0  Positive    0.415585
1  Negative    0.193286
2   Neutral    0.391129
Final score (Emotion DataFrame mingyao): 4.95
Final score (Sentiment DataFrame bhavik): 5.55
Weighted score for user satisfaction: 5.25
Using existing model from ./coherence_model
        Pair Number                                            Context  \
0            Pair 1  ["AI: Hi, my name is Lila. I'm Octivo's AI age...   
1            Pair 2  "Caller: Hey, nice to meet you. My name is Mic...   
2            Pair 3  "AI: Thank you for introducing yourself Michae...   
3            Pair 4  "Caller: Yeah, sure. I'm 27 but I feel like I ...   
4            Pair 5  "AI: I completely understand your hesitation a...   
5            Pair 6  "Caller: Ok, that's fair enough. So I'm earnin...   
6            Pair 7  "AI: Thank you for sharing your income range t...   
7            Pair 8  "Caller: I will retire at around 65 and I woul...   
Overall                                                   

In [None]:
# Main function to run the pipeline
def main():

  audio_model_name = "pollner/distilhubert-finetuned-ravdess"

  # Initialize the pipeline with the pre-trained model
  emotion_pipeline = EmotionPipeline(model_name=audio_model_name)

  # Provide the Google Drive file IDs of the audio files
  audio_file_ids = {
      'audio2.mp3': '13O1hKhYl5Uzlb0mIadH5hv5t_zSud664'
  }

  # Download the audio files and perform emotion prediction
  audio_results_df = emotion_pipeline.load_and_predict(audio_file_ids)

  # Output the audio results
  print(audio_results_df)

  # Load sentiment analysis model to predict sentiment and confidence scores
  test_data_path = '/content/dialogue2.txt'
  pipeline = TextSentimentAnalysisPipeline(test_data_path)
  textresults_df = pipeline.evaluate()
  print(textresults_df)

  final_score_emotion, emotion_confidence = calculate_sentiment_score(audio_results_df)
  final_score_sentiment, sentiment_confidence = calculate_sentiment_score(textresults_df)

  print(f"Final score (Emotion DataFrame mingyao): {final_score_emotion:.2f}")
  print(f"Final score (Sentiment DataFrame bhavik): {final_score_sentiment:.2f}")

  user_satisfaction = weighted_score(final_score_emotion, emotion_confidence, final_score_sentiment, sentiment_confidence)
  print(f"Weighted score for user satisfaction: {user_satisfaction:.2f}")

  coherence_pipeline = CoherencePipeline(
    dataset_path = test_data_path, # Change to '/content/dialogues_dataset.csv' if you want to train
    model_save_path='./coherence_model', load_model=True)  # Set to False if you want to train
  coherence_result = coherence_pipeline.run_pipeline()
  coherence_score = coherence_result.loc['Overall', 'Coherence Score']

  final_score = 0.6*user_satisfaction+0.4*coherence_score*10
  print(f"Final score: {final_score:.2f}")

  #score = calculate_final_score(emotions, levels, confidences)
  #print(f"Final score: {score:.2f}")


if __name__ == "__main__":
  main()

Some weights of the model checkpoint at pollner/distilhubert-finetuned-ravdess were not used when initializing HubertForSequenceClassification: ['hubert.encoder.pos_conv_embed.conv.weight_g', 'hubert.encoder.pos_conv_embed.conv.weight_v']
- This IS expected if you are initializing HubertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing HubertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of HubertForSequenceClassification were not initialized from the model checkpoint at pollner/distilhubert-finetuned-ravdess and are newly initialized: ['hubert.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'hubert.encoder.pos_conv_em

Downloading audio file from Google Drive with file ID: 13O1hKhYl5Uzlb0mIadH5hv5t_zSud664
Downloaded audio file and saved as ./audio2.mp3
   audio_file    emotion  confidence
0  audio2.mp3       Calm    0.949149
1  audio2.mp3        Sad    0.038235
2  audio2.mp3    Neutral    0.007554
3  audio2.mp3    Disgust    0.004204
4  audio2.mp3    Fearful    0.000314
5  audio2.mp3  Surprised    0.000238
6  audio2.mp3      Angry    0.000162
7  audio2.mp3      Happy    0.000145




  sentiment  confidence
0  Positive    0.234192
1  Negative    0.505983
2   Neutral    0.259824
Final score (Emotion DataFrame mingyao): 4.89
Final score (Sentiment DataFrame bhavik): 4.32
Weighted score for user satisfaction: 4.61
Using existing model from ./coherence_model
        Pair Number                                            Context  \
0            Pair 1  ["AI: Hello, I'm Claire, the receptionist at A...   
1            Pair 2  "Caller: Hi, my name is Michael. I have some i...   
2            Pair 3  "AI: Hi, Michael. I'd like to help you with th...   
3            Pair 4  "Caller: Yes. So my phone number is 0406000. Y...   
Overall                                                                  

                                                  Response  Coherence Score  
0        "Caller: Hi, my name is Michael. I have some i...         0.593523  
1        "AI: Hi, Michael. I'd like to help you with th...         0.758470  
2        "Caller: Yes. So my phone number is 

# Contextual Coherence (ignore - for record keeping)

In [None]:
# Import required libraries
from transformers import BigBirdForSequenceClassification, AutoTokenizer, Trainer, TrainingArguments
import torch
from torch.utils.data import Dataset
import pandas as pd
from google.colab import files
import joblib
import os
import torch.nn.functional as F
from google.colab import drive
import gc

# Step 0: Safely handle mounting Google Drive
if not os.path.ismount('/content/drive'):
    drive.mount('/content/drive', force_remount=False)
    print("Drive mounted successfully.")
else:
    print("Google Drive is already mounted.")

# Clear cache and collect garbage to free memory
torch.cuda.empty_cache()
gc.collect()

# Step 1: Class for handling file operations and dataset management
class DatasetHandler:
    def __init__(self, save_directory):
        self.save_directory = save_directory
        self.train_dataset_file = os.path.join(self.save_directory, 'saved_train_dataset.pkl')
        self.dataframe = None

    def create_save_directory(self):
        if not os.path.exists(self.save_directory):
            os.makedirs(self.save_directory)
            print(f"Created directory: {self.save_directory}")

    def load_or_upload_dataset(self):
        if not os.path.exists(self.train_dataset_file):
            print("Training dataset file not found in Google Drive. Please upload the dataset.")
            uploaded = files.upload()
            file_name = list(uploaded.keys())[0]
            self.dataframe = pd.read_csv(file_name)
            joblib.dump(self.dataframe, self.train_dataset_file)
            print(f"Dataset saved to {self.train_dataset_file} in Google Drive.")
        else:
            # Load the dataset from Google Drive if it already exists
            self.dataframe = joblib.load(self.train_dataset_file)
            print(f"Dataset loaded from {self.train_dataset_file} in Google Drive.")
        return self.dataframe

# Step 2: Class for defining the custom dataset
class DialogueDataset(Dataset):
    def __init__(self, dataframe, tokenizer, max_length):
        self.dataframe = dataframe
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.dataframe)

    def __getitem__(self, idx):
        context = self.dataframe.iloc[idx, 0]
        response = self.dataframe.iloc[idx, 1]
        label = self.dataframe.iloc[idx, 2]

        combined_text = context + " " + self.tokenizer.sep_token + " " + response
        encoding = self.tokenizer(
            combined_text,
            max_length=self.max_length,
            padding="max_length",
            truncation=True,
            return_tensors="pt",
        )

        input_ids = encoding["input_ids"].squeeze(0)
        attention_mask = encoding["attention_mask"].squeeze(0)

        return {
            "input_ids": input_ids,
            "attention_mask": attention_mask,
            "labels": torch.tensor(label, dtype=torch.long),
        }

# Custom Trainer class to handle non-contiguous tensor issue
class CustomTrainer(Trainer):
    def save_model(self, output_dir=None, _internal_call=False):
        # Make all tensors contiguous before saving
        for param in self.model.parameters():
            param.data = param.data.contiguous()
        super().save_model(output_dir, _internal_call=_internal_call)

# Step 3: Class for model training
class ModelTrainer:
    def __init__(self, model_name, train_dataset):
        self.model_name = model_name
        self.tokenizer = AutoTokenizer.from_pretrained(self.model_name)
        self.model = BigBirdForSequenceClassification.from_pretrained(self.model_name)
        self.train_dataset = train_dataset
        self.training_args = self._setup_training_args()

    def _setup_training_args(self):
        # Set up training arguments, limiting to 1 epoch for quick testing
        return TrainingArguments(
            output_dir='./results',
            num_train_epochs=1,  # Quick testing with 1 epoch
            per_device_train_batch_size=2,
            learning_rate=2e-5,
            warmup_steps=500,
            weight_decay=0.01,
            logging_dir='./logs',
            logging_steps=50,
            save_total_limit=2,
            save_steps=200,
            evaluation_strategy="no",
        )

    def fine_tune_model(self):
        trainer = CustomTrainer(
            model=self.model,
            args=self.training_args,
            train_dataset=self.train_dataset
        )
        trainer.train()
        return self.model

    def save_model(self, save_path):
        # Ensure all tensors are contiguous before saving
        for param in self.model.parameters():
            param.data = param.data.contiguous()
        self.model.save_pretrained(save_path)
        self.tokenizer.save_pretrained(save_path)
        print(f"Model saved to {save_path}")

# Step 4: Class for coherence evaluation with memory management
class CoherenceEvaluator:
    def __init__(self, model_path):
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.model = BigBirdForSequenceClassification.from_pretrained(model_path)

        # Move model to CPU to avoid GPU memory issues
        device = torch.device('cpu')
        self.model = self.model.to(device)

    def tokenize_input(self, context, response):
        return self.tokenizer(context, response, return_tensors='pt', max_length=1024, truncation=True, padding='max_length')

    def compute_logits(self, inputs):
        # Move inputs to the same device as the model
        inputs = {key: val.to(self.model.device) for key, val in inputs.items()}
        outputs = self.model(**inputs)
        return outputs.logits

    def apply_softmax(self, logits):
        probabilities = F.softmax(logits, dim=1)
        return probabilities[0][1].item()

# Step 5: Main pipeline class to encapsulate the entire process
class CoherencePipeline:
    def __init__(self, dataset_directory, model_name, model_save_path, train_model=True):
        self.dataset_directory = dataset_directory
        self.model_name = model_name
        self.model_save_path = model_save_path
        self.train_model = train_model
        self.dataset_handler = DatasetHandler(dataset_directory)
        self.model_trainer = None
        self.coherence_evaluator = None

    def prepare_dataset(self):
        self.dataset_handler.create_save_directory()
        df = self.dataset_handler.load_or_upload_dataset()
        tokenizer = AutoTokenizer.from_pretrained(self.model_name)
        train_dataset = DialogueDataset(df, tokenizer, max_length=256)
        return train_dataset

    def train_and_save_model(self, train_dataset):
        self.model_trainer = ModelTrainer(self.model_name, train_dataset)
        trained_model = self.model_trainer.fine_tune_model()
        self.model_trainer.save_model(self.model_save_path)
        return trained_model

    def evaluate_coherence(self, dialogue_file_path):
        print("Please upload the test file for evaluation:")
        uploaded = files.upload()
        file_name = list(uploaded.keys())[0]
        with open(file_name, 'r') as file:
            dialogue = file.readlines()

        self.coherence_evaluator = CoherenceEvaluator(self.model_save_path)
        pairs = [(dialogue[i].strip(), dialogue[i + 1].strip()) for i in range(len(dialogue) - 1)]

        scores = []
        for context, response in pairs:
            inputs = self.coherence_evaluator.tokenize_input(context, response)
            logits = self.coherence_evaluator.compute_logits(inputs)
            score = self.coherence_evaluator.apply_softmax(logits)
            scores.append(score)

        # Create DataFrame to store results
        df_results = pd.DataFrame({
            'Pair Number': [f'Pair {i+1}' for i in range(len(pairs))],
            'Context': [pair[0] for pair in pairs],
            'Response': [pair[1] for pair in pairs],
            'Coherence Score': scores
        })

        # Calculate overall coherence score
        overall_score = sum(scores) / len(scores)
        df_results.loc['Overall'] = ['', '', 'Overall Coherence Score', overall_score]

        return df_results

    def run_pipeline(self):
        if self.train_model:
            # Train model if flag is set to True
            train_dataset = self.prepare_dataset()
            self.train_and_save_model(train_dataset)
        else:
            # Check if using a pretrained model from Hugging Face
            if self.model_save_path.startswith("google/"):
                print(f"Using pretrained model from Hugging Face: {self.model_save_path}")
            else:
                # Check if fine-tuned model exists locally
                if not os.path.exists(self.model_save_path):
                    raise FileNotFoundError(f"No fine-tuned model found at {self.model_save_path}. Please train the model first.")
                print(f"Using existing model from {self.model_save_path}")

        # Proceed to evaluate test data and get DataFrame
        df_results = self.evaluate_coherence('your_dialogue_test_file.txt')
        print(df_results)
        return df_results

# Step 6: Run the pipeline
pipeline = CoherencePipeline(
    dataset_directory='/content/drive/MyDrive/Coherence_Model',
    model_name="google/bigbird-roberta-base",
    model_save_path="google/bigbird-roberta-base",  # Pretrained model path
    train_model=False  # Set to True if you want to train, False to use existing model
)

# Run the pipeline
df_results = pipeline.run_pipeline()


Mounted at /content/drive
Drive mounted successfully.
Using pretrained model from Hugging Face: google/bigbird-roberta-base
Please upload the test file for evaluation:


Saving dialogue1.txt to dialogue1.txt


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/1.02k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/760 [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/846k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/775 [00:00<?, ?B/s]



pytorch_model.bin:   0%|          | 0.00/513M [00:00<?, ?B/s]

Some weights of BigBirdForSequenceClassification were not initialized from the model checkpoint at google/bigbird-roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


        Pair Number                                            Context  \
0            Pair 1  ["AI: Hi, my name is Lila. I'm Octivo's AI age...   
1            Pair 2  "Caller: Hey, nice to meet you. My name is Mic...   
2            Pair 3  "AI: Thank you for introducing yourself Michae...   
3            Pair 4  "Caller: Yeah, sure. I'm 27 but I feel like I ...   
4            Pair 5  "AI: I completely understand your hesitation a...   
5            Pair 6  "Caller: Ok, that's fair enough. So I'm earnin...   
6            Pair 7  "AI: Thank you for sharing your income range t...   
7            Pair 8  "Caller: I will retire at around 65 and I woul...   
Overall                                                                  

                                                  Response  Coherence Score  
0        "Caller: Hey, nice to meet you. My name is Mic...         0.566975  
1        "AI: Thank you for introducing yourself Michae...         0.552891  
2        "Caller: Yeah, s