# Step 1: Mental Health Chatbot Training Pipeline

This notebook performs a full end-to-end pipeline using all 8 specified datasets:
- 5 from Hugging Face
- 3 local CSV files

We are training:
1. RoBERTa for Emotion Detection (base model)
2. T5 for Response Generation (small base model)
3. T5 for Q&A Assistant

We also built a full logic flow and evaluate the chatbot's performance.

In [1]:
import random
import pandas as pd
from datasets import load_dataset, Dataset, concatenate_datasets
from transformers import (AutoTokenizer, AutoModelForSequenceClassification,
                          T5Tokenizer, T5ForConditionalGeneration,
                          Trainer, TrainingArguments,
                          Seq2SeqTrainer, Seq2SeqTrainingArguments)

import nltk
from nltk.translate.bleu_score import sentence_bleu  
from nltk.translate.meteor_score import meteor_score  
from rouge_score import rouge_scorer
import sentencepiece as spm  





In [2]:
# Load local files
local1 = pd.read_csv("./data/mental_health_faq.csv")
local2 = pd.read_csv("./data/transformed_mental_health_chatbot.csv")
local3 = pd.read_csv("./data/Mental Health Chatbot Dataset - Friend mode and Professional mode Responses.csv")

# Unify function
def unify(ds, q_col, a_col):
    return ds.map(lambda x: {"question": str(x[q_col]).strip(), "answer": str(x[a_col]).strip()})

# Convert local files
local1_ds = unify(Dataset.from_pandas(local1), 'Question', 'Answer')
local2_ds = unify(Dataset.from_pandas(local2), 'Question', 'Answer')
local3_ds = unify(Dataset.from_pandas(local3), 'Prompt', 'Friend Response')  # or 'Professional Response'


Map:   0%|          | 0/98 [00:00<?, ? examples/s]

Map:   0%|          | 0/172 [00:00<?, ? examples/s]

Map:   0%|          | 0/205 [00:00<?, ? examples/s]

In [3]:
# Load online datasets
ds1 = load_dataset('tolu07/Mental_Health_FAQ', split='train')  # ['question', 'answer']
ds2 = load_dataset('Amod/mental_health_counseling_conversations', split='train')  # ['Context', 'Response']
ds3 = load_dataset('ruslanmv/ai-medical-chatbot', split='train')  # ['Patient', 'Doctor']
ds4 = load_dataset('lavita/ChatDoctor-HealthCareMagic-100k', split='train')  # ['instruction', 'input', 'output']
ds5 = load_dataset('heliosbrahma/mental_health_chatbot_dataset', split='train')  # ['text']

# Normalize online datasets
ds1 = unify(ds1, 'Questions', 'Answers')
ds2 = unify(ds2, 'Context', 'Response')
ds3 = unify(ds3, 'Patient', 'Doctor')  # Map 'Patient' to 'question' and 'Doctor' to 'answer'
ds4 = ds4.map(lambda x: {"question": (x['instruction'] + ' ' + x['input']).strip(), "answer": x['output'].strip()})
ds5 = ds5.map(lambda x: {"question": x['text'].split('\n')[0].strip(), "answer": x['text'].split('\n')[-1].strip()} if '\n' in x['text'] else {"question": x['text'].strip(), "answer": x['text'].strip()})

# Combine all
all_data = concatenate_datasets([ds1, ds2, ds3, ds4, ds5, local1_ds, local2_ds, local3_ds])  # Combine all datasets
all_data = all_data.filter(lambda x: bool(x['question']) and bool(x['answer']))
print('Total records in the combined dataset:', len(all_data))

print("Dataset loaded and unified successfully.")

Total records in the combined dataset: 373327
Dataset loaded and unified successfully.


## Step 2: Train RoBERTa for Emotion Detection

In [4]:
# Use subset of data for emotion labels (simulate labels for demo)
sample_df = pd.DataFrame(all_data[:3000])  # Use 3000 examples for quick demo
emotions = ['neutral', 'sadness', 'nervousness', 'anger', 'fear']
sample_df['label'] = [random.choice(emotions) for _ in range(len(sample_df))]

In [5]:
# Encode labels
label2id = {label: i for i, label in enumerate(emotions)}
id2label = {i: label for label, i in label2id.items()}
sample_df['label'] = sample_df['label'].map(label2id)

In [6]:
# Convert to Dataset
tokenizer_roberta = AutoTokenizer.from_pretrained("roberta-base")
model_roberta = AutoModelForSequenceClassification.from_pretrained("roberta-base", num_labels=len(label2id))

Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [7]:
# Tokenize the dataset for RoBERTa
def tokenize_roberta(examples):
    return tokenizer_roberta(examples['question'], truncation=True, padding="max_length")

emo_dataset = Dataset.from_pandas(sample_df)
emo_dataset = emo_dataset.map(tokenize_roberta, batched=True)
emo_train_test = emo_dataset.train_test_split(test_size=0.1)

training_args_roberta = TrainingArguments(
    output_dir="./roberta-emotion",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    weight_decay=0.01,
)

Map:   0%|          | 0/3000 [00:00<?, ? examples/s]



In [8]:
# Initialize Trainer for RoBERTa
# trainer_roberta = Trainer(
#     model=model_roberta,
#     args=training_args_roberta,
#     train_dataset=emo_train_test["train"],
#     eval_dataset=emo_train_test["test"],
#     tokenizer=tokenizer_roberta,
# )

# trainer_roberta.train()




In [9]:
model_roberta.save_pretrained("./saved_models/roberta_emotion")
tokenizer_roberta.save_pretrained("./saved_models/roberta_emotion")

('./saved_models/roberta_emotion\\tokenizer_config.json',
 './saved_models/roberta_emotion\\special_tokens_map.json',
 './saved_models/roberta_emotion\\vocab.json',
 './saved_models/roberta_emotion\\merges.txt',
 './saved_models/roberta_emotion\\added_tokens.json',
 './saved_models/roberta_emotion\\tokenizer.json')

## Step 3: Train T5 for Response Generation

In [10]:
from datasets import concatenate_datasets

# Define unified datasets
datasets_list = []

# ==== Toggle Each Dataset On/Off ====
# Online
USE_DS1 = False   # tolu07/Mental_Health_FAQ
USE_DS2 = False  # Amod/mental_health_counseling_conversations
USE_DS3 = False  # ruslanmv/ai-medical-chatbot
USE_DS4 = False  # lavita/ChatDoctor-HealthCareMagic-100k
USE_DS5 = True  # heliosbrahma/mental_health_chatbot_dataset

# Local
USE_LOCAL1 = False  # mental_health_faq.csv
USE_LOCAL2 = False  # transformed_mental_health_chatbot.csv
USE_LOCAL3 = False  # Mental Health Chatbot Dataset - Friend/Pro mode

# === Load and append each if enabled ===
if USE_DS1:
    datasets_list.append(ds1)
if USE_DS2:
    datasets_list.append(ds2)
if USE_DS3:
    datasets_list.append(ds3)
if USE_DS4:
    datasets_list.append(ds4)
if USE_DS5:
    datasets_list.append(ds5)

if USE_LOCAL1:
    datasets_list.append(local1_ds)
if USE_LOCAL2:
    datasets_list.append(local2_ds)
if USE_LOCAL3:
    datasets_list.append(local3_ds)

# Combine selected datasets
all_data = concatenate_datasets(datasets_list)
all_data = all_data.filter(lambda x: bool(x['question']) and bool(x['answer']))
print("Total examples loaded:", len(all_data))


Total examples loaded: 165


In [11]:
# Initialize T5 for response generation

tokenizer_t5 = T5Tokenizer.from_pretrained("t5-small")  # or t5-tiny if you have it
model_t5 = T5ForConditionalGeneration.from_pretrained("t5-small")

You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


In [12]:
def preprocess_t5_batched(batch):
    def clean_text(text):
        return str(text).replace("HUMAN>:", "").replace("HUMAN>", "").replace("USER:", "").replace("<ASSISTANT>:", "").strip()

    inputs = ["question: " + clean_text(q) for q in batch['question']]
    targets = [clean_text(a) for a in batch['answer']]

    model_inputs = tokenizer_t5(inputs, max_length=128, truncation=True, padding="max_length")

    with tokenizer_t5.as_target_tokenizer():
        labels = tokenizer_t5(targets, max_length=128, truncation=True, padding="max_length")

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs



# Remove existing tokenized columns before applying preprocess_t5
t5_eval_dataset_cleaned = t5_split["test"].remove_columns(["input_ids", "attention_mask", "labels"]).map(preprocess_t5)


NameError: name 't5_split' is not defined

In [None]:
all_data = all_data.remove_columns([col for col in all_data.column_names if col not in ['question', 'answer']])
t5_dataset = all_data.map(preprocess_t5_batched, batched=True)
t5_split = t5_dataset.train_test_split(test_size=0.1)



training_args_t5 = Seq2SeqTrainingArguments(
    output_dir="./t5-response",
    evaluation_strategy="epoch",
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    predict_with_generate=True,
    num_train_epochs=1,
    fp16=True
)

In [None]:
# Initialize Seq2SeqTrainer for T5
trainer_t5 = Seq2SeqTrainer(
    model=model_t5,
    args=training_args_t5,
    train_dataset=t5_split["train"],
    eval_dataset=t5_split["test"],
    tokenizer=tokenizer_t5,
)

trainer_t5.train()

  trainer_t5 = Seq2SeqTrainer(


Epoch,Training Loss,Validation Loss
1,No log,7.619164


TrainOutput(global_step=10, training_loss=7.349626922607422, metrics={'train_runtime': 42.6945, 'train_samples_per_second': 3.466, 'train_steps_per_second': 0.234, 'total_flos': 5007646654464.0, 'train_loss': 7.349626922607422, 'epoch': 1.0})

In [None]:
# Load local data  
local1 = pd.read_csv('./data/mental_health_faq.csv')  # ['Questions', 'Answers']  
local2 = pd.read_csv('./data/transformed_mental_health_chatbot.csv')  # ['question', 'answer']  
local3 = pd.read_csv('./data/Mental Health Chatbot Dataset - Friend mode and Professional mode Responses.csv')  # ['Prompt', 'Friend Response']  
  
# Function to standardize datasets to have 'question' and 'answer' keys  
def unify(ds, q_col, a_col):  
    return ds.map(lambda x: {"question": str(x[q_col]).strip(), "answer": str(x[a_col]).strip()})  
  
# Convert local data to datasets  
local1_ds = unify(Dataset.from_pandas(local1), 'Question', 'Answer')  
local2_ds = unify(Dataset.from_pandas(local2), 'Question', 'Answer')  
local3_ds = unify(Dataset.from_pandas(local3), 'Prompt', 'Friend Response')  
  
# Load online datasets and fix columns  
ds1 = load_dataset('tolu07/Mental_Health_FAQ', split='train')  # ['question', 'answer']  
ds2 = load_dataset('Amod/mental_health_counseling_conversations', split='train')  # ['Context', 'Response']  
ds3 = load_dataset('ruslanmv/ai-medical-chatbot', split='train')  # ['Context', 'Response']  
ds4 = load_dataset('lavita/ChatDoctor-HealthCareMagic-100k', split='train')  # ['instruction', 'input', 'output']  
ds5 = load_dataset('heliosbrahma/mental_health_chatbot_dataset', split='train')  # ['text']  
  
# Normalize columns  
ds1 = unify(ds1, 'Questions', 'Answers')    
ds2 = unify(ds2, 'Context', 'Response')  
ds3 = unify(ds3, 'Patient', 'Doctor')  # Map 'Patient' to 'question' and 'Doctor' to 'answer'  
ds4 = ds4.map(lambda x: {"question": (x['instruction'] + ' ' + x['input']).strip(), "answer": x['output'].strip()})  
ds5 = ds5.map(lambda x: {"question": x['text'].split('\n')[0].strip(), "answer": x['text'].split('\n')[-1].strip()} if '\n' in x['text'] else {"question": x['text'].strip(), "answer": x['text'].strip()})  
  
# Combine everything  
all_data = concatenate_datasets([ds1, ds2, ds3, ds4, ds5, local1_ds, local2_ds, local3_ds])  
all_data = all_data.filter(lambda x: bool(x['question']) and bool(x['answer']))  
  
# Print out the total number of records  
print('Total records in the combined dataset:', len(all_data))  


Map:   0%|          | 0/98 [00:00<?, ? examples/s]

Map:   0%|          | 0/172 [00:00<?, ? examples/s]

Map:   0%|          | 0/205 [00:00<?, ? examples/s]

Total records in the combined dataset: 373327


In [None]:
model_t5.save_pretrained("./saved_models/t5_response")
tokenizer_t5.save_pretrained("./saved_models/t5_response")

('./saved_models/t5_response\\tokenizer_config.json',
 './saved_models/t5_response\\special_tokens_map.json',
 './saved_models/t5_response\\spiece.model',
 './saved_models/t5_response\\added_tokens.json')

## Step 4: Train T5 for Q&A Assistant

In [None]:
from datasets import concatenate_datasets

# Define unified datasets
datasets_list = []

# ==== Toggle Each Dataset On/Off ====
# Online
USE_DS1 = False   # tolu07/Mental_Health_FAQ
USE_DS2 = False  # Amod/mental_health_counseling_conversations
USE_DS3 = False  # ruslanmv/ai-medical-chatbot
USE_DS4 = False  # lavita/ChatDoctor-HealthCareMagic-100k
USE_DS5 = True  # heliosbrahma/mental_health_chatbot_dataset
# Local
USE_LOCAL1 = False  # mental_health_faq.csv
USE_LOCAL2 = False  # transformed_mental_health_chatbot.csv
USE_LOCAL3 = False  # Mental Health Chatbot Dataset - Friend/Pro mode

# === Load and append each if enabled ===
if USE_DS1:
    datasets_list.append(ds1)
if USE_DS2:
    datasets_list.append(ds2)
if USE_DS3:
    datasets_list.append(ds3)
if USE_DS4:
    datasets_list.append(ds4)
if USE_DS5:
    datasets_list.append(ds5)

if USE_LOCAL1:
    datasets_list.append(local1_ds)
if USE_LOCAL2:
    datasets_list.append(local2_ds)
if USE_LOCAL3:
    datasets_list.append(local3_ds)

# Combine selected datasets
all_data = concatenate_datasets(datasets_list)
all_data = all_data.filter(lambda x: bool(x['question']) and bool(x['answer']))
print("Total examples loaded:", len(all_data))

Total examples loaded: 165


In [None]:
model_qa = T5ForConditionalGeneration.from_pretrained("t5-small")
qa_dataset = all_data.map(preprocess_t5_batched, batched=True)

qa_split = qa_dataset.train_test_split(test_size=0.1)

In [None]:
all_data = all_data.remove_columns([col for col in all_data.column_names if col not in ['question', 'answer']])
t5_dataset = all_data.map(preprocess_t5_batched, batched=True)
t5_split = t5_dataset.train_test_split(test_size=0.1)



training_args_t5 = Seq2SeqTrainingArguments(
    output_dir="./t5-response",
    evaluation_strategy="epoch",
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    predict_with_generate=True,
    num_train_epochs=1,
    fp16=True
)



In [None]:
training_args_qa = Seq2SeqTrainingArguments(
    output_dir="./t5-qa",
    evaluation_strategy="epoch",
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=1,
    predict_with_generate=True,
    fp16=True
)

In [None]:
trainer_qa = Seq2SeqTrainer(
    model=model_qa,
    args=training_args_qa,
    train_dataset=qa_split["train"],
    eval_dataset=qa_split["test"],
    tokenizer=tokenizer_t5,
)

trainer_qa.train()

  trainer_qa = Seq2SeqTrainer(


Epoch,Training Loss,Validation Loss
1,No log,7.0899


TrainOutput(global_step=10, training_loss=7.720272827148437, metrics={'train_runtime': 43.6613, 'train_samples_per_second': 3.39, 'train_steps_per_second': 0.229, 'total_flos': 5007646654464.0, 'train_loss': 7.720272827148437, 'epoch': 1.0})

In [None]:
model_qa.save_pretrained("./saved_models/t5_qa")
tokenizer_t5.save_pretrained("./saved_models/t5_qa")  # same tokenizer

('./saved_models/t5_qa\\tokenizer_config.json',
 './saved_models/t5_qa\\special_tokens_map.json',
 './saved_models/t5_qa\\spiece.model',
 './saved_models/t5_qa\\added_tokens.json')

## Step 5: Combined Chatbot Logic

In [None]:
def chatbot(user_input):
    # Emotion Detection
    emo_inputs = tokenizer_roberta(user_input, return_tensors="pt")
    emo_outputs = model_roberta(**emo_inputs)
    emotion = id2label[int(emo_outputs.logits.argmax(dim=1))]

    # Support message
    support_msg = "I'm here for you." if emotion != "neutral" else "Letâ€™s talk more about how you're feeling."

    # Response generation
    response_input = tokenizer_t5("question: " + user_input, return_tensors="pt").input_ids
    response_output = model_t5.generate(response_input, max_length=100)
    response_text = tokenizer_t5.decode(response_output[0], skip_special_tokens=True)

    # Q&A answer
    qa_output = model_qa.generate(response_input, max_length=100)
    answer_text = tokenizer_t5.decode(qa_output[0], skip_special_tokens=True)

    print(f"Detected Emotion: {emotion}")
    print(f"Empathetic Support: {support_msg}")
    print(f"Generated Response: {response_text}")
    print(f"Factual Q&A: {answer_text}")


## Step 6: Evaluate

In [None]:
def evaluate_model(model, tokenizer, dataset, sample_size=5, batch_size=4):
    from evaluate import load as load_metric
    from bert_score import score
    import torch
    from torch.nn import CrossEntropyLoss
    from transformers import DataCollatorForSeq2Seq
    from tqdm import tqdm

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.eval()
    model.to(device)

    # â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
    # Step 1: Sample Predictions
    print("\nðŸ”¹ Generating sample predictions...")
    test_data = dataset.select(range(sample_size))
    references = [ex['answer'] for ex in test_data]
    predictions = []

    for ex in tqdm(test_data, desc="Generating outputs"):
        input_text = "question: " + str(ex['question']).strip()
        encoded = tokenizer(input_text, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
        input_ids = encoded['input_ids'].to(device)

        output_ids = model.generate(input_ids=input_ids, max_length=100)
        generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

        predictions.append(generated_text)

    # â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
    # Step 2: ROUGE Evaluation
    print("\nðŸ”¹ ROUGE Evaluation:")
    rouge = load_metric("rouge")
    valid_predictions = [pred if pred.strip() else "N/A" for pred in predictions]
    rouge_scores = rouge.compute(predictions=valid_predictions, references=references)
    for key, value in rouge_scores.items():
        print(f"{key}: {value:.4f}")

    # â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
    # Step 3: BERTScore Evaluation
    print("\nðŸ”¹ BERTScore Evaluation:")
    P, R, F1 = score(predictions, references, lang="en", verbose=False)
    print(f"Precision: {P.mean().item():.4f}")
    print(f"Recall:    {R.mean().item():.4f}")
    print(f"F1 Score:  {F1.mean().item():.4f}")

    # â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
    
    # Ensure dataset has 'input_ids' and 'labels'
    if not all(k in dataset.column_names for k in ['input_ids', 'labels']):
        raise ValueError("Dataset must be tokenized with 'input_ids' and 'labels' before evaluating perplexity.")


    # Step 4: Perplexity on full dataset
    print("\nðŸ”¹ Perplexity Calculation:")
    data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=model)
    test_dataloader = torch.utils.data.DataLoader(
    dataset.remove_columns([col for col in dataset.column_names if col not in ['input_ids', 'attention_mask', 'labels']]),
    batch_size=batch_size,
    collate_fn=data_collator
)


    total_loss = 0
    total_tokens = 0

    with torch.no_grad():
        for batch in tqdm(test_dataloader, desc="Evaluating perplexity"):
            input_ids = batch["input_ids"].to(device)
            labels = batch["labels"].to(device)

            outputs = model(input_ids=input_ids, labels=labels)
            logits = outputs.logits

            shift_logits = logits[..., :-1, :].contiguous()
            shift_labels = labels[..., 1:].contiguous()

            loss_fct = CrossEntropyLoss(ignore_index=-100, reduction='sum')
            loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))

            total_loss += loss.item()
            total_tokens += (shift_labels != -100).sum().item()

    perplexity = torch.exp(torch.tensor(total_loss / total_tokens))
    print(f"Perplexity: {perplexity.item():.4f}")

    # â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
    # Step 5: Print Sample Predictions
    print("\nðŸ”¹ Sample Predictions:")
    for ref, pred in zip(references, predictions):
        print(f"REF:  {ref}")
        print(f"PRED: {pred}")
        print("-" * 60)


In [None]:
# STEP 1: Clean the test set â€” only keep necessary columns
test_clean = t5_split["test"].remove_columns(
    [col for col in t5_split["test"].column_names if col not in ['question', 'answer']]
)

# STEP 2: Tokenize the test set using the CORRECT batched function
t5_eval_dataset = test_clean.map(preprocess_t5_batched, batched=True)

# STEP 3: Run evaluation
evaluate_model(model_t5, tokenizer_t5, t5_eval_dataset)



ðŸ”¹ Generating sample predictions...


Generating outputs: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 5/5 [00:02<00:00,  1.86it/s]



ðŸ”¹ ROUGE Evaluation:
rouge1: 0.0460
rouge2: 0.0063
rougeL: 0.0460
rougeLsum: 0.0458

ðŸ”¹ BERTScore Evaluation:


Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Precision: 0.6570
Recall:    0.6294
F1 Score:  0.6424

ðŸ”¹ Perplexity Calculation:


Evaluating perplexity: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 5/5 [00:01<00:00,  3.21it/s]

Perplexity: 1177.8364

ðŸ”¹ Sample Predictions:
REF:  Remember, each person's journey to mental health is unique, so it's crucial to be patient with yourself and not compare your progress to others. Recovery is possible, and with the right support and dedication, you can lead a fulfilling life.
PRED: HUMAN>:
------------------------------------------------------------
REF:  <ASSISTANT>: Substance abuse can simply be defined as a pattern of harmful use of any substance for mood-altering purposes. Medline's medical encyclopedia defines drug abuse as "the use of illicit drugs or the abuse of prescription or over-the-counter drugs for purposes other than those for which they are indicated or in a manner or in quantities other than directed.
PRED: substance abuse
------------------------------------------------------------
REF:  Remember, while it's essential to educate yourself, seeking professional help from a licensed mental health practitioner is crucial for personalized advice and trea


