<a href="https://colab.research.google.com/github/nicchic/NLP/blob/main/mental_t5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from transformers import pipeline

pipe = pipeline("translation", model="google-t5/t5-small")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [2]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-small")
model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-small")

In [3]:
import torch

In [4]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

In [5]:
model_name = "google-t5/t5-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)

In [6]:
model = AutoModelForSeq2SeqLM.from_pretrained(model_name).to(device)

In [7]:
pip install datasets



In [8]:
from datasets import load_dataset

ds = load_dataset("Amod/mental_health_counseling_conversations")

Downloading readme:   0%|          | 0.00/2.82k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/4.79M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/3512 [00:00<?, ? examples/s]

In [9]:
tokenizer.pad_token = tokenizer.eos_token

def tokenize_function(examples):
    context_tokens = tokenizer(examples['Context'], padding="max_length", truncation=True, max_length=128)
    response_tokens = tokenizer(examples['Response'], padding="max_length", truncation=True, max_length=128)

    return {
        "context_input_ids": context_tokens["input_ids"],
        "context_attention_mask": context_tokens["attention_mask"],
        "response_input_ids": response_tokens["input_ids"],
        "response_attention_mask": response_tokens["attention_mask"]
    }


In [10]:
tokenized_ds = ds.map(tokenize_function, batched=True)

Map:   0%|          | 0/3512 [00:00<?, ? examples/s]

In [11]:
attention_mask = torch.tensor(tokenized_ds['train']['context_attention_mask'][0]).unsqueeze(0).to(device)


In [27]:
input_ids = torch.tensor(tokenized_ds['train']['context_input_ids'][0]).unsqueeze(0).to(device)


In [31]:
output_beam = model.generate(input_ids,
                        max_length= 300,
                        attention_mask=attention_mask,
                        pad_token_id=tokenizer.eos_token_id,
                        num_beams= 10,
                        do_sample= False,
                        no_repeat_ngram_size = 3
                        )

generated_text_beam = tokenizer.decode(output_beam[0], skip_special_tokens=True)
print("\nGenerated Text (Beam Search):")
print(generated_text_beam)


Generated Text (Beam Search):
.. I never tried or contemplated suicide. I've always wanted to fix my issues, but I never get around to it. How can I change my feeling of being worthless to everyone?


In [32]:
output = model.generate(input_ids,
                        attention_mask=attention_mask,
                        pad_token_id=tokenizer.eos_token_id,
                        max_length= 300,
                        do_sample=True,
                        top_p=0.9,
                        )

generated_text_top = tokenizer.decode(output[0], skip_special_tokens=True)
print("Generated Text (Top-p Sampling):")
print(generated_text_top)


Generated Text (Top-p Sampling):
.. I've never tried to fix my issues but never managed to fix my issues. I don't get around to suicide..... to all? Can I change my feeling of being worthless to everyone?...What a way to change my feeling of being worthless to everyone?


In [19]:
pip install rouge_score



In [20]:
pip install evaluate



In [21]:
from evaluate import load
rouge = load("rouge")


Downloading builder script:   0%|          | 0.00/6.27k [00:00<?, ?B/s]

In [33]:
reference_text =  "I'm going through some things with my feelings and myself. I barely sleep and I do nothing but think about how I'm worthless and how I shouldn't be here. I've never tried or contemplated suicide. I've always wanted to fix my issues, but I never get around to it.  How can I change my feeling of being worthless to everyone? Here are few things that you could do to make a positive change of you feel about yourself.List all the things that you want to do in life and then start working on one thing at a time. Give yourself credit for even the smallest accomplishment and don't forget to celebrate the fact that you've tried."


In [34]:
rouge_result_top_p = rouge.compute(predictions=[generated_text_top], references=[reference_text])

print("\nROUGE Scores (Top-p Sampling):")

def display_rouge_scores(rouge_scores):
    for rouge_type, score in rouge_scores.items():
        print(f"{rouge_type.upper()}: {score:.4f}")

display_rouge_scores(rouge_result_top_p)


ROUGE Scores (Top-p Sampling):
ROUGE1: 0.4235
ROUGE2: 0.2262
ROUGEL: 0.3176
ROUGELSUM: 0.3176


In [35]:
rouge_result_beam = rouge.compute(predictions=[generated_text_beam], references=[reference_text])
print("\nROUGE Scores (Beam Search):")
display_rouge_scores(rouge_result_beam)


ROUGE Scores (Beam Search):
ROUGE1: 0.4103
ROUGE2: 0.3896
ROUGEL: 0.4103
ROUGELSUM: 0.4103
