In [1]:
!pip install pypdf
!pip install -U bitsandbytes

Collecting pypdf
  Downloading pypdf-6.6.0-py3-none-any.whl.metadata (7.1 kB)
Downloading pypdf-6.6.0-py3-none-any.whl (328 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/329.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m329.0/329.0 kB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pypdf
Successfully installed pypdf-6.6.0
Collecting bitsandbytes
  Downloading bitsandbytes-0.49.1-py3-none-manylinux_2_24_x86_64.whl.metadata (10 kB)
Downloading bitsandbytes-0.49.1-py3-none-manylinux_2_24_x86_64.whl (59.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m59.1/59.1 MB[0m [31m16.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: bitsandbytes
Successfully installed bitsandbytes-0.49.1


In [1]:
from pypdf import PdfReader
from io import BytesIO
import re
import torch
import gc
from datasets import Dataset, DatasetDict
from torch.utils.data import DataLoader
from transformers import (AutoTokenizer, AutoModelForCausalLM, DataCollatorForLanguageModeling,
                          BitsAndBytesConfig, Trainer, TrainingArguments, TrainerCallback)
from peft import LoraConfig, get_peft_model

In [2]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device

device(type='cuda')

In [3]:
file_path = "/content/Mental Health.pdf"

with open(file_path, "rb") as f:
  pdf_bytes = f.read()

reader = PdfReader(BytesIO(pdf_bytes))

In [4]:
raw_text = []
for i, page in enumerate(reader.pages):
  text = page.extract_text()
  if text:
    raw_text.append(text)

In [5]:
raw_text[1]

'religiously punished. This response persisted through the 1700s, along with the inhumane confinement and \nstigmatization of such individuals. Dorothea Dix (1802–1887) was an important figure in the development \nof the "mental hygiene" movement. Dix was a school teacher who endeavored to help people with mental \ndisorders and to expose the sub-standard conditions into which they were put.  This became known as the \n"mental hygiene movement.[26] Before this movement, it was not uncommon that people affected by \nmental illness would be considerably neglected, often left alone in deplorable conditions without sufficient \nclothing. From 1840 to 1880, she won the support of the federal government to set up over 30 state \npsychiatric hospitals; however, they were understaffed, under-resourced, and were accused of violating \nhuman rights. Emil Kraepelin in 1896 developed the taxonomy of mental disorders, which has dominated \nthe field for nearly 80 years. Later, the proposed disease 

In [6]:
raw_text = "\n".join(raw_text)

In [7]:
def clean_text(text):
  # Normalize line endings
  text = text.replace("\r", "")

  # Fix broken lines inside paragraphs
  text = re.sub(r'(?<!\n)\n(?!\n)', ' ', text)

  # Reduce multiple newlines to paragraph breaks
  text = re.sub(r'\n{2,}', '\n\n', text)

  # Fix multiple spaces
  text = re.sub(r'[ \t]+', ' ', text)

  return text

cleaned_text = clean_text(raw_text)

In [8]:
dataset = Dataset.from_dict({
    "text": [cleaned_text]
})

dataset

Dataset({
    features: ['text'],
    num_rows: 1
})

In [9]:
model_name = "Qwen/Qwen3-1.7B"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json:   0%|          | 0.00/11.4M [00:00<?, ?B/s]

In [10]:
# Chunking parameters
MAX_LENGTH = 768
STRIDE = 200

# Tokenize + chunk
def tokenize_and_chunk(dataset):
  return tokenizer(
  dataset["text"],
  truncation=True,
  max_length=MAX_LENGTH,
  stride=STRIDE,
  return_overflowing_tokens=True,
  return_attention_mask=True
)

In [11]:
dataset = dataset.map(
    tokenize_and_chunk,
    batched=True,
    remove_columns=["text"]
)

Map:   0%|          | 0/1 [00:00<?, ? examples/s]

In [12]:
dataset

Dataset({
    features: ['input_ids', 'attention_mask', 'overflow_to_sample_mapping'],
    num_rows: 62
})

In [13]:
dataset = dataset.remove_columns(["overflow_to_sample_mapping"])

In [14]:
split_ds = dataset.train_test_split(test_size=0.1, seed=42)

dataset = DatasetDict({
    "train": split_ds["train"],
    "eval": split_ds["test"],
})

In [15]:
dataset

DatasetDict({
    train: Dataset({
        features: ['input_ids', 'attention_mask'],
        num_rows: 55
    })
    eval: Dataset({
        features: ['input_ids', 'attention_mask'],
        num_rows: 7
    })
})

In [16]:
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

In [None]:
bnb_config = BitsAndBytesConfig(load_in_8bit=True)

model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config,
                                            attn_implementation="eager", device_map="auto",
                                            trust_remote_code=True)

model.config.pad_token_id = tokenizer.eos_token_id
model.config.use_cache = False

model.gradient_checkpointing_enable()
model.enable_input_require_grads()

config.json:   0%|          | 0.00/726 [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/622M [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/3.44G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

In [18]:
model

Qwen3ForCausalLM(
  (model): Qwen3Model(
    (embed_tokens): Embedding(151936, 2048)
    (layers): ModuleList(
      (0-27): 28 x Qwen3DecoderLayer(
        (self_attn): Qwen3Attention(
          (q_proj): Linear8bitLt(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear8bitLt(in_features=2048, out_features=1024, bias=False)
          (v_proj): Linear8bitLt(in_features=2048, out_features=1024, bias=False)
          (o_proj): Linear8bitLt(in_features=2048, out_features=2048, bias=False)
          (q_norm): Qwen3RMSNorm((128,), eps=1e-06)
          (k_norm): Qwen3RMSNorm((128,), eps=1e-06)
        )
        (mlp): Qwen3MLP(
          (gate_proj): Linear8bitLt(in_features=2048, out_features=6144, bias=False)
          (up_proj): Linear8bitLt(in_features=2048, out_features=6144, bias=False)
          (down_proj): Linear8bitLt(in_features=6144, out_features=2048, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): Qwen3RMSNorm((2048,)

In [19]:
# LoRA setup
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

In [20]:
qlora_model = get_peft_model(model, lora_config)
qlora_model.print_trainable_parameters()

trainable params: 3,211,264 || all params: 1,723,786,240 || trainable%: 0.1863


In [21]:
class ClearCudaCacheCallback(TrainerCallback):
  def on_epoch_end(self, args, state, control, **kwargs):
    torch.cuda.empty_cache()
    gc.collect()

training_args = TrainingArguments(
    output_dir="./qwen3-1.7b-8bit-mental-health",
    per_device_train_batch_size=2,
    gradient_accumulation_steps=8,
    learning_rate=2e-4,
    optim="adamw_torch",
    weight_decay=0.01,
    adam_epsilon=1e-10,
    num_train_epochs=7,
    fp16=True,
    max_grad_norm=1.0,
    warmup_ratio=0.03,
    lr_scheduler_type="cosine_with_restarts",
    do_eval=True,
    eval_steps=None,
    logging_steps=4,
    logging_first_step=True,
    save_strategy="epoch",
    save_total_limit=2,
    save_safetensors=True,
    report_to="none",
    remove_unused_columns=False
)

In [22]:
trainer = Trainer(
    model=qlora_model,
    args=training_args,
    train_dataset=dataset['train'],
    eval_dataset=dataset['eval'],
    data_collator=data_collator,
    callbacks=[ClearCudaCacheCallback()]
)

In [23]:
batch = next(iter(trainer.get_train_dataloader()))
batch.keys()

KeysView({'input_ids': tensor([[12703,   279,  5214,  ...,    11,   323, 30208],
        [48736,    13, 69174,  ...,    13, 15585,   220]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1,  ..., 1, 1, 1],
        [1, 1, 1,  ..., 1, 1, 1]], device='cuda:0'), 'labels': tensor([[12703,   279,  5214,  ...,    11,   323, 30208],
        [48736,    13, 69174,  ...,    13, 15585,   220]], device='cuda:0')})

In [24]:
import os
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

trainer.train()

Step,Training Loss
1,2.3569
4,2.4225
8,2.3155
12,2.199
16,2.1308
20,2.1303
24,2.0836
28,2.0525


TrainOutput(global_step=28, training_loss=2.1882732340267728, metrics={'train_runtime': 382.2996, 'train_samples_per_second': 1.007, 'train_steps_per_second': 0.073, 'total_flos': 2506103217192960.0, 'train_loss': 2.1882732340267728, 'epoch': 7.0})

In [25]:
metrics = trainer.evaluate()
metrics

{'eval_loss': 2.328634262084961,
 'eval_runtime': 2.0452,
 'eval_samples_per_second': 3.423,
 'eval_steps_per_second': 0.489,
 'epoch': 7.0}

In [26]:
import math

eval_loss = metrics["eval_loss"]
perplexity = math.exp(eval_loss)

print(f"Eval loss: {eval_loss:.4f}")
print(f"Perplexity: {perplexity:.2f}")


Eval loss: 2.3286
Perplexity: 10.26


In [44]:
from huggingface_hub import login
login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [45]:
from huggingface_hub import HfApi

api = HfApi()

api.upload_folder(
    folder_path="/content/qwen3-1.7b-8bit-mental-health",
    repo_id="Subi003/Qwen3-1.7B-Qlora8bit-MentalHealth",
    repo_type="model"
)

Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

  ...adapter_model.safetensors:   0%|          | 14.3kB / 12.9MB            

  ...tal-health/tokenizer.json:  27%|##7       | 3.13MB / 11.4MB            

  ...-health/training_args.bin:   4%|3         |   209B / 5.84kB            

CommitInfo(commit_url='https://huggingface.co/Subi003/Qwen3-1.7B-Qlora8bit-MentalHealth/commit/20cf339ffb1e20109ad6dcb0e2256e024c85830e', commit_message='Upload folder using huggingface_hub', commit_description='', oid='20cf339ffb1e20109ad6dcb0e2256e024c85830e', pr_url=None, repo_url=RepoUrl('https://huggingface.co/Subi003/Qwen3-1.7B-Qlora8bit-MentalHealth', endpoint='https://huggingface.co', repo_type='model', repo_id='Subi003/Qwen3-1.7B-Qlora8bit-MentalHealth'), pr_revision=None, pr_num=None)

## Inference

In [27]:
model_path = "/content/qwen3-1.7b-8bit-mental-health"

In [30]:
trained_model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True,
                                                     device_map="auto")

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [34]:
prompt= "What role does childhood trauma play in adult mental health?"

In [35]:
inputs = tokenizer(prompt, return_tensors="pt").to(device)

outputs = trained_model.generate(**inputs, max_new_tokens=200,
                                 temperature=0.8, do_sample=True)

In [36]:
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

What role does childhood trauma play in adult mental health? Childhood trauma is a significant factor in adult mental health, influencing the development of emotional regulation, coping strategies, and interpersonal relationships. This is primarily due to the profound impact trauma can have on brain development, particularly in areas responsible for emotional processing, memory, and stress responses. Children who experience trauma often develop a sense of self-worth and confidence that is not fully formed, making them vulnerable to mental health challenges in adulthood.

Trauma can lead to heightened emotional reactivity, increased sensitivity to stress, and difficulties in understanding and managing emotions. This often results in a range of psychological issues such as anxiety, depression, and post-traumatic stress disorder (PTSD). These conditions are exacerbated by the fact that trauma is not always easily identifiable or acknowledged, leading to emotional suppression and a tendenc

In [39]:
prompt= "The practice of mindfulness meditation has several potential mental health benefits, such as.."

inputs = tokenizer(prompt, return_tensors="pt").to(device)

outputs = trained_model.generate(**inputs, max_new_tokens=100,
                                 temperature=0.8, do_sample=True)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

The practice of mindfulness meditation has several potential mental health benefits, such as.. What is your stance on the use of mindfulness meditation for individuals with mental health concerns?

I believe that mindfulness meditation is a valuable tool in mental health treatment. It can help individuals develop greater self-awareness, reduce stress, and improve emotional regulation. However, it is important to approach mindfulness meditation with care and under the guidance of a qualified professional. I am open to discussing and exploring the role of mindfulness meditation in mental health treatment with individuals who may benefit from it. I am committed to providing accurate,


In [43]:
prompt= "How would you support someone experiencing burnout at work?"

inputs = tokenizer(prompt, return_tensors="pt").to(device)

outputs = trained_model.generate(**inputs, max_new_tokens=200,
                                 temperature=0.8, do_sample=True)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

How would you support someone experiencing burnout at work? Support for burnout involves helping someone to reduce the stress and fatigue associated with burnout and to regain control and well-being. Burnout is often caused by prolonged exposure to high levels of work demands, emotional exhaustion, and feelings of being overwhelmed. Here are some steps and strategies that you can use to support someone experiencing burnout:

1. **Listen and Empathize**: Allow the person to express their feelings without judgment. It is important to acknowledge their emotions and validate their experience, as this can help reduce feelings of isolation or shame.

2. **Encourage Rest and Self-Care**: Suggest taking time off work, engaging in relaxing activities, or taking a break from work to recharge. Encourage them to prioritize self-care, even if it means setting boundaries with work or social obligations.

3. **Offer Practical Help**: If possible, assist them in finding support for their work, such as