# üß† FinTherapy-8B: Finnish Mental Health AI Demo

Welcome to the official demo for **FinTherapy-8B**, a specialized Finnish language model finetuned for empathetic and professional mental health dialogue.

### ‚ÑπÔ∏è About the Model
- **Base Model:** LumiOpen/Llama-Poro-2-8B-Instruct
- **Adapter:** kidahatsu/fintherapy-8b
- **Focus:** Therapeutic empathy, active listening, and mental health support in Finnish.

### ‚öôÔ∏è Setup Instructions (Kaggle)
1. Open the **Notebook Settings** (sidebar).
2. Set **Accelerator** to `GPU T4 x 2`.
3. Set **Internet** to `On`.

---

In [None]:
# 1. Setup & Installation
# We install specific libraries optimized for 4-bit quantization to fit on free-tier GPUs.
# Warnings about dependency conflicts (like pydantic/gradio) are expected in pre-built environments and can be safely ignored.

print("‚è≥ Installing dependencies... (This takes about 1 minute)")
!pip install -U -q transformers accelerate bitsandbytes peft
print("‚úÖ Installation complete.")

In [None]:
# 2. Memory Management
# Essential step: Clear GPU memory to ensure we have the full 15GB VRAM available.

import torch
import gc

gc.collect()
torch.cuda.empty_cache()

if torch.cuda.is_available():
    print(f"‚úÖ GPU Detected: {torch.cuda.get_device_name(0)}")
else:
    print("‚ö†Ô∏è No GPU detected. Please enable 'GPU T4 x 2' in settings.")

In [None]:
# 3. Model Configuration
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

# Model IDs
ADAPTER_MODEL = "kidahatsu/fintherapy-8b"
BASE_MODEL = "LumiOpen/Llama-Poro-2-8B-Instruct"

# 4-bit Quantization Config
# This reduces model size from ~16GB to ~6GB, allowing it to run fast on a single T4 GPU.
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

In [None]:
# 4. Load Models
print("‚è≥ Loading Tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

print("‚è≥ Loading Base Model (4-bit)... (This may take 2-3 minutes)")
# 'device_map="auto"' automatically assigns layers to the GPU
base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    quantization_config=quantization_config,
    device_map="auto",
    torch_dtype=torch.float16,
)

print("‚è≥ Loading FinTherapy Adapter...")
# Crucial: 'subfolder="adapter"' points to the correct location in the Hugging Face repo
model = PeftModel.from_pretrained(
    base_model,
    ADAPTER_MODEL,
    subfolder="adapter" 
)

print("‚úÖ System Ready! Model loaded successfully.")

In [None]:
# 5. Inference Helper Function

def therapist_chat(user_input):
    """
    Generates a therapeutic response using the correct chat formatting.
    """
    # Format prompt clearly for the model
    prompt = f"USER: {user_input}\nASSISTANT:"
    
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    
    with torch.no_grad():
        output_tokens = model.generate(
            **inputs,
            max_new_tokens=256,        # Allow long enough responses
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id,
            do_sample=True,            # Enable sampling for more natural text
            temperature=0.7,           # Control creativity (0.7 is good for chat)
            top_p=0.9,                 # Nucleus sampling
            repetition_penalty=1.1     # Prevent repeating phrases
        )
    
    # Decode and extract only the assistant's part
    full_response = tokenizer.decode(output_tokens[0], skip_special_tokens=True)
    response_only = full_response.split("ASSISTANT:")[-1].strip()
    
    return response_only

## üß™ Sample Prompts
Below are several test cases demonstrating how the model handles different mental health scenarios.

In [None]:
# TEST CASE 1: Anxiety / Ahdistus
prompt = "Olen tuntenut oloni todella ahdistuneeksi viime aikoina, enk√§ pysty rentoutumaan edes kotona."
print(f"üë§ USER: {prompt}\n")
print(f"ü§ñ ASSISTANT:\n{therapist_chat(prompt)}")

In [None]:
# TEST CASE 2: Sleep Problems / Univaikeudet
prompt = "Her√§√§n jatkuvasti aamuy√∂ll√§ nelj√§lt√§ enk√§ saa en√§√§ unta. Se alkaa uuvuttaa minua."
print(f"üë§ USER: {prompt}\n")
print(f"ü§ñ ASSISTANT:\n{therapist_chat(prompt)}")

In [None]:
# TEST CASE 3: Loneliness / Yksin√§isyys
prompt = "Minusta tuntuu, ett√§ kukaan ei oikeasti ymm√§rr√§ minua. Olen aivan yksin ajatusteni kanssa."
print(f"üë§ USER: {prompt}\n")
print(f"ü§ñ ASSISTANT:\n{therapist_chat(prompt)}")

In [None]:
# TEST CASE 4: Panic Attack / Paniikkikohtaus
prompt = "Syd√§meni hakkaa ja minusta tuntuu etten saa henke√§. Apua."
print(f"üë§ USER: {prompt}\n")
print(f"ü§ñ ASSISTANT:\n{therapist_chat(prompt)}")

In [None]:
## ‚úçÔ∏è Interactive Mode
# Uncomment the lines below to chat interactively!

# while True:
#     user_input = input("\nSin√§: ")
#     if user_input.lower() in ["exit", "lopeta"]:
#         break
#     response = therapist_chat(user_input)
#     print(f"Therapist: {response}")