# EDA and Fine-Tuning Mental-BERT with QLoRA

This notebook provides exploratory data analysis (EDA) of therapeutic transcript data, and demonstrates fine-tuning Mental-BERT using QLoRA for sentence-level construct classification.

## Sections
- Data loading and inspection
- Sentence segmentation and label distribution
- QLoRA configuration and training
- Evaluation and qualitative review
    

# Methodological Alignment Summary

This notebook and the associated codebase implement the AQUA-inspired, sentence-level qualitative analysis pipeline as described in the project methodology. Key features include:

- **Sentence-level graph neural network (GNN) analysis** using Mental-BERT embeddings, capturing semantic and temporal relationships between therapeutic exchanges.
- **Parameter-efficient fine-tuning** of Mental-BERT using QLoRA for domain adaptation on expert-coded transcripts.
- **Transparent, interpretable, and reproducible pipeline**: All steps (preprocessing, embedding, graph construction, clustering, classification, evaluation) are modular, auditable, and logged.
- **Graph construction**: Nodes represent sentences; edges encode semantic similarity and temporal adjacency, supporting transparent community detection.
- **Community detection and theme discovery**: Louvain algorithm is used for unsupervised theme extraction.
- **Classification**: Combines Mental-BERT predictions, graph structure, and community-level aggregation for robust construct assignment.
- **Audit trail**: All decisions, confidence scores, and rationales are logged for full reproducibility and review.
- **Evaluation**: Cohen’s kappa, precision/recall/F1, and modularity metrics are computed to benchmark performance and reliability.
- **Visualization**: Graph communities and similar sentences are visualized for qualitative review.

This ensures the codebase is fully aligned with the methodological principles of transparency, interpretability, and reproducibility as outlined in the project documentation.