This repository contains the code for CCMH (Cross-Condition Mental Health), an intelligent decision support system for mental health text analysis using Blind Source Separation (BSS) methods.
Title: CCMH: An Intelligent System for Cross-Condition Mental Health Text Analysis via Semantic Dictionary Learning
Status: Submitted to Expert Systems With Applications
Authors: Muhammad Usman Khalid, Shafiq ur Rehman, Malik Muhammad Nauman, Hatoon S. AlSagri, Sheikh Naeem Shafqat
This project uses the Reddit Mental Health Dataset by Low et al. (2020).
Download: https://zenodo.org/records/3941387
Citation:
Low, D. M., Rumker, L., Talkar, T., Torous, J., Cecchi, G., & Ghosh, S. S. (2020).
Natural Language Processing Reveals Vulnerable Mental Health Support Groups and
Heightened Health Anxiety on Reddit During COVID-19: Observational Study.
Journal of Medical Internet Research, 22(10), e22635.
- MATLAB R2020a or later
- Python 3.8+ (for sentence embeddings)
- sentence-transformers library (
pip install sentence-transformers) - Required MATLAB toolboxes:
- Statistics and Machine Learning Toolbox
- Signal Processing Toolbox
- Clone this repository:
git clone https://github.com/usmankhalid06/CCMH.git
cd CCMH-
Download the dataset from Zenodo
-
Extract the dataset to your working directory
-
Install Python dependencies:
pip install sentence-transformers pandas numpyCCMH/
├── script_Sentence_Transformer_preCovid.m # Main analysis pipeline
├── clean_reddit_post.m # Text preprocessing
├── get_sentence_embeddings.m # Sentence transformer interface
├── find_K_multiple_criteria.m # Dictionary size selection (AIC/BIC)
├── my_KSVD.m # K-SVD algorithm
├── my_ODL.m # Online Dictionary Learning
├── my_ACSD.m # Adaptive Consistent Sequential DL
├── SDL.m # Shared Dictionary Learning (proposed)
├── my_sparse_encode.m # Sparse coding implementation
└── README.md
% Clean and preprocess Reddit posts
cleaned_text = clean_reddit_post(raw_posts);% Generate 384-dimensional embeddings using all-MiniLM-L6-v2
embeddings = get_sentence_embeddings(cleaned_text);% Execute complete pipeline
script_Sentence_Transformer_preCovidThis will:
- Load preprocessed data
- Determine optimal dictionary size (K) using AIC/BIC
- Learn dictionaries using all four algorithms
- Perform statistical validation
- Generate figures
my_KSVD.m- K-SVD dictionary learningmy_ODL.m- Online dictionary learning with LARSmy_ACSD.m- Adaptive consistent sequential dictionary learningSDL.m- Shared dictionary learning (our proposed method)
For K-SVD and ODL you need to download SPAMS toolbox from here https://thoth.inrialpes.fr/people/mairal/spams/ to run mexOMP and mexLasso
clean_reddit_post.m- Text preprocessing (remove HTML, URLs, formatting)get_sentence_embeddings.m- Generate sentence transformer embeddingsfind_K_multiple_criteria.m- Model selection (AIC, BIC, variance explained)my_sparse_encode.m- Sparse coding with adaptive L1 regularization
- Dictionary size (K): Determined by 70% variance explained criterion
- Sparsity (λ): 20 for cross-condition analysis, algorithm-specific for training
- Iterations: 30 for all dictionary learning methods
The analysis generates:
- Learned dictionary atoms for each algorithm
- Activation matrices (11 conditions × K atoms)
- Cross-algorithm validation metrics
- Condition clustering visualizations
- Discriminative atom analysis
To reproduce paper results:
% Ensure dataset is in path
addpath('path/to/reddit/data');
% Run main script
script_Sentence_Transformer_preCovid
% Results will be saved in figures/ directoryFor questions or issues, please contact:
- Muhammad Usman Khalid: [mukhalid@imamu.edu.sa]
- Corresponding Author: malik.nauman@ubd.edu.bn
This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) (grant number IMSIU-DDRSP2504).
If you use this code, please cite:
@article{khalid2025ccmh,
title={CCMH: An Intelligent System for Cross-Condition Mental Health Text Analysis via Semantic Dictionary Learning},
author={Khalid, Muhammad Usman and Rehman, Shafiq ur and Nauman, Malik Muhammad and AlSagri, Hatoon S. and Shafqat, Sheikh Naeem},
journal={Expert Systems With Applications},
year={2025},
note={Submitted}
}