# Mental-Health Text Classification: Inference On New Data

This notebook is designed to demonstrate how to use the previously trained models of the notebook `train_deep_learning_models.ipynb` for inference on new posts/comments.

Main Goals of this notebook :
- Load all trained models directly from the Hugging Face Hub.
- Set up a quick inference pipeline to make predictions on new text data.
- Give a link of a deployed Huggig Face App performing online same task as the one done here.

This notebook is organized as follows:

1.  **Packages Loading**: Load necessary packages.
2.  **Models Loading**: Load all three models.
3.  **Inference Pipeline Setup**: Prepare the inference pipeline used for making predictions on new posts/comments locally.
4.  **Infer on New Data**: Infer locally mental health state of a person based on its posts/comments.
5.  **Deployed Gradio App**: The link of deployed App on a Hugging Face link with the best trained model.

### **1. Packages Loading**

In [1]:
from transformers import pipeline

### **2. Models Loading**

In [2]:
REPO_ID_ROBERTA = "paragonadey/mh-text-classifier-roberta-base"
REPO_ID_DEBERTA = "paragonadey/mh-text-classifier-deberta-v3-base_v2"
REPO_ID_MODERBERT = "paragonadey/mh-text-classifier-moderbert-large_v2"

### **3. Inference Pipeline Setup**

In [3]:
def create_inference_pipeline(repo_id):
    """
    Creates an inference pipeline for a given model repository ID.

    Args:
        repo_id (str): The repository ID of the model on Hugging Face Hub.

    Returns:
        pipeline: The Hugging Face inference pipeline.
    """
    # Create the pipeline specifying the task, model, and tokenizer.
    text_classifier = pipeline(
        "text-classification",
        model=repo_id,
        tokenizer=repo_id,
    )

    # id2label needs to be passed to the pipeline to map the output logits to labels
    # Its format is already known from the work done in the notebook `train_deep_learning_models.ipynb`
    id2label = {
        0: 'EDAnonymous',
        1: 'addiction',
        2: 'adhd',
        3: 'alcoholism',
        4: 'anxiety',
        5: 'autism',
        6: 'bipolarreddit',
        7: 'bpd',
        8: 'depression',
        9: 'healthanxiety',
        10: 'lonely',
        11: 'normal',
        12: 'ptsd',
        13: 'schizophrenia',
        14: 'socialanxiety',
        15: 'suicidewatch'
    }

    # Manually set the id2label mapping on the pipeline's model config
    # This ensures the pipeline uses the correct labels for output
    if hasattr(text_classifier.model.config, 'id2label'):
         text_classifier.model.config.id2label = id2label
    elif hasattr(text_classifier.model.config, 'label2id'):
        text_classifier.model.config.id2label = {v: k for k, v in text_classifier.model.config.label2id.items()}

        text_classifier.model.config.id2label.update(id2label)
    else:
        # If no id2label or label2id in config, set it directly
         text_classifier.model.config.id2label = id2label

    return text_classifier

### **4. Infer on New Data**

In [4]:
my_depression_sentence = "I am feeling very down and have no motivation."
my_suicide_sentence = "Need to kill some people, then myself."

In [5]:
roberta_classifier = create_inference_pipeline(REPO_ID_ROBERTA)

print(f"My Depression Sentence: {my_depression_sentence}")
print(f"RoBERTa Prediction On Depression Sentence: {roberta_classifier(my_depression_sentence)[0]}")

print(10*'-------')

print(f"My Suicide Sentence: {my_suicide_sentence}")
print(f"RoBERTa Prediction On Depression Sentence: {roberta_classifier(my_suicide_sentence)[0]}")

Device set to use cpu


My Depression Sentence: I am feeling very down and have no motivation.
RoBERTa Prediction On Depression Sentence: {'label': 'depression', 'score': 0.9756641387939453}
----------------------------------------------------------------------
My Suicide Sentence: Need to kill some people, then myself.
RoBERTa Prediction On Depression Sentence: {'label': 'suicidewatch', 'score': 0.686688244342804}


In [6]:
deberta_classifier = create_inference_pipeline(REPO_ID_DEBERTA)

print(f"My Depression Sentence: {my_depression_sentence}")
print(f"DeBERTa Prediction On Depression Sentence: {deberta_classifier(my_depression_sentence)[0]}")

print(10*'-------')

print(f"My Suicide Sentence: {my_suicide_sentence}")
print(f"DeBERTa Prediction On Depression Sentence: {deberta_classifier(my_suicide_sentence)[0]}")

Device set to use cpu


My Depression Sentence: I am feeling very down and have no motivation.
DeBERTa Prediction On Depression Sentence: {'label': 'depression', 'score': 0.9577535390853882}
----------------------------------------------------------------------
My Suicide Sentence: Need to kill some people, then myself.
DeBERTa Prediction On Depression Sentence: {'label': 'suicidewatch', 'score': 0.7814931273460388}


In [7]:
moderbert_classifier = create_inference_pipeline(REPO_ID_MODERBERT)

print(f"My Depression Sentence: {my_depression_sentence}")
print(f"ModerBERT Prediction On Depression Sentence: {moderbert_classifier(my_depression_sentence)[0]}")

print(10*'-------')

print(f"My Suicide Sentence: {my_suicide_sentence}")
print(f"ModerBERT Prediction On Depression Sentence: {moderbert_classifier(my_suicide_sentence)[0]}")

Device set to use cpu


My Depression Sentence: I am feeling very down and have no motivation.
ModerBERT Prediction On Depression Sentence: {'label': 'depression', 'score': 0.9607046842575073}
----------------------------------------------------------------------
My Suicide Sentence: Need to kill some people, then myself.
ModerBERT Prediction On Depression Sentence: {'label': 'suicidewatch', 'score': 0.44293347001075745}


### **5. Deployed Gradio App**

- The best model `paragonadey/mh-text-classifier-roberta-base` was used to create a Gradio web application for inference.

- You can try it out online here: [https://huggingface.co/spaces/paragonadey/mental-health-text-classifier](https://huggingface.co/spaces/paragonadey/mental-health-text-classifier)