Skip to content

nyarderr/moodmate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

55 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MoodMate β€” Emotion-Aware NLP Microservice For Mental Wellness Support

MoodMate is an NLP-based microservice designed to detect user emotional states in short user-generated text and generate brief, empathetic support messages. The system demonstrates how Small Language Models (SLMs) can be effectively used to build emotion-aware applications under realistic computational and time constraints.

The project addresses the challenge of extracting emotional intent from natural language and responding in a contextually appropriate manner, using open-source instruction-tuned language models. MoodMate is implemented as a lightweight, fully local microservice, emphasizing efficiency, reproducibility, and privacy.

This project presents the design, implementation, and evaluation of an NLP-based system using small language models for emotion-aware text processing.


🎯 Problem Definition and Motivation

Understanding emotional cues in text is a fundamental problem in Natural Language Processing, with applications in mental wellness support, human–computer interaction, and affective computing. While large language models offer strong performance, they are often impractical due to computational cost and deployment constraints.

This project investigates the following question:

To what extent can small, open-source language models perform emotion classification and generate emotionally appropriate responses in a resource-constrained setting?

MoodMate explores this question by combining supervised emotion classification with conditional text generation, demonstrating that SLMs can deliver meaningful emotional awareness without reliance on large proprietary models.


🧩 System Overview

MoodMate is composed of two primary NLP components exposed through a RESTful API:

Emotion Classification User input text is classified into one of seven predefined macro-emotions.

Support Message Generation Based on the detected emotion, the system generates a short, empathetic response tailored to the emotional category.

The system follows a modular pipeline:

User Text β†’ Emotion Classifier (SLM) β†’ Emotion Label β†’ Response Generator (SLM) β†’ Support Message

This separation allows independent evaluation and replacement of each NLP component.


πŸ“Œ Features

  • Emotion detection using 7 macro-emotions
  • Supportive text generation using a small LLM
  • RESTful API built with FastAPI
  • Lightweight, privacy-friendly, and fully local
  • Dataset preprocessing pipeline included
  • Reproducible training and evaluation setup

πŸ“ Repository Structure

moodmate/
β”œβ”€β”€ api/ Contains scripts for FASTAPI
β”œβ”€β”€ training/ This folder contains all scripts related to training models.
β”œβ”€β”€ evaluation/ Contains all evaluation scripts
β”œβ”€β”€ data/ Holds both raw and processed datasets
β”‚ β”œβ”€β”€ raw/
β”‚ └── processed/
β”œβ”€β”€ models/
β”œβ”€β”€ docs/ Holds all documentation material

β”œβ”€β”€ README.md
└── requirements.txt ## Defines the environment needed to run the project.


πŸ“Š Dataset

This project uses the GoEmotions dataset (Google Research).
The original dataset contains 28 fine-grained emotion labels, which are mapped down to 7 macro-emotions for improved accuracy and usability.

Macro-Emotion Mapping

Macro Original Labels
joy joy, excitement, amusement, pride
sadness sadness, disappointment, grief
anger anger, annoyance, frustration
anxiety fear, nervousness, worry
love love, caring, gratitude
surprise surprise
neutral neutral, confusion, curiosity, realization

Processed Dataset

The processed dataset is provided at:

data/processed/goemotions_final.csv

This enables full reproducibility without rerunning preprocessing steps.


🧠 Models

Emotion Classifier

  • Qwen 1.5–0.5B fine-tuned using LoRA
  • Predicts one of the 7 macro-emotions

Support Message Generator

  • Phi-3.5 Mini Instruct
  • Generates short, empathetic responses conditioned on emotion

Trained model weights are not included in this repository due to size constraints.

To run the API locally:

  1. Train the model using the provided training scripts, or
  2. Download the LoRA adapter and place it in: models/qwen_emotion_lora/

βš™οΈ Installation

Requirements

  • Python 3.9+
  • Virtual environment recommended
  • CPU/GPU execution supported

Setup

git clone https://github.com/nyarderr/moodmate.git
cd moodmate

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

πŸš€ Usage

Start the API

From the project root:

uvicorn api.main:app

Once running, open: β€’ http://127.0.0.1:8000/docs

  • Analyze Emotion

Endpoint: /analyze_emotion

Request:

{
  "text": "I feel overwhelmed with everything lately."
}

Response:

{
  "emotion": "anxiety"
}
  • Generate Support Message

Endpoint: /generate_support

Request:

{
  "text": "I feel overwhelmed with everything lately.",
  "emotion": "anxiety"
}
Response:
{
  "support_message": "It’s understandable to feel overwhelmed. Try taking things one step at a time and be kind to yourself."
}

🧠 Model Artifacts

Due to size constraints, trained model weights are not included in this repository.

  • The emotion classifier was fine-tuned in Google Colab using LoRA.
  • Only LoRA adapter weights are required for inference.
  • Model loading instructions are provided in the API code.

The API supports concurrent requests at the server level, but inference is serialized per worker due to the compute-bound nature of large language models. In production, this would be scaled horizontally using multiple workers or replicas.


πŸ‹οΈ Model Training

This project fine-tunes Qwen (a decoder-only large language model) using LoRA (Low-Rank Adaptation) to classify text into the 7 macro-emotions defined during dataset preprocessing.

πŸ”Ή Model Used

  • Base Model: Qwen 1.5 (0.5B)
  • Fine-Tuning Method: LoRA
  • Task Type: Instruction-style causal language modeling
  • Objective: Train Qwen to generate the correct emotion label from a prompt.

Prompt (input): Instruction: Identify the emotion of the following text.
Text: I’m really stressed about tomorrow.
Emotion:

Target (expected output): anxiety

The model learns to predict the correct emotion after the "Emotion:" token.


🧩 Why LoRA?

LoRA allows efficient fine-tuning by updating only ~1–2% of the model parameters.

Benefits:

  • Fits on free Google Colab GPU (T4)
  • Produces a small adapter file
  • Leaves original Qwen weights unchanged
  • Faster training and lower memory use

πŸ“Š Training Data Format

The processed dataset contains:

text labels
"I'm overwhelmed with school." anxiety
"This is amazing!" joy

During tokenization, prompt + label are combined: Instruction: Identify the emotion of the following text.
Text:
Emotion:

To prevent the model from learning the prompt itself, all prompt tokens are masked with -100, so the model only learns from the label tokens.


βš™οΈ Training Pipeline Overview

The fine-tuning workflow includes:

  1. Load processed dataset (7 macro-emotion labels).
  2. Construct instruction-style prompts for each example.
  3. Tokenize the combined prompt and label into model input.
  4. Mask prompt tokens using -100 so only label tokens influence training.
  5. Apply LoRA adapters to the base Qwen model.
  6. Fine-tune using HuggingFace Trainer, updating only LoRA layers.
  7. Save the trained LoRA adapter for use in the API.

πŸš€ Reproducible Training

Training can be reproduced using the notebook: training/train_qwen_lora.ipynb

The notebook:

  • Loads Qwen in 4-bit mode for memory efficiency
  • Applies LoRA configuration
  • Tokenizes dataset using the masking strategy
  • Runs fine-tuning
  • Saves LoRA weights

You may then upload the adapter to HuggingFace Hub or place it in: models/qwen_emotion/


πŸ“ˆ Evaluation and Results

The emotion classification model was evaluated on a held-out test set of 500 samples from the processed GoEmotions dataset.

Quantitative Performance

  • Overall Accuracy: 0.70
  • Macro F1-score: 0.59
  • Weighted F1-score: 0.69
Emotion Precision Recall F1-score
anger 0.63 0.55 0.59
anxiety 0.60 0.35 0.44
joy 0.64 0.69 0.66
love 0.71 0.72 0.71
neutral 0.74 0.84 0.79
sadness 0.75 0.27 0.40
surprise 0.33 0.26 0.29

Observations

  • The model performs strongest on neutral, love, and joy, which are well-represented in the dataset.
  • Lower performance on anxiety and surprise is expected due to limited sample size and semantic overlap with other emotions.
  • Overall results demonstrate that LoRA fine-tuning of a small language model can achieve reasonable performance for multi-class emotion detection in resource-constrained settings.

πŸ“š Acknowledgements

  • GoEmotions Dataset (Google Research)
  • Qwen & Phi models (Open-source)
  • HuggingFace Transformers library

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors