Skip to content

deshadspace/musicemo

Repository files navigation

Music Emotion Volatility Prediction

Python License Research

Predicting emotional volatility in music using acoustic features and machine learning.

🎡 Overview

This research investigates whether the emotional volatility of musicβ€”how much listeners' emotional responses varyβ€”can be predicted from acoustic features, and whether this prediction differs from predicting average emotional responses.

While most music emotion recognition focuses on mean responses, this study explores the often-overlooked dimension of emotional variance, revealing which songs evoke consistent emotions versus those producing highly varied listener experiences.

Key Question

Can we predict which songs are emotionally "robust" (consistent across listeners) versus "fragile" (highly variable responses) using only acoustic features?

πŸ“Š Dataset

PMEmo (Personalized Music Emotion Dataset)

  • 767 songs with comprehensive acoustic features
  • 6,373 initial acoustic features per song
  • Listener ratings across multiple contexts
  • Emotion labels on Arousal and Valence dimensions

Find the Link - https://www.kaggle.com/datasets/deshadsithsara/pmemo-2018-dataset

πŸ”¬ Research Hypothesis

H1 β€” Emotional Volatility is Predictable

Acoustic structure predicts variance of emotion better than, or at least comparably to, mean emotion.

Spoiler: This hypothesis was not supported, leading to valuable insights about the nature of music emotion.

πŸ› οΈ Methodology

Feature Engineering

  • Static Features: 6,373 acoustic features per song
  • Dynamic Features: Temporal statistics derived from acoustic properties

Feature Selection Pipeline

Stage 1: Unsupervised Pruning

  1. Variance Filtering: Removed 279 near-constant features (variance < 1e-6)
  2. Collinearity Reduction: Eliminated 2,254 redundant features (correlation > 0.95)
  3. Result: 6,373 β†’ 3,840 features

Stage 2: Mutual Information Ranking

  • Measured dependency between features and targets
  • Applied Pareto selection (top 20% of total MI)
  • Feature selection per target:
    • mean_A: 135 features
    • mean_V: 172 features
    • std_A: 184 features
    • std_V: 144 features

Critical Finding: Zero intersection among all four targetsβ€”predicting mean versus variance requires fundamentally different acoustic information.

Model Training

  • Algorithm: ElasticNet Regression
  • Split: 70% Train / 20% Validation / 10% Test
  • Preprocessing: Standardization after split (prevents data leakage)
  • Hyperparameter Tuning: Alpha and L1 ratio optimized via validation

πŸ“ˆ Results

Static Feature Dataset

Target Train RΒ² Validation RΒ² Test RΒ² Interpretation
mean_A 0.797 0.780 0.707 βœ… Excellent β€” mean arousal highly predictable
mean_V 0.551 0.562 0.493 ⚠️ Moderate β€” mean valence predictable but weaker
std_A 0.234 0.039 0.220 ❌ Low β€” arousal variance difficult to predict
std_V 0.107 0.029 -0.028 ❌ Unpredictable β€” valence variance not captured

Dynamic Feature Dataset

Target Test RΒ² Best Alpha Best L1 Ratio
mean_A 0.657 0.001 0.1
mean_V 0.506 0.001 0.1
std_A 0.211 0.001 0.9
std_V -0.009 0.01 0.7

πŸ” Key Findings

1. Acoustic Features Predict Mean, Not Variance

  • Acoustic features excellently predict mean arousal (RΒ² = 0.707)
  • Acoustic features moderately predict mean valence (RΒ² = 0.493)
  • Acoustic features fail to predict emotional volatility (RΒ² β‰ˆ 0 or negative)

2. Different Features for Different Targets

The zero-intersection in feature selection reveals that:

  • Features that predict average emotion are completely different from those that would predict variance
  • Mean emotion is acoustically determined
  • Emotional volatility arises from listener-specific factors

3. Implications for Music Recommendation

  • Current systems relying solely on acoustic features can predict general emotional tone
  • They cannot predict how individuals will respond differently
  • True personalization requires both acoustic properties AND listener-specific context

πŸ’‘ Conclusion

Emotional volatility appears to arise from listener-specific factorsβ€”personal history, context, mood, cultural backgroundβ€”rather than from the acoustic structure of music itself.

This research demonstrates that:

  • Some aspects of musical emotion are acoustically determined (mean responses)
  • Others emerge from the interaction between music and individual listeners (emotional variance)

The same frequencies, rhythms, and beats create a consistent baseline vibe, but whether you personally connect with that vibe depends on who you are and where you've been.

πŸš€ Installation & Usage

# Clone the repository
git clone https://github.com/deshadspace/music-emotion-volatility.git
cd music-emotion-volatility

# Install dependencies
pip install -r requirements.txt

# Run feature selection pipeline
python feature_selection.py --dataset pmemo --variance-threshold 1e-6 --correlation-threshold 0.95

# Train models
python train.py --model elasticnet --split 70-20-10 --features static

# Evaluate results
python evaluate.py --model-path models/elasticnet_static.pkl

πŸ“ Project Structure

music-emotion-volatility/
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ pmemo/                  # PMEmo dataset
β”‚   └── processed/              # Processed features
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ feature_selection.py   # Feature pruning & MI ranking
β”‚   β”œβ”€β”€ train.py               # Model training pipeline
β”‚   β”œβ”€β”€ evaluate.py            # Model evaluation
β”‚   └── utils.py               # Helper functions
β”œβ”€β”€ notebooks/
β”‚   β”œβ”€β”€ EDA.ipynb              # Exploratory data analysis
β”‚   β”œβ”€β”€ Feature_Analysis.ipynb # Feature importance analysis
β”‚   └── Results.ipynb          # Visualization of results
β”œβ”€β”€ models/                     # Trained models
β”œβ”€β”€ results/                    # Plots and metrics
β”œβ”€β”€ requirements.txt
└── README.md

πŸ”§ Requirements

python>=3.8
numpy>=1.21.0
pandas>=1.3.0
scikit-learn>=1.0.0
scipy>=1.7.0
matplotlib>=3.4.0
seaborn>=0.11.0

πŸ“š Citation

If you use this code or findings in your research, please cite:

@misc{senevirathne2025musicemotion,
  title={Music Emotion Volatility Prediction: A Feature-Based Analysis},
  author={Senevirathne, Deshad},
  year={2025},
  howpublished={\url{https://github.com/deshadspace/music-emotion-volatility}}
}

🎯 Future Work

  • Incorporate listener metadata and demographic information
  • Explore deep learning architectures for temporal modeling
  • Investigate cultural and contextual factors in emotional variance
  • Develop hybrid models combining acoustic and listener-specific features

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ‘€ Author

Deshad Senevirathne

πŸ™ Acknowledgments

  • PMEmo dataset creators for providing comprehensive music emotion data
  • The music information retrieval research community
  • Open-source contributors to scikit-learn and Python ecosystem

⭐ If you found this research interesting, please consider giving it a star!

About

pmemo dataset used to test a hypothesis: emotion volatility could be better explained by acoustic features than the mean emotion

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors