Skip to content

FraAmato/MMM-paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MMM - Clustering Multivariate Longitudinal Mixed-type Data

Here you can find the code related to the paper "MMM - Clustering Multivariate Longitudinal Mixed-type Data".

About

This work introduces the Mixture of Mixed-Matrices (MMM) model for clustering multivariate longitudinal data with mixed variable types (continuous, ordinal, binary, nominal, count). The model assumes underlying latent continuous variables and uses matrix-variate normal distributions to handle temporal dependencies without conditional independence assumptions.

Preprint: https://hal.science/hal-04807626v1 https://arxiv.org/abs/2509.12166v1

Repository Structure

├── README.md
├── Real_data/                  # S&P 500 real data analysis               
│   ├── Data.zip                # Compressed raw data - UNZIP BEFORE USE
│   ├── Fitting & Analysis/     # Model fitting and results analysis scripts
│   ├── Images/                 # Generated plots and visualizations
│   └── Results/                # Fitted model outputs (.RData)
├── renv/                       # renv package library (auto-managed)
├── renv.lock                   # Lockfile with exact package versions
├── SETUP.md                    # Detailed environment setup instructions
├── Simulations/                # Simulation studies
│   ├── Data/                   # Synthetic datasets and generation scripts
│   ├── Results/                # Simulation outputs and performance metrics
└── Software_tools/             # Core algorithm implementations
    ├── EM_mixed_par.R          # Main MMM algorithm (MCMC-EM)
    ├── EM_MMN.R                # EM algorithm for Matrix-variate Normal mixtures
    └── PerfEval.R              # Performance evaluation functions

Reproducing the Environment

This project uses renv to ensure reproducible package dependencies. To set up the exact environment:

# 1. Install renv (if not already installed)
install.packages("renv")

# 2. Restore all dependencies
renv::restore()

For detailed setup instructions, troubleshooting, and system requirements, see SETUP.md.

Important Setup Note

⚠️ Before running any code: The data in the Real_data/ folder is compressed. You must unzip Data.zip before running the analysis scripts.

Core MMM Implementation

The main MMM algorithm is implemented in Software_tools/EM_mixed_par.R. This file contains the MCMC-EM algorithm for fitting the Mixture of Mixed-Matrices model.

Data Description

S&P 500 Data (Real_data/Data.zip)

  • LogReturns: Continuous (yearly log-returns)
  • Grades: Ordinal (Underperform/Neutral/Buy from Bank of America)
  • Dividends: Binary (dividend paid or not)
  • Volume: Count (millions of shares traded)
  • Period: 2019-2023 (330 companies × 5 years)

Simulation Data (Simulations/Data/)

  • Mixed-type matrices: continuous, ordinal (5 levels), binary, count
  • Sample sizes: N ∈ {100, 500, 1000}
  • True clusters: K = 2
  • Noise scenarios: 0%, 10%, 20%

About

Here you can find the material related to the paper "MMM - Clustering Multivariate Longitudinal Mixed-type Data"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages