HiCD: Hyperbolic Insight through Decomposed Educational Graphs for Long-Tailed Cognitive Diagnosis

Overview

HiCD is a hyperbolic geometry–based cognitive diagnosis model designed to address long-tail sparsity and semantic heterogeneity in educational graphs.
Unlike traditional Euclidean or single-graph methods, HiCD segmenting the educational graph into semantically distinct subgraphs and embeds students, exercises, and concepts into curvature-adaptive hyperbolic spaces, effectively preserving hierarchical and power-law structures while improving diagnostic accuracy and robustness.

Key Features

Hyperbolic Embedding for Long-Tail Sparsity: Learns representations in hyperbolic (Lorentz) space to preserve hierarchical and tree-like relations in sparse educational graphs.
Curvature-Aware Graph Decomposition: Decomposes the educational graph into three semantically distinct subgraphs (correct, incorrect, and exercise–concept) and assigns adaptive curvature to each.
Multi-Level Fusion Mechanism: Integrates semantic and structural embeddings across subgraphs through attention and knowledge-aware fusion.
Hyperbolic Diagnostic Function: Defines prediction functions directly in hyperbolic space using either Möbius subtraction or Lorentzian distance.
Comprehensive Evaluation & Robustness: Achieves state-of-the-art results across multiple datasets (ASSIST-0910 and Junyi), especially in sparse and long-tail scenarios.

Problem Statement

Educational data are characterized by high sparsity and semantic heterogeneity:

Long-tail Sparsity:

Most students and concepts interact with very few exercises, making representation learning unstable.

Heterogeneous Semantics:

Different edge types (e.g., correct/incorrect responses, exercise–concept links) exhibit distinct structures and distributions, which a single embedding space cannot represent faithfully.

Traditional CD models (IRT, NCDM, RCD, SVGCD) fail to fully capture such structural diversity, leading to biased or distorted embeddings. HiCD overcomes these challenges through hyperbolic geometry and graph decomposition.

Solution: HiCD Framework

HiCD addresses the above issues through three coordinated modules:

Graph Decomposition

Splits the educational graph into three subgraphs:
- (a) students’ correct graph 𝐺𝑐𝑟,
- (b) students‘ incorrect graph 𝐺𝑖𝑐𝑟,
- (c) exercise–concept association graph 𝐺𝑒𝑐.
Each subgraph captures unique semantic and structural relationships.

Hyperbolic Representation Learning

Each subgraph is mapped into a Lorentz-model hyperbolic space with an estimated curvature 𝑘𝑖.
Performs message passing in tangent space (via GraphSAGE) and projects results back to hyperbolic space.
Multi-level fusion integrates semantic information across subgraphs through attention and Q-matrix–based enhancement.

Hyperbolic Diagnosis

Two geometric diagnostic functions are defined:
- (a) Möbius Subtraction (HiCD-sub) — models ability–difficulty difference in the Poincaré ball,
- (b) Lorentz Distance (HiCD-dist) — computes hyperbolic distance directly in Lorentz space for stability.
The Fermi–Dirac decoder converts geometric distances into probabilistic predictions.

Project Structure

EduCWO/
├── model/                      # Model implementations
│   ├── HiCD/                   # Main Hyperbolic Cognitive Diagnosis (HiCD) model
│   ├── manifolds/         		# Lorentz and Poincaré ball manifolds
│   └── __init__.py
│
├── scripts/                    # Scripts for data processing, training, and evaluation
│   ├── data/                   # Dataset directory
│   │   ├── rawdata/        	# Original input datasets
│   │   │   ├── assist0910/ 	# ASSIST-0910 dataset (raw)
│   │   │   └── junyi/      	# Junyi dataset (raw)
│   └── run.py
│
├── utils/                      # Helper functions and visualization tools
│
└── README.md                   # Project description and usage guide

Datasets

Dataset	#Students	#Exercises	#Concepts	#Logs	Sparsity
ASSIST-0910	2,493	17,676	123	267,423	99.39%
Junyi	10,000	734	734	408,057	94.44%

ASSIST-0910: Response data from ASSISTments 2009–2010.
Junyi: Real-world learning logs from the Junyi Academy platform.
Each dataset includes a Q matrix specifying exercise–concept mappings.

Evaluation Metrics

AUC: Area under the ROC curve
ACC: Accuracy with 0.5 probability threshold
RMSE: Root mean square error

Experimental Results

HiCD achieves consistent improvement over strong baselines across datasets.

Highlights:

On ASSIST-0910, HiCD-dist achieves AUC = 0.7972 and ACC = 0.7531.
On Junyi, HiCD-dist attains AUC = 0.8207, outperforming SVGCD by 1.6%.
HiCD consistently improves diagnostic robustness under extreme sparsity.

Key Innovations

1.Graph Decomposition

HiCD decomposes the educational interaction graph into three semantically distinct subgraphs — students’ correct graph 𝐺𝑐𝑟, students‘ incorrect graph 𝐺𝑖𝑐𝑟, and exercise–concept association graph 𝐺𝑒𝑐.
This decomposition isolates different behavioral and structural semantics: the correct and incorrect graphs capture distinct cognitive behaviors, while the exercise–concept graph reflects the underlying knowledge structure.
By learning from these specialized subgraphs, HiCD enhances representation diversity and captures fine-grained relational patterns that are often overlooked in unified graph formulations.

2.Curvature-Aware Mapping and Multi-Level Fusion

To effectively integrate heterogeneous semantics, HiCD embeds each subgraph into a hyperbolic space with an adaptively estimated curvature, aligning the geometric properties with the relational complexity of each subgraph. This curvature-aware mapping preserves hierarchical and non-Euclidean characteristics, improving the fidelity of embeddings under long-tail sparsity.
HiCD introduces a multi-level fusion mechanism that integrates representations from multiple subgraphs through three stages:
- Behavioral Fusion — merges student behaviors from correct and incorrect response graphs.
- Attention Fusion — balances the relative contributions of different subgraphs using attention weighting.
- Knowledge-Aware Enhancement — aligns exercise representations with related concepts based on the Q-matrix.
This unified fusion framework ensures that the final embeddings jointly encode behavioral, structural, and conceptual information, yielding robust and interpretable student–exercise representations.

3.Hyperbolic Diagnosis

HiCD defines the diagnostic process directly within hyperbolic space, where student and exercise embeddings are compared through geometric distances that naturally reflect their hierarchical relations.
Two complementary formulations — Möbius subtraction and Lorentz distance — are employed under different hyperbolic models to measure the relational strength between entities.
The resulting geometric distances are then decoded into cognitive response probabilities through the Fermi–Dirac function, achieving smooth probabilistic outputs while maintaining geometric interpretability.
This design allows HiCD to perform cognitive diagnosis in a geometrically consistent and probabilistically stable manner.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.idea		.idea
EduCWO		EduCWO
confs		confs
scripts		scripts
utils		utils
README.md		README.md
__init__.py		__init__.py
model.jpg		model.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiCD: Hyperbolic Insight through Decomposed Educational Graphs for Long-Tailed Cognitive Diagnosis

Overview

Key Features

Problem Statement

Solution: HiCD Framework

Graph Decomposition

Hyperbolic Representation Learning

Hyperbolic Diagnosis

Project Structure

Datasets

Evaluation Metrics

Experimental Results

Key Innovations

1.Graph Decomposition

2.Curvature-Aware Mapping and Multi-Level Fusion

3.Hyperbolic Diagnosis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HiCD: Hyperbolic Insight through Decomposed Educational Graphs for Long-Tailed Cognitive Diagnosis

Overview

Key Features

Problem Statement

Solution: HiCD Framework

Graph Decomposition

Hyperbolic Representation Learning

Hyperbolic Diagnosis

Project Structure

Datasets

Evaluation Metrics

Experimental Results

Key Innovations

1.Graph Decomposition

2.Curvature-Aware Mapping and Multi-Level Fusion

3.Hyperbolic Diagnosis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages