A Glass Box System 2 model that implements dual-pathway reasoning (System 1/System 2) with causal feature extraction for robust bug detection in code patches. This architecture is designed to learn causal representations that generalize across distribution shifts.
- Dual-Pathway Architecture: Combines fast intuitive processing (System 1) with deliberate analysis (System 2) via learned metacognitive gating
- Causal Feature Extraction: Uses Variational Information Bottleneck (VIB) to separate causal code features from spurious confounders
- Theoretical Guarantees: Formal proofs for causal convergence, abstraction hierarchy, calibration bounds, and adaptive compute
- OOD Robustness: Demonstrated generalization across different code repositories and organizations
| Model | F1 Harmonic | ROC-AUC | Balanced Acc |
|---|---|---|---|
| Ours (GCS2-V5) | 0.500 Β± 0.014 | 0.776 Β± 0.018 | 0.663 Β± 0.011 |
| CodeBERT | 0.512 Β± 0.011 | 0.823 Β± 0.006 | 0.714 Β± 0.016 |
| GraphCodeBERT | 0.532 Β± 0.008 | 0.840 Β± 0.003 | 0.712 Β± 0.007 |
| Model | F1 Harmonic | ROC-AUC | ECE | Relative Drop |
|---|---|---|---|---|
| Ours (GCS2-V5) | 0.500 Β± 0.014 | 0.776 Β± 0.018 | 0.148 Β± 0.044 | ~0% π’ |
| CodeBERT | 0.490 Β± 0.024 | 0.759 Β± 0.018 | - | ~4% |
| GraphCodeBERT | 0.515 Β± 0.013 | 0.772 Β± 0.009 | 0.232 Β± 0.015 | ~3% |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Glass Box System 2 (V5) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββ βββββββββββββββ β
β β Code β β Confounders β β
β β (Patch) β β (11 dim) β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββββ βββββββββββββββ β
β β GNN + β β Spurious β β
β β CodeBERT β β Encoder β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββββ βββββββββββββββ β
β β VIB Causal β βVIB Spurious β β
β β z_causal β β z_spurious β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ β
β β β β
β ββββββββββ¬ββββββββββ β
β β β
β βΌ β
β βββββββββββββββββ β
β β Learned Fusionβ z_fused = (1-Ξ»)Β·z_c + λ·z_s β
β β Gate (Ξ») β β
β βββββββββ¬ββββββββ β
β β β
β βββββββββββββ΄ββββββββββββ β
β β β β
β βΌ βΌ β
β βββββββββββ βββββββββββ β
β βSystem 1 β βSystem 2 β β
β β (Fast) β β (GRU) β β
β ββββββ¬βββββ ββββββ¬βββββ β
β β β β
β ββββββββββββ¬βββββββββββ β
β β β
β βΌ β
β βββββββββββββββββ β
β β Metacognitive β p = (1-Ξ³)Β·pβ + Ξ³Β·pβ β
β β Gate (Ξ³) β β
β βββββββββ¬ββββββββ β
β β β
β βΌ β
β [Prediction] β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
glass-box-system2/
βββ src/
β βββ __init__.py
β βββ model.py # GraphCausalSystem2 architecture
β βββ data_engineering.py # Data loading, labeling, preprocessing
β βββ training.py # Training loop with theorem validation
β βββ baselines/
β βββ codebert.py # CodeBERT baseline
β βββ graphcodebert.py # GraphCodeBERT baseline
β βββ causal_methods.py # IRM, GroupDRO, VREx
βββ configs/
β βββ default.yaml # Default hyperparameters
βββ experiments/
β βββ within_org.py # Within-organization experiments
β βββ cross_org.py # Cross-organization experiments
βββ results/
β βββ ood/ # Out-of-distribution results
β βββ baselines/ # Baseline method results
β βββ codebert/ # CodeBERT results per seed
β βββ graphcodebert/ # GraphCodeBERT results per seed
βββ docs/
β βββ theorems.md # Mathematical proofs summary
β βββ labeling.md # Labeling methodology
βββ scripts/
β βββ run_training.py # Main training script
β βββ run_ood_eval.py # OOD evaluation script
βββ requirements.txt
βββ setup.py
βββ README.md
Our model is grounded in four key theorems (see docs/theorems.md for full proofs):
Under weak independence assumptions, our model recovers causal features with error bounded by:
E[(f_c(x) - fΜ_c(x))Β²] β€ O(V log n/n) + O(1/Ξ») + rΒ·β(d_xΒ·d_c)Β·LΒ²
The residual architecture provides exponential representational capacity:
R(L) = O(L^d) or O((d+1)^L) regions
The metacognitive gating achieves improved calibration:
ECE(pΜ) β€ min{ECE(pβ), ECE(pβ)} + O(β(log(1/Ξ΄)/n))
Reasoning depth scales with problem difficulty:
T* = Ξ©(log(1/Ξ΅_target) / (1-d))
git clone https://github.com/YOUR_USERNAME/glass-box-system2.git
cd glass-box-system2
pip install -e .from src.model import create_model
from src.data_engineering import load_and_label_data, System2Dataset
from src.training import train_system2
# Load data
samples, labels, analyses, prs = load_and_label_data('data/train.json')
# Create dataset
dataset = System2Dataset(samples, labels, tokenizer, pdg_extractor)
# Create model
model = create_model(
model_name="microsoft/codebert-base",
num_features=11,
hidden_dim=256,
bottleneck_dim=128
)
# Train
model, history, results = train_system2(
model, train_loader, val_loader, test_loader,
num_epochs=10, lr=3e-5
)from src.training import dual_robustness_test
# Test robustness to distribution shift
robustness_results = dual_robustness_test(
model, ood_loader, device,
strengths=[0.0, 0.1, 0.25, 0.5]
)python scripts/run_training.py \
--data_path data/django_train.json \
--seeds 42 123 456 789 1024 \
--epochs 10 \
--output_dir results/python scripts/run_baselines.py \
--model codebert \
--data_path data/django_train.json \
--ood_path data/optimum.json| Feature | Description | Range |
|---|---|---|
auth_hash |
Author identity (hashed) | [0, 1] |
sin_hour |
sin(hour/24Γ2Ο) temporal | [0, 1] |
cos_hour |
cos(hour/24Γ2Ο) temporal | [0, 1] |
complexity |
Patch length / 10000 | [0, 1] |
centrality |
AST PageRank centrality | [0, 1] |
time_to_merge |
Days to merge / 30 | [0, 1] |
code_churn |
(adds+dels)/files / 1000 | [0, 1] |
non_code_ratio |
Comments/strings ratio | [0, 1] |
review_depth |
Review count / 10 | [0, 1] |
files_changed |
File count / 50 | [0, 1] |
degree |
AST graph degree | [0, 1] |
We use a semantic-focused distant supervision approach that combines:
- PR-level signals: Issue links, fix keywords, test coupling
- Code-level signals: Semantic pattern detection (identifier swaps, wrong calls, etc.)
- Strict intersection: Requires BOTH semantic evidence AND PR intent
See docs/labeling.md for detailed statistics.
This project is licensed under the MIT License - see LICENSE for details.
- CodeBERT and GraphCodeBERT from Microsoft Research
- Django, HuggingFace, and other open-source projects for evaluation data