Glass Box System 2: Causal Neural Networks for Bug Detection

A Glass Box System 2 model that implements dual-pathway reasoning (System 1/System 2) with causal feature extraction for robust bug detection in code patches. This architecture is designed to learn causal representations that generalize across distribution shifts.

🎯 Key Contributions

Dual-Pathway Architecture: Combines fast intuitive processing (System 1) with deliberate analysis (System 2) via learned metacognitive gating
Causal Feature Extraction: Uses Variational Information Bottleneck (VIB) to separate causal code features from spurious confounders
Theoretical Guarantees: Formal proofs for causal convergence, abstraction hierarchy, calibration bounds, and adaptive compute
OOD Robustness: Demonstrated generalization across different code repositories and organizations

📊 Results Summary

In-Distribution (Django Dataset)

Model	F1 Harmonic	ROC-AUC	Balanced Acc
Ours (GCS2-V5)	0.500 ± 0.014	0.776 ± 0.018	0.663 ± 0.011
CodeBERT	0.512 ± 0.011	0.823 ± 0.006	0.714 ± 0.016
GraphCodeBERT	0.532 ± 0.008	0.840 ± 0.003	0.712 ± 0.007

Out-of-Distribution (Optimum Dataset)

Model	F1 Harmonic	ROC-AUC	ECE	Relative Drop
Ours (GCS2-V5)	0.500 ± 0.014	0.776 ± 0.018	0.148 ± 0.044	~0% 🟢
CodeBERT	0.490 ± 0.024	0.759 ± 0.018	-	~4%
GraphCodeBERT	0.515 ± 0.013	0.772 ± 0.009	0.232 ± 0.015	~3%

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Glass Box System 2 (V5)                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐    ┌─────────────┐                             │
│  │   Code      │    │ Confounders │                             │
│  │  (Patch)    │    │  (11 dim)   │                             │
│  └──────┬──────┘    └──────┬──────┘                             │
│         │                  │                                    │
│         ▼                  ▼                                    │
│  ┌─────────────┐    ┌─────────────┐                             │
│  │    GNN +    │    │  Spurious   │                             │
│  │  CodeBERT   │    │   Encoder   │                             │
│  └──────┬──────┘    └──────┬──────┘                             │
│         │                  │                                    │
│         ▼                  ▼                                    │
│  ┌─────────────┐    ┌─────────────┐                             │
│  │ VIB Causal  │    │VIB Spurious │                             │
│  │   z_causal  │    │  z_spurious │                             │
│  └──────┬──────┘    └──────┬──────┘                             │
│         │                  │                                    │
│         └────────┬─────────┘                                    │
│                  │                                              │
│                  ▼                                              │
│         ┌───────────────┐                                       │
│         │ Learned Fusion│  z_fused = (1-λ)·z_c + λ·z_s          │
│         │    Gate (λ)   │                                       │
│         └───────┬───────┘                                       │
│                 │                                               │
│     ┌───────────┴───────────┐                                   │
│     │                       │                                   │
│     ▼                       ▼                                   │
│ ┌─────────┐           ┌─────────┐                               │
│ │System 1 │           │System 2 │                               │
│ │ (Fast)  │           │ (GRU)   │                               │
│ └────┬────┘           └────┬────┘                               │
│      │                     │                                    │
│      └──────────┬──────────┘                                    │
│                 │                                               │
│                 ▼                                               │
│         ┌───────────────┐                                       │
│         │ Metacognitive │  p = (1-γ)·p₁ + γ·p₂                  │
│         │   Gate (γ)    │                                       │
│         └───────┬───────┘                                       │
│                 │                                               │
│                 ▼                                               │
│           [Prediction]                                          │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

📁 Repository Structure

glass-box-system2/
├── src/
│   ├── __init__.py
│   ├── model.py              # GraphCausalSystem2 architecture
│   ├── data_engineering.py   # Data loading, labeling, preprocessing
│   ├── training.py           # Training loop with theorem validation
│   └── baselines/
│       ├── codebert.py       # CodeBERT baseline
│       ├── graphcodebert.py  # GraphCodeBERT baseline
│       └── causal_methods.py # IRM, GroupDRO, VREx
├── configs/
│   └── default.yaml          # Default hyperparameters
├── experiments/
│   ├── within_org.py         # Within-organization experiments
│   └── cross_org.py          # Cross-organization experiments
├── results/
│   ├── ood/                   # Out-of-distribution results
│   ├── baselines/             # Baseline method results
│   ├── codebert/              # CodeBERT results per seed
│   └── graphcodebert/         # GraphCodeBERT results per seed
├── docs/
│   ├── theorems.md            # Mathematical proofs summary
│   └── labeling.md            # Labeling methodology
├── scripts/
│   ├── run_training.py        # Main training script
│   └── run_ood_eval.py        # OOD evaluation script
├── requirements.txt
├── setup.py
└── README.md

🔬 Theoretical Foundations

Our model is grounded in four key theorems (see docs/theorems.md for full proofs):

Theorem 1: Causal Convergence

Under weak independence assumptions, our model recovers causal features with error bounded by:

E[(f_c(x) - f̂_c(x))²] ≤ O(V log n/n) + O(1/λ) + r·√(d_x·d_c)·L²

Theorem 2: Abstraction Hierarchy

The residual architecture provides exponential representational capacity:

R(L) = O(L^d) or O((d+1)^L) regions

Theorem 3: Calibration Bound

The metacognitive gating achieves improved calibration:

ECE(p̂) ≤ min{ECE(p₁), ECE(p₂)} + O(√(log(1/δ)/n))

Theorem 4: Adaptive Compute

Reasoning depth scales with problem difficulty:

T* = Ω(log(1/ε_target) / (1-d))

🚀 Quick Start

Installation

git clone https://github.com/YOUR_USERNAME/glass-box-system2.git
cd glass-box-system2
pip install -e .

Training

from src.model import create_model
from src.data_engineering import load_and_label_data, System2Dataset
from src.training import train_system2

# Load data
samples, labels, analyses, prs = load_and_label_data('data/train.json')

# Create dataset
dataset = System2Dataset(samples, labels, tokenizer, pdg_extractor)

# Create model
model = create_model(
    model_name="microsoft/codebert-base",
    num_features=11,
    hidden_dim=256,
    bottleneck_dim=128
)

# Train
model, history, results = train_system2(
    model, train_loader, val_loader, test_loader,
    num_epochs=10, lr=3e-5
)

OOD Evaluation

from src.training import dual_robustness_test

# Test robustness to distribution shift
robustness_results = dual_robustness_test(
    model, ood_loader, device,
    strengths=[0.0, 0.1, 0.25, 0.5]
)

📈 Reproducing Results

Multi-Seed Training

python scripts/run_training.py \
    --data_path data/django_train.json \
    --seeds 42 123 456 789 1024 \
    --epochs 10 \
    --output_dir results/

Baseline Comparisons

python scripts/run_baselines.py \
    --model codebert \
    --data_path data/django_train.json \
    --ood_path data/optimum.json

📊 Confounders (11 Features)

Feature	Description	Range
`auth_hash`	Author identity (hashed)	[0, 1]
`sin_hour`	sin(hour/24×2π) temporal	[0, 1]
`cos_hour`	cos(hour/24×2π) temporal	[0, 1]
`complexity`	Patch length / 10000	[0, 1]
`centrality`	AST PageRank centrality	[0, 1]
`time_to_merge`	Days to merge / 30	[0, 1]
`code_churn`	(adds+dels)/files / 1000	[0, 1]
`non_code_ratio`	Comments/strings ratio	[0, 1]
`review_depth`	Review count / 10	[0, 1]
`files_changed`	File count / 50	[0, 1]
`degree`	AST graph degree	[0, 1]

🏷️ Labeling Methodology

We use a semantic-focused distant supervision approach that combines:

PR-level signals: Issue links, fix keywords, test coupling
Code-level signals: Semantic pattern detection (identifier swaps, wrong calls, etc.)
Strict intersection: Requires BOTH semantic evidence AND PR intent

See docs/labeling.md for detailed statistics.

📄 License

This project is licensed under the MIT License - see LICENSE for details.

🙏 Acknowledgments

CodeBERT and GraphCodeBERT from Microsoft Research
Django, HuggingFace, and other open-source projects for evaluation data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Glass Box System 2: Causal Neural Networks for Bug Detection

🎯 Key Contributions

📊 Results Summary

In-Distribution (Django Dataset)

Out-of-Distribution (Optimum Dataset)

🏗️ Architecture

📁 Repository Structure

🔬 Theoretical Foundations

Theorem 1: Causal Convergence

Theorem 2: Abstraction Hierarchy

Theorem 3: Calibration Bound

Theorem 4: Adaptive Compute

🚀 Quick Start

Installation

Training

OOD Evaluation

📈 Reproducing Results

Multi-Seed Training

Baseline Comparisons

📊 Confounders (11 Features)

🏷️ Labeling Methodology

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
configs		configs
docs		docs
results		results
scripts		scripts
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

Glass Box System 2: Causal Neural Networks for Bug Detection

🎯 Key Contributions

📊 Results Summary

In-Distribution (Django Dataset)

Out-of-Distribution (Optimum Dataset)

🏗️ Architecture

📁 Repository Structure

🔬 Theoretical Foundations

Theorem 1: Causal Convergence

Theorem 2: Abstraction Hierarchy

Theorem 3: Calibration Bound

Theorem 4: Adaptive Compute

🚀 Quick Start

Installation

Training

OOD Evaluation

📈 Reproducing Results

Multi-Seed Training

Baseline Comparisons

📊 Confounders (11 Features)

🏷️ Labeling Methodology

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages