# SemanticGNN-Physics: Complete Pipeline

**Graph Neural Networks for Physics Equation Discovery**

## What This Notebook Does:
1. 🔬 Parse 400 physics equations (AST processing)
2. 🌐 Build knowledge graph (639 nodes, 4,557 bridges)  
3. 🧠 Train SuperPhysicsGNN (52K parameters)
4. 📊 Analyze physics discoveries

## Expected Results:
- **Quick version** (1000 epochs): AUC 0.9548 ± 0.0071
- **Full version** (3000 epochs): AUC 0.9706 ± 0.0036

## Runtime:
- Quick: ~15 minutes
- Full: ~45 minutes

**Ready? Click "Run All" to start! 🚀**

In [14]:
# ===============================================
# SEMANTICGNN-PHYSICS: COMPLETE PIPELINE
# ===============================================
print("🚀 SemanticGNN-Physics Pipeline Starting...")

# Install requirements
!pip install torch-geometric

print("✅ Setup completed!")

🚀 SemanticGNN-Physics Pipeline Starting...
✅ Setup completed!


In [11]:
# ===============================================
# STEP 1: AST PROCESSING
# ===============================================

# Run AST parser
!python ast_parser.py

🔬 AST ANALYZER - FINAL 100% VERSION WITH DEBUG
✅ Loaded 400 equations from final_physics_database.json

🧪 TESTING problematic equations...

📝 Testing: γ * m * c^2
   After Greek: gamma * m * c^2
   Final cleaned: gamma * m * c**2
   Result: ✅ SUCCESS
   Parsed as: c**2*gamma*m

📝 Testing: f_0 * sqrt((1 + β)/(1 - β))
   After Greek: f_0 * sqrt((1 + beta)/(1 - beta))
   Final cleaned: f_0 * sqrt((1 + beta)/(1 - beta))
   Result: ✅ SUCCESS
   Parsed as: f_0*sqrt((beta + 1)/(1 - beta))

📝 Testing: sqrt(γ * R * T / M)
   After Greek: sqrt(gamma * R * T / M)
   Final cleaned: sqrt(gamma * R * T / M)
   Result: ✅ SUCCESS
   Parsed as: sqrt(R*T*gamma/M)

📝 Testing: N_0 * exp(-λ * t)
   After Greek: N_0 * exp(-lamda * t)
   Final cleaned: N_0 * exp(-lamda * t)
   Result: ✅ SUCCESS
   Parsed as: N_0*exp(-lamda*t)

📝 Testing: I * ∫(dl × B)
   After Greek: I * ∫(dl × B)
   Final cleaned: I * Integral(dl * B, x)
   Result: ✅ SUCCESS
   Parsed as: I*Integral(B*dl, x)

📝 Testing: ∫(dQ / T)
   After G

In [12]:
# ===============================================
# STEP 2: GRAPH CONSTRUCTION
# ===============================================

# Run graph builder
!python graph_builder.py

GRAPH3EV - PHYSICS KNOWLEDGE GRAPH BUILDER v3.1
(Post Peer-Review Edition with Literature-Based Weights)

📚 Loading equations from final_physics_database.json...
  Found equations under 'laws' key
  Database version: 3.0
  Total laws in DB: 400
✓ Loaded 400 equations

🔍 Optimizing hyperparameters with grid search...
✓ Optimal hyperparameters: α=0.100, β=0.100, γ=0.800

🔨 Building base graph structure...
Processing equations:   0% 0/400 [00:00<?, ?it/s]Processing equations: 100% 400/400 [00:00<00:00, 89813.79it/s]
✓ Created 400 equation nodes
✓ Created 239 concept nodes
✓ Created 2692 concept-equation edges

🌉 Creating equation bridges with normalized weights...
Finding equation bridges: 100% 399/399 [00:00<00:00, 2945.72it/s]
✓ Found 10422 equation bridges
  - Same-branch: 2746
  - Cross-branch: 7676
✓ Kept 4557 significant bridges
  Same-branch threshold: 0.898 (top 25%)
  Cross-branch threshold: 0.461 (top 50%)

GRAPH CONSTRUCTION SUMMARY
Total nodes: 639
  - Equation nodes: 400
  

In [13]:
# ===============================================
# STEP 3: SUPERGNN TRAINING & EVALUATION
# ===============================================

# Run SuperGNN training 5 seeds, 1000 epochs
!python gnn_trainer_1000.py

print("For 3000 epoch training replace above with: !python gnntrainer_3000.py")

SUPER GNN PRO - ULTIMATE PHYSICS KNOWLEDGE DISCOVERY

🎲 RUN #1/5 - SEED: 42
🎲 RUN - SEED: 42

🖥️ Device: cuda

📊 Loading graph: knowledge_graph_outputs/physics_knowledge_graph_v2_20250807_101448.pt

📈 Graph Statistics:
   - Total nodes: 639
   - Total edges: 5903
   - Node features: 12
   - Edge attributes: torch.Size([11806, 3])
   - Equations: 400
   - Concepts: 239
   This will exploit the 4557 equation bridges!

🔄 Data Split:
   - Train edges: 4668
   - Val edges: 583
   - Test edges: 584

🧠 SUPER GNN Architecture:
   - Type: Multi-Head GAT with Edge Weights
   - Attention Heads: 4 → 2 → 1
   - Hidden Dimensions: 64 → 32 → 16
   - Total Parameters: 52,065
   - Edge Attribute Dimensions: 3
   - Uses Physics-Aware Decoder: Yes
2025-08-07 10:14:54,987 - INFO - Model on device: cuda
2025-08-07 10:14:54,987 - INFO - Total parameters: 52,065

🚀 Starting SUPER GNN training...
   This will exploit the 4557 equation bridges!
2025-08-07 10:14:54,990 - INFO - Starting Super GNN training for 1