# Level-3 Neuro-Symbolic AI: Logic-Aware Gating PoC

## What are the NSAI Levels?

**Level-2 (Pure Statistical Learning)**
- The model learns patterns from training data using only statistical correlations
- No awareness of logical constraints or domain rules
- Can confidently predict logically invalid intent combinations
- Example: Predicting `execute` when that action is explicitly forbidden for the current context

**Level-2.5 (Post-hoc Logic Filtering)**
- The Level-2 model predicts freely
- **After** the model produces probabilities, we apply logical constraints:
  - Zero out suppressed intents
  - Renormalize remaining probabilities
- The model's internal representations are unchanged
- Logic acts as a post-processing filter, not a learning signal

**Level-3 (Logic-Aware Gating)**
- Logical constraints are embedded **inside the model's forward pass**
- Before the output layer produces final probabilities, a logic gate:
  - Receives a mask indicating which intents are allowed/suppressed
  - Strongly suppresses logits for forbidden intents **before softmax**
  - Ensures invalid intents never become top predictions
- The model learns to work **with** constraints, not against them
- Logic shapes the prediction formation process itself

## Why gating "inside forward pass" is Level-3

The critical difference:
- **L2.5**: Logic corrects `model(x) ‚Üí probs` **after** the fact
- **L3**: Logic participates in `model(x, constraints) ‚Üí probs` **during** computation

This means:
- The model's loss function sees constraint-aware predictions during training
- Gradients flow through the gated outputs
- The model learns representations that align with logical structure
- Invalid predictions are structurally prevented, not just masked

This notebook demonstrates all three levels on the same dataset.

In [1]:
# Imports
import os
import sys
import ast
import numpy as np
import pandas as pd
from typing import List, Dict, Any
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings('ignore')

# Determinism
np.random.seed(42)

# Canonical intents (do not change)
INTENTS = ['investigate', 'execute', 'summarize', 'ops']
INTENT_TO_IDX = {intent: i for i, intent in enumerate(INTENTS)}
IDX_TO_INTENT = {i: intent for i, intent in enumerate(INTENTS)}

print("‚úì Imports complete")
print(f"Canonical intents: {INTENTS}")

‚úì Imports complete
Canonical intents: ['investigate', 'execute', 'summarize', 'ops']


In [8]:
# Cell 1 ‚Äî Load Level-3 Dataset

def find_repo_root(start_dir=None):
    d = start_dir or os.getcwd()
    while True:
        if os.path.exists(os.path.join(d, 'requirements.txt')) or os.path.exists(os.path.join(d, '.git')):
            return d
        parent = os.path.dirname(d)
        if parent == d:
            return os.getcwd()
        d = parent

repo_root = find_repo_root()
sys.path.insert(0, repo_root)

# Locate L3 dataset
l3_path = None
for candidate in [os.path.join(repo_root, 'l3', 'data', 'level3_intents.csv'),
                 os.path.join(repo_root, 'level3', 'data', 'level3_intents.csv')]:
    if os.path.exists(candidate):
        l3_path = candidate
        break

if l3_path is None:
    raise FileNotFoundError('Level-3 dataset not found at l3/data/level3_intents.csv')

df = pd.read_csv(l3_path)
print(f"‚úì Loaded {len(df)} records from {l3_path}")

# Map dataset intent names to canonical INTENTS
# The data uses 'execution', 'summarization' instead of 'execute', 'summarize'
INTENT_MAPPING = {
    'investigate': 'investigate',
    'execution': 'execute',
    'summarization': 'summarize',
    'execute': 'execute',  # in case data already uses correct name
    'summarize': 'summarize',
    # 'out_of_scope' is filtered out (not in canonical INTENTS)
    # 'ops' is in canonical but not in current data (that's okay)
}

# Apply intent mapping to gold_intent
df['gold_intent'] = df['gold_intent'].map(lambda x: INTENT_MAPPING.get(x, x))

# Check distribution after mapping
print(f"\nIntent distribution after mapping to canonical names:")
print(df['gold_intent'].value_counts())

# Parse list columns safely
def parse_list(x):
    if isinstance(x, (list, tuple)):
        return list(x)
    if pd.isna(x):
        return []
    s = str(x).strip()
    if s == '':
        return []
    try:
        v = ast.literal_eval(s)
        if isinstance(v, (list, tuple)):
            return list(v)
    except Exception:
        pass
    return [item.strip() for item in s.split(',') if item.strip()]

def parse_and_map_intents(x):
    """Parse list and map intent names to canonical."""
    items = parse_list(x)
    return [INTENT_MAPPING.get(item, item) for item in items]

for col in ['allowed_intents', 'suppressed_intents']:
    if col in df.columns:
        df[col] = df[col].apply(parse_and_map_intents)

# Validate dataset
required_cols = ['utterance', 'gold_intent', 'allowed_intents', 'suppressed_intents']
missing = [c for c in required_cols if c not in df.columns]
if missing:
    raise ValueError(f'Missing required columns: {missing}')

# Filter to only canonical intents (after mapping)
before_filter = len(df)
df = df[df['gold_intent'].isin(INTENTS)].copy()
filtered_count = before_filter - len(df)

if filtered_count > 0:
    print(f"\n‚ö† Filtered out {filtered_count} records with non-canonical intents (e.g., 'out_of_scope')")

print(f"\n‚úì Using {len(df)} records with canonical intents")
print(f"Distribution by canonical intent:")
intent_counts = df['gold_intent'].value_counts()
for intent in INTENTS:
    count = intent_counts.get(intent, 0)
    print(f"  {intent}: {count} samples")

# Ensure allowed_intents is non-empty (use all if empty)
df['allowed_intents'] = df['allowed_intents'].apply(
    lambda x: x if len(x) > 0 else INTENTS.copy()
)

# Validation check
min_samples_per_class = df['gold_intent'].value_counts().min() if len(df) > 0 else 0
if min_samples_per_class == 0:
    print(f"\n‚ö† WARNING: Some canonical intents have 0 samples!")
    print(f"   The PoC will work but won't demonstrate all 4 intent classes.")
elif min_samples_per_class < 10:
    print(f"\n‚ö† WARNING: Minimum samples per class is {min_samples_per_class}")
    print(f"   This is low but sufficient for PoC demonstration.")

print(f"\nColumns: {list(df.columns)}")
print(f"\nSample record:")
print(f"  Utterance: {df.iloc[0]['utterance']}")
print(f"  Gold intent: {df.iloc[0]['gold_intent']}")
print(f"  Allowed: {df.iloc[0]['allowed_intents']}")
print(f"  Suppressed: {df.iloc[0]['suppressed_intents']}")

‚úì Loaded 614 records from c:\git\nsai_poc\level3\data\level3_intents.csv

Intent distribution after mapping to canonical names:
gold_intent
out_of_scope    169
execute         150
investigate     149
summarize       146
Name: count, dtype: int64

‚ö† Filtered out 169 records with non-canonical intents (e.g., 'out_of_scope')

‚úì Using 445 records with canonical intents
Distribution by canonical intent:
  investigate: 149 samples
  execute: 150 samples
  summarize: 146 samples
  ops: 0 samples

Columns: ['utterance', 'gold_intent', 'facts', 'active_constraints', 'allowed_intents', 'suppressed_intents']

Sample record:
  Utterance: why is server host123 cpu high
  Gold intent: investigate
  Allowed: ['investigate', 'summarize', 'ops']
  Suppressed: ['execute']


In [None]:
# Cell 2 ‚Äî Minimal Baseline Model (L2)

# Simple TF-IDF + Logistic Regression as L2 baseline
from sklearn.linear_model import LogisticRegression

# Prepare data
X_text = df['utterance'].values
y_labels = df['gold_intent'].map(INTENT_TO_IDX).values

# Check class distribution
unique, counts = np.unique(y_labels, return_counts=True)
print(f"Class distribution in dataset:")
for intent_idx, count in zip(unique, counts):
    print(f"  {IDX_TO_INTENT[intent_idx]}: {count} samples")

# Check if we have enough samples for stratified split
min_samples = counts.min()
use_stratify = min_samples >= 2  # Need at least 2 samples per class for stratified split

# Train/test split (80/20)
if use_stratify:
    X_train_text, X_test_text, y_train, y_test, train_idx, test_idx = train_test_split(
        X_text, y_labels, np.arange(len(df)), test_size=0.2, random_state=42, stratify=y_labels
    )
    print(f"\n‚úì Using stratified split")
else:
    X_train_text, X_test_text, y_train, y_test, train_idx, test_idx = train_test_split(
        X_text, y_labels, np.arange(len(df)), test_size=0.2, random_state=42
    )
    print(f"\n‚ö† Class imbalanced - using random split (not stratified)")

# Vectorize
vectorizer = TfidfVectorizer(max_features=500, ngram_range=(1, 2))
X_train = vectorizer.fit_transform(X_train_text)
X_test = vectorizer.transform(X_test_text)

# Check training set class distribution
train_unique, train_counts = np.unique(y_train, return_counts=True)
print(f"\nTraining set class distribution:")
for intent_idx, count in zip(train_unique, train_counts):
    print(f"  {IDX_TO_INTENT[intent_idx]}: {count} samples")

# Train L2 baseline model
l2_model = LogisticRegression(max_iter=200, random_state=42)
l2_model.fit(X_train, y_train)

# Get L2 predictions on test set
l2_probs_test = l2_model.predict_proba(X_test)  # shape: (n_test, num_classes)
l2_preds_test = np.argmax(l2_probs_test, axis=1)

# Store in dataframe for later comparison
test_df = df.iloc[test_idx].copy().reset_index(drop=True)
test_df['l2_probs'] = list(l2_probs_test)
test_df['l2_pred_idx'] = l2_preds_test
test_df['l2_pred_intent'] = [IDX_TO_INTENT[i] for i in l2_preds_test]

l2_accuracy = (l2_preds_test == y_test).mean()

print(f"\n‚úì L2 baseline trained on {X_train.shape[0]} samples")
print(f"‚úì Test set: {X_test.shape[0]} samples")
print(f"‚úì L2 accuracy on test: {l2_accuracy:.2%}")
print(f"\nL2 represents: Pure statistical learning with no constraint awareness")

Class distribution in dataset:
  investigate: 149 samples
  execute: 150 samples
  summarize: 146 samples

‚úì Using stratified split

Training set class distribution:
  investigate: 119 samples
  execute: 120 samples
  summarize: 117 samples


TypeError: sparse array length is ambiguous; use getnnz() or shape[0]

In [None]:
# Cell 3 ‚Äî L2.5 (Post-hoc Masking)

def apply_l25_masking(probs: np.ndarray, allowed: List[str], suppressed: List[str]) -> np.ndarray:
    """
    Apply L2.5 post-hoc logic:
    - Zero out suppressed intents
    - If allowed_intents specified, zero out all others
    - Renormalize
    """
    masked_probs = probs.copy()
    
    # Zero suppressed
    for intent in suppressed:
        if intent in INTENT_TO_IDX:
            masked_probs[INTENT_TO_IDX[intent]] = 0.0
    
    # Zero non-allowed (if allowed list is not everything)
    if set(allowed) != set(INTENTS):
        for intent in INTENTS:
            if intent not in allowed:
                masked_probs[INTENT_TO_IDX[intent]] = 0.0
    
    # Renormalize
    total = masked_probs.sum()
    if total > 0:
        masked_probs = masked_probs / total
    else:
        # Fallback: uniform over allowed
        masked_probs = np.zeros(len(INTENTS))
        for intent in allowed:
            if intent in INTENT_TO_IDX:
                masked_probs[INTENT_TO_IDX[intent]] = 1.0 / len(allowed)
    
    return masked_probs

# Apply L2.5 to all test predictions
l25_probs_list = []
for idx, row in test_df.iterrows():
    l2_probs = row['l2_probs']
    allowed = row['allowed_intents']
    suppressed = row['suppressed_intents']
    l25_probs = apply_l25_masking(l2_probs, allowed, suppressed)
    l25_probs_list.append(l25_probs)

l25_probs_array = np.array(l25_probs_list)
l25_preds = np.argmax(l25_probs_array, axis=1)

test_df['l25_probs'] = list(l25_probs_array)
test_df['l25_pred_idx'] = l25_preds
test_df['l25_pred_intent'] = [IDX_TO_INTENT[i] for i in l25_preds]

l25_accuracy = (l25_preds == y_test).mean()

print(f"‚úì L2.5 post-hoc masking applied to {len(test_df)} test samples")
print(f"‚úì L2.5 accuracy on test: {l25_accuracy:.2%}")
print(f"\nL2.5 represents: Logic applied AFTER model inference (post-processing filter)")

In [None]:
# Cell 4 ‚Äî Level-3 Model with Logic-Aware Gating

class Level3LogicGatedClassifier:
    """
    A simple classifier with logic-aware gating INSIDE the forward pass.
    
    Architecture:
    - TF-IDF features ‚Üí Linear layer ‚Üí Logits
    - Logic gate: mask logits BEFORE softmax
    - Softmax ‚Üí Final probabilities
    
    The key difference from L2:
    - forward() accepts both features AND allowed_mask
    - Gating happens inside forward, not after
    """
    
    def __init__(self, input_dim: int, num_classes: int, mask_value: float = -1e9):
        self.num_classes = num_classes
        self.mask_value = mask_value
        
        # Simple linear layer (weights + bias)
        self.W = np.random.randn(input_dim, num_classes) * 0.01
        self.b = np.zeros(num_classes)
    
    def forward(self, X, allowed_mask):
        """
        Forward pass with logic-aware gating.
        
        Args:
            X: feature matrix (n_samples, input_dim)
            allowed_mask: binary mask (n_samples, num_classes)
                         1 = allowed, 0 = suppressed
        
        Returns:
            probs: probability distribution (n_samples, num_classes)
        """
        # Compute logits
        if hasattr(X, 'toarray'):  # sparse matrix
            X = X.toarray()
        logits = X @ self.W + self.b  # shape: (n_samples, num_classes)
        
        # CRITICAL: Apply logic gate BEFORE softmax
        # Mask suppressed intents with large negative value
        masked_logits = logits + (1 - allowed_mask) * self.mask_value
        
        # Softmax (numerically stable)
        exp_logits = np.exp(masked_logits - np.max(masked_logits, axis=1, keepdims=True))
        probs = exp_logits / np.sum(exp_logits, axis=1, keepdims=True)
        
        return probs
    
    def fit(self, X, y, allowed_masks, epochs=50, lr=0.01):
        """
        Simple gradient descent training.
        
        Args:
            X: features
            y: true labels (indices)
            allowed_masks: binary masks for each sample
            epochs: training iterations
            lr: learning rate
        """
        if hasattr(X, 'toarray'):
            X = X.toarray()
        
        n_samples = X.shape[0]
        
        for epoch in range(epochs):
            # Forward pass with logic gating
            probs = self.forward(X, allowed_masks)
            
            # Compute loss (cross-entropy)
            log_probs = np.log(probs + 1e-10)
            loss = -np.mean([log_probs[i, y[i]] for i in range(n_samples)])
            
            # Compute gradients
            grad_probs = probs.copy()
            for i in range(n_samples):
                grad_probs[i, y[i]] -= 1
            grad_probs = grad_probs / n_samples
            
            # Update weights
            grad_W = X.T @ grad_probs
            grad_b = np.sum(grad_probs, axis=0)
            
            self.W -= lr * grad_W
            self.b -= lr * grad_b
            
            if epoch % 10 == 0:
                print(f"  Epoch {epoch}/{epochs}, Loss: {loss:.4f}")
        
        print(f"‚úì Training complete")

# Prepare allowed_masks for training and test
def create_allowed_mask(allowed: List[str], suppressed: List[str]) -> np.ndarray:
    """Create binary mask: 1 for allowed intents, 0 for suppressed."""
    mask = np.ones(len(INTENTS))
    
    # Zero out suppressed
    for intent in suppressed:
        if intent in INTENT_TO_IDX:
            mask[INTENT_TO_IDX[intent]] = 0
    
    # If allowed list specified, zero non-allowed
    if set(allowed) != set(INTENTS):
        for intent in INTENTS:
            if intent not in allowed:
                mask[INTENT_TO_IDX[intent]] = 0
    
    return mask

# Build masks for train and test
train_df_full = df.iloc[train_idx].copy()
train_masks = np.array([
    create_allowed_mask(row['allowed_intents'], row['suppressed_intents'])
    for _, row in train_df_full.iterrows()
])

test_masks = np.array([
    create_allowed_mask(row['allowed_intents'], row['suppressed_intents'])
    for _, row in test_df.iterrows()
])

# Train L3 model
print("\nüîß Training Level-3 model with logic-aware gating...")
l3_model = Level3LogicGatedClassifier(input_dim=X_train.shape[1], num_classes=len(INTENTS))
l3_model.fit(X_train, y_train, train_masks, epochs=50, lr=0.1)

# Get L3 predictions on test set
l3_probs_test = l3_model.forward(X_test, test_masks)
l3_preds_test = np.argmax(l3_probs_test, axis=1)

test_df['l3_probs'] = list(l3_probs_test)
test_df['l3_pred_idx'] = l3_preds_test
test_df['l3_pred_intent'] = [IDX_TO_INTENT[i] for i in l3_preds_test]

l3_accuracy = (l3_preds_test == y_test).mean()

print(f"\n‚úì L3 accuracy on test: {l3_accuracy:.2%}")
print(f"\nL3 represents: Logic gating INSIDE forward pass (constraint-aware learning)")

In [None]:
# Cell 5 ‚Äî Comparison Metrics (L2 vs L2.5 vs L3)

def is_violating(pred_intent: str, allowed: List[str], suppressed: List[str]) -> bool:
    """Check if predicted intent violates constraints."""
    if pred_intent in suppressed:
        return True
    if set(allowed) != set(INTENTS) and pred_intent not in allowed:
        return True
    return False

# Compute violation rates
test_df['l2_violates'] = test_df.apply(
    lambda row: is_violating(row['l2_pred_intent'], row['allowed_intents'], row['suppressed_intents']),
    axis=1
)
test_df['l25_violates'] = test_df.apply(
    lambda row: is_violating(row['l25_pred_intent'], row['allowed_intents'], row['suppressed_intents']),
    axis=1
)
test_df['l3_violates'] = test_df.apply(
    lambda row: is_violating(row['l3_pred_intent'], row['allowed_intents'], row['suppressed_intents']),
    axis=1
)

l2_violation_rate = test_df['l2_violates'].mean()
l25_violation_rate = test_df['l25_violates'].mean()
l3_violation_rate = test_df['l3_violates'].mean()

# Intent flip rates
l2_to_l25_flips = (test_df['l2_pred_intent'] != test_df['l25_pred_intent']).sum()
l2_to_l3_flips = (test_df['l2_pred_intent'] != test_df['l3_pred_intent']).sum()
l25_to_l3_flips = (test_df['l25_pred_intent'] != test_df['l3_pred_intent']).sum()

# Gold agreement
test_df['l2_correct'] = test_df['l2_pred_intent'] == test_df['gold_intent']
test_df['l25_correct'] = test_df['l25_pred_intent'] == test_df['gold_intent']
test_df['l3_correct'] = test_df['l3_pred_intent'] == test_df['gold_intent']

l2_acc = test_df['l2_correct'].mean()
l25_acc = test_df['l25_correct'].mean()
l3_acc = test_df['l3_correct'].mean()

# Print comparison table
print("="*70)
print("LEVEL COMPARISON: L2 vs L2.5 vs L3")
print("="*70)
print(f"\nTest set size: {len(test_df)} samples\n")

print(f"{'Metric':<35} {'L2':>10} {'L2.5':>10} {'L3':>10}")
print("-"*70)
print(f"{'Constraint Violation Rate':<35} {l2_violation_rate:>9.1%} {l25_violation_rate:>9.1%} {l3_violation_rate:>9.1%}")
print(f"{'Gold Intent Accuracy':<35} {l2_acc:>9.1%} {l25_acc:>9.1%} {l3_acc:>9.1%}")
print("")
print(f"Intent flips from L2 ‚Üí L2.5: {l2_to_l25_flips} ({l2_to_l25_flips/len(test_df):.1%})")
print(f"Intent flips from L2 ‚Üí L3:   {l2_to_l3_flips} ({l2_to_l3_flips/len(test_df):.1%})")
print(f"Intent flips from L2.5 ‚Üí L3: {l25_to_l3_flips} ({l25_to_l3_flips/len(test_df):.1%})")
print("\n" + "="*70)

print("\nüìä Key Observations:")
print(f"  ‚Ä¢ L2 violation rate shows baseline constraint-unawareness")
print(f"  ‚Ä¢ L2.5 should have near-zero violations (post-hoc masking)")
print(f"  ‚Ä¢ L3 should have EXACTLY zero violations (structural prevention)")
print(f"  ‚Ä¢ Accuracy differences reveal trade-offs between constraint adherence and prediction quality")

In [None]:
# Cell 6 ‚Äî Side-by-side Examples (Teaching Moment)

# Find examples where L2 violated but L2.5/L3 corrected
violations = test_df[test_df['l2_violates'] == True].copy()

if len(violations) > 0:
    print("="*90)
    print("CONCRETE EXAMPLES: Where L2 violated constraints")
    print("="*90)
    
    # Show first 5 violations (deterministic)
    sample = violations.head(5)
    
    for idx, row in sample.iterrows():
        print(f"\n{'‚îÄ'*90}")
        print(f"Utterance: {row['utterance'][:70]}..." if len(row['utterance']) > 70 else f"Utterance: {row['utterance']}")
        print(f"Gold intent: {row['gold_intent']}")
        print(f"Suppressed: {row['suppressed_intents']}")
        if set(row['allowed_intents']) != set(INTENTS):
            print(f"Allowed: {row['allowed_intents']}")
        print()
        
        # L2 prediction
        l2_probs_dict = {INTENTS[i]: row['l2_probs'][i] for i in range(len(INTENTS))}
        l2_top3 = sorted(l2_probs_dict.items(), key=lambda x: x[1], reverse=True)[:3]
        print(f"L2 (baseline):")
        print(f"  Top-1: {row['l2_pred_intent']} {'‚ùå VIOLATES' if row['l2_violates'] else '‚úì'}")
        print(f"  Probs: {', '.join([f'{k}={v:.3f}' for k, v in l2_top3])}")
        
        # L2.5 prediction
        l25_probs_dict = {INTENTS[i]: row['l25_probs'][i] for i in range(len(INTENTS))}
        l25_top3 = sorted(l25_probs_dict.items(), key=lambda x: x[1], reverse=True)[:3]
        print(f"\nL2.5 (post-hoc masking):")
        print(f"  Top-1: {row['l25_pred_intent']} {'‚ùå VIOLATES' if row['l25_violates'] else '‚úì fixed'}")
        print(f"  Probs: {', '.join([f'{k}={v:.3f}' for k, v in l25_top3])}")
        
        # L3 prediction
        l3_probs_dict = {INTENTS[i]: row['l3_probs'][i] for i in range(len(INTENTS))}
        l3_top3 = sorted(l3_probs_dict.items(), key=lambda x: x[1], reverse=True)[:3]
        print(f"\nL3 (logic-aware gating):")
        print(f"  Top-1: {row['l3_pred_intent']} {'‚ùå VIOLATES' if row['l3_violates'] else '‚úì prevented'}")
        print(f"  Probs: {', '.join([f'{k}={v:.3f}' for k, v in l3_top3])}")
    
    print(f"\n{'‚îÄ'*90}")
else:
    print("\n‚úì L2 produced no constraint violations on this test set.")
    print("(This can happen if the dataset's constraints align naturally with training patterns)")

print("\nüìö Teaching Takeaway:")
print("  ‚Ä¢ L2: Can confidently predict INVALID intents (suppressed or disallowed)")
print("  ‚Ä¢ L2.5: Fixes violations AFTER prediction by zeroing and renormalizing")
print("  ‚Ä¢ L3: Never forms invalid predictions ‚Äî logic gates prevent them structurally")

# Level-3 Conclusion

## What L3 proved in this PoC

**Logic can be embedded inside the model's forward pass.** By applying a logic gate before the softmax activation, we structurally prevent the model from predicting suppressed intents. This is fundamentally different from post-hoc filtering.

**L3 violation rate should be exactly zero.** Unlike L2 (which can violate freely) and L2.5 (which corrects after the fact), L3 models cannot produce invalid top-1 predictions by construction. The logic gate masks suppressed logits with large negative values before softmax, ensuring they receive near-zero probability.

**The model learns with constraint awareness.** During training, gradients flow through the logic-gated outputs. This means:
- The model's loss function only sees valid predictions
- The model learns representations that work within the logical constraints
- Invalid reasoning paths are not reinforced during learning

## What L3 did not prove

**This is a minimal PoC, not a production system.** We used:
- A simple linear classifier (TF-IDF + logistic regression equivalent)
- Basic gradient descent training
- Binary masks (allowed/suppressed only)

Real Level-3 systems would involve:
- More sophisticated architectures (transformers, graph neural networks)
- Richer logical constraints (temporal dependencies, multi-step reasoning)
- Differentiable logic layers that can learn constraint parameters

**We did not prove L3 always improves accuracy.** Constraint enforcement can reduce the model's flexibility, potentially lowering accuracy on edge cases. The trade-off between constraint adherence and predictive performance depends on:
- How well constraints align with the true data distribution
- Whether the model has enough capacity to learn valid patterns
- The quality and coverage of the constraint specifications

**We did not demonstrate constraint learning.** In this PoC, constraints were provided as fixed masks. True neuro-symbolic AI might:
- Learn constraint parameters from data
- Discover latent logical structure
- Adapt constraints based on context

## Why this is structurally different from L2.5

**Timing of logic application:**
- **L2.5**: `model(x) ‚Üí raw_probs` ‚Üí `apply_logic(raw_probs) ‚Üí final_probs`
- **L3**: `model(x, constraints) ‚Üí final_probs` (logic inside forward)

**Gradient flow:**
- **L2.5**: Gradients flow through unconstrained predictions; logic is a non-differentiable post-process
- **L3**: Gradients flow through constrained predictions; logic participates in learning

**Representation learning:**
- **L2.5**: Model learns features without constraint awareness; may waste capacity on invalid patterns
- **L3**: Model learns features that align with constraints; representations are structurally informed by logic

**Architectural commitment:**
- **L2.5**: Logic is external; can be added/removed without retraining
- **L3**: Logic is embedded; model architecture explicitly includes constraint handling

---

## Final Verdict

In [None]:
# Cell 7 ‚Äî Final Verdict

print("="*70)
print("LEVEL-3 POC COMPLETE")
print("="*70)
print()
print("We demonstrated:")
print("  ‚úì L2: Pure statistical learning (no constraint awareness)")
print("  ‚úì L2.5: Post-hoc logic filtering (constraints applied AFTER inference)")
print("  ‚úì L3: Logic-aware gating (constraints embedded INSIDE forward pass)")
print()
print("Key architectural distinction:")
print("  L2.5 corrects invalid outputs after they form")
print("  L3 prevents invalid outputs from forming")
print()
print("This is the foundation of neuro-symbolic AI:")
print("  Logic is not a post-processing step")
print("  Logic is a structural component of the model")
print("="*70)