# IVNTR Bilevel Learning: Neural Predicate Invention

This notebook explores the core IVNTR algorithm implemented in `predicators/approaches/bilevel_learning_approach.py`. IVNTR (bIleVel learNing from TRansitions) learns neural predicates that enable symbolic planning from demonstration data.

## The Core Algorithm

IVNTR implements a **bilevel learning approach** where:

1. **Upper Level (Symbolic)**: Searches over candidate predicate effects and constructs NSRTs
2. **Lower Level (Neural)**: Learns predicate classifiers using **Any** Neural Networks
3. **Alternating Process**: The two levels guide each other through iterative refinement

For each predicate, IVNTR alternates between:
- **Effect Discovery**: Searching for high-level symbolic effects the predicate should have
- **Grounding Learning**: Training neural networks to classify when the predicate holds

## Tutorial Focus: "HasChemX" Predicate

In this tutorial, we'll examine the simplest predicate invention process using the "HasChemX" predicate:
- **Purpose**: Determines if an object has been shot with Chemical X
- **Input**: Continous feature vectors of satellites and objects (position, chemical states, ...)
- **Output**: Binary classification (has ChemX or not)
- **Role**: Required precondition for camera readings

## What We'll Cover

1. **Dummy NSRTs**: Starting point with incomplete effect knowledge.
2. **Demonstration Data**: Training examples from oracle trajectories.
3. **Neural Learning Process**: If we have perfect AE vector, how to train a neural network (single iteration).
4. **Symbolic Search Process**: If we do not know the AE vector, how to use neural loss to guide the search (multiple iterations).
5. **Advanced In-depth Highlight**: Binary predicate, variable bindings, quantifiers, and more.

Let's dive in!

## 1. Setting Up the Bilevel Learning Environment

We'll recreate the bilevel learning setup from `bilevel_learning_approach.py`, focusing on the key components that IVNTR needs for predicate invention.

In [57]:
import sys
import os
import logging

# Add the project root to path
sys.path.append('..')

# Set FastDownward path
FD_EXEC_PATH = os.path.join(os.path.dirname(os.path.abspath('.')), 'ext', 'downward')
os.environ['FD_EXEC_PATH'] = FD_EXEC_PATH

# Import IVNTR components
from predicators.envs.satellites import SatellitesEnv
from predicators.ground_truth_models import get_dummy_nsrts, get_gt_options
from predicators.datasets.demo_only import _generate_demonstrations
from predicators.approaches import create_approach
from predicators import utils
from predicators.settings import CFG

config_path = os.path.abspath("example_gt_vec.yaml") 
log_file_path = os.path.abspath("example_gt_vec_learning.log")
neupi_save_path = os.path.abspath("saved_neural")

print("✅ All imports successful!")

# Test the configuration section
utils.reset_config({
    "device": "cpu",
    "env": "satellites",
    "approach": "ivntr",
    "neupi_pred_config": config_path,
    "neupi_gt_ae_matrix": True,
    "excluded_predicates": "ViewClear,IsCalibrated,HasChemX,HasChemY,Sees",
    "neupi_do_normalization": True,
    "num_train_tasks": 50,
    "num_test_tasks": 20,
    "seed": 0,
    "bilevel_plan_without_sim": False,
    "exclude_domain_feat": None,
    "log_file": log_file_path,
})

handlers = [logging.StreamHandler()]
handlers.append(logging.FileHandler(CFG.log_file, mode='w'))
logging.basicConfig(level=logging.INFO,
                    format="%(message)s",
                    handlers=handlers,
                    force=True)

CFG.seed = 42
CFG.num_train_tasks = 50  # Generate 50 demonstration trajectories
CFG.satellites_num_sat_train = [2, 3]
CFG.satellites_num_obj_train = [2, 3]
CFG.timeout = 10.0
CFG.demonstrator = "oracle"
CFG.max_initial_demos = 50

print("✅ IVNTR Bilevel Learning Setup Complete!")
print(f"Environment: {CFG.env}")
print(f"Training tasks: {CFG.num_train_tasks}")
print(f"Demonstration trajectories: {CFG.max_initial_demos}")

✅ All imports successful!
✅ IVNTR Bilevel Learning Setup Complete!
Environment: satellites
Training tasks: 50
Demonstration trajectories: 50


## 2. Understanding Dummy NSRTs

IVNTR starts with **dummy NSRTs** - symbolic operators with incomplete knowledge. These are defined in `predicators/ground_truth_models/satellites/dummy_nsrts.py` and serve as the starting point for predicate invention.

### Key Characteristics of Dummy NSRTs:
- **Known preconditions**: Static predicates we can observe (e.g., `HasCamera`, `ShootsChemX`)
- **Unknown effects**: Dynamic predicates we need to learn (e.g., `HasChemX`, `IsCalibrated`)
- **Placeholder structure**: Provides the symbolic scaffolding for learning

Let's examine these dummy NSRTs and understand what IVNTR needs to learn.

In [58]:
# Test the dummy NSRTs section
# Create environment and load dummy NSRTs
env = SatellitesEnv(use_gui=False)
train_tasks = env.get_train_tasks()

# Get ground truth options (needed for dummy NSRTs)
options = get_gt_options(env.get_name())

# Load dummy NSRTs (the incomplete starting point)
# get_dummy_nsrts takes: env_name, predicates_to_keep, options_to_keep
dummy_nsrts = get_dummy_nsrts(env.get_name(), env.predicates, options)

print(f"🧩 Dummy NSRTs: {len(dummy_nsrts)} operators\n")

# Test just the first few NSRTs to verify structure
for nsrt in sorted(list(dummy_nsrts)[:3], key=lambda x: x.name):
    print(f"📋 {nsrt.name}")
    print(f"   Parameters: {[str(p) for p in nsrt.parameters]}")
    
    if nsrt.preconditions:
        print(f"   Known Preconditions:")
        for pre in nsrt.preconditions:
            print(f"     ✓ {pre}")
    else:
        print(f"   Known Preconditions: None")
    
    if nsrt.add_effects:
        print(f"   Known Add Effects:")
        for eff in nsrt.add_effects:
            print(f"     + {eff}")
    else:
        print(f"   Known Add Effects: None (TO BE LEARNED!)")
    
    print()

print("✅ Dummy NSRTs loaded successfully!")

🧩 Dummy NSRTs: 8 operators

📋 Calibrate
   Parameters: ['?sat:satellite', '?obj:object']
   Known Preconditions:
     ✓ CalibrationTarget(?sat:satellite, ?obj:object)
   Known Add Effects: None (TO BE LEARNED!)

📋 ShootChemX
   Parameters: ['?sat:satellite', '?obj:object']
   Known Preconditions:
     ✓ ShootsChemX(?sat:satellite)
   Known Add Effects: None (TO BE LEARNED!)

📋 TakeCameraReading
   Parameters: ['?sat:satellite', '?obj:object']
   Known Preconditions:
     ✓ HasCamera(?sat:satellite)
   Known Add Effects:
     + CameraReadingTaken(?sat:satellite, ?obj:object)

✅ Dummy NSRTs loaded successfully!


## 3. Generating Demonstration Data

IVNTR learns from demonstration trajectories collected using the oracle approach. These trajectories contain the training signal for both effect discovery and neural predicate learning.

Let's generate 50 demonstration trajectories using the process from `03_demo_trajectories.ipynb`.

In [59]:
# Test the demonstration generation section (with a smaller number for testing)
print("🎬 Generating Demonstration Trajectories...\n")

# This replicates the _generate_demonstrations function from demo_only.py
# Note: We need to provide all required arguments
training_tasks = [task.task for task in train_tasks]  # Use only first 5 tasks for testing
dataset = _generate_demonstrations(
    env,
    training_tasks,
    options,          # Set of ground truth options
    0,               # train_tasks_start_idx 
    False            # annotate_with_gt_ops
)

print(f"✅ Dataset Generated Successfully!")
print(f"   Total trajectories: {len(dataset.trajectories)}")
print(f"   All demonstrations: {all(traj.is_demo for traj in dataset.trajectories)}")

# Analyze the demonstration data
total_steps = sum(len(traj.actions) for traj in dataset.trajectories)
avg_length = total_steps / len(dataset.trajectories) if dataset.trajectories else 0
successful_demos = sum(1 for traj in dataset.trajectories 
                      if traj.train_task_idx is not None)

print(f"\n📊 Dataset Statistics:")
print(f"   Total action steps: {total_steps}")
print(f"   Average trajectory length: {avg_length:.1f} steps")
print(f"   Successful demonstrations: {successful_demos}/{len(dataset.trajectories)}")

print("✅ Demonstration generation successful!")

# Sample a trajectory to examine structure (if we have any)
if dataset.trajectories:
    sample_traj = dataset.trajectories[0]
    sample_task = train_tasks[sample_traj.train_task_idx]

    print(f"\n🔍 Sample Trajectory Analysis:")
    print(f"   Task index: {sample_traj.train_task_idx}")
    print(f"   Length: {len(sample_traj.states)} states, {len(sample_traj.actions)} actions")
    print(f"   Goal: {sample_task.goal}")
    print(f"   Action sequence preview:")

    for i, action in enumerate(sample_traj.actions[:5]):
        if hasattr(action, '_option') and action._option:
            print(f"     Step {i}: {action.get_option().name} with params {action.get_option().params}")
        else:
            print(f"     Step {i}: Raw action {action.arr[:4]}...")

    if len(sample_traj.actions) > 5:
        print(f"     ... ({len(sample_traj.actions)-5} more actions)")
else:
    print("\n⚠️ No trajectories generated - check configuration")

print("\n💡 This demonstration data provides the training signal for IVNTR's bilevel learning!")

Using 8 NSRTs: {NSRT-TakeInfraredReading:
    Parameters: [?sat:satellite, ?obj:object]
    Preconditions: [HasChemY(?obj:object), HasInfrared(?sat:satellite), IsCalibrated(?sat:satellite), Sees(?sat:satellite, ?obj:object)]
    Add Effects: [InfraredReadingTaken(?sat:satellite, ?obj:object)]
    Delete Effects: []
    Ignore Effects: []
    Option Spec: UseInfraRed(?sat:satellite, ?obj:object), NSRT-ShootChemY:
    Parameters: [?sat:satellite, ?obj:object]
    Preconditions: [Sees(?sat:satellite, ?obj:object), ShootsChemY(?sat:satellite)]
    Add Effects: [HasChemY(?obj:object)]
    Delete Effects: []
    Ignore Effects: []
    Option Spec: ShootChemY(?sat:satellite, ?obj:object), NSRT-MoveTo:
    Parameters: [?sat:satellite, ?obj:object]
    Preconditions: [ViewClear(?sat:satellite)]
    Add Effects: [Sees(?sat:satellite, ?obj:object)]
    Delete Effects: [ViewClear(?sat:satellite)]
    Ignore Effects: []
    Option Spec: MoveTo(?sat:satellite, ?obj:object), NSRT-TakeCameraReading:
 

HasChemX(0) ([Type(name='object')]): [0, 0, 0, 1, 0, 0, 0, 0]
IsCalibrated(0) ([Type(name='satellite')]): [1, 0, 0, 0, 0, 0, 0, 0]
GeigerReadingTaken(0, 0) ([Type(name='satellite'), Type(name='object')]): [0, 0, 0, 0, 0, 0, 1, 0]
[CogMan] Reset called.


🎬 Generating Demonstration Trajectories...



[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[Co

✅ Dataset Generated Successfully!
   Total trajectories: 50
   All demonstrations: True

📊 Dataset Statistics:
   Total action steps: 637
   Average trajectory length: 12.7 steps
   Successful demonstrations: 50/50
✅ Demonstration generation successful!

🔍 Sample Trajectory Analysis:
   Task index: 0
   Length: 11 states, 10 actions
   Goal: {CameraReadingTaken(sat0:satellite, obj2:object), InfraredReadingTaken(sat1:satellite, obj2:object)}
   Action sequence preview:
     Step 0: MoveTo with params [0.0521853  0.65423506]
     Step 1: Calibrate with params []
     Step 2: ShootChemY with params []
     Step 3: UseInfraRed with params []
     Step 4: MoveTo with params [0.7846738 0.7001005]
     ... (5 more actions)

💡 This demonstration data provides the training signal for IVNTR's bilevel learning!


## 4. Neural Predicate Training: "IsCalibrated" Example

Now we'll dive into the core of IVNTR's neural predicate learning using the "IsCalibrated" predicate as our example. This predicate is crucial because:

- **Purpose**: Determines if a satellite has been calibrated and ready to take readings
- **Input**: Satellite features (position, calibration state, instrument type)
- **Output**: Binary classification (calibrated or not)
- **Role**: Required precondition for all reading operations

We'll demonstrate how IVNTR learns this predicate given a known **effect vector** that specifies which actions add/delete the predicate.

In [60]:
# Step 1: Create the IVNTR Bilevel Learning Approach
print("🧠 Creating IVNTR Bilevel Learning Approach...\n")

# Create the approach using the same process as main.py
approach = create_approach(
    CFG.approach,          # "ivntr"
    env.predicates,        # Initial predicates 
    options,               # Ground truth options
    env.types,             # Environment types
    env.action_space,      # Action space
    training_tasks         # Training tasks
)

print(f"✅ Approach created: {type(approach).__name__}")
print(f"   Approach name: {approach.get_name()}")

# Step 2: Examine the sorted options for effect vector understanding
print(f"\n📦 Action Options in IVNTR (sorted order):")
sorted_options = approach._sorted_options
for i, option in enumerate(sorted_options):
    print(f"   {i:2d}: {option.name}")

print(f"\n🎯 Understanding Effect Vectors:")
print(f"   Effect vectors are {len(sorted_options)}-dimensional binary vectors")
print(f"   Each position corresponds to an action option above")
print(f"   Value 1 = action ADDS the predicate")
print(f"   Value 0 = action has no effect on the predicate") 
print(f"   Value -1 = action DELETES the predicate")

print(f"\n💡 This ordering is crucial for interpreting effect vectors in example_gt_vec.yaml!")

Options (Clusters): Calibrate, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): MoveAway, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): MoveTo, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): ShootChemX, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): ShootChemY, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): UseCamera, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): UseGeiger, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): UseInfraRed, Arguments: [Type(name='satellite'), Type(name='object')]
Using: satellite_0
Predicate Type TOBE Invented: neural_u_p1
Learning Conifg: {'name': 'neural_u_p1', 'types': ['satellite'], 'gt': [[1, 0, 0, 0, 0, 0, 0, 0]], 'ent_idx': [0], 'architecture': {'type': 'MLP', 'layer_size': 32, 'initializer': 'xavier'}, 'optimizer': {'type': 'AdamW', 'kwargs': {'l

🧠 Creating IVNTR Bilevel Learning Approach...

✅ Approach created: BilevelLearningApproach
   Approach name: ivntr

📦 Action Options in IVNTR (sorted order):
    0: Calibrate
    1: MoveAway
    2: MoveTo
    3: ShootChemX
    4: ShootChemY
    5: UseCamera
    6: UseGeiger
    7: UseInfraRed

🎯 Understanding Effect Vectors:
   Effect vectors are 8-dimensional binary vectors
   Each position corresponds to an action option above
   Value 1 = action ADDS the predicate
   Value 0 = action has no effect on the predicate
   Value -1 = action DELETES the predicate

💡 This ordering is crucial for interpreting effect vectors in example_gt_vec.yaml!


### Understanding the Effect Vector for IsCalibrated

The effect vector `[1, 0, 0, 0, 0, 0, 0, 0]` from `example_gt_vec.yaml` tells us:

1. **Most actions (positions 1-7) have no effect** on IsCalibrated (value 0)
2. **One action (position 0) DELETES IsCalibrated** (value -1)

This is a **partial effect vector** - it only shows delete effects, not add effects. In reality:
- **Calibrate action** should ADD IsCalibrated (but it's not in this vector)
- **Use actions** (UseGeiger, UseCamera, UseInfraRed) DELETE IsCalibrated after taking readings

### The Learning Process

Given this effect vector, IVNTR will:
1. **Generate training labels** using the effect vector and demonstration trajectories
2. **Train a neural network** to predict IsCalibrated from satellite features  
3. **Validate the classifier** on held-out data

Let's see this process in action!

In [61]:
# Test the updated effect vector analysis
print("🔍 Analyzing Updated IsCalibrated Effect Vector...\n")

# From updated example_gt_vec.yaml: gt_ae_vecs: [[1, 0, 0, 0, 0, 0, 0, 0]]
# This shows the Calibrate action ADDS IsCalibrated
updated_effect_vector = [1, 0, 0, 0, 0, 0, 0, 0]

print(f"📋 Updated IsCalibrated Effect Vector: {updated_effect_vector}")
print(f"\n📊 Effect Distribution Analysis:")

# Analyze each action's effect on IsCalibrated with the updated vector
for i, (option, effect) in enumerate(zip(sorted_options, updated_effect_vector)):
    effect_str = {1: "ADDS", 0: "no effect", -1: "DELETES"}[effect]
    if effect != 0:
        print(f"   {i:2d}: {option.name:<15} → {effect_str} IsCalibrated ⭐")
    else:
        print(f"   {i:2d}: {option.name:<15} → {effect_str} IsCalibrated")

print(f"\n🎯 Key Insight: Calibrate action (position 0) ADDS IsCalibrated!")
print(f"   This makes perfect sense - calibration sets the satellite as ready")
print(f"   All other actions have no effect on IsCalibrated state")

🔍 Analyzing Updated IsCalibrated Effect Vector...

📋 Updated IsCalibrated Effect Vector: [1, 0, 0, 0, 0, 0, 0, 0]

📊 Effect Distribution Analysis:
    0: Calibrate       → ADDS IsCalibrated ⭐
    1: MoveAway        → no effect IsCalibrated
    2: MoveTo          → no effect IsCalibrated
    3: ShootChemX      → no effect IsCalibrated
    4: ShootChemY      → no effect IsCalibrated
    5: UseCamera       → no effect IsCalibrated
    6: UseGeiger       → no effect IsCalibrated
    7: UseInfraRed     → no effect IsCalibrated

🎯 Key Insight: Calibrate action (position 0) ADDS IsCalibrated!
   This makes perfect sense - calibration sets the satellite as ready
   All other actions have no effect on IsCalibrated state


In [62]:
# Test the neural learning process
print("🧠 Starting Neural Predicate Learning Process...\n")

print("📋 Learning Process Overview:")
print("   1. Generate training data from demonstration trajectories")
print("   2. Setup input fields for neural predicates") 
print("   3. Initialize Action-Effect (AE) matrix constraints")
print("   4. Compute input normalizers for stable training")
print("   5. Train neural networks using the effect vector as ground truth")
print("   6. Validate learned predicates on held-out data")

# Call the main neural learning method (line 2312 from bilevel_learning_approach.py)
print(f"\n🚀 Calling approach.learn_neural_predicates(dataset)...")
learning_trajectories, init_atom_traj = approach.learn_neural_predicates(dataset)

print(f"✅ Neural Learning Completed Successfully!")
print(f"   Learning trajectories: {len(learning_trajectories)}")
print(f"   Initial atom trajectory: {type(init_atom_traj)} with {len(init_atom_traj)} entries" if init_atom_traj else "None")

Constructing NeuPi Data...


🧠 Starting Neural Predicate Learning Process...

📋 Learning Process Overview:
   1. Generate training data from demonstration trajectories
   2. Setup input fields for neural predicates
   3. Initialize Action-Effect (AE) matrix constraints
   4. Compute input normalizers for stable training
   5. Train neural networks using the effect vector as ground truth
   6. Validate learned predicates on held-out data

🚀 Calling approach.learn_neural_predicates(dataset)...


100%|██████████| 50/50 [00:00<00:00, 150.00it/s]
Low-level feat not changed for Predicate HasChemX in Row 0
Low-level feat not changed for Predicate HasChemX in Row 0
Low-level feat not changed for Predicate HasChemY in Row 0
Low-level feat not changed for Predicate HasChemY in Row 0
Low-level feat not changed for Predicate HasChemX in Row 1
Low-level feat not changed for Predicate HasChemX in Row 1
Low-level feat not changed for Predicate HasChemY in Row 1
Low-level feat not changed for Predicate HasChemY in Row 1
Low-level feat not changed for Predicate HasChemX in Row 2
Low-level feat not changed for Predicate HasChemX in Row 2
Low-level feat not changed for Predicate HasChemY in Row 2
Low-level feat not changed for Predicate HasChemY in Row 2
Low-level feat not changed for Predicate neural_u_p1 in Row 3
Low-level feat not changed for Predicate neural_u_p1 in Row 3
Low-level feat not changed for Predicate IsCalibrated in Row 3
Low-level feat not changed for Predicate IsCalibrated in

✅ Neural Learning Completed Successfully!
   Learning trajectories: 50
   Initial atom trajectory: <class 'list'> with 50 entries


## 6. Examining Learned Neural Predicate Results

After neural learning completes, IVNTR stores detailed information about the learned predicates. Let's examine the results and understand what was learned.

In [None]:
# Point to log file for detailed learning process
print("📋 Detailed Learning Log File")
print(f"   Log file path: {log_file_path}")
print(f"   You can examine the neural learning process in detail by checking this log file")
print(f"   The log contains training progress, loss values, and model statistics")

📋 Detailed Learning Log File
   Log file path: /Users/libowen/Documents/Research/RSS2025/code/IVNTR/docs/example_gt_vec_learning.log
   You can examine the neural learning process in detail by checking this log file
   The log contains training progress, loss values, and model statistics

🧠 Learned Neural Predicate Information:
   Source: approach.learned_ae_pred_info (line 312 in bilevel_learning_approach.py)

📊 Learned Predicate Details:

🔍 Predicate: neural_u_p1
   Entity indices: []
   Ground truth effect vectors: []
   Learning status: Learned
   Provided status: Not provided
   📈 Learning Scores: No scores available
   🧠 Model Weights: No trained models saved

🔍 Predicate: CameraReadingTaken
   Entity indices: [[0, 0]]
   Ground truth effect vectors: N/A
   Learning status: Learned
   Provided status: Provided
   📈 Learning Scores: No scores available
   🧠 Model Weights: No trained models saved

🔍 Predicate: GeigerReadingTaken
   Entity indices: [[0, 0]]
   Ground truth effect ve