# IVNTR Bilevel Learning: Neural Predicate Invention

This notebook explores the core IVNTR algorithm implemented in `predicators/approaches/bilevel_learning_approach.py`. IVNTR (bIleVel learNing from TRansitions) learns neural predicates that enable symbolic planning from demonstration data.

## The Core Algorithm

IVNTR implements a **bilevel learning approach** where:

1. **Upper Level (Symbolic)**: Searches over candidate predicate effects and constructs NSRTs
2. **Lower Level (Neural)**: Learns predicate classifiers using **Any** Neural Networks
3. **Alternating Process**: The two levels guide each other through iterative refinement

For each predicate, IVNTR alternates between:
- **Effect Discovery**: Searching for high-level symbolic effects the predicate should have
- **Grounding Learning**: Training neural networks to classify when the predicate holds

## Tutorial Focus: "IsCalibrated" Predicate

In this tutorial, we'll examine the simplest predicate invention process using the "IsCalibrated" predicate:
- **Purpose**: Determines if a satellite has been calibrated and is ready to take readings
- **Input**: Continuous feature vectors of satellites (position, calibration state, instrument type, ...)
- **Output**: Binary classification (calibrated or not)
- **Role**: Required precondition for camera, geiger, and infrared readings

## What We'll Cover

1. **Dummy NSRTs**: Starting point with incomplete effect knowledge.
2. **Demonstration Data**: Training examples from oracle trajectories.
3. **Neural Learning Process**: If we have perfect AE vector, how to train a neural network (single iteration).
4. **Symbolic Search Process**: If we do not know the AE vector, how to use neural loss to guide the search (multiple iterations).
5. **Advanced In-depth Highlight**: Binary predicate, variable bindings, quantifiers, and more.

Let's dive in!

## 1. Setting Up the Bilevel Learning Environment

We'll recreate the bilevel learning setup from `bilevel_learning_approach.py`, focusing on the key components that IVNTR needs for predicate invention.

In [1]:
import sys
import os
import logging

# Add the project root to path
sys.path.append('..')

# Set FastDownward path
FD_EXEC_PATH = os.path.join(os.path.dirname(os.path.abspath('.')), 'ext', 'downward')
os.environ['FD_EXEC_PATH'] = FD_EXEC_PATH

# Import IVNTR components
from predicators.envs.satellites import SatellitesEnv
from predicators.ground_truth_models import get_dummy_nsrts, get_gt_options
from predicators.datasets.demo_only import _generate_demonstrations
from predicators.approaches import create_approach
from predicators import utils
from predicators.settings import CFG

config_path = os.path.abspath("example_gt_vec_good.yaml") 
log_file_path = os.path.abspath("example_gt_vec_learning.log")
neupi_save_path = os.path.abspath("saved_neural")

print("✅ All imports successful!")

# Test the configuration section
utils.reset_config({
    "device": "cpu",
    "env": "satellites",
    "approach": "ivntr",
    "neupi_pred_config": config_path,
    "neupi_gt_ae_matrix": True,
    "excluded_predicates": "ViewClear,IsCalibrated,HasChemX,HasChemY,Sees",
    "neupi_do_normalization": True,
    "num_train_tasks": 50,
    "num_test_tasks": 20,
    "seed": 0,
    "bilevel_plan_without_sim": False,
    "exclude_domain_feat": None,
    "log_file": log_file_path,
})

handlers = [logging.StreamHandler()]
handlers.append(logging.FileHandler(CFG.log_file, mode='w'))
logging.basicConfig(level=logging.INFO,
                    format="%(message)s",
                    handlers=handlers,
                    force=True)

CFG.seed = 42
CFG.num_train_tasks = 50  # Generate 50 demonstration trajectories
CFG.satellites_num_sat_train = [2, 3]
CFG.satellites_num_obj_train = [2, 3]
CFG.timeout = 10.0
CFG.demonstrator = "oracle"
CFG.max_initial_demos = 50

print("✅ IVNTR Bilevel Learning Setup Complete!")
print(f"Environment: {CFG.env}")
print(f"Training tasks: {CFG.num_train_tasks}")
print(f"Demonstration trajectories: {CFG.max_initial_demos}")

  import pkg_resources
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)


✅ All imports successful!
✅ IVNTR Bilevel Learning Setup Complete!
Environment: satellites
Training tasks: 50
Demonstration trajectories: 50


## 2. Understanding Dummy NSRTs

IVNTR starts with **dummy NSRTs** - symbolic operators with incomplete knowledge. These are defined in `predicators/ground_truth_models/satellites/dummy_nsrts.py` and serve as the starting point for predicate invention.

### Key Characteristics of Dummy NSRTs:
- **Known preconditions**: Static predicates we can observe (e.g., `HasCamera`, `ShootsChemX`)
- **Unknown effects**: Dynamic predicates we need to learn (e.g., `IsCalibrated`, `HasChemX`, `HasChemY`)
- **Placeholder structure**: Provides the symbolic scaffolding for learning

**Note**: While multiple predicates can be learned, this tutorial focuses specifically on the `IsCalibrated` predicate as our primary example.

Let's examine these dummy NSRTs and understand what IVNTR needs to learn.

In [2]:
# Test the dummy NSRTs section
# Create environment and load dummy NSRTs
env = SatellitesEnv(use_gui=False)
train_tasks = env.get_train_tasks()

# Get ground truth options (needed for dummy NSRTs)
options = get_gt_options(env.get_name())

# Load dummy NSRTs (the incomplete starting point)
# get_dummy_nsrts takes: env_name, predicates_to_keep, options_to_keep
dummy_nsrts = get_dummy_nsrts(env.get_name(), env.predicates, options)

print(f"🧩 Dummy NSRTs: {len(dummy_nsrts)} operators\n")

# Test just the first few NSRTs to verify structure
for nsrt in sorted(list(dummy_nsrts)[:3], key=lambda x: x.name):
    print(f"📋 {nsrt.name}")
    print(f"   Parameters: {[str(p) for p in nsrt.parameters]}")
    
    if nsrt.preconditions:
        print(f"   Known Preconditions:")
        for pre in nsrt.preconditions:
            print(f"     ✓ {pre}")
    else:
        print(f"   Known Preconditions: None")
    
    if nsrt.add_effects:
        print(f"   Known Add Effects:")
        for eff in nsrt.add_effects:
            print(f"     + {eff}")
    else:
        print(f"   Known Add Effects: None (TO BE LEARNED!)")
    
    print()

print("✅ Dummy NSRTs loaded successfully!")



🧩 Dummy NSRTs: 8 operators

📋 Calibrate
   Parameters: ['?sat:satellite', '?obj:object']
   Known Preconditions:
     ✓ CalibrationTarget(?sat:satellite, ?obj:object)
   Known Add Effects: None (TO BE LEARNED!)

📋 MoveTo
   Parameters: ['?sat:satellite', '?obj:object']
   Known Preconditions: None
   Known Add Effects: None (TO BE LEARNED!)

📋 TakeInfraredReading
   Parameters: ['?sat:satellite', '?obj:object']
   Known Preconditions:
     ✓ HasInfrared(?sat:satellite)
   Known Add Effects:
     + InfraredReadingTaken(?sat:satellite, ?obj:object)

✅ Dummy NSRTs loaded successfully!


## 3. Generating Demonstration Data

IVNTR learns from demonstration trajectories collected using the oracle approach. These trajectories contain the training signal for both effect discovery and neural predicate learning.

Let's generate 50 demonstration trajectories using the process from `03_demo_trajectories.ipynb`.

In [3]:
# Test the demonstration generation section (with a smaller number for testing)
print("🎬 Generating Demonstration Trajectories...\n")

# This replicates the _generate_demonstrations function from demo_only.py
# Note: We need to provide all required arguments
training_tasks = [task.task for task in train_tasks]  # Use only first 5 tasks for testing
dataset = _generate_demonstrations(
    env,
    training_tasks,
    options,          # Set of ground truth options
    0,               # train_tasks_start_idx 
    False            # annotate_with_gt_ops
)

print(f"✅ Dataset Generated Successfully!")
print(f"   Total trajectories: {len(dataset.trajectories)}")
print(f"   All demonstrations: {all(traj.is_demo for traj in dataset.trajectories)}")

# Analyze the demonstration data
total_steps = sum(len(traj.actions) for traj in dataset.trajectories)
avg_length = total_steps / len(dataset.trajectories) if dataset.trajectories else 0
successful_demos = sum(1 for traj in dataset.trajectories 
                      if traj.train_task_idx is not None)

print(f"\n📊 Dataset Statistics:")
print(f"   Total action steps: {total_steps}")
print(f"   Average trajectory length: {avg_length:.1f} steps")
print(f"   Successful demonstrations: {successful_demos}/{len(dataset.trajectories)}")

print("✅ Demonstration generation successful!")

# Sample a trajectory to examine structure (if we have any)
if dataset.trajectories:
    sample_traj = dataset.trajectories[0]
    sample_task = train_tasks[sample_traj.train_task_idx]

    print(f"\n🔍 Sample Trajectory Analysis:")
    print(f"   Task index: {sample_traj.train_task_idx}")
    print(f"   Length: {len(sample_traj.states)} states, {len(sample_traj.actions)} actions")
    print(f"   Goal: {sample_task.goal}")
    print(f"   Action sequence preview:")

    for i, action in enumerate(sample_traj.actions[:5]):
        if hasattr(action, '_option') and action._option:
            print(f"     Step {i}: {action.get_option().name} with params {action.get_option().params}")
        else:
            print(f"     Step {i}: Raw action {action.arr[:4]}...")

    if len(sample_traj.actions) > 5:
        print(f"     ... ({len(sample_traj.actions)-5} more actions)")
else:
    print("\n⚠️ No trajectories generated - check configuration")

print("\n💡 This demonstration data provides the training signal for IVNTR's bilevel learning!")

Using 8 NSRTs: {NSRT-ShootChemX:
    Parameters: [?sat:satellite, ?obj:object]
    Preconditions: [Sees(?sat:satellite, ?obj:object), ShootsChemX(?sat:satellite)]
    Add Effects: [HasChemX(?obj:object)]
    Delete Effects: []
    Ignore Effects: []
    Option Spec: ShootChemX(?sat:satellite, ?obj:object), NSRT-TakeInfraredReading:
    Parameters: [?sat:satellite, ?obj:object]
    Preconditions: [HasChemY(?obj:object), HasInfrared(?sat:satellite), IsCalibrated(?sat:satellite), Sees(?sat:satellite, ?obj:object)]
    Add Effects: [InfraredReadingTaken(?sat:satellite, ?obj:object)]
    Delete Effects: []
    Ignore Effects: []
    Option Spec: UseInfraRed(?sat:satellite, ?obj:object), NSRT-MoveTo:
    Parameters: [?sat:satellite, ?obj:object]
    Preconditions: [ViewClear(?sat:satellite)]
    Add Effects: [Sees(?sat:satellite, ?obj:object)]
    Delete Effects: [ViewClear(?sat:satellite)]
    Ignore Effects: []
    Option Spec: MoveTo(?sat:satellite, ?obj:object), NSRT-TakeGeigerReading:
 

🎬 Generating Demonstration Trajectories...



[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[CogMan] Reset called.
[CogMan] Finishing episode.
[Co

✅ Dataset Generated Successfully!
   Total trajectories: 50
   All demonstrations: True

📊 Dataset Statistics:
   Total action steps: 637
   Average trajectory length: 12.7 steps
   Successful demonstrations: 50/50
✅ Demonstration generation successful!

🔍 Sample Trajectory Analysis:
   Task index: 0
   Length: 11 states, 10 actions
   Goal: {InfraredReadingTaken(sat1:satellite, obj2:object), CameraReadingTaken(sat0:satellite, obj2:object)}
   Action sequence preview:
     Step 0: MoveTo with params [0.0521853  0.65423506]
     Step 1: Calibrate with params []
     Step 2: ShootChemY with params []
     Step 3: UseInfraRed with params []
     Step 4: MoveTo with params [0.7846738 0.7001005]
     ... (5 more actions)

💡 This demonstration data provides the training signal for IVNTR's bilevel learning!


## 4. Neural Predicate Training: "IsCalibrated" Example

Now we'll dive into the core of IVNTR's neural predicate learning using the "IsCalibrated" predicate as our example. This predicate is crucial because:

- **Purpose**: Determines if a satellite has been calibrated and is ready to take readings
- **Input**: Satellite features (position, calibration state, instrument type)
- **Output**: Binary classification (calibrated or not)
- **Role**: Required precondition for all reading operations (camera, geiger, infrared)

We'll demonstrate how IVNTR learns this predicate given a known **effect vector** that specifies which actions add/delete the predicate.

In [4]:
# Step 1: Create the IVNTR Bilevel Learning Approach
print("🧠 Creating IVNTR Bilevel Learning Approach...\n")

# Create the approach using the same process as main.py
approach = create_approach(
    CFG.approach,          # "ivntr"
    env.predicates,        # Initial predicates 
    options,               # Ground truth options
    env.types,             # Environment types
    env.action_space,      # Action space
    training_tasks         # Training tasks
)

print(f"✅ Approach created: {type(approach).__name__}")
print(f"   Approach name: {approach.get_name()}")

# Step 2: Examine the sorted options for effect vector understanding
print(f"\n📦 Action Options in IVNTR (sorted order):")
sorted_options = approach._sorted_options
for i, option in enumerate(sorted_options):
    print(f"   {i:2d}: {option.name}")

print(f"\n🎯 Understanding Effect Vectors:")
print(f"   Effect vectors are {len(sorted_options)}-dimensional binary vectors")
print(f"   Each position corresponds to an action option above")
print(f"   Value 1 = action ADDS the predicate")
print(f"   Value 0 = action has no effect on the predicate") 
print(f"   Value -1 = action DELETES the predicate")

print(f"\n💡 This ordering is crucial for interpreting effect vectors in example_gt_vec.yaml!")

Options (Clusters): Calibrate, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): MoveAway, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): MoveTo, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): ShootChemX, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): ShootChemY, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): UseCamera, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): UseGeiger, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): UseInfraRed, Arguments: [Type(name='satellite'), Type(name='object')]
Using: satellite_0
Predicate Type TOBE Invented: neural_u_p1
Learning Conifg: {'name': 'neural_u_p1', 'types': ['satellite'], 'gt': [[1, 0, 0, 0, 0, 0, 0, 0]], 'ent_idx': [0], 'architecture': {'type': 'MLP', 'layer_size': 32, 'initializer': 'xavier'}, 'optimizer': {'type': 'AdamW', 'kwargs': {'l

🧠 Creating IVNTR Bilevel Learning Approach...

✅ Approach created: BilevelLearningApproach
   Approach name: ivntr

📦 Action Options in IVNTR (sorted order):
    0: Calibrate
    1: MoveAway
    2: MoveTo
    3: ShootChemX
    4: ShootChemY
    5: UseCamera
    6: UseGeiger
    7: UseInfraRed

🎯 Understanding Effect Vectors:
   Effect vectors are 8-dimensional binary vectors
   Each position corresponds to an action option above
   Value 1 = action ADDS the predicate
   Value 0 = action has no effect on the predicate
   Value -1 = action DELETES the predicate

💡 This ordering is crucial for interpreting effect vectors in example_gt_vec.yaml!


### Understanding the Effect Vector for IsCalibrated

The effect vector `[1, 0, 0, 0, 0, 0, 0, 0]` from `example_gt_vec.yaml` tells us:

1. **Position 0 (Calibrate action) ADDS IsCalibrated** (value 1)
2. **All other actions (positions 1-7) have no effect** on IsCalibrated (value 0)

This effect vector correctly captures the semantics of calibration:
- **Calibrate action** sets a satellite as calibrated and ready for readings
- **All other actions** (movement, shooting chemicals, taking readings) don't affect calibration status

### The Learning Process

Given this effect vector, IVNTR will:
1. **Generate training labels** using the effect vector and demonstration trajectories
2. **Train a neural network** to predict IsCalibrated from satellite features  
3. **Validate the classifier** on held-out data

Let's see this process in action!

In [5]:
# Test the updated effect vector analysis
print("🔍 Analyzing Updated IsCalibrated Effect Vector...\n")

# From updated example_gt_vec.yaml: gt_ae_vecs: [[1, 0, 0, 0, 0, 0, 0, 0]]
# This shows the Calibrate action ADDS IsCalibrated
updated_effect_vector = [1, 0, 0, 0, 0, 0, 0, 0]

print(f"📋 Updated IsCalibrated Effect Vector: {updated_effect_vector}")
print(f"\n📊 Effect Distribution Analysis:")

# Analyze each action's effect on IsCalibrated with the updated vector
for i, (option, effect) in enumerate(zip(sorted_options, updated_effect_vector)):
    effect_str = {1: "ADDS", 0: "no effect", -1: "DELETES"}[effect]
    if effect != 0:
        print(f"   {i:2d}: {option.name:<15} → {effect_str} IsCalibrated ⭐")
    else:
        print(f"   {i:2d}: {option.name:<15} → {effect_str} IsCalibrated")

print(f"\n🎯 Key Insight: Calibrate action (position 0) ADDS IsCalibrated!")
print(f"   This makes perfect sense - calibration sets the satellite as ready")
print(f"   All other actions have no effect on IsCalibrated state")

🔍 Analyzing Updated IsCalibrated Effect Vector...

📋 Updated IsCalibrated Effect Vector: [1, 0, 0, 0, 0, 0, 0, 0]

📊 Effect Distribution Analysis:
    0: Calibrate       → ADDS IsCalibrated ⭐
    1: MoveAway        → no effect IsCalibrated
    2: MoveTo          → no effect IsCalibrated
    3: ShootChemX      → no effect IsCalibrated
    4: ShootChemY      → no effect IsCalibrated
    5: UseCamera       → no effect IsCalibrated
    6: UseGeiger       → no effect IsCalibrated
    7: UseInfraRed     → no effect IsCalibrated

🎯 Key Insight: Calibrate action (position 0) ADDS IsCalibrated!
   This makes perfect sense - calibration sets the satellite as ready
   All other actions have no effect on IsCalibrated state


In [6]:
# Test the neural learning process
print("🧠 Starting Neural Predicate Learning Process...\n")

print("📋 Learning Process Overview:")
print("   1. Generate training data from demonstration trajectories")
print("   2. Setup input fields for neural predicates") 
print("   3. Initialize Action-Effect (AE) matrix constraints")
print("   4. Compute input normalizers for stable training")
print("   5. Train neural networks using the effect vector as ground truth")
print("   6. Validate learned predicates on held-out data")

# Call the main neural learning method (line 2312 from bilevel_learning_approach.py)
print(f"\n🚀 Calling approach.learn_neural_predicates(dataset)...")
learning_trajectories, init_atom_traj = approach.learn_neural_predicates(dataset)

print(f"✅ Neural Learning Completed Successfully!")
print(f"   Learning trajectories: {len(learning_trajectories)}")
print(f"   Initial atom trajectory: {type(init_atom_traj)} with {len(init_atom_traj)} entries" if init_atom_traj else "None")

Constructing NeuPi Data...


🧠 Starting Neural Predicate Learning Process...

📋 Learning Process Overview:
   1. Generate training data from demonstration trajectories
   2. Setup input fields for neural predicates
   3. Initialize Action-Effect (AE) matrix constraints
   4. Compute input normalizers for stable training
   5. Train neural networks using the effect vector as ground truth
   6. Validate learned predicates on held-out data

🚀 Calling approach.learn_neural_predicates(dataset)...


100%|██████████| 50/50 [00:00<00:00, 106.17it/s]
Low-level feat not changed for Predicate HasChemX in Row 0
Low-level feat not changed for Predicate HasChemX in Row 0
Low-level feat not changed for Predicate HasChemY in Row 0
Low-level feat not changed for Predicate HasChemY in Row 0
Low-level feat not changed for Predicate HasChemX in Row 1
Low-level feat not changed for Predicate HasChemX in Row 1
Low-level feat not changed for Predicate HasChemY in Row 1
Low-level feat not changed for Predicate HasChemY in Row 1
Low-level feat not changed for Predicate HasChemX in Row 2
Low-level feat not changed for Predicate HasChemX in Row 2
Low-level feat not changed for Predicate HasChemY in Row 2
Low-level feat not changed for Predicate HasChemY in Row 2
Low-level feat not changed for Predicate neural_u_p1 in Row 3
Low-level feat not changed for Predicate neural_u_p1 in Row 3
Low-level feat not changed for Predicate IsCalibrated in Row 3
Low-level feat not changed for Predicate IsCalibrated in

✅ Neural Learning Completed Successfully!
   Learning trajectories: 50
   Initial atom trajectory: <class 'list'> with 50 entries


## 6. Examining Learned Neural Predicate Results

After neural learning completes, IVNTR stores detailed information about the learned predicates. Let's examine the results and understand what was learned.

In [65]:
# Point to log file for detailed learning process
print("📋 Detailed Learning Log File")
print(f"   Log file path: {log_file_path}")
print(f"   You can examine the neural learning process in detail by checking this log file")
print(f"   The log contains training progress, loss values, and model statistics")

📋 Detailed Learning Log File
   Log file path: /Users/libowen/Documents/Research/RSS2025/code/IVNTR/docs/example_gt_vec_learning.log
   You can examine the neural learning process in detail by checking this log file
   The log contains training progress, loss values, and model statistics


## 7. Neural Informed Symbolic Search

In practice, we **don't have the correct effect vector** when inventing new predicates - that's the whole point of predicate invention! A key insight from IVNTR is that searching the potential effect space can be **guided by neural learning**.

### The Challenge

- **Effect space is exponential**: For 8 actions, we have {0, 1, -1}^8 = 6,561 possible effect vectors 
- **Naive search is intractable**: We can't try every combination
- **Neural guidance**: Bad effect vectors lead to poor neural learning performance

### The IVNTR Solution

IVNTR uses **neural learning loss as a guidance signal** for symbolic search:
1. Try a candidate effect vector
2. Train a neural network using that vector's labels  
3. Measure the learning performance (validation loss, accuracy)
4. Use this score to guide search toward better effect vectors

Let's demonstrate this by trying a **deliberately bad effect vector** and seeing how IVNTR detects it's wrong.

In [7]:
# Test the bad effect vector approach
print("🔍 Neural Informed Symbolic Search: Bad Effect Vector Example\n")

# Update configuration to use bad effect vector
bad_config_path = os.path.abspath("example_gt_vec_bad.yaml") 
bad_log_file_path = os.path.abspath("example_bad_vec_learning.log")

print("📋 Updating Configuration for Bad Effect Vector Experiment:")
print(f"   Config file: {bad_config_path}")
print(f"   Log file: {bad_log_file_path}")

# Reset configuration with bad effect vector
utils.reset_config({
    "device": "cpu",
    "env": "satellites",
    "approach": "ivntr",
    "neupi_pred_config": bad_config_path,
    "neupi_gt_ae_matrix": True,
    "excluded_predicates": "ViewClear,IsCalibrated,HasChemX,HasChemY,Sees",
    "neupi_do_normalization": True,
    "num_train_tasks": 50,
    "num_test_tasks": 20,
    "seed": 0,
    "bilevel_plan_without_sim": False,
    "exclude_domain_feat": None,
    "log_file": bad_log_file_path,
})

# Reinitialize logging for bad vector experiment
handlers = [logging.StreamHandler()]
handlers.append(logging.FileHandler(CFG.log_file, mode='w'))
logging.basicConfig(level=logging.INFO,
                    format="%(message)s",
                    handlers=handlers,
                    force=True)

CFG.seed = 42
CFG.num_train_tasks = 50
CFG.satellites_num_sat_train = [2, 3]
CFG.satellites_num_obj_train = [2, 3]
CFG.timeout = 10.0
CFG.demonstrator = "oracle"
CFG.max_initial_demos = 50

# Analyze the bad effect vector
bad_effect_vector = [0, 0, 1, 0, 0, 0, 0, 0]  # MoveTo action adds IsCalibrated (wrong!)
print(f"\n🚨 Bad Effect Vector: {bad_effect_vector}")
print(f"   This says: MoveTo action ADDS IsCalibrated")
print(f"   Reality: Calibrate action should ADD IsCalibrated")
print(f"   This is completely wrong semantically!")

print("\n✅ Configuration updated for bad effect vector experiment")

🔍 Neural Informed Symbolic Search: Bad Effect Vector Example

📋 Updating Configuration for Bad Effect Vector Experiment:
   Config file: /Users/libowen/Documents/Research/RSS2025/code/IVNTR/docs/example_gt_vec_bad.yaml
   Log file: /Users/libowen/Documents/Research/RSS2025/code/IVNTR/docs/example_bad_vec_learning.log

🚨 Bad Effect Vector: [0, 0, 1, 0, 0, 0, 0, 0]
   This says: MoveTo action ADDS IsCalibrated
   Reality: Calibrate action should ADD IsCalibrated
   This is completely wrong semantically!

✅ Configuration updated for bad effect vector experiment


In [8]:
# Create new approach with bad effect vector and test neural learning
print("🧠 Creating New IVNTR Approach with Bad Effect Vector...\n")

# Create fresh environment and tasks (reuse existing dataset)
env_bad = SatellitesEnv(use_gui=False)
train_tasks_bad = env_bad.get_train_tasks()
training_tasks_bad = [task.task for task in train_tasks_bad]
options_bad = get_gt_options(env_bad.get_name())

# Create the approach with bad configuration
approach_bad = create_approach(
    CFG.approach,
    env_bad.predicates,
    options_bad,
    env_bad.types,
    env_bad.action_space,
    training_tasks_bad
)

print(f"✅ Bad Effect Vector Approach Created: {approach_bad.get_name()}")

# Use the same demonstration dataset (reuse for consistency)
print(f"\n🎬 Using Existing Demonstration Dataset...")
print(f"   Dataset trajectories: {len(dataset.trajectories)}")

# Run neural learning with bad effect vector
print(f"\n🚨 Running Neural Learning with BAD Effect Vector...")
print(f"   Expected: Poor learning performance due to wrong labels")
print(f"   Effect vector: [0, 0, 1, 0, 0, 0, 0, 0] (MoveTo adds IsCalibrated)")

learning_trajectories_bad, init_atom_traj_bad = approach_bad.learn_neural_predicates(dataset)

print(f"\n✅ Bad Effect Vector Learning Completed!")
print(f"   This will show poor learning performance in the guidance scores")

Options (Clusters): Calibrate, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): MoveAway, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): MoveTo, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): ShootChemX, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): ShootChemY, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): UseCamera, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): UseGeiger, Arguments: [Type(name='satellite'), Type(name='object')]
Options (Clusters): UseInfraRed, Arguments: [Type(name='satellite'), Type(name='object')]
Using: satellite_0
Predicate Type TOBE Invented: neural_u_p1
Learning Conifg: {'name': 'neural_u_p1', 'types': ['satellite'], 'gt': [[0, 0, 1, 0, 0, 0, 0, 0]], 'ent_idx': [0], 'architecture': {'type': 'MLP', 'layer_size': 32, 'initializer': 'xavier'}, 'optimizer': {'type': 'AdamW', 'kwargs': {'l

🧠 Creating New IVNTR Approach with Bad Effect Vector...

✅ Bad Effect Vector Approach Created: ivntr

🎬 Using Existing Demonstration Dataset...
   Dataset trajectories: 50

🚨 Running Neural Learning with BAD Effect Vector...
   Expected: Poor learning performance due to wrong labels
   Effect vector: [0, 0, 1, 0, 0, 0, 0, 0] (MoveTo adds IsCalibrated)


100%|██████████| 50/50 [00:00<00:00, 72.95it/s] 
Low-level feat not changed for Predicate HasChemX in Row 0
Low-level feat not changed for Predicate HasChemX in Row 0
Low-level feat not changed for Predicate HasChemY in Row 0
Low-level feat not changed for Predicate HasChemY in Row 0
Low-level feat not changed for Predicate HasChemX in Row 1
Low-level feat not changed for Predicate HasChemX in Row 1
Low-level feat not changed for Predicate HasChemY in Row 1
Low-level feat not changed for Predicate HasChemY in Row 1
Low-level feat not changed for Predicate HasChemX in Row 2
Low-level feat not changed for Predicate HasChemX in Row 2
Low-level feat not changed for Predicate HasChemY in Row 2
Low-level feat not changed for Predicate HasChemY in Row 2
Low-level feat not changed for Predicate neural_u_p1 in Row 3
Low-level feat not changed for Predicate neural_u_p1 in Row 3
Low-level feat not changed for Predicate IsCalibrated in Row 3
Low-level feat not changed for Predicate IsCalibrated in


✅ Bad Effect Vector Learning Completed!
   This will show poor learning performance in the guidance scores


In [9]:
# Analyze the guidance scores from bad effect vector learning
print("📊 Analyzing Guidance Scores from Bad Effect Vector Learning\n")

# Read guidance scores from the log file
print("📋 Reading AE Guidance Scores from Log File...")
try:
    with open(bad_log_file_path, 'r') as f:
        log_content = f.read()

    # Find the "Learned AE Guidance (Lower better):" section
    guidance_start = log_content.find("Learned AE Guidance (Lower better):")
    if guidance_start != -1:
        # Extract the guidance section (next few lines after the marker)
        guidance_section = log_content[guidance_start:guidance_start+180]  # Get reasonable chunk
        guidance_lines = guidance_section.split('\n')

        print("🎯 Found AE Guidance Scores in Log File:")
        print()
        for i, line in enumerate(guidance_lines):
            if i == 0 or line.strip():  # Print header and non-empty lines
                print(f"   {line}")
            if i > 15:  # Limit output to avoid too much text
                break
        print()

        print("🚨 Key Interpretation:")
        print("   - Lower guidance scores = better neural learning performance")
        print("   - Higher guidance scores = worse neural learning performance")
        print("   - Bad effect vector [0,0,1,0,0,0,0,0] should show some poor scores")
        print("   - These scores guide IVNTR's tree search algorithm")

    else:
        print("   ⚠️ 'Learned AE Guidance (Lower better):' not found in log file")
        print("   This might mean the learning process hasn't completed yet")

except FileNotFoundError:
    print(f"   ⚠️ Log file not found: {bad_log_file_path}")
    print("   Run the bad effect vector learning first to generate the log")
except Exception as e:
    print(f"   ⚠️ Error reading log file: {e}")

print("\n💡 Next Step: These guidance scores feed into the tree search algorithm")
print("   at predicators/approaches/bilevel_learning_approach.py::L1683")

📊 Analyzing Guidance Scores from Bad Effect Vector Learning

📋 Reading AE Guidance Scores from Log File...
🎯 Found AE Guidance Scores in Log File:

   Learned AE Guidance (Lower better): tensor([7.5286e-01, 7.4332e-01, 9.9220e-01, 2.3026e-09, 2.3026e-09, 7.4895e-01,
           8.2120e-01, 4.6737e-01])
   Learning Done in 60.71571612358

🚨 Key Interpretation:
   - Lower guidance scores = better neural learning performance
   - Higher guidance scores = worse neural learning performance
   - Bad effect vector [0,0,1,0,0,0,0,0] should show some poor scores
   - These scores guide IVNTR's tree search algorithm

💡 Next Step: These guidance scores feed into the tree search algorithm
   at predicators/approaches/bilevel_learning_approach.py::L1683


### Tree Search Algorithm Integration

The guidance scores computed above feed into IVNTR's **tree search algorithm** for systematic effect vector exploration. This is implemented in the bilevel learning approach at a specific location in the codebase.

**Implementation Reference**: 
- **File**: `predicators/approaches/bilevel_learning_approach.py`
- **Method**: Tree search logic around **line 1683**
- **Function**: Uses guidance scores to expand/prune search tree nodes

### How It Works

1. **Initialize Search Tree**: Start with candidate effect vectors
2. **Neural Evaluation**: For each candidate, train neural network and compute guidance score  
3. **Tree Expansion**: Expand high-scoring candidates, prune low-scoring ones
4. **Iterative Refinement**: Continue until convergence or resource limits

This completes our exploration of IVNTR's neural-informed symbolic search - the key insight that neural learning performance can guide symbolic effect discovery!

## 8. Advanced Understanding

So far, we have gone through how IVNTR:
- Trains neural networks with effect vector labels.
- Performs symbolic effect search with neural guidance.

To invent all the predicates in satellites domain, please see `scripts/train/satellites/satellites_biplan.sh` and also check the predicate learning configurations `predicators/config/satellites/pred.yaml`.

We next highlights some advanced implementation in predicate learning for better understanding.

### Binary Predicates

`IsCalibrated(?sat)` is a unary predicate that takes in feature vectors of a single entity (object/satellite). For binary predicates, we input the concatination of feature vectors. details see `predicators/approaches/bilevel_learning_approach.py::gen_graph_data`.

### Variable Bindings

IVNTR algorithm made an implicit assumption that, for a certain predicate, it can only appear at most once in the effect set of an operator (add effect or delete effect or no effect). However, there are cases where a single predicate appears twice, e.g., in `Blocks` domain we have the `Clear(?block)` predicate like this:
```
UnStack::
    ?x0:block, ?x1:block, ?x2:robot
    pre-cond:
        HandEmpty(?x2),
        On(?x0,?x1),
        Clear(?x0)
    add-eff:
        Holding(?x2,?x0),
        Clear(?x1)
    del-eff:
        HandEmpty(?x2),
        On(?x0,?x1),
        Clear(?x0)
```
In this case, `Clear(?block)` can't be fully captured by [0,1,-1] effect vectors. To tackle this, each predicate is additionally associated with a varibale binding index, and we only learn&evaluate the specified variable binding (which varible in the operator should be binded to the predicate). In the code we used `ent_idx` (default to 0) for this.

Specifically, we will invent two predicates to capture the original `Clear(?block)`, with `Clear1(?block)_0` (it only binds the first block variable in any operator) and `Clear2(?block)_1` (it only binds the second block variable in any operator). And this way, we can use the [0,1,-1] effect vectors to represent the distribution (another way to explain is that the varible binding is actually part of the effect search).
For more details, you can try `Blocks` domain, and understand the code: `predicators/approaches/bilevel_learning_approach.py::ent_idx`.


### Negation and Quantifiers
Invented predicates can be augmented with negation (`Not`) and quantifiers (`ForAll` and `Exist`), the quantifiers can also be applied to different variables. The augmented predicates will be added to the predicate pool before predicate selection. details see L1725 of `predicators/approaches/bilevel_learning_approach.py`.

### Predicate Selection
After we get the pool of neural predicates (low-loss classifiers), we conduct hill-climbing search similar to [previous work](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=CMcsygMAAAAJ&citation_for_view=CMcsygMAAAAJ:J_g5lzvAfSwC). The core algorithm evaluates a set of predicates by:
1. Use the predicates to learn an operator set.
2. Use the operator set to do task planning.
3. Compare the generated task plan with demonstration length, and additionally capture search statistics (num_nodes_expanded / created).
The implementation is in: `predicators/approaches/bilevel_learning_approach.py::_select_predicates_by_score_search` and `predicators/predicate_search_score_functions.py::_OperatorBeliefScoreFunction`.
