# Calling External Python Functions in pyRDDLGym.

This preliminary notebook discusses how to write and execute external Python function calls from within RDDL domain description files.

In [1]:
%pip install --quiet --upgrade pip
%pip install --quiet pyRDDLGym rddlrepository

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


Import the required packages:

In [2]:
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
import numpy as np
import pyRDDLGym
from pyRDDLGym.core.policy import RandomAgent
from rddlrepository.core.manager import RDDLRepoManager

Let us now define the RDDL domain and instance description files containing the code with a function call:

In [3]:
domain_text = """
domain my_domain {

    types {
        obj : object;
    };

    pvariables {
        state(obj) : { state-fluent, real, default = 0.0 };
        action(obj) : { action-fluent, real, default = 0.0 };
    };
    
    cpfs {
        state'(?o) = $MyFunctionCall[?o](state(_), action(_));
    };
    
    reward = sum_{?o : obj} pow[state'(?o) - 4, 2];
    
    action-preconditions {
        forall_{?o : obj} [action(?o) >= -10 ^ action(?o) <= 10];
    };
}
"""

instance_text = """
non-fluents nf_simple {
    domain = my_domain;
    objects {
        obj : { o1, o2 };
    };
}

instance simple_inst {
    domain = my_domain;
    non-fluents = nf_simple;
    max-nondef-actions = pos-inf;
    horizon = 5;
    discount = 1.0;
}
"""

# register the domain and instance with rddlrepository
manager = RDDLRepoManager(rebuild=True)
manager.register_domain("ExternalFuncDomain", "standalone", domain_text, desc="domain with external function call", viz=None)
problem_info = manager.get_problem("ExternalFuncDomain_standalone")
problem_info.register_instance("1", instance_text)
RDDLRepoManager(rebuild=True)

Domain <ExternalFuncDomain> was successfully registered in rddlrepository with context <standalone>.
Instance <1> was successfully registered in rddlrepository for domain <ExternalFuncDomain_standalone>.


<rddlrepository.core.manager.RDDLRepoManager at 0x1922af6d5e0>

The line of the code `state'(?o) = $MyFunctionCall[?o](state(_), action(_));` calls an external Python function that takes the state and action and returns the next state vectors.

Next, we must define the external Python function within the Python code, whose signature must match its definition in RDDL. Note that the function can be any computation that can be executed in Python, which allows Python control flow such as loops or recursion, external packages, ML frameworks, etc., to be used to extend the functionality of RDDL:

In [4]:
def my_function(state_vec, action_vec):
    vec = state_vec + action_vec
    while np.max(np.abs(vec)) > 1.0:
        vec = vec / 2.0
    return vec

Finally, we need to instruct the Python compiler to use the above function when compiling and executing the domain and instance:

In [5]:
env = pyRDDLGym.make("ExternalFuncDomain_standalone", "1", 
                     backend_kwargs={'python_functions': {'MyFunctionCall': my_function}})

Let's execute the environment with a random policy:

In [6]:
agent = RandomAgent(action_space=env.action_space, num_actions=env.max_allowed_actions)
agent.evaluate(env, episodes=1, verbose=True)

initial state = 
     state___o1 = 0.0  state___o2 = 0.0 
-----------------------------------------------------------------------------------------------------------------------------------------------
step   = 0
action = 
     action___o2 = -0.6362280249595642  action___o1 = -6.7809906005859375 
state  = 
     state___o1 = -0.8476238250732422   state___o2 = -0.07952850311994553 
reward = 40.142009557185794
done   = False
-----------------------------------------------------------------------------------------------------------------------------------------------
step   = 1
action = 
     action___o2 = -3.7849085330963135  action___o1 = 9.153914451599121   
state  = 
     state___o1 = 0.5191431641578674   state___o2 = -0.2415273147635162 
reward = 30.10691827351391
done   = False
-----------------------------------------------------------------------------------------------------------------------------------------------
step   = 2
action = 
     action___o2 = -5.282477855682373  actio

{'mean': np.float64(169.65103546343275),
 'median': np.float64(169.65103546343275),
 'min': np.float64(169.65103546343275),
 'max': np.float64(169.65103546343275),
 'std': np.float64(0.0)}