Skip to content

3. Interface 🛠️

John Yang edited this page Jun 27, 2023 · 1 revision

This section discusses the additional features taken care of the IntercodeEnv base class, along with recommended usage for building on top of the gym API class definition.

The below diagram visualizes the function call workflow for constructing a custom Intercode environment.

Class Definition

The IntercodeEnv interface exposes the following class functions. Of these, the functions that require an implementation for IntercodeEnv subclasses are denoted with the caution emoji.

  • __init__(self, data_path: str, image_name: str, **kwargs)
    • Create data loader (IntercodeDataLoader instance) from dataset
    • Use given docker image name to create + connect with a docker container
    • Initializes logging handler
    • Keyword Arguments (**kwargs)
      • verbose (bool): If true, logging is enabled and environment interactions are shown to standard output
      • traj_dir (str): If a valid path is provided, task episode summaries are saved to the given directory (generated by save_trajectory)
      • preprocess (callable): If provided, this function is run before every task episode. It is a way to provide task-level customization of the execution environment.
  • reset(self, index: int = None) -> Tuple[str, Dict]
    • Retrieves task record from data loader
    • Call reset_container
    • Reset task level logger, instance variables
  • step(self, action: str) -> Tuple[str, int, bool, Dict]
    • Log (action, observation)
    • Invoke exec_action on action argument
    • If action=submit, invoke get_reward, save_trajectory
  • save_trajectory(self)
    • Creates .json file that saves information per task episode.
  • close(self)
    • Safely exit or stop any resources (i.e. docker container) used by the environment
  • execute_action(self, action: str) -> None ⚠️
    • Handles logic for executing action within the docker container context
  • get_reward(self) -> Tuple[float, Dict] ⚠️
    • Handles reward calculation of the actions within the context of the task episode with respect to the gold command(s).
  • reset_container(self) -> None ⚠️
    • Handles resetting of execution container (i.e. resetting file system to original state)

The following visualization conveys how each of these methods are invoked and how they related to one another.

Notable Features

IntercodeEnv takes care of several things under the hood:

Execution Context: Intercode relies on Docker for the construction of flexible and expressive code environments. Meant for defining virtualized containers to package applications and dependencies, Docker images, which can be custom made with a Dockerfile, are leveraged by Intercode due to their immense coverage of operating systems, languages, dependencies, and packages for virtualized containers. Given the name of a docker image, the Intercode base class handles creating and connecting to a virtual container that serves as a sandboxed execution environment. The container variable is exposed for defining how to:

  • (in exec_action) Execute an action within the container
  • (in reward) Get information from the container for reward calculation purposes

Datasets: A dataset can be fed into Intercode for training and evaluation purposes. Currently, Intercode is capable of handling datasets for tasks that either involve 1. Natural Language Query to Code or 2. Natural Language Query to Answer. Assuming the dataset is:

  1. A .pkl, .json, .csv, or .tsv file, and
  2. Contains the fields query and gold,

Intercode takes care of dataset validation along with loading a new dataset record per task episode upon a reset() call. The data_loader variable is exposed to subclasses as a way to call the ICDataLoader class instance.

Logging: Intercode also takes care of logging interactions with the environment to standard output and saving episode-level interaction trajectories as .json files to the directory specified by the traj_dir argument if provided.

Clone this wiki locally