Skip to content

Command IR

Simon816 edited this page Sep 14, 2019 · 5 revisions

Introduction

Command IR is an Intermediate Representation for Minecraft commands.

It follows design choices similar to that of LLVM IR.

The IR is designed to be readable by both machines and humans. It is not intended to be written manually, given its verbosity. Rather, a higher-level tool should be used to generate the IR.

Command Block Assembly outputs Command IR for both the ASM and C languages. A new language is being drafted that will also use Command IR. Besides programming languages, a tool has been developed that will take a datapack and convert it to Command IR. This allows arbitrary datapacks to be optimized. The tool is still work-in-progress and not released yet.

The IR is primarily designed to be used from the Python interface, but it also has a textual representation that can be parsed. The full grammar is specified in grammar.lark (Written in Lark grammar).

Key syntax:

  • $foo denotes a variable called foo.
  • @func refers to a function called func.
  • :label refers to a label (basic block) called label.

An entire command program can be represented in the IR (except for auxiliary data such as loot tables), without needing additional context.

Several basic optimizers have been written and more are possible. It is also hoped that optimizers can be written to specifically optimize commands and not just scoreboard arithmetic.

Outline

There are 4 primary components:

  • Preamble - Specifies meta information and definitions. There is a global preamble and a per-function preamble
  • Function - This is a container of Basic Blocks, functions can be called from other functions
  • Basic Block - This is a linear sequence of instructions with no branching. The last instruction is a branch to either another block or a return from the function
  • Instruction - A single execution unit. Some instructions are "virtual" which are executed by the compiler, otherwise they are executed at runtime.

Hello World

Here is a basic hello world program:

function hello {
    preamble {
        $all_players = selector a
        extern
    }

    begin:
    $message = text
    text_append $message, "Hello, "
    text_append $message, "World!"
    text_send $message, $all_players
    ret
}

Using the Datapack Definition file hello.dpd:

[Datapack]
namespace=hello
place location=0 56 0

MCC can be invoked to generate a datapack hello.zip:

python main.py hello.ir hello.dpd

The resulting datapack has a function: hello.zip/data/hello/functions/hello.mcfunction

tellraw @a [{"text":"Hello, "},{"text":"World!"}]

Instructions

There are many instructions available in Command IR, to see the full list and associated documentation, see the Instruction Reference.

The IR is completely type-safe, i.e. a valid program must use types compatible with each instruction.

Concepts

Global entity

The global entity is an armor stand that is summoned automatically when the datapack loads.

The entity is not used by users directly, but it is used internally for several purposes.

Stack

A global stack exists to handle actions typically performed using a stack such as function calling.

The stack is stored in the NBT of the global entity.

Stack Frames

The elements on the global stack are not actually variables etc directly. Instead the stack holds "stack frames".

A stack frame is simply an NBT list of a known size. All elements in the list are known by the compiler so they can be referenced directly by index.

To get a clear picture, here is a diagram showing what the stack would look like with the call sequence A -> B -> C:

---------------------
| Stack Frame for C | <-- Top of stack
---------------------
| Stack Frame for B |
---------------------
| Stack Frame for A |
---------------------

This is in fact a simplification, there can be a secondary stack frame holding parameters, saves registers and return values.

Lets say function A calls function B, where B has formal parameters (p0, p1) and returns two values (r0, r1). Lets also say that function A has a variable in register R0 which needs preserving.

This is what the stack looks like once execution has moved to function B:

-------------------------------
| Frame[Local variables in B] | <-- Top of stack
-------------------------------
| Frame[p0, p1, r0, r1, R0]   |
-------------------------------
| Frame[Local variables in A] |
-------------------------------

B is aware of the frame sandwiched between the two local variable frames and can read/write to variables inside it. A is responsible for setting up and destroying the intermediate frame. B will write its return values into r0 and r1 which A copies somewhere (also restoring R0) before destroying the frame.

Variables

Variables are stored in one of two locations: in NBT or in a scoreboard objective. NBT variables are stored in the stackframe as described above.

The compiler prefers to store variables in scoreboard objectives because of their fast storage/retrieval as well as being able to perform arithmetic directly.

All normal variables are stored on the global entity. The exception being "entity local" variables.

Where a variable is stored is determined by the allocator. Note: The allocator is still work-in-progress so subject to change.

Scoreboard objectives are treated like registers in real hardware. Local variables are assigned to registers at compile time. Unlike real hardware the number of registers is flexible. Currently the allocator creates a fixed size of 4 registers.

To allow recursion, local variables cannot be globally allocated in scoreboard objectives.

Because registers are re-used by other functions, their state must be preserved during a function call, this is described in the stack frame section above.

If there are more variables than registers, or if the variable type is not compatible with a scoreboard objective, then it is stored in a fixed-sized NBT list in the stackframe.

There are currently two types of values a variable can hold: i32 and nbt.

i32 is a 32 bit signed integer (NBT type TAG_Int, only native scoreboard objective type).

nbt is any NBT structure. Obviously this can only be stored in a stack frame variable. There is no type guarantees of the NBT structure itself - the program writer must make sure to know what the structure of an NBT variable is.

Entity Local

Entity locals are variables that aren't singletons (attached to the global entity). They are ordinary scoreboard objectives (therefore cannot be stored in NBT).

The first step is to define a scoreboard objective:

$local_var = objective "local_var", NULL

To access (read or write) the variable, an entity needs to be specified. For example to get the value of local_var on the command sender (@s):

$sender = selector s
$sender_local_var = entity_local_access $local_var, $sender

Now the value can be read/written like any other variable. Note that the selector will be evaluated every time so it may change value e.g. by /execute as.

$sender_local_var += 10

Special command block

A command block is placed in the world when the datapack is loaded. This command block is an unconditional repeating block. It initially has no command set.

The intended use-case for the command block is as a synchronisation primitive.

It may be desirable to delay execution of code for one tick (to let the rest of the game run).

To wait a tick, at the end of a basic block, put set_command_block :callback. This will write a command into the command block such that the basic block callback is invoked on the next tick. Because it's a repeating command block, callback should call clear_command_block to clear itself from the command block.

With this primitive, the async/await pattern can be implemented with the use of a stack. The push_function and set_command_block_from_stack instructions enable this.

This feature can also be used as a form of Dynamic dispatch and potentially for Coroutines if a tick delay is not a problem.

Position Utility Entity

This entity is a second armor stand spawned alongside the global entity, but is designed for end-user usage. It is called the position utility because the primary use-case is moving it to a position in the world (maybe to set a block at some dynamic location). It is always available as a variable $pos_util of type EntityRef.

Event Handlers

There are two types of event handlers: tag based and advancement based. The two tag based events are minecraft:tick - an event for every game tick, and minecraft:load - an event when the datapack is reloaded.

Advancement event handlers use the mechanic that advancements can trigger functions when granted. You will need to refer to the Advancements minecraftwiki page to see what conditions there are for different advancements.

Here is an example definition of an event handler:

preamble {
    $event = event "minecraft:placed_block"
    add_event_condition $event, "item.item", "minecraft:stone"
    event_handler @on_placed_stone, $event
}

function on_placed_stone {
    begin:
    ...
}

This will call the on_placed_stone function whenever a player places a stone block.

Running Command IR

Don't want to install Python locally? Try the online demo: https://www.simon816.com/minecraft/assembler/ select "Command IR" in the language drop-down.

See the MCC README for how to run and configure MCC.

Command IR files must have a filename ending .ir for MCC to recognise it.

Command IR is designed to output commands compatible with Minecraft 1.14. If no NBT stack commands are emitted then it may work on 1.13 too.