Skip to content
This repository
tree: c8218caee9
Fetching contributors…

Cannot retrieve contributors at this time

file 449 lines (302 sloc) 11.521 kb

PIR (Parrot Intermediate Representation) is a way to program the parrot virtual machine that is easier to use than PASM (Parrot Assembler). PASM notation is like any other assembler-like format and can be used directly, but it is more verbose and gives too much power to the user. PIR abstracts common operations and conventions into a syntax that more closely resembles a high-level language. PIR allows the programmer to write code that more naturally expresses their intent without worrying about setting up the exact details that PASM requires to function properly.

This article will show the basics on programming in PIR. More advanced topics will appear in later articles.

In order to test the PIR and PASM code in this article, a parrot virtual machine is needed (henceforth just "parrot"). Parrot is available from http://parrot.org. Just download the latest release, or checkout the current development version from the Git repository. The programs in this article were tested with Parrot 0.8.1.

Parrot is very easy to compile on unix-like and Microsoft Windows operating systems: just run perl Configure.pl && make in the root directory of the parrot source and, if everything works correctly, a parrot executable should appear. At the moment of writing, the make install target does not work properly, so in this and other articles it is assumed that the parrot executable is invoked from the parrot root directory.

If you do not want to compile your own Parrot you can download a pre-compiled binary from http://www.parrot.org/source.html.

Before we get started with the examples, here's a quick overview of parrot's architecture.

Parrot is a register-based virtual machine. It provides 4 types of registers. The register types are:

In order to designate a register in PASM, use the character indicating the type (I, N, S or P) and the register number. For instance, in order to use register 10 of type integer, you'd write I10. In this series of articles, we will mainly focus on programming PIR.

In PIR, you would type the $ character in front of the register, to indicate a virtual register. For instance, the integer registers are $I0, $I1 and so on. The PMC registers hold arbitrary data objects and are parrot's mechanism for implementing more complex behavior than the ones that can be expressed using the other 3 register types alone.

A virtual register is mapped to an actual register by the register allocator. You can use as many registers as you want, and the register allocator will allocate them as needed.

PMCs will be covered in more detail in a future article. Examples in this article will focus on the first 3 register types.

Let me start with a simple and typical example:

To run it, save the code in a hello.pir file and pass it to the parrot virtual machine:

   ./parrot hello.pir

Note that I am using a relative path to parrot given that I didn't install it into the system.

The keywords starting with a dot (.sub and .end) are PIR directives. They are used together to define subroutines. After the .sub keyword I use the name of the subroutine. The keyword that starts with a colon (:main) is a pragma that tells parrot that this is the main body of the program and that it should start by executing this subroutine. By the way, I could use .sub foo :main and Parrot will use the foo subroutine as the main body of the program. The actual name of the subroutine does not matter as long as it has the :main pragma. If you don't specify the <:main> pragma on any subroutine, then parrot will start executing the first subroutine in the source file. The full set of pragmas are defined in "pdds/pdd19_pir.pod" in docs.

Before going into more details about subroutines and calling conventions, let's compare some PIR syntax to the equivalent PASM.

If I want to add two integer registers using PASM I would use the Parrot set opcode to put values into registers, and the add opcode to add them, like this:

PIR includes infix operators for these common opcodes. I could write this same code as

There are the four arithmetic operators as you should be expecting, as well as the six different comparison operators, which return a boolean value:

I can also use the short accumulation-like operators, like +=.

Another PIR perk is that local variable names may be declared and used instead of register names. For that I just need to declare the variable using the .local keyword with any of the four data types available on PIR: int, string, num and pmc:

Note that all registers, both numbered and named, are consolidated by the Parrot register allocator, assigning these "virtual registers" to actual registers as needed. The register allocator even coalesces two virtual names onto the same physical register when it can prove that they have non-overlapping lifetimes, so there is no need to be stingy with register names. To see the actual registers used, use pbc_disassemble on the *.pbc output. You can generate a Parrot Byte Code (PBC) file as follows:

   ./parrot -o foo.pbc --output-pbc foo.pir

Then, use pbc_disassemble in order to disassemble it:

   ./pbc_disassemble foo.pbc

Another simplification of PASM are branches. Basically, when I want to test a condition and jump to another place in the code, I would write the following PASM code:

Meaning, if $I1 is less or equal than $I2, jump to label LESS_EQ. In PIR I would write it in a more legible way:

PIR includes the unless keyword as well.

Subroutines can easily be created using the .sub keyword shown before. If you do not need parameters, it is just as simple as I show in the following code:

Now, I want to make my hello subroutine a little more useful, such that I can greet other people. For that I will use the .param keyword to define the parameters hello can handle:

If I need more parameters I just need to add more .param lines.

To return values from PIR subroutines I use the .return keyword, followed by one or more arguments, just like this:

The calling subroutine can accept these values. If you want to retrieve only one value (or only the first value, in case multiple values are returned), write this:

To accept multiple values from such a function, use a parenthesized results list:

Now, for a little more complicated example, let me show how I would code Factorial subroutine:

This example also shows that PIR subroutines may be recursive just as in a high-level language.

As some other languages as Python and Perl support named arguments, PIR supports them as well.

As before, I need to use .param for each named argument, but you need to specify a flag indicating the parameter is named:

The subroutine will receive an integer named "foo", and inside of the subroutine that integer will be known as "a".

When calling the function, I need to pass the names of the arguments. For that there are two syntaxes:

Note that with named arguments, you may rearrange the order of your parameters at will.

This subroutine may be called in any of the following ways:

and any other permutation you can think of as long as you use the named argument syntax. Note that any positional parameters must be passed before the named parameters. So, the following is allowed:

Whereas the following is not:

It's also possible to use named syntax when returning values from subroutines. Into the .return command I'll use:

and when calling the function, I will do:

And $I0 will yield 10, and $I1 will yield 20, as expected.

To conclude this first article on PIR and to let you test what you learned, let me show you how to do input on PASM (hence, also in PIR). There is a read opcode to read from standard input. Just pass it a string register or variable where you wish the characters read to be placed and the number of characters you wish to read:

This line will read 100 characters (or until the end of the line) and put the read string into $S1. In case you need a number, just assign the string to the correct register type:

With the PIR syntax shown in this article you should be able to start writing simple programs. Next article we will look into available Polymorphic Containers (PMCs), and how they can be used.

Alberto Simões

  • Jonathan Scott Duff
Something went wrong with that request. Please try again.