Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
IMCC 0.0.9.5 imcc is the Intermediate Code Compiler for Parrot. The language it compiles is currently termed Parrot Intermediate Language (PIR). Why? Writing a compiler is a large undertaking. We are trying to take some of the load off of potential language designers, including the designers of the Perl6 compiler. We can provide a common back-end for Parrot that does: Register Allocation and Spillage Constant folding and expression evaluation Instruction selection Optimization Bytecode generation This way, language designers can get right to work on Tokenizing, parsing, type checking AST/DAG production Then they can simply feed PIR to imcc which will compile directly to Parrot bytecode, altogether skipping the reference assembler. So far, all the compiler does (besides translating the PIR to pasm) is register allocation and spilling. I like Steve Muchnick's MIR language, and I'm taking a few things from it. Presently you can write code with unlimited symbolics or named locals and imcc will translate to pasm. I expect the IR compiler to be FAST, simple, and maintainable, and never develop featuritis; however I want it to be adequate for all languages targetting parrot. Did I mention that it needs to be FAST? Register Allocation The allocator uses graph-coloring and du-chains to assign registers to lexicals and symbolic temporaries. One weakness of the allocator is the lack of branch analysis. A brute force method is used in the du-chain computation where we assume any symbol is live from the time it was first used until either the last time it was used or the last branch instruction. This is being replaced with directed graphs of basic blocks and flow analysis. Optimization We break the instructions into a directed graph of basic blocks. The plan is to translate to SSA form to make optimizations easier. Why C and Bison? Until Perl6 compiles itself (and does it fast), a Bison parser is the easiest to maintain. An additional, important benefit, is C-based parsers are pretty darn fast. Currently assembling Parrot on the fly is still relatively slow. More documentation s. the docs subdir Language Reference Variables, Registers and Constants Variables are simple identifiers, anything that is NOT a register or constant is a legal identifier. The syntax will have to change a bit to support Perl's non-alpha identifiers, which will probably spell renaming the registers. You may use an infinite number of typed temporaries. S=String, I=Int, N=Float, P=Object(or PMC) $S0 = 1 $S1 = $S0 + 2 ... You may also define lexicals with the .local directive below and use them by name. .local int i .local int j i = 4 j = i * i Assigning to a constant is illegal syntax. Directives For this reference, I use the following legend: reg = A symbolic temporary ($S0, $I25) var = A named variable or symbol (i, foo, myArray) IDENTIFIER = An optionally quoted name lval = reg | var rval = reg | var | const lval is allowed as targets of instructions, but not constants, so we refer to operands on the right side as rvals. lvals are also used on the right hand side to indicate that the item can be a const or non-const token. These are tokens that might not directly translate to a specific opcode, and are intentionally left generic. Some of them aren't ops at all, but instructions to the compiler. .sym <type> <IDENTIFIER> .sub <IDENTIFIER> .end .local <type> <IDENTIFIER> .return '(' <rval> ')' .param <type> <lval> .emit .eom Instructions Assignments, calls, branches, etc. Simplified forms of lower level operations. Typically is 1-to-1 or 1-to-2 instruction to Parrot ratio. lval = rval lval = rval + rval lval = rval - rval lval = rval * rval lval = rval / rval lval = rval % rval lval = rval << rval lval = rval >> rval <S.lval> = <S.rval> . <S.rval> For array and string access you can use: <S.lval|P.lval> [ rval ] = rval lval = <S.lval|P.lval> [ rval ] <P.lval> = new <TYPE> dec <lval> inc <lval> end goto <LABEL> if <expression> goto <LABEL> print rval Instructions not known to imcc are looked up in parrot's op_info_table and must have the proper amount and types of arguments. The best way to learn PIR quickly is to study the output of other compilers that use it (Perl6 and Cola), and consult the source file imcc.y. Please mail email@example.com with bug-reports or patches. Original Author: Melvin Smith <firstname.lastname@example.org>, <email@example.com> Active Maintainer: Leopold Toetsch <firstname.lastname@example.org> Contributing Authors: Angel Faus <email@example.com> ... CFG, life analysis Sean O'Rourke <firstname.lastname@example.org> ... anyop, iANY Leopold Toetsch <email@example.com> ... major rewrite numerous bugfixes/cleanup/rewrite optimizer.c run parrot code inside imcc Juergen Boemmels <firstname.lastname@example.org> Macro preprocessor