Skip to content
This repository
tree: 9d150e98a9
Fetching contributors…

Cannot retrieve contributors at this time

file 257 lines (173 sloc) 8.43 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257
"Cola" - A compiler for the Parrot/Perl6 VM

  V0.0.8.1

  I've started this compiler (I call it Cola) to simultaneously
  teach myself how to write a compiler (ok I admit I squeaked
  through Compiler Design in school because I was too busy
  playing online MUD and basketball), as well as help
  out a project that I love, Perl6/Parrot!

  To really start having fun with the Parrot runtime I wanted
  a language similar to C/C++/C#/Java.

  The Cola parser is LALR, developed with flex/bison. It targets
  an intermediate language (PIR?) which can be found in
  parrot/languages/imcc or alongside Cola on CPAN.
  IMCC does the register allocation/spill control, optimization
  and various other dirty things, before generating machine
  instructions. Currently the only target is Parrot.

Where to Get the Latest Compiler

  Cola is included in the Parrot distribution,
  see http://www.parrotcode.org or http://dev.perl.org/perl6
  At some point the distributions may diverge for some reason
  so you can grab the latest tarball from CPAN.

    http://cpan.org/authors/id/M/ME/MELVIN

The Syntax

  Compatible (eventually, maybe) C# syntax.

  The easiest way to see what it currently looks like is
  to read the examples.

  I aim to eventually achieve C# source level compliance, targetted
  to the Parrot runtime. Whether or not we do on the fly bytecode
  interfacing with .NET later is a whole different rhinoceros.


Supported Constructs

  For some quick samples, see the cola/examples/ subdirectory.

  Statements: FOR, WHILE, IF, ELSE, BREAK, CONTINUE, RETURN.

    All of these are limited in their current version but follow
    typical C-ish rules (break breaks out of current loop, continue
    shortcuts to the next iteration). return may return a value
    from anywhere inside a method.

    0.0.4 adds logical operators.
    Conditional expressions may now be compound and use logical
    operators (&& and ||).
    CAUTION: Assignments inside conditionals such as...

        if((i = 15) == 15)

    are still not supported. This sort of expression will be fixed
    in the next update.

  Types: int, float, string (classes and arrays coming soon)
    Currently these map directly to the Parrot primitive types.
    Very simple type coercement and checking is supported.

  Variable Declarations:
    For now declare all your variables at the top of each method.
    Block scoping will be done soon.
    You may initialize your variables in the declaration.

  Expressions: Parens, +, -, *, /, %, ++ and -- are supported in
    any combination or complexity. Post and pre-increment are also
    supported. You may use the + operator on strings for concatenation.

    The following works:

        Console.WriteLine("Hello Mr " + name + "\n");
    
    There is no support for concatenation with non-strings yet.
    I need to whip up a few conversion routines.
 
  Comparisons: <, >, <=, >=, !=, ==

  Bitwise operators: <<, >>, |, &, ^, and ~ are now supported in 0.0.4

  Conditional expressions: Commonly known as the ternary conditional...
    As I understand the Java and C# language spec,
    conditional expressions must be on the right hand side of an
    assignment or anywhere that uses the return value, however you
    cannot use ternaries standalone as a statement. So in Cola you can
    say:
           max = i > j ? i : j;
    or:
           max = i > j ? foo() : bar();

    But it is not proper (although C allows you) to say:

           i > j ? printf("max=i\n") : printf("max=j\n");

    If you think I have misread the C# grammar spec, please email me.

  Boolean: The boolean ! operator isn't yet supported.
 
  Objects and Methods:
    Classic C++/Java style, recursion supported. Return types supported.
    Will be adding variable argument support per the C# spec.

    Member variables or "fields" aren't yet supported. You
    can use const definitions, but member variables
    are incomplete pending a little more work on Parrot.
    Its possible to do them now with PerlHash or PerlArray,
    but I'm working on something faster for strictly typed,
    non-dynamic languages.

    Currently instance methods are the same as class methods. So
    you could do:

        Console.WriteLine("");

    or Console c = new Console();
        c.WriteLine("");

    The compiler doesn't yet differentiate between static or class
    methods and instance methods, and whether you are calling them
    as such.

    OOP is really just faked for now, enough for people to write
    in a high level language for Parrot. As everything, it is a work
    in progress. All objects are simply PerlString references for now.

  Current Builtin Subroutines

    Since I have yet to implement class importing, the system routines
    are just plain wrappers around the Parrot ops. Currently they are:

       strlen, substr, strchop, ord, puts, puti, putf, gets, sleep

    See gen.c and main() for the current kludgy way to patch in more wrappers.

    In calc.cola I also did a sample implementation of a string to int
    conversion called StrToInt().


  Arrays:
    Parrot now has a substr with replace op so we can emulate arrays
    on top of it. It is a hack but it works.

What you currently CANNOT do even though the Parser may eat it...
  Statement lists: i = j, j = k;
  Nested assigns: i = (j = 0);
  Fancy For loops: for(i = 0; j++, j < 4; i++)
  Empty For headers: for(;;) // use while(1)
    
Using the Compiler

  You will need Flex and Bison installed to build the compiler.
  These are the GNU versions of the classic lex and yacc, but
  are more modern. The grammars should work with standard lex/yacc
  but I've not tested this lately.

  If for some odd reason you have Perl and Parrot but no Flex/Bison
  send me an email <melvin.smith@mindspring.com> and I might take pity
  on you and send you the generated parser C files.

  Build colac by typing:

make

  Build imcc:

        cd ../imcc ; make ; cp imcc ../cola

  Usage:

colac examples/mandelbrot.cola

  Then copy "a.pasm" to your Parrot directory, assemble it and run
  as usual. In case you need help there to, try:

        assemble.pl a.pasm > a.pbc
        parrot a.pbc

  Currently colac is a short Perl pre-processor that includes
  classes for any import statements (using System;)
  If you have trouble with colac you can just use colacc which
  is the raw compiler which ignores 'using' directives.

  Also, if you look in "a.imc" you will see an intermediate
  language. Debugging the compiler is easier by looking at the
  intermediate code. This language can be piped through imcc to
  re-generate the .pasm file, however this is done by default.

  Currently the compiler is very limited with a few warnings and
  a few simple type coercions.
  
  If you try to do something fancy, the grammar might accept
  it but the code will probably come out wrong or simply crash the
  compiler. Read all the samples before trying anything.
 
Intermediate Code


  Please see the README in the parrot/languages/imcc directory.

  For a nice sample of intermediate code, compile mandelbrot.cola or
  calc.cola which actually does a limited form of parsing with Parrot!


Register Allocation

  Done via graph-coloring. Cola emits intermediate code that uses
  named locals/globals and symbolic temporary registers. IMCC
  handles the allocation and spilling.

  See imcc/README

Optimization

  Very limited. No explicit optimization phase yet.

  You might see dumb code generated in the form of:

    set I1, I0
    set I2, I1

  or..

    branch LABEL34
    LABEL34: ...

  generated between basic blocks or in situations where
  the generator is currently dumb.
 
  The plan is to convert intermediate code to SSA (Static
  Single Assignment) form before doing various optimization
  passes.

Coming Soon
  Some form of a printf()
  Full array support
    

Gaping Holes

  Arrays
    Strings can be treated as arrays, thats it for now.
    Will be adding full array support soon.
 
  Class/Struct instantiation
    Lotta work involved here, I won't go there for now.
    Parrot currently does not have an adequate bytecode "class format"
    in which to write symbol table information. This will change soon.

  Object Methods and Field references
    Again, a lot of work, but on the list. I'm sort of faking
    method calls for now, no instance calls.
Something went wrong with that request. Please try again.