Skip to content

Commit

Permalink
Adjusted spacing and naming for Kazoo tutorial.
Browse files Browse the repository at this point in the history
  • Loading branch information
chriswailes committed May 14, 2015
1 parent 1e4c028 commit 1860fe7
Show file tree
Hide file tree
Showing 49 changed files with 717 additions and 717 deletions.
2 changes: 1 addition & 1 deletion kazoo/chapter 1/Chapter1.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,4 +67,4 @@ rule(/./, :comment)

When attempting to match a substring of the input RLTK lexers only use the rules that are defined for their current state. The first rule says that when the lexer encounters a # it should enter the `:comment` state. The second rule says that if we encounter a newline we should pop the current state off of the state stack, but *only* if we are already in the `:comment` state. Lastly, we add a rule that will discard any single character input. Since this rule is specified after the newline rule we will never discard a newline.

And that finishes our lexer for now! The full code for this chapter can be found in the "`examples/kazoo/chapter 1`" directory. Continue on to the [next chapter](file.Chapter2.html) to see how we use RLTK to define AST nodes for Kazoo.
And that finishes our lexer for now! The full code for this chapter can be found in the "`kazoo/chapter 1`" directory. Continue on to the [next chapter](https://github.com/chriswailes/compiler-examples/blob/master/kazoo/chapter%202/Chapter2.md) to see how we use RLTK to define AST nodes for Kazoo.
40 changes: 20 additions & 20 deletions kazoo/chapter 1/klexer.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Ruby Language Toolkit
# Date: 2011/05/09
# Description: This file defines a simple lexer for the Kazoo language.
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Compiler Examples
# Date: 2011/05/09
# Description: This file defines a simple lexer for the Kazoo language.

# RLTK Files
require 'rltk/lexer'
Expand All @@ -12,31 +12,31 @@ class Lexer < RLTK::Lexer
rule(/\s/)

# Keywords
rule(/def/) { :DEF }
rule(/extern/) { :EXTERN }
rule(/def/) { :DEF }
rule(/extern/) { :EXTERN }

# Operators and delimiters.
rule(/\(/) { :LPAREN }
rule(/\)/) { :RPAREN }
rule(/;/) { :SEMI }
rule(/,/) { :COMMA }
rule(/\+/) { :PLUS }
rule(/-/) { :SUB }
rule(/\*/) { :MUL }
rule(/\//) { :DIV }
rule(/</) { :LT }
rule(/\(/) { :LPAREN }
rule(/\)/) { :RPAREN }
rule(/;/) { :SEMI }
rule(/,/) { :COMMA }
rule(/\+/) { :PLUS }
rule(/-/) { :SUB }
rule(/\*/) { :MUL }
rule(/\//) { :DIV }
rule(/</) { :LT }

# Identifier rule.
rule(/[A-Za-z][A-Za-z0-9]*/) { |t| [:IDENT, t] }

# Numeric rules.
rule(/\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\.\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\d+\.\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\.\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\d+\.\d+/) { |t| [:NUMBER, t.to_f] }

# Comment rules.
rule(/#/) { push_state :comment }
rule(/\n/, :comment) { pop_state }
rule(/#/) { push_state :comment }
rule(/\n/, :comment) { pop_state }
rule(/./, :comment)
end
end
2 changes: 1 addition & 1 deletion kazoo/chapter 2/Chapter2.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,4 +63,4 @@ class Function < RLTK::ASTNode
end
```

In the [next chapter](file.Chapter3.html) we will write a parser that takes input from our lexer and uses the AST node definitions to build an AST from our input. The full code for this chapter can be found in the "`examples/kazoo/chapter 2`" directory.
In the [next chapter](https://github.com/chriswailes/compiler-examples/blob/master/kazoo/chapter%203/Chapter3.md) we will write a parser that takes input from our lexer and uses the AST node definitions to build an AST from our input. The full code for this chapter can be found in the "`kazoo/chapter 2`" directory.
8 changes: 4 additions & 4 deletions kazoo/chapter 2/kast.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Ruby Language Toolkit
# Date: 2011/05/09
# Description: This file defines a simple AST for the Kazoo language.
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Compiler Examples
# Date: 2011/05/09
# Description: This file defines a simple AST for the Kazoo language.

# RLTK Files
require 'rltk/ast'
Expand Down
2 changes: 1 addition & 1 deletion kazoo/chapter 3/Chapter3.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,4 +126,4 @@ loop do
end
```

The driver doesn't do much yet, but in the [next chapter](file.Chapter4.html) we will add support for translating our AST into LLVM intermediate representation. The full code listing for this chapter can be found in the "`examples/kazoo/chapter 3`" directory.
The driver doesn't do much yet, but in the [next chapter](https://github.com/chriswailes/compiler-examples/blob/master/kazoo/chapter%204/Chapter4.md) we will add support for translating our AST into LLVM intermediate representation. The full code listing for this chapter can be found in the "`kazoo/chapter 3`" directory.
8 changes: 4 additions & 4 deletions kazoo/chapter 3/kast.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Ruby Language Toolkit
# Date: 2011/05/09
# Description: This file defines a simple AST for the Kazoo language.
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Compiler Examples
# Date: 2011/05/09
# Description: This file defines a simple AST for the Kazoo language.

# RLTK Files
require 'rltk/ast'
Expand Down
8 changes: 4 additions & 4 deletions kazoo/chapter 3/kazoo.rb
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#!/usr/bin/ruby

# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Ruby Language Toolkit
# Date: 2011/05/09
# Description: This file is the driver for the Kazoo tutorial.
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Compiler Examples
# Date: 2011/05/09
# Description: This file is the driver for the Kazoo tutorial.

# Tutorial Files
require './klexer'
Expand Down
40 changes: 20 additions & 20 deletions kazoo/chapter 3/klexer.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Ruby Language Toolkit
# Date: 2011/05/09
# Description: This file defines a simple lexer for the Kazoo language.
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Compiler Examples
# Date: 2011/05/09
# Description: This file defines a simple lexer for the Kazoo language.

# RLTK Files
require 'rltk/lexer'
Expand All @@ -12,31 +12,31 @@ class Lexer < RLTK::Lexer
rule(/\s/)

# Keywords
rule(/def/) { :DEF }
rule(/extern/) { :EXTERN }
rule(/def/) { :DEF }
rule(/extern/) { :EXTERN }

# Operators and delimiters.
rule(/\(/) { :LPAREN }
rule(/\)/) { :RPAREN }
rule(/;/) { :SEMI }
rule(/,/) { :COMMA }
rule(/\+/) { :PLUS }
rule(/-/) { :SUB }
rule(/\*/) { :MUL }
rule(/\//) { :DIV }
rule(/</) { :LT }
rule(/\(/) { :LPAREN }
rule(/\)/) { :RPAREN }
rule(/;/) { :SEMI }
rule(/,/) { :COMMA }
rule(/\+/) { :PLUS }
rule(/-/) { :SUB }
rule(/\*/) { :MUL }
rule(/\//) { :DIV }
rule(/</) { :LT }

# Identifier rule.
rule(/[A-Za-z][A-Za-z0-9]*/) { |t| [:IDENT, t] }

# Numeric rules.
rule(/\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\.\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\d+\.\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\.\d+/) { |t| [:NUMBER, t.to_f] }
rule(/\d+\.\d+/) { |t| [:NUMBER, t.to_f] }

# Comment rules.
rule(/#/) { push_state :comment }
rule(/\n/, :comment) { pop_state }
rule(/#/) { push_state :comment }
rule(/\n/, :comment) { pop_state }
rule(/./, :comment)
end
end
40 changes: 20 additions & 20 deletions kazoo/chapter 3/kparser.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Ruby Language Toolkit
# Date: 2011/05/09
# Description: This file defines a simple parser for the Kazoo language.
# Author: Chris Wailes <chris.wailes@gmail.com>
# Project: Compiler Examples
# Date: 2011/05/09
# Description: This file defines a simple parser for the Kazoo language.

# RLTK Files
require 'rltk/parser'
Expand All @@ -18,34 +18,34 @@ class Parser < RLTK::Parser
production(:input, 'statement SEMI') { |s, _| s }

production(:statement) do
clause('e') { |e| e }
clause('ex') { |e| e }
clause('p') { |p| p }
clause('f') { |f| f }
clause('e') { |e| e }
clause('ex') { |e| e }
clause('p') { |p| p }
clause('f') { |f| f }
end

production(:e) do
clause('LPAREN e RPAREN') { |_, e, _| e }

clause('NUMBER') { |n| Number.new(n) }
clause('IDENT') { |i| Variable.new(i) }
clause('NUMBER') { |n| Number.new(n) }
clause('IDENT') { |i| Variable.new(i) }

clause('e PLUS e') { |e0, _, e1| Add.new(e0, e1) }
clause('e SUB e') { |e0, _, e1| Sub.new(e0, e1) }
clause('e MUL e') { |e0, _, e1| Mul.new(e0, e1) }
clause('e DIV e') { |e0, _, e1| Div.new(e0, e1) }
clause('e LT e') { |e0, _, e1| LT.new(e0, e1) }
clause('e PLUS e') { |e0, _, e1| Add.new(e0, e1) }
clause('e SUB e') { |e0, _, e1| Sub.new(e0, e1) }
clause('e MUL e') { |e0, _, e1| Mul.new(e0, e1) }
clause('e DIV e') { |e0, _, e1| Div.new(e0, e1) }
clause('e LT e') { |e0, _, e1| LT.new(e0, e1) }

clause('IDENT LPAREN args RPAREN') { |i, _, args, _| Call.new(i, args) }
clause('.IDENT LPAREN .args RPAREN') { |i, args| Call.new(i, args) }
end

list(:args, :e, :COMMA)

production(:ex, 'EXTERN p_body') { |_, p| p }
production(:p, 'DEF p_body') { |_, p| p }
production(:f, 'p e') { |p, e| Function.new(p, e) }
production(:ex, 'EXTERN p_body') { |_, p| p }
production(:p, 'DEF p_body') { |_, p| p }
production(:f, 'p e') { |p, e| Function.new(p, e) }

production(:p_body, 'IDENT LPAREN arg_defs RPAREN') { |name, _, arg_names, _| Prototype.new(name, arg_names) }
production(:p_body, '.IDENT LPAREN .arg_defs RPAREN') { |name, arg_names| Prototype.new(name, arg_names) }

list(:arg_defs, :IDENT, :COMMA)

Expand Down
24 changes: 12 additions & 12 deletions kazoo/chapter 4/Chapter4-old.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ In this chapter we will be translating the AST that our parser builds into LLVM

## Code Generation Setup

In order to generate LLVM IR, we need to perform some simple setup to get started. We need to tell LLVM that we will be working on an x86 platform by making a call to the {RLTK::CG::LLVM.init} method. This will go at the top of our 'kjit.rb' file which will hold most of our code for this chapter. Here is a quick outline of the file:
In order to generate LLVM IR, we need to perform some simple setup to get started. We need to tell LLVM that we will be working on an x86 platform by making a call to the {RCGTK::LLVM.init} method. This will go at the top of our 'kjit.rb' file which will hold most of our code for this chapter. Here is a quick outline of the file:

RLTK::CG::LLVM.init(:X86)
RCGTK::LLVM.init(:X86)

class JIT
attr_reader :module
Expand All @@ -27,9 +27,9 @@ In order to generate LLVM IR, we need to perform some simple setup to get starte
end
end

The translate methods will emit IR for that AST node along with all the things it depends on, and they all return an LLVM Value object. The {RLTK::CG::Value} class represents a "Static Single Assignment (SSA) register" or "SSA value" in LLVM. The most distinct aspect of SSA values is that their value is computed as the related instruction executes, and it does not get a new value until (and if) the instruction re-executes. In other words, there is no way to "change" an SSA value. For more information you can read the Wikipedia article on [Static Single Assignment](http://en.wikipedia.org/wiki/Static_single_assignment_form) - the concepts are really quite natural once you grok them.
The translate methods will emit IR for that AST node along with all the things it depends on, and they all return an LLVM Value object. The {RCGTK::Value} class represents a "Static Single Assignment (SSA) register" or "SSA value" in LLVM. The most distinct aspect of SSA values is that their value is computed as the related instruction executes, and it does not get a new value until (and if) the instruction re-executes. In other words, there is no way to "change" an SSA value. For more information you can read the Wikipedia article on [Static Single Assignment](http://en.wikipedia.org/wiki/Static_single_assignment_form) - the concepts are really quite natural once you grok them.

The last bit of setup is to tell the JIT to create a new {RLTK::CG::Module}, {RLTK::CG::Builder}, and symbol table when it is initialized.
The last bit of setup is to tell the JIT to create a new {RCGTK::Module}, {RCGTK::Builder}, and symbol table when it is initialized.

def initialize
# IR building objects.
Expand All @@ -38,7 +38,7 @@ The last bit of setup is to tell the JIT to create a new {RLTK::CG::Module}, {RL
@st = Hash.new
end

The {RLTK::CG::Module} class is the LLVM construct that contains all of the functions and global variables in a chunk of code. In many ways, it is the top-level structure that the LLVM IR uses to contain code. The {RLTK::CG::Builder} object is a helper object that makes it easy to generate LLVM instructions. Instances of the Builder class keep track of the current place to insert instructions and has methods to create new instructions. The symbol table (`@st`) keeps track of which values are defined in the current scope and what their LLVM representation is. In this form of Kazoo, the only things that can be referenced are function parameters. As such, function parameters will be in this map when generating code for their function body.
The {RCGTK::Module} class is the LLVM construct that contains all of the functions and global variables in a chunk of code. In many ways, it is the top-level structure that the LLVM IR uses to contain code. The {RCGTK::Builder} object is a helper object that makes it easy to generate LLVM instructions. Instances of the Builder class keep track of the current place to insert instructions and has methods to create new instructions. The symbol table (`@st`) keeps track of which values are defined in the current scope and what their LLVM representation is. In this form of Kazoo, the only things that can be referenced are function parameters. As such, function parameters will be in this map when generating code for their function body.

With these basics in place, we can start talking about how to generate code for each expression.

Expand All @@ -47,9 +47,9 @@ With these basics in place, we can start talking about how to generate code for
Generating LLVM code for expression nodes is very straightforward. First we'll do numeric literals:

when Number
RLTK::CG::Double.new(node.value)
RCGTK::Double.new(node.value)

In the LLVM IR, floating point constants are represented with the {RLTK::CG::Float} and {RLTK::CG::Double} classes. This code simply creates and returns a {RLTK::CG::Double} constant.
In the LLVM IR, floating point constants are represented with the {RCGTK::Float} and {RCGTK::Double} classes. This code simply creates and returns a {RCGTK::Double} constant.

References to variables are also quite simple using LLVM. In the simple version of Kazoo, we assume that the variable has already been emitted somewhere and its value is available. In practice, the only values that can be in the symbol table are function arguments. This code simply checks to see that the specified name is in the table (if not, an unknown variable is being referenced) and returns the value for it. In future chapters we'll add support for loop induction variables in the symbol table, and for local variables.

Expand Down Expand Up @@ -119,19 +119,19 @@ Code generation for prototypes and functions must handle a number of details, wh
if fun = @module.functions[node.name]
if fun.blocks.size != 0
raise "Redefinition of function #{node.name}."

elsif fun.params.size != node.arg_names.length
raise "Redefinition of function #{node.name} with different number of arguments."
end
else
fun = @module.functions.add(node.name, RLTK::CG::DoubleType, Array.new(node.arg_names.length, RLTK::CG::DoubleType))
fun = @module.functions.add(node.name, RCGTK::DoubleType, Array.new(node.arg_names.length, RCGTK::DoubleType))
end

The first thing this code does is check to see if a function has already been declared with the specified name. If such a function has been seen before it then checks to make sure it has the same argument list and an zero-length body. If this function name has not been seen before a new function is created. The `Array.new(...)` call produces an array that tells LLVM the type of each of the functions arguments.

This code allows function redefinition in two cases: first, we want to allow 'extern'ing a function more than once, as long as the prototypes for the externs match (since all arguments have the same type, we just have to check that the number of arguments match). Second, we want to allow 'extern'ing a function and then defining a body for it. This is useful when defining mutually recursive functions.

The last bit of code for prototypes loops over all of the arguments in the function, setting the name of the LLVM Argument objects to match, and registering the arguments in the symbol table for future use by the `translate_expression` method. Once this is set up, it returns the {RLTK::CG::Function Function} object to the caller. Note that we don't check for conflicting argument names here (e.g. "extern foo(a, b, a)"). Doing so would be very straight-forward with the mechanics we have already used above.
The last bit of code for prototypes loops over all of the arguments in the function, setting the name of the LLVM Argument objects to match, and registering the arguments in the symbol table for future use by the `translate_expression` method. Once this is set up, it returns the {RCGTK::Function Function} object to the caller. Note that we don't check for conflicting argument names here (e.g. "extern foo(a, b, a)"). Doing so would be very straight-forward with the mechanics we have already used above.

# Name each of the function paramaters.
returning(fun) do
Expand All @@ -157,7 +157,7 @@ Once the insertion point is set up, we call the `translate_expression` method fo
fun.blocks.append('entry', nil, @builder, self) do |jit|
ret jit.translate_expression(node.body)
end

# Verify the function and return it.
returning(fun) { fun.verify }

Expand Down Expand Up @@ -217,4 +217,4 @@ Here is how you declare an external function and then call it:

When you quit the current demo, it dumps out the IR for the entire module generated. Here you can see the big picture with all the functions referencing each other.

This wraps up the third chapter of the Kazoo tutorial. In the [next chapter](file.Chapter5.html) we'll describe how to add JIT compilation and optimization support to this so we can actually start running code! The full code listing for this chapter can be found in the "`examples/kazoo/chapter 4`" directory.
This wraps up the third chapter of the Kazoo tutorial. In the [next chapter](https://github.com/chriswailes/compiler-examples/blob/master/kazoo/chapter%205/Chapter5-old.md) we'll describe how to add JIT compilation and optimization support to this so we can actually start running code! The full code listing for this chapter can be found in the "`kazoo/chapter 4`" directory.
Loading

0 comments on commit 1860fe7

Please sign in to comment.