-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Moved some folders around; finished the README.md file, and started a…
…dding the tutorial text to their respective .md files.
- Loading branch information
1 parent
63d6aa0
commit 163db55
Showing
50 changed files
with
299 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
# Kazoo - Chapter 1: The Lexer | ||
|
||
When it comes to implementing a language, the first thing needed is the ability to process a text file and recognize what it says. The traditional way to do this is to use a *lexer* (aka *scanner*) to break the input up into *tokens*. Each token returned by the lexer includes a token code and potentially some metadata (e.g. the numeric value of a number). | ||
|
||
It is pretty simple to create a lexer using RLKT: | ||
|
||
class Lexer < RLTK::Lexer | ||
|
||
Next we will define a set of rules inside the class definition that will tell the lexer how to convert strings into tokens. The first rule we will add simply discards any whitespace encountered. | ||
|
||
rule(/\s/) | ||
|
||
This is simply a call the the {RLTK::Lexer.rule} class function. The first argument is the regular expression that describes substrings of the input that match this rule. Most of the rules we will define have actions associated with them that tell the lexer how to emit {RLTK::Token tokens}, but because we simply want to discard any whitespace we can can leave the action blank. The default action associated with a rule returns `nil` and any action that returns `nil` emits no token. | ||
|
||
Rules for keywords and operators can be added very simply: | ||
|
||
# Keywords | ||
rule(/def/) { :DEF } | ||
rule(/extern/) { :EXTERN } | ||
|
||
# Operators and delimiters. | ||
rule(/\(/) { :LPAREN } | ||
rule(/\)/) { :RPAREN } | ||
rule(/;/) { :SEMI } | ||
rule(/,/) { :COMMA } | ||
rule(/=/) { :ASSIGN } | ||
rule(/\+/) { :PLUS } | ||
rule(/-/) { :SUB } | ||
rule(/\*/) { :MUL } | ||
rule(/\//) { :DIV } | ||
rule(/</) { :LT } | ||
|
||
Each rule has an associated action that returns a single symbol when evaluated. These symbols are used to construct RLTK::Token objects that have their `type` attribute set to the returned symbol. The token objects also contain formation about their source file, line number, line offset, and length. | ||
|
||
In Kazoo we will use a token type identifier (or :IDENT) to indicate a variable or function name. We would like these identifiers to start with a letter, but after that they may contain letters or numbers. A rule to capture these identifiers can be defined as such: | ||
|
||
# Identifier rule. | ||
rule(/[A-Za-z][A-Za-z0-9]*/) { |t| [:IDENT, t] } | ||
|
||
Here we specify an action that takes a single parameter, `t`, which will contain the text matched by the rule’s regular expression. The rule’s action then returns an array where the first element is the token’s type, and the second element is the token’s value. | ||
|
||
Next, we’re going to add rules for matching numbers. These rules are very similar to the identifier rule, in that they take their matched text and use it to give the generated token a value, but in this case we convert the text from a string to a float using Ruby’s built-in `to_f` method. | ||
|
||
# Numeric rules. | ||
rule(/\d+/) { |t| [:NUMBER, t.to_f] } | ||
rule(/\.\d+/) { |t| [:NUMBER, t.to_f] } | ||
rule(/\d+\.\d+/) { |t| [:NUMBER, t.to_f] } | ||
|
||
Not that many people are going to need to leaving comments in Kazoo code, but they are a good way to show off some of the more advanced functionality in the {RLTK::Lexer} class so lets think about the behavior we want out of the lexer when we encounter a comment (which start with a # in Kazoo). As we are only going to support line comments the lexer should discard all input after a # until it encounters a newline. To achieve this behavior we will use the lexers state stack. | ||
|
||
# Comment rules. | ||
rule(/#/) { push_state :comment } | ||
rule(/\n/, :comment) { pop_state } | ||
rule(/./, :comment) | ||
|
||
When attempting to match a substring of the input RLTK lexers only use the rules that are defined for their current state. The first rule says that when the lexer encounters a # it should enter the `:comment` state. The second rule says that if we encounter a newline we should pop the current state off of the state stack, but *only* if we are already in the `:comment` state. Lastly, we add a rule that will discard any single character input. Since this rule is specified after the newline rule we will never discard a newline. | ||
|
||
And that finishes our lexer for now! The full code for this chapter can be found in the `examples/kazoo/chapter 1` directory. Continue on to the [next chapter](file.Chapter2.html) to see how we use RLTK to define AST nodes for Kazoo. |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Kazoo - Chapter 2 | ||
|
||
To be filled in later. |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Kazoo - Chapter 3 | ||
|
||
To be filled in later. |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Kazoo - Chapter 4 | ||
|
||
To be filled in later. |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Kazoo - Chapter 5 | ||
|
||
To be filled in later. |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Kazoo - Chapter 6 | ||
|
||
To be filled in later. |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Kazoo - Chapter 7 | ||
|
||
To be filled in later. |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Kazoo - Chapter 8 | ||
|
||
To be filled in later. |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.