Skip to content

Examples

Vinícius Garcia edited this page May 28, 2018 · 8 revisions

This page is still incomplete, but most of the examples will be copied from the jSpy project, which is a modern interpreter built using this framework.

If you are interested the jSpy project is a good place to see some real-world examples.

Creating statements

Procedural programming languages are often described in terms of statements. Each line of code is usually a statement which might describe an expression such as a = 2 * b or a special structure such as if (a) then b.

In the examples below we will create 3 types of statements:

  1. An expression statement for evaluating simple expressions.
  2. A code block statement that will describe a group of statements (useful for describing function bodies and if then else code blocks).
  3. An example of a named statement such as if (a) then b. This type can be extended and adapted to describe loops, function declarations, variable declarations, etc.

A Base Statement Class

It is useful to be able ot keep all statements in a single array, thus, we will define an abstract base class for all statements so this is is possible to achieve and also for making it easier to define specific types of statements below:

class Statement {
 protected:
  virtual void _compile(const char* code, const char** rest,
                        TokenMap parent_scope) = 0;
  virtual packToken _exec(TokenMap scope) const = 0;

 public:
  virtual ~Statement() {}
  void compile(const char* code, const char** rest = 0,
               TokenMap parent_scope = &TokenMap::empty) {
    return _compile(code, rest, parent_scope);
  }

  packToken exec(TokenMap scope) const { return _exec(scope); }

  // This will allow each statement to be copied:
  virtual Statement* clone() const = 0;
};

With this implementation all our statements will have 2 public functions: A compile() function and an exec() function and these functions rely on 2 private pure virtual functions that we will need to implement for each of our child classes: _compile() and _exec() that are called respectively by the 2 public functions.

Note: You might have noticed the functions compile() and eval() are redundant. The reason I did this is that I wanted to set default arguments for these functions, but it is only possible if they are not pure virtual. So I created a non-virtual version of each of the 2 virtual functions.

The _compile() method should describe how this statement should be parsed, i.e. how to break the input text into meaningful structures that can be used later during the evaluation stage.

The _exec() method should describe the evaluation rules of this statement, e.g. an IF statement would execute only one of the 2 code blocks each time depending on the result of the boolean expression.

Note: The exec() functions on this example are returning a packToken for simplicity, however, it might be useful to return a special class such as "returnState" so that the returned value of a statement such as a "return" statement can be different from the value returned by an expression or a "throw" statement.

Expression Statement

Now that we have a base class to extend we may create our very first statement type the class ExpStatement:

class ExpStatement : public Statement {
  calculator expr;

 private:
  void _compile(const char* code, const char** rest, TokenMap parent_scope);
  packToken _exec(TokenMap scope) const;

 public:
  ExpStatement() {}
  ExpStatement(const char* code, const char** rest = 0,
               TokenMap parent_scope = &TokenMap::empty) {
    _compile(code, rest, parent_scope);
  }
  virtual Statement* clone() const {
    return new ExpStatement(*this);
  }
};

/* * * * * ExpStatement Class .cpp File * * * * */

void ExpStatement::_compile(const char* code, const char** rest,
                            TokenMap parent_scope) {
  // The string ";}\n" list the delimiters I want for my programming language.
  // Feel free to change it to something more adequate for your own language.
  expr.compile(code, parent_scope, ";}\n", &code);

  // Skip the delimiter character:
  if (*code && *code != '}') ++code;

  if (rest) *rest = code;
}

packToken ExpStatement::_exec(TokenMap scope) const {
  return expr.eval(scope);
}

Creating a BlockStatement

A BlockStatement is useful for describing more complex statements. The main reason for this is that code blocks are used by many other types of statements e.g.:

  • IF statements have usually two code blocks the TRUE block and the FALSE block.
  • FOR/WHILE statements have usually 1 code block that will be executed repeatedly.
  • FUNCTION declaration statements can be described as a code block with arguments and a return value.
  • Etc.

The code for this class is slightly more complex than for a normal statement, mainly because it should be easy to add new statements in the future:

class BlockStatement : public Statement {
 public:
  typedef std::map<std::string, Statement* (*)()> statementMap_t;

  // Associate each type of statement with a keyword:
  static statementMap_t& statementMap() {
    static statementMap_t map;
    return map;
  }

  // Use this to register new statements on statementsMap.
  template<typename T>
  static Statement* factory() { return new T(); }

 private:
  typedef std::vector<Statement*> codeBlock_t;
  codeBlock_t list;

 private:
  void cleanList(codeBlock_t* list);

 private:
  void _compile(const char* code, const char** rest, TokenMap parent_scope);
  packToken _exec(TokenMap scope) const;

  Statement* buildStatement(const char** source, TokenMap scope);

 public:
  BlockStatement() {}

  // Implement The Big 3, for safely copying:
  BlockStatement(const BlockStatement& other);
  ~BlockStatement();
  BlockStatement& operator=(const BlockStatement& other);

  virtual Statement* clone() const {
    return new BlockStatement(*this);
  }
};

/* * * * * BlockStatement Class in .cpp File: * * * * */

// Decide what type of statement to build:
Statement* BlockStatement::buildStatement(const char** source, TokenMap scope) {
  const char* code = *source;

  // If it is a block statement:
  if (*code == '{') {
    Statement* stmt = new BlockStatement();
    stmt->compile(code, source, scope);
    return stmt;
  }

  // Parse the first word of the text:
  std::string name = rpnBuilder::parseVar(code);

  // Check if it is a reserved word:
  statementMap_t& stmt_map = statementMap();
  auto it = stmt_map.find(name);
  if (it != stmt_map.end()) {
    // If it is parse it and return:
    Statement* stmt = it->second();
    stmt->compile(code+name.size(), source, scope);
    return stmt;
  }

  // Return a normal statement:
  return new ExpStatement(code, source, scope);
}

void BlockStatement::cleanList(codeBlock_t* list) {
  for(auto stmt : *list) {
    delete stmt;
  }

  list->clear();
}

BlockStatement::BlockStatement(const BlockStatement& other) {
  for(const Statement* stmt : other.list) {
    list.push_back(stmt->clone());
  }
}

BlockStatement& BlockStatement::operator=(const BlockStatement& other) {
  cleanList(&list);
  for(const Statement* stmt : other.list) {
    list.push_back(stmt->clone());
  }
  return *this;
}

BlockStatement::~BlockStatement() {
  cleanList(&list);
}

void BlockStatement::_compile(const char* code, const char** rest,
                              TokenMap parent_scope) {
  // Make sure the list is empty:
  cleanList(&list);

  while (isspace(*code)) ++code;

  if (*code == '{') {

    // Find the next non-blank character:
    ++code;
    while (isspace(*code)) ++code;

    // Parse each statement of the block:
    while (*code && *code != '}') {
      // Ignore empty statements:
      if (strchr(";\n", *code)) {
        ++code;
      } else {
        list.push_back(buildStatement(&code, parent_scope));
      }

      // Discard blank spaces:
      while (isspace(*code)) ++code;
    }

    if (*code == '}') {
      ++code;
    } else {
      throw syntax_error("Missing a '}' somewhere on the code!");
    }
  } else {
    list.push_back(buildStatement(&code, parent_scope));
  }

  if (rest) *rest = code;
}

packToken BlockStatement::_exec(TokenMap scope) const {
  // Returned value:
  packToken rv;
  for(const auto stmt : list) {
    // In a more complete implementation, `rv` should
    // be checked for "return" or "throw" behaviors.
    rv = stmt->exec(scope);
  }

  return rv;
}

Now it should be possible to evaluate a list of statements, e.g.:

  TokenMap scope;
  BlockStatement code;
  code.compile("{ a = 2; b = 3; c = a+b; }");
  code.exec(scope);
  std::cout << scope["a"] << std::endl; // 2
  std::cout << scope["b"] << std::endl; // 3
  std::cout << scope["c"] << std::endl; // 5

Creating Named Statements

Named statements are statements with a special meaning that begin with a reserved word such as try { ... } catch { ... } or if (...) { ... } else { ...} or function f(...) { ... }.

Since this is only an example we will describe how to create only an if statement. The class definition is not very complex, it needs to hold 2 BlockStatements and one expression, e.g.:

class IfStatement : public Statement {
  calculator cond;
  BlockStatement _then;
  BlockStatement _else;

 private:
  void _compile(const char* code, const char** rest, TokenMap parent_scope);
  returnState _exec(TokenMap scope) const;

 public:
  IfStatement() {}
  virtual Statement* clone() const {
    return new IfStatement(*this);
  }
};

/* * * * * IfStatement Class in .cpp File: * * * * */

void IfStatement::_compile(const char* code, const char** rest,
                           TokenMap parent_scope) {

  while (isspace(*code)) ++code;

  if (*code != '(') {
    throw syntax_error("Expected '(' after `if` statement!");
  }

  // Parse the condition:
  cond.compile(code+1, parent_scope, ")", &code);

  if (*code != ')') {
    throw syntax_error("Missing ')' after `if` statement!");
  }

  _then.compile(code+1, &code, parent_scope);

  while (isspace(*code)) ++code;

  // Check if the next word is else:
  static const char* str = "else";
  for (int i = 0; i < 4; ++i) {
    if (str[i] != code[i]) {
      if (rest) *rest = code;
      return;
    }
  }
  if (isalnum(code[4]) || code[4] == '_') {
    if(rest) *rest = code;
    return;
  }

  _else.compile(code+4, &code, parent_scope);

  if(rest) *rest = code;
}

returnState IfStatement::_exec(TokenMap scope) const {
  if (cond.eval(scope).asBool()) {
    return _then.exec(scope);
  } else {
    return _else.exec(scope);
  }
}

After that you need to register this new statement type on the BlockStatement class so it knows how to parse it, e.g.:

struct MyStartup {
  Startup() {
    auto& statementMap = BlockStatement::statementMap();

    statementMap["if"] = BlockStatement::factory<IfStatement>;
  }
} MyStartup;

Now it should be possible to compile if statements among normal expressions e.g.:

  TokenMap scope;
  BlockStatement code;
  code.compile(
      "{\n"
      "  a = 2;\n"
      "  if (a < 2) {\n"
      "    b = True;\n"
      "  } else {\n"
      "    b = False;\n"
      "  }\n"
      "}");
  code.exec(scope);
  std::cout << scope["a"] << std::endl; // 2
  std::cout << scope["b"] << std::endl; // False