You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The pocketlang compiler will read the source (a string) and generate bytecode (an array of bytes) which can be interpreted by the pocket VM.
The compilation is a 2 step process
Lexing: Make tokens from the source string (see below)
Parsing: Generate parse tree (in multi-pass compilers) or bytecode (in single pass compilers) according to the language grammar (this will be updated in a new ref issue).
Pocketlang is a single-pass compiler, but for the sake of easiness, we'll see lexing and parsing separately, even though we parse immediately after a token is lexed.
Tokens
Tokens are the result of the lexing process. A token has a type (TokenType) and optionally a value. Here is the simplified pocketlang token type and token declaration. (the source reference).
enumTokenType {
TK_LPARAN, // '(' -- no token valueTK_RPARAN, // ')' -- no token valueTK_NAME, // foo -- token value = name of the identifierTK_STRING, // "foo" -- token value = the string literal
...
};
structToken {
TokenTypetype;
Varvalue;
};
Lexing
The compiler will read the source string (sequence of characters) and make a sequence of corresponding tokens. These tokens then will be used to generate bytecode by the compiler.
The tokens can be classified into
separators - ( ) { } [] . .. , ;
operators - + - * / = & \| ^
literals - 42 false "foo" 0b10110
keywords - def import if break
names - foo bar print input
Except for keywords, any identifier will be tokenized to name token (TK_NAME) which will be determined by the parser if it's a built-in function or variable or an imported module, etc. The names that aren't defined are semantic errors throws by the parser. The lexer only cares about if it can make a valid token out of it. Here are some lexing errors.
Non terminated string x = "foo
Invalid literals 0b123456abc
And every other places the lexError() function called.
Each different classification of token types will be tokenized by various lexer helper functions (source). Those functions are encapsulated by lexToken(Compiler* compiler) function. (source)
Use this thread to discuss the lexing process of pocketlang
The text was updated successfully, but these errors were encountered:
The pocketlang compiler will read the source (a string) and generate bytecode (an array of bytes) which can be interpreted by the pocket VM.
The compilation is a 2 step process
Tokens
Tokens are the result of the lexing process. A token has a type (
TokenType
) and optionally a value. Here is the simplified pocketlang token type and token declaration. (the source reference).Lexing
The compiler will read the source string (sequence of characters) and make a sequence of corresponding tokens. These tokens then will be used to generate bytecode by the compiler.
The tokens can be classified into
( ) { } [] . .. , ;
+ - * / = & \| ^
42 false "foo" 0b10110
def import if break
foo bar print input
Except for keywords, any identifier will be tokenized to name token (
TK_NAME
) which will be determined by the parser if it's a built-in function or variable or an imported module, etc. The names that aren't defined are semantic errors throws by the parser. The lexer only cares about if it can make a valid token out of it. Here are some lexing errors.x = "foo
0b123456abc
lexError()
function called.Each different classification of token types will be tokenized by various lexer helper functions (source). Those functions are encapsulated by
lexToken(Compiler* compiler)
function. (source)The text was updated successfully, but these errors were encountered: