Skip to content

Commit

Permalink
Lexer does not crash in case of unrecognized characters.
Browse files Browse the repository at this point in the history
We added a special ERROR token to keep tokenizing the input even when
unrecognized characters exist.
  • Loading branch information
Baris Aktemur committed Nov 2, 2017
1 parent 6881dca commit b319c09
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 0 deletions.
4 changes: 4 additions & 0 deletions Deve/README.md
Expand Up @@ -43,4 +43,8 @@ Sample run:
- : token list =
[LET; NAME "x"; EQUALS; INT 5; IN; IF; NAME "x"; STAR; NAME "dummy"; THEN;
INT 321; ELSE; EOF]
# allTokens "x % abc Upper ??@ hey";;
- : token list =
[NAME "x"; ERROR '%'; NAME "abc"; ERROR 'U'; NAME "pper"; ERROR '?';
ERROR '?'; ERROR '@'; NAME "hey"; EOF]
```
2 changes: 2 additions & 0 deletions Deve/lexer.ml
Expand Up @@ -10,6 +10,7 @@ type token = INT of int
| PLUS | STAR | MINUS | SLASH
| LET | EQUALS | IN
| IF | THEN | ELSE
| ERROR of char
| EOF
;;

Expand Down Expand Up @@ -50,6 +51,7 @@ let rec tokenize chars =
tokenizeInt rest (digitToInt c)
| c::rest when isLowercaseLetter(c) ->
tokenizeName rest (charToString c)
| c::rest -> (ERROR c)::(tokenize rest)

and tokenizeInt chars n =
match chars with
Expand Down

0 comments on commit b319c09

Please sign in to comment.