# Lab 2

## Lexical Analyzer for JSON

In this lab, we will be building a lexical analyzer for simple JSON files.  We will be using ANTLR.

The learning objectives:

1. Understand the workflow of using ANTLR tools.
2. Be able to build lexical analyzers using ANTLR.

<p style="border: 2px solid red; text-align: center; padding: 20px; font-size: 150%;"> For this lab, you will need to use Terminal for part of your work. </p>

You will need to restart the kernel whenever you update the ANTLR generated Java classes.

We have provided a Makefile to help you to recompile your ANTLR code.

```
$ make build
```

In [1]:
@file:DependsOn("/data/shared/antlr-4.9.1-complete.jar")
@file:DependsOn(".")

Let's create a simple JSON file.

In [2]:
import java.io.File;

val jsonSample = """
    {
        "name": "CSCI 4020U",
        "title": "Compilers",
        "enrollment": 89,
        "lectures": [
            {
                "day": "Monday",
                "time": "11-12:30"
            },
            {
                "day": "Thursday",
                "time": "11-12:30"
            }
        ]
    }
""".trimIndent()

File("./sample.json").writeText(jsonSample)

We load the generated lexer and its dependencies from ANTLR runtime library.

In [3]:
import org.antlr.v4.runtime.*
import lab2.JsonLexer;

Construct an ANTLR character stream

In [4]:
val input = ANTLRFileStream("./sample.json")

Construct a lexer that processes the character stream.

In [5]:
val lexer = JsonLexer(input);
lexer.removeErrorListeners()
lexer.addErrorListener(object: BaseErrorListener() {
  override fun syntaxError(recognizer: Recognizer<*,*>,
               offendingSymbol: Any?,
               line: Int,
               pos: Int,
               msg: String,
               e: RecognitionException) {
      throw Exception("${e} at line:${line}, char:${pos}")
  }
})

Obtain the token stream, and trigger the scan of the input character stream by the lexer.

In [6]:
val tokens = CommonTokenStream(lexer)

In [9]:
// Now we can display all the tokens extracted
// by the lexer.

try {
    tokens.fill()
    val tokenTypeNames = lexer.tokenNames;
    tokens.getTokens().forEach {
        token: Token ->
        val typeName: String = if(token.type < 0) {
            "EOF"
        } else {
            tokenTypeNames[token.type]
        }
        println("\"${token.text}\" is a ${typeName}")
    }
} catch(e:Exception) {
    println(e)
}


"{" is a '{'
""name"" is a String
":" is a ':'
""CSCI 4020U"" is a String
"," is a ','
""title"" is a String
":" is a ':'
""Compilers"" is a String
"," is a ','
""enrollment"" is a String
":" is a ':'
"89" is a Number
"," is a ','
""lectures"" is a String
":" is a ':'
"[" is a '['
"{" is a '{'
""day"" is a String
":" is a ':'
""Monday"" is a String
"," is a ','
""time"" is a String
":" is a ':'
""11-12:30"" is a String
"}" is a '}'
"," is a ','
"{" is a '{'
""day"" is a String
":" is a ':'
""Thursday"" is a String
"," is a ','
""time"" is a String
":" is a ':'
""11-12:30"" is a String
"}" is a '}'
"]" is a ']'
"}" is a '}'
"<EOF>" is a EOF
