# Chapter 3 Lexical Analysis

The lexical analyzer (scanner) can be viewed as a subroutine or coroutine of the parser:
* Easy design – Can concentrate only on the most complex parsing task in syntax analysis.
* Efficiency – The size of the parse tree can be greatly reduced and this saves time and space.
* Portability – The same parser can be used for lexically different versions of a programming language.

<h3><center><i>Scanner and Parser</i></center></h3>

![Scanner and Parser](res/03/3-1.png)

## Input Buffering

**Input buffering** – The scanner will access input symbols via a buffer containing a part of the source program which is stored in a secondary storage.

#### Example: Double Buffering

A double buffer can be implemented by using a **circular queue**.

Consider the following program:
```
...
int x;
x += 4;
double d;
d = 9.401;
```

It can be observed that the keyword `double` is present in the program but is cut off at the end of buffer 2. Once the `l` of `double` has been read, buffer 1 is reloaded so the `e` of double can be read. A similar situation arises for the constant `9.401`.

<h3><center><i>Double Buffering Example</i></center></h3>

<img src="./res/03/3-2.png" width="700px" alt="Double Buffering Example"/>

## Error Recovery

**Error recovery** – Recover from errors in order to fins more errors. Examples include:
* Panic mode recovery, i.e., simply ignore some successive characters until a well-formed token is found.
* Deletion, insertion, replacement or transposition of a few characters, e.g., 102.o8 → 108.08

## Automated vs. Manual Design

* Can use a scanner generator such as LEX.
* Design from a DFA.

<h3><center><i>Scanner Transformation Process</i></center></h3>

<img src="./res/03/3-3.png" width="500px" alt="Scanner Transformation Process"/>

#### Example: DFA → Scanner

Transform the given DFA to a scanner:

<img src="./res/03/3-4.png" width="400px" alt="DFA Example"/>

##### Pseudocode - one block for each state

```
State-A:
    read(c);
    if (c == 'a') then goto State-B
    else if (c == 'b') then goto State-B
    else ERROR()
State-B:
    read(c)
    if (c == 'a') then goto State-D
    else if (c == 'b') then goto State-B
    else if (c in $) then ACCEPT()
    else ERROR()
State-C:
    read(c)
    if (c == 'a') then goto State-A
    else if (c == 'b') then goto State-D
    else ERROR()
State-D:
    read(c)
    if (c == 'a') then goto State-B
    else if (c == 'b') then goto State-A
    else ERROR()
```

#### Example: Big DFA

Here is a DFA applicable to Programming Assignment 1.

![Big DFA](res/03/DFA.png)