# Chapter 2: Assemblers

Mrs. Sunita M Dol (Aher),
Assistant Professor,
Computer Science and Engineering Department,
Walchand Institute of Technology, Solapur, Maharashtra

#### 2. Assemblers

- Elements of Assembly Language Programming
- A simple Assembly Scheme
- Pass Structure of Assemblers
- Design of a two pass Assembler
- A Single Pass Assembler for IBM PC

#### 2. Assemblers

- Elements of Assembly Language Programming
- A simple Assembly Scheme
- Pass Structure of Assemblers
- Design of a two pass Assembler
- A Single Pass Assembler for IBM PC

- An assembly language is a machine dependent, low level programming language which is specific to a certain computer system.
- Compared to machine language of computer, it provides three basic features:
  - Mnemonic operation codes
    - Eliminates the need to memorize numeric operation code.
  - Symbolic operands
    - Symbolic names can be used.
  - Data declaration
    - Data can be declared in any form e.g. -5, 10.5 etc.



Figure: Assembler

Statement Format

```
[Label] <Opcode> <operand spec> [<operand spec>....]
```

- 1. Label:- Is optional.
- 2. Opcode:- Symbolic opcode
- Operand Spec:- <operand spec> has the following syntax <symbolic name> [+<displacement>][(<index register>)] e.g. AREA

AREA + 5

**AREA** (4)

 $\mathsf{AREA} + \mathsf{5}(4)$ 

- A Simple Assembly Language
  - Each statement has two operands
    - The first operand is always a register (AREG, BREG, CREG and DREG)
    - The second operand refers to a memory word.

| Instruction<br>Opcode | Assembly<br>Mnemonic | Remarks                  |  |  |  |
|-----------------------|----------------------|--------------------------|--|--|--|
| 00                    | STOP                 | Stop Execution           |  |  |  |
| 01                    | ADD                  | Op1 ← Op1+ Op2           |  |  |  |
| 02                    | SUB                  | Op1 ← Op1 – Op2          |  |  |  |
| 03                    | MULT                 | Op1 ← Op1* Op2           |  |  |  |
| 04                    | MOVER                | CPU Reg ← Memory operand |  |  |  |
| 05                    | MOVEM                | Memory ← CPU Reg         |  |  |  |
| 06                    | COMP                 | Sets Condition Code      |  |  |  |
| 07                    | BC                   | Branch on Condition      |  |  |  |
| 08                    | DIV                  | Op1 ← Op1/ Op2           |  |  |  |
| 09                    | READ                 | Operand 2 ← input Value  |  |  |  |
| 10                    | PRINT                | Output ← Operand2        |  |  |  |

Figure: Mnemonic Operation Codes

Instruction Format



Figure: Instruction Format

- Assembly Language to Machine Language
  - Find address of variables and labels.
  - Replace Symbolic address by numeric address.
  - Replace Symbolic opcode by machine opcode.
  - Reserve storage for data.

|        | START | 101          |
|--------|-------|--------------|
|        | READ  | N            |
|        | MOVER | BREG, ONE    |
|        | MOVEM | BREG, TERM   |
| AGAIN  | MULT  | BREG, TERM   |
|        | MOVER | CREG, TERM   |
|        | ADD   | CREG, ONE    |
|        | MOVEM | CREG, TERM   |
|        | COMP  | CREG, N      |
|        | BC    | LE, AGAIN    |
|        | MOVEM | BREG, RESULT |
|        | PRINT | RESULT       |
|        | STOP  |              |
| N      | DS    | 1            |
| RESULT | DS    | 1            |
| ONE    | DC    | <b>'1'</b>   |
| TERM   | DS    | 1            |
|        | END   |              |

Figure: Sample program to find n!

|        | START | 101          |             |               |     |
|--------|-------|--------------|-------------|---------------|-----|
|        | READ  | N            | <b></b>     | 101) + 09 0   | 113 |
|        | MOVER | BREG, ONE    | <b>→</b>    | 102) + 04 2   | 115 |
|        | MOVEM | BREG, TERM   | <b></b>     | 103) + 05 2   | 116 |
| AGAIN  | MULT  | BREG, TERM   | <b></b>     | 104) + 03 2   | 116 |
|        | MOVER | CREG, TERM   | <b></b>     | 105) + 04 3   | 116 |
|        | ADD   | CREG, ONE    | <del></del> | 106) + 01 3   | 115 |
|        | MOVEM | CREG, TERM   | <b>→</b>    | 107) + 05 3   | 116 |
|        | COMP  | CREG, N      | <b>→</b>    | 108) + 06 3   | 116 |
|        | ВС    | LE, AGAIN    | <b>→</b>    | 109) + 07 2   | 104 |
|        | MOVEM | BREG, RESULT | <b>─</b>    | 110) + 05 2   | 114 |
|        | PRINT | RESULT       | <b>→</b>    | 111) + 10 0 📗 | 114 |
|        | STOP  |              | <b></b>     | 112) + 00 0 0 | 00  |
| N      | DS    | 1            | <b></b>     | 113)          |     |
| RESULT | DS    | 1            | <b></b>     | 114)          |     |
| ONE    | DC    | <b>'1'</b>   | <b></b>     | 115) + 00 0 0 | 01  |
| TERM   | DS    | 1            | <b>─</b>    | 116)          |     |
|        | END   |              |             |               |     |

| Variable | Address |
|----------|---------|
| AGAIN    | 104     |
| N        | 113     |
| RESULT   | 114     |
| ONE      | 115     |
| TERM     | 116     |

Figure: After LC Processing

Machine Code

| LC  | Opcode | Register | Address |  |
|-----|--------|----------|---------|--|
| 101 | 09     | 0        | 113     |  |
| 102 | 04     | 2        | 115     |  |
| 103 | 05     | 2        | 116     |  |
| 104 | 03     | 2        | 116     |  |
| 105 | 04     | 3        | 116     |  |
| 106 | 01     | 3        | 115     |  |
| 107 | 05     | 3        | 116     |  |
| 108 | 06     | 3        | 113     |  |
| 109 | 07     | 2        | 104     |  |
| 110 | 05     | 2        | 114     |  |
| 111 | 10     | 0        | 114     |  |
| 112 | 00     | 0        | 000     |  |
| 113 |        |          |         |  |
| 114 |        |          |         |  |
| 115 | 00     | 0        | 001     |  |
| 116 |        |          |         |  |

```
START 101
         READ
                    X
         READ
         MOVER
                    AREG, X
                    AREG, Y
         ADD
                    AREG, RESULT
         MOVEM
         PRINT RESULT
         STOP
   X
         DS
         DS
         DS
RESULT
         END
```

Figure: Sample program to find X+Y



| Variable | Address |
|----------|---------|
| X        | 108     |
| Υ        | 109     |
| RESULT   | 110     |

Figure: After LC Processing

#### Machine Code

| LC  | Opcode   | Register | Address |  |
|-----|----------|----------|---------|--|
| 101 | 09       | 0        | 108     |  |
| 102 | 09       | 0        | 109     |  |
| 103 | 04       | 1        | 108     |  |
| 104 | 104 01 1 |          | 109     |  |
| 105 | 05       | 0        | 110     |  |
| 106 | 10       | 0        | 110     |  |
| 107 | 00       | 0        | 000     |  |
| 108 |          |          |         |  |
| 109 |          |          |         |  |
| 110 |          |          |         |  |
| 111 |          |          |         |  |

- Assembly Language Statement
  - Imperative Statement.
    - Indicates an action to be taken during execution of a program.
    - Eg: MOV, ADD, MULT, etc.
  - Declaration Statement.
    - To reserve memory for variable.

```
[Label] DS <constant> eg: X DS 5 [Label] DC '<value>' eg: X DC '3'
```

- Assembler Directives
  - Instructs the assembler to perform certain action during assembly of a program.

```
START <constant>
```

#### Literals and Constants

- Literal cannot be changed during program execution
- Literal is more safe and protected than a constant.
- Literals appear as a part of the instruction.
- e.g.

- Assembler Directives
  - START <constant> : The first word of the target program by the assembler should be placed in the memory with the address <constant>
     e.g. START 100
  - END [<operand spec>]: It indicates the end of source program. The optional <operand spec> indicates the address of instruction where the execution of the program should begin.

Advantages of Assembly Language

|        | <b>Program</b><br>START | <b>1</b><br>101 |                       | LC  | Opcode | e F            | Regi | ster | Memory operand |
|--------|-------------------------|-----------------|-----------------------|-----|--------|----------------|------|------|----------------|
|        | READ                    | X               | <b>→</b>              | 101 | +      | <b>y</b><br>09 | 0    | 108  |                |
|        |                         | Y               | <b>→</b>              | 102 | +      | 09             | 0    | 109  | j              |
|        | MOVER                   | AREG, X         | <b>──</b>             | 103 | +      | 04             | 1    | 108  |                |
|        | ADD                     | AREG, Y         | <b>→</b>              | 104 | +      | 01             | 1    | 109  |                |
|        | DIV                     | AREG, TWO       | <b>→</b>              | 105 | +      | 80             | 2    | 112  |                |
|        | <b>MOVEM</b>            | AREG, RES       | $ULT \longrightarrow$ | 106 | +      | 05             | 0    | 110  |                |
|        | PRINT                   | RESULT          | <b>──</b>             | 107 | +      | 10             | 0    | 110  |                |
|        | STOP                    |                 | $\longrightarrow$     | 108 | +      | 00             | 0    | 000  |                |
| X      | DS                      | 1               | $\longrightarrow$     | 109 |        |                |      |      |                |
| Υ      | DS                      | 1               | <b>→</b>              | 110 |        |                |      |      |                |
| RESULT | DS                      | 1               | $\longrightarrow$     | 111 |        |                |      |      |                |
| TWO    | DC                      | <b>'2'</b>      | <b>→</b>              | 112 | +      | 00             | 0    | 002  |                |
|        | END                     |                 |                       |     |        |                |      |      |                |

#### 2. Assemblers

- Elements of Assembly Language Programming
- A Simple Assembly Scheme
- Pass Structure of Assemblers
- Design of a two pass Assembler
- A Single Pass Assembler for IBM PC

- Design Specification for an Assembler
  - Four step approach
    - Identify the information necessary to perform a task.
    - Design a suitable data structure to record the information.
    - Determine the processing necessary to obtain and maintain the information.
    - Determine the processing necessary to perform the task.

- Synthesis Phase
  - Consider instruction MOVER AREG, X

```
START 101
                                 LC
       READ
                                 101
                                            09
       READ
                               → 102
                                            09
                                                   109
       MOVER AREG, X
                               → 103
                                            04
                                                   108
              AREG, Y
       ADD
                                → 104
                                                   109
                                            01
       MOVEM AREG, RESULT
                                            05
                                                   110
       PRINT
              RESULT
                                → 106
                                            10
       STOP
                                → 107
                                            00
                                                  000
X
       DS
       DS
                               → 109
RESULT DS
                               → 110
       END
```

#### Synthesis Phase

- we must have the following information to synthesize the machine instruction corresponding to this statement:
  - Address of the memory word with which name X is associated.
  - Machine operation code corresponding to the mnemonic MOVER.
- The first item of information depends on the source program. Hence it must be available by the analysis phase. The second item of information does not depend on the source program

- Synthesis Phase
  - we consider the use of two data structures during the synthesis phase:
    - Symbol table
    - Mnemonics table

#### Analysis Phase

- The primary function performed by the analysis phase is the building of the symbol table.
- For this purpose it must determine the address with which the symbolic names in a program are associated.
- To determine the address of any instruction, we must fix the address of all program elements preceding it. This function is called memory allocation.
- To implement memory allocation a data structure called location counter is introduced.

#### Analysis Phase

- It is initialized to the constant specified in the START statement.
- The location counter is always made to contain the address of the next memory word in the target program.



Figure: Overview of Two Pass Assembler

#### Analysis Phase

- Isolate the label, mnemonic opcode and operand fields of a statement.
- If a label is present, enter the pair(symbol, <LC counter>)
  in a new entry of the symbol table
- Check the validity of the mnemonic opcode through a look-up in the mnemonic table.
- Perform LC processing i.e. update the value contained in LC by considering the opcode and operands of the statement.

- Synthesis phase
  - Obtain the machine opcode corresponding to the mnemonic from mnemonics table.
  - Obtain address of a memory operand from the symbol table.
  - Synthesize a machine instruction or the machine form of a constant, as the case may be.

#### 2. Assemblers

- Elements of Assembly Language Programming
- A Simple Assembly Scheme
- Pass Structure of Assemblers
- Design of a two pass Assembler
- A Single Pass Assembler for IBM PC

#### Pass Structure of Assembler

- Two pass translation
  - Two pass translation of an assembly language program can handle forward references easily.
  - LC processing is performed in the first pass and symbols defined in the program are entered into the symbol table.
  - The second pass synthesizes the target form using the address information found in the symbol table.

#### Pass Structure of Assembler

- Two pass translation
  - The first pass constructs an intermediate representation of the source program for use by the second pass.
  - This representation consists of two main components—
    - · data structures, e.g. the symbol table, and
    - a processed form of the source program

#### Pass Structure of Assembler



Figure: Overview of Two Pass Assembler

- Single pass translation
  - LC processing and construction of the symbol table proceed as in two pass translation.
  - The problem of forward references is tackled using a process called backpatching.
  - The operand field of an instruction containing a forward reference is left blank initially.
  - The address of the forward referenced symbol is put into this field when its definition is encountered.

- Single pass translation
  - The need for inserting the operand's address at a later stage can be indicated by adding an entry to the table of incomplete instructions (TII).
  - In TII, each entry is a pair(<instruction address>,<symbol>)

- Single pass translation
  - By the time the END statement is processed, the symbol table would contain the addresses of all symbols defined in the source program and TII would contain information describing all forward references.
  - The assembler can now process each entry in TII to complete the concerned instruction.

- Single Pass Translation
  - The problem of forward reference can be handled using a technique called as back patching.
  - The need for inserting the second operand's address at a later stage can be indicated by adding an entry to the Table of Incomplete Instruction (TII)
  - The entry in TII is a pair(<instruction address>, <symbol>)

Single Pass Translation Example

|     | START | 100         |
|-----|-------|-------------|
|     | MOVER | AREG, X     |
|     | ADD   | BREG, ONE   |
|     | ADD   | CREG, TEN   |
|     | STOP  |             |
| X   | DC    | <b>'</b> 5' |
| ONE | DC    | <b>'1'</b>  |
| TEN | DC    | '10'        |
|     | END   |             |

Single Pass Translation Example

|     | START | 100         |     |          |
|-----|-------|-------------|-----|----------|
|     | MOVE  | R AREG, X   | 100 | 04 1     |
|     | ADD   | BREG, ONE   | 101 | 01 2     |
|     | ADD   | CREG, TEN   | 102 | 06 3     |
|     | STOP  | ,           | 103 | 00 0 000 |
| X   | DC    | <b>'</b> 5' | 104 |          |
| ONE | DC    | <b>'1'</b>  | 105 |          |
| TEN | DC    | '10'        | 106 |          |
|     | END   |             |     |          |

| Instruction<br>Address | Symbol Making a forward reference |
|------------------------|-----------------------------------|
| 100                    | X                                 |
| 101                    | ONE                               |
| 102                    | TEN                               |

| Symbol | Address |
|--------|---------|
| X      | 104     |
| ONE    | 105     |
| TEN    | 105     |

Figure : Symbol Table

Figure: TII (Table of Incomplete Instruction)

Single Pass Translation Example

|     | START | 100         |     |    |   |             |
|-----|-------|-------------|-----|----|---|-------------|
|     | MOVE  | R AREG, X   | 100 | 04 | 1 | <u> 104</u> |
|     | ADD   | BREG, ONE   | 101 | 01 | 2 | <u> 105</u> |
|     | ADD   | CREG, TEN   | 102 | 06 | 3 | <u> 106</u> |
|     | STOP  | ,           | 103 | 00 | 0 | 000         |
| Χ   | DC    | <b>'</b> 5' | 104 |    |   |             |
| ONE | DC    | <b>'1'</b>  | 105 |    |   |             |
| TEN | DC    | '10'        | 106 |    |   |             |
|     | END   |             |     |    |   |             |

### 2. Assemblers

- Elements of Assembly Language Programming
- A Simple Assembly Scheme
- Pass Structure of Assemblers
- Design of a Two Pass Assembler
- A Single Pass Assembler for IBM PC

### Pass I:-

- 1. Separate the symbol, mnemonic, opcode and operand.
- 2. Build Symbol Table.
- 3. Perform LC Processing.
- 4. Construct Intermediate Representation.

### Pass II:-

1. Process IR to synthesize the target program.

- Advanced Assembler Directives
  - ORIGIN:
    - The syntax is

```
ORIGIN <address specification>
where address specification is an <operand spec> or <constant>
```

- This directives indicates that LC should be set to the address given by <address spec>.
- The ORIGIN is useful when the target program does not consist of consecutive memory word.

| 1<br>2<br>3<br>4<br>5<br>6 | LOOP | START<br>MOVER<br>MOVEM<br>MOVER<br>MOVER<br>ADD | 200<br>AREG, ='5'<br>AREG, A<br>AREG, A<br>CREG, B<br>CREG, ='1' | 200) + 04 1 211<br>201) + 05 1 217<br>202) + 04 1 217<br>203) + 05 3 218<br>204) + 01 2 212 |                     |
|----------------------------|------|--------------------------------------------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------|---------------------|
| 7<br>12                    |      | BC                                               | ANY, NEXT                                                        | 210) + 07 6 214                                                                             |                     |
| 13                         |      | LTORG                                            | ='5'<br>='1'                                                     | 211) + 00 0 005<br>212) + 00 0 001                                                          | Statement number 18 |
| 14                         |      |                                                  |                                                                  | ,                                                                                           | sets LC to          |
| 15                         | NEXT | SUB                                              | AREG, ='1'                                                       | 214) + 02 1 219                                                                             | the value           |
| 16                         |      | BC                                               | LT, BACK                                                         | 215) + 07 1 202                                                                             | 204 since           |
| 17                         | LAST | STOP                                             |                                                                  | 216) + 00 000                                                                               | the symbol          |
| 18                         |      | <b>ORIGIN</b>                                    | LOOP+2                                                           |                                                                                             | LOOP is             |
| 19                         |      | MULT                                             | CREG, B                                                          | 204) + 03 3 218                                                                             | associated          |
| 20                         |      | ORIGIN                                           | LAST+1                                                           |                                                                                             | with the            |
| 21                         | Α    | DS                                               | 1                                                                | 217)                                                                                        |                     |
| 22                         | BACK | EQU                                              | LOOP                                                             |                                                                                             | address 202         |
| 23<br>24                   | В    | DS<br>END                                        | 1                                                                | 218)                                                                                        | 7, 1                |
| 25                         |      |                                                  | ='1'                                                             | 219)                                                                                        |                     |

Figure : An assembly language illustrating ORIGIN

- Advanced Assembler Directives
  - EQU:
    - The syntax is

```
<symbol> ORIGIN <address specification>
where address specification is an <operand spec> or
<constant>
```

- The EQU statement defines the symbol to represent <address spec>.
- No LC processing is implied.



Figure : An assembly language illustrating ORIGIN

### Advanced Assembler Directives

### - LTORG:

- The LTORG statement permits a programmer to specify where literals should be placed.
- By default, assembler places the literals after the END statement.
- At every LTORG statement, as also at the END statement, the assembler allocates memory to the literals of a literal pool.
- The pool contains all literals used in the program since start of the program or since the last LTORG statement.

| 1<br>2<br>3<br>4<br>5<br>6<br>7 | LOOP | START MOVER MOVER MOVER MOVER ADD | 200<br>AREG, ='5'<br>AREG, A<br>AREG, A<br>CREG, B<br>CREG, ='1' | 200) + 04 1 211<br>201) + 05 1 217<br>202) + 04 1 217<br>203) + 05 3 218<br>204) + 01 2 212 | The literals ='5' and ='1' are added to the literal pool in the |
|---------------------------------|------|-----------------------------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------------------------|
| 12                              |      | BC                                | ANY, NEXT                                                        | 210) + 07 6 214                                                                             | statement 2 and 6                                               |
| 13                              |      | LTORG                             |                                                                  |                                                                                             | respectively. The                                               |
|                                 |      |                                   | ='5'                                                             | 211) + 00 0 005                                                                             | first LTORG                                                     |
|                                 |      |                                   | ='1'                                                             | 212) + 00 0 001                                                                             | statement                                                       |
| 14                              |      |                                   |                                                                  |                                                                                             | allocates the                                                   |
| 15                              | NEXT | SUB                               | AREG, ='1'                                                       | 214) + 02 1 219                                                                             | addresses 211                                                   |
| 16                              |      | BC                                | LT, BACK                                                         | 215) + 07 1 202                                                                             |                                                                 |
| 17                              | LAST | STOP                              |                                                                  | 216) + 00 0 000                                                                             | and 212 to the                                                  |
| 18                              |      | ORIGIN                            | LOOP+2                                                           | ,                                                                                           | values '5' and '1'.                                             |
| 19                              |      | MULT                              | CREG, B                                                          | 204) + 03 3 218                                                                             | A new literal pool                                              |
| 20                              |      | ORIGIN                            | LAST+1                                                           | ·                                                                                           | is now started.                                                 |
| 21                              | Α    | DS                                | 1                                                                | 217)                                                                                        | The value '1' is                                                |
| 22                              | BACK | EQU                               | LOOP                                                             | ,                                                                                           | put into this pool                                              |
| 23                              | В    | DS                                | 1                                                                | 218)                                                                                        | in statement 15.                                                |
| 24                              |      | END                               |                                                                  | •                                                                                           | iii Stateriierit 13.                                            |
| 25                              |      |                                   | ='1'                                                             | 219)                                                                                        |                                                                 |

Figure : An assembly language illustrating ORIGIN

|      | Prograr      | n                    | LC           |       |     |
|------|--------------|----------------------|--------------|-------|-----|
|      | START        | 200                  |              |       |     |
|      | MOVER        | AREG, ='5'           | 200          | +04 1 | 205 |
|      | MOVEM        | 1 AREG, X            | 201          | +05 1 | 214 |
| L1   | MOVER        | BREG, ='2'           | 202          | +04 2 | 206 |
|      | ORIGIN       | L1+3                 |              |       |     |
|      | <b>LTORG</b> |                      |              |       |     |
|      |              |                      | 205          | +00 0 | 005 |
|      |              |                      | 206          | +00 0 | 002 |
| NEXT | ADD          | AREG,='1'            | 207          | +01 1 | 210 |
|      | SUB          | BREG,='2'            | 208          | +02 2 | 211 |
|      | BC           | LT, BACK             | 209          | +07 1 | 202 |
|      | <b>LTORG</b> |                      |              |       |     |
|      |              |                      | 210          | +00 0 | 001 |
|      |              |                      | 211          | +00 0 | 002 |
| BACK | <b>EQU</b>   | L1                   |              |       |     |
|      | ORIGIN       | NEXT+5               |              |       |     |
|      | MULT         | CREG,='4'            | 212          | +03 3 | 215 |
|      | STOP         |                      | 213          | +00 0 | 000 |
| X    | DS           | 1                    | 214          |       |     |
|      | END          | Mrs. Sunita M Dol, C | CSE Dept 215 | +00 0 | 004 |

- Pass I uses the following data structures
  - 1. Machine Opcode table (OPTAB)
  - 2. Symbol Table (SYMTAB)
  - 3. Literal Table (LITTAB)
  - 4. Pool Table (POOLTAB)

- Pass-I Data Structures
  - OPTAB
    - OPTAB contains the fields
      - Mnemonic opcode
      - Class: the class indicate whether the opcode corresponding to an imperative statement (IS), a declaration statement (DL) or an assembler directives (AD).
      - Mnemonic info: If an imperative statement, the mnemonic info field contains the pair (machine opcode, instruction length) else it contains the id of a routine to handle the declaration or directive statement.
    - A SYMTAB entry contains the fields address and length.
    - A LITTAB entry contains the fields literal and address.

|      | Prograr      | n                 | LC                    |       |            |
|------|--------------|-------------------|-----------------------|-------|------------|
|      | START        | 200               |                       |       |            |
|      | MOVEF        | R AREG, ='5'      | 200                   | +04   | 205        |
|      | MOVEN        | 1 AREG, X         | 201                   | +05   | 214        |
| L1   | MOVEF        | R BREG, ='2'      | 202                   | +04 2 | 206        |
|      | ORIGIN       | L1+3              |                       |       |            |
|      | <b>LTORG</b> |                   |                       |       |            |
|      |              |                   | 205                   | +00 ( | 005        |
|      |              |                   | 206                   | +00 ( | 002        |
| NEXT | ADD          | AREG,='1'         | 207                   | +01   | 210        |
|      | SUB          | BREG,='2'         | 208                   | +02 2 | 2 211      |
|      | BC           | LT, BACK          | 209                   | +07   | 202        |
|      | <b>LTORG</b> |                   |                       |       |            |
|      |              |                   | 210                   | +00 ( | 001        |
|      |              |                   | 211                   | +00 ( | 002        |
| BACK | EQU          | L1                |                       |       |            |
|      | ORIGIN       | NEXT+5            |                       |       |            |
|      | MULT         | CREG,='4'         | 212                   | +03 3 | <b>215</b> |
|      | STOP         |                   | 213                   | +00 ( | 000        |
| X    | DS           | 1                 | 214                   |       |            |
|      | END          | Mrs. Sunita M Dol | , CSE Dept <b>215</b> | +00 ( | 004        |

| Symbol | Address |
|--------|---------|
| L1     | 202     |
| NEXT   | 207     |
| ВАСК   | 202     |
| X      | 214     |

| Literal | Address |
|---------|---------|
| ='5'    | 205     |
| ='2'    | 206     |
| ='1'    | 210     |
| ='2'    | 211     |
| ='4'    | 215     |

| Pool Table |
|------------|
| 0          |
| 2          |
| 4          |
| 5          |

### 2. Assemblers

- Elements of Assembly Language Programming
- A Simple Assembly Scheme
- Pass Structure of Assemblers
- Design of a Two Pass Assembler
- A Single Pass Assembler for IBM PC

## **Algorithm for PASS-I**

```
1) Loc_cntr:=0;(default value)
Pooltab_ptr:=;POOLTAB[1]:=1;
Littab_ptr:=1;
```

- 2) While next statement is not an END statement
  - a) If label is present

```
This_label := symbol in label field:
Enter(this_label,loc_cntr)in SYMTAB
```

#### b) If an LTORG statement then

- i) Process literals LITTAB[POOLTAB[pooltab\_ptr]] ...LITTAB[littab\_ptr-1] to allocate memory and put the address in the address field. Update loc\_cntr accordingly.
- ii)Pooltab\_ptr:=pooltab\_ptr+1;
- iii)POOLTAB[pooltab\_ptr]:=littab\_ptr;

### c) If a START or ORIGIN statement then

Loc\_cntr:=value specified in operand field;

#### d) If an EQU statement then

- i) This \_addr:=value of <address spec>;
- ii) Correct the SYMTAB entry for this\_label to (this\_label,this\_addr).

#### e)If a declaration statement then

- i)Code :=code of the declaration statement;
- ii)Size:=size of memory area required by DC/DS.
- iii)Loc cntr:=loc cntr+size:
- iv)(Generate IC '(DL,code)....'.

### f) If an imperative statement then

- i) Code:=machine opcode from OPTAB:
- ii) Loc cntr := loc cntr + instruction length from OPTAB;
- iii) If operand is literal then

```
this_literal:=lliteral in operand field;
LITTAB[littab_ptr]:=this_literal;
littab_ptr:=littab_ptr+1;
```

Else (i.e. operand is a symbol)

```
this_entry := SYMTAB entry number of operand;
```

Generate IC '(IS,code)(S,this\_entry)';

3) (Processing of END statement)
a)Perform step 2(b).
b)Generate IC '(AD,02)'.
c)Go to Pass 2.



























| <b>Mnemonic Opcode</b> | Class | Mnemonic Info |
|------------------------|-------|---------------|
| MOVER                  | IS    | (04,1)        |
| DS                     | DL    | R#7           |
| START                  | AD    | R#11          |
|                        |       |               |
|                        | •     |               |

**OP Table** 

START 100
MOVER AREG, A
ADD AREG, B
MOVEM AREG, C
STOP
A DC '5'
B DC '8'
C DS 1
END



| Symbol | Address | Length |
|--------|---------|--------|
| Α      | 104     | 1      |
| В      | 105     | 1      |
| С      | 106     | 1      |

## **Symbol Table**Mrs. Sunita M Dol, CSE Dept























Location\_ Counter = 104 START 100 MOVER AREG, A ADD AREG, B MOVEM AREG, C Label? **STOP** DC **'5'** Α a This\_Label = A This\_Label = Symbolic name in label field **Symbol** Length Address Enter (This\_Label, Α 104 1

Location\_Counter) in Symbol Table

**Symbol Table** 

























Source Program
START 100
MOVER AREG, A
ADD AREG, B
MOVEM AREG, C
STOP
A DC '5'
B DC '8'
C DS 1
END

Intermediate Code
(AD, 01) (C, 100)
(IS, 04) AREG, A
(IS, 01) AREG, B
(IS, 05) AREG, C
(IS, 00)
(DL, 01) (C, 5)
(DL, 01) (C, 8)
(DL, 02) (C, 1)
(AD, 02)

| Symbol | Address | Length |
|--------|---------|--------|
| Α      | 104     | 1      |
| В      | 105     | 1      |
| С      | 106     |        |

**Symbol Table** 

| <b>Mnemonic Opcode</b> | Class | .Mnemonic<br>Info |
|------------------------|-------|-------------------|
| MOVER                  | IS    | (04,1)            |
| DS                     | DL    | R#7               |
| START                  | AD    | R#11              |
|                        |       |                   |
|                        |       |                   |

```
START 100
MOVER AREG, ='5'
ADD AREG, B
MOVEM AREG, C
STOP
B DC '8'
C DS 1
END
```



| Literal_<br>Table_P<br>ointer | Literal | Address |
|-------------------------------|---------|---------|
| 1                             | ='5'    | 106     |
| 2                             |         |         |

**Literal Table** 

| POOL_Table_<br>Pointer | Literal_Table_<br>Pointer |
|------------------------|---------------------------|
| 1                      | 1                         |
| 2                      | 2                         |

**POOL Table** 

| Symbol | Address | Length |
|--------|---------|--------|
| В      | 104     | 1      |
| С      | 105     | 1      |

**Symbol Table** 























Location\_ Counter = 104 Literal\_Table\_Pointer = 2 POOL\_Table\_Pointer = 1









Location\_ Counter = 105 Literal\_Table\_Pointer = 2 POOL\_Table\_Pointer = 1











Location\_ Counter = 106 Literal\_Table\_Pointer = 2 POOL\_Table\_Pointer = 1

Intermediate Code
(AD, 01)(C, 100)
(IS, 04) AREG, (L,01)
(IS, 01) AREG, B
(IS, 05) AREG, C
(IS, 00)
(DL, 01) (C, 8)
(DL, 02) (C, 1)

Source Program
START 100
MOVER AREG, ='5'
ADD AREG, B
MOVEM AREG, C
STOP
B DC '8'
C DS 1
END 126

Mrs. Sunita M Dol, CSE Dept





Source Program
START 100
MOVER AREG, ='5'
ADD AREG, B
MOVEM AREG, C
STOP
B DC '8'
C DS 1
END

Intermediate Code
(AD, 01) (C, 100)
(IS, 04) AREG, (L,01)
(IS, 01) AREG, B
(IS, 05) AREG, C
(IS, 00)
(DL, 01) (C, 8)
(DL, 02) (C, 1)
(AD, 02)

| Symbol | Address | Length |
|--------|---------|--------|
| В      | 104     | 1      |
| С      | 105     |        |

**Symbol Table** 

| Literal_<br>Table_P<br>ointer | Literal | Address |
|-------------------------------|---------|---------|
| 1                             | ='5'    | 106     |
| 2                             |         |         |

**Literal Table** 

| POOL_Table_<br>Pointer | Literal_Table_<br>Pointer |
|------------------------|---------------------------|
| 1                      | 1                         |
| 2                      | 2                         |

**POOL Table** 

- The intermediate code consist of a set of IC unit, each IC unit consisting of following three fields
  - Address
  - Representation of the mnemonic opcode
  - Representation of operands

| Address   Opcode   Operands |
|-----------------------------|
|-----------------------------|

#### Mnemonic field

Mnemonic field contains a pair of the form

(statement class, code)

where statement class can be one of the following

IS – Imperative statement

DL – Declaration statement

AD – Assembler directives

and code is the instruction opcode in the machine language.

#### Mnemonic field

#### Declaration statement

| DC | 01 |
|----|----|
| DS | 02 |

#### **Assembler Directives**

| START | 01 |
|-------|----|
| END   | 02 |
| ORGIN | 03 |
| EQU   | 04 |
| LTORG | 05 |

- Intermediate code for Imperative statement
  - The first operand is represented by a single digit
    - 1-4 for AREG-DREG
    - 1-6 for LT-ANY
  - The second operand which is a memory operand is represented by

```
(operand class, code)
```

where operand class is one of C, S, L.

- Intermediate code for Imperative statement
  - For constant, the code field contains the internal representation of the constant itself.
  - For symbol or literals, code field contain the ordinal number of the operand's entry in SYMTAB or LITTAB

#### Variant-I



#### Variant-I

 In Variant-I, two kinds of entries may exists in SYMTAB at any time for defined symbol and for forward references.

#### Variant-II

- For declarative statements and assembler directives, processing of the operand fields is essential to support LC processing.
- For imperative statements, the operand field is processed only to identify literal references.
- Symbolic references in the source statement are not processed at all during Pass-I

#### Variant-I



- Comparison of Variant-I and Variant-II
  - Variant-I of the intermediate code appears to require extra work in Pass-I.
  - IC is quite compact in Variant-I
  - Variant-II reduces the work of pass-I by transferring the burden of operand processing from Pass-I to Pass-II of the assembler.
  - IC is less compact in Variant-II.

- Comparison of Variant-I and Variant-II
  - Memory requirement using Variant-I and Variant-II.

Pass-I

Data
Structure

Work
area

Pass-II

Data
Structure

Work
area

Pass-I

Data
Structure

Work
area

Pass-II

Data
Structure

Work
area

- Processing of Declaration & Assembler Directives
  - DC: A DC statement must be represented in IC
  - START or ORIGIN: It is not necessary to retain START and ORIGIN statement in IC if the IC contains an address field.
  - LTORG: The IC for a literal can be made identical to the IC for a DC statement so that no special processing is required in Pass-II.

### 2. Assemblers

- Elements of Assembly Language Programming
- A Simple Assembly Scheme
- Pass Structure of Assemblers
- Design of a Two Pass Assembler
- A Single Pass Assembler for IBM PC

# **Algorithm for PASS-II**

```
1. code_area_address := address of code_area;
    pooltab_ptr := 1;
    loc_cntr := 0;
```

- 2. While next statement is not an END statement
  - a) clear machine\_code\_buffer;
  - b) If an LTORG statement
    - i)Process literals in LITTAB[POOLTAB[pooltab\_ptr]] ...LITTAB[POOLTAB[pooltab\_ptr+1]]-1 similar to processing of constants in a DC statement i.e. assemble the literals in machine\_code\_buffer;
    - ii) size := size of memory area required for literals;
    - iii) pooltab\_ptr := pooltab\_ptr + 1;

#### c) If a START or ORIGIN statement then

```
i)loc_cntr := value specified in operand field;ii) size := 0;
```

#### d) If a declaration statement

- i) If a DC statement thenAssemble the constatnt in machine\_code\_buffer
- ii) size := size of memory area required by DC/DS;

#### e) If an imperative statement

- i) Get the operand address from SYMTAB or LITTAB.
- ii) Assemble instruction in machine\_code\_buffer.
- iii) size := size of instruction.

- f) If size <> 0 then
  - i) Move the contents of machine\_code\_buffer to the address code\_area\_address + loc\_cntr;
  - ii) loc\_cntr := loc\_cntr + size;
- 3. (Processing of an END statement)
  - a) Perform steps 2(b) & 2(f).
  - b) Write code\_area into output file.

























## **Source Program**

```
START 100
MOVER AREG, A
ADD AREG, B
MOVEM AREG, C
STOP
A DC '5'
B DC '8'
C DS 1
END
```



| Symbol | Address | Length |
|--------|---------|--------|
| Α      | 104     | 1      |
| В      | 105     | 1      |
| С      | 106     | 1      |

## **Symbol Table**



| Symbol | Address | Length |
|--------|---------|--------|
| Α      | 104     | 1      |
| В      | 105     | 1      |
| С      | 106     | 1      |

## **Symbol Table**





(AD, 01) (C, 100) (IS, 04) AREG, A (IS, 01) AREG, B (IS, 05) AREG, C (IS, 00) (DL, 01) (C, 5) (DL, 01) (C, 8) (DL, 02) (C, 1)





**OP Table** 



Code\_Area\_Address = 0 Location\_Counter = 100 POOL\_Table\_Pointer = 1 Size=1

100) + 04 1 104

**MCB** 

100) + 04 1 104 CAA







**OP Table** 



Code\_Area\_Address = 0 Location\_Counter = 101 POOL\_Table\_Pointer = 1 Size=1

101) + 01 1 105 **MCB** 

100) + 04 1 104 **CAA** 101) + 01 1 105







Mrs. Sunita M Dol, CSE Dept

**OP Table** 



Code\_Area\_Address = 0 Location\_Counter = 102 POOL\_Table\_Pointer = 1 Size=1

102) + 05 1 106 **MCB** 

100) + 04 1 104 **CAA** 101) + 01 1 105 102) + 05 1 106









Code\_Area\_Address = 0 Location\_Counter = 103 POOL\_Table\_Pointer = 1 Size=1

103) + 00 0 000 **MCB** 

| 100) + 04 1 | 104 | CAA |
|-------------|-----|-----|
| 101) + 01 1 | 105 | OAA |
| 102) + 05 1 | 106 |     |
| 103) + 00 0 | 000 |     |





Code\_Area\_Address = 0 Location\_Counter = 104 POOL\_Table\_Pointer = 1 Size=1

| 104) + 00 0 | 005 | MCB |
|-------------|-----|-----|
|-------------|-----|-----|

| 100) + 04 |   |     | CAA |
|-----------|---|-----|-----|
| 101) + 01 | 1 | 105 |     |
| 102) + 05 | 1 | 106 |     |
| 103) + 00 | 0 | 000 |     |





Code\_Area\_Address = 0 Location\_Counter = 104 POOL\_Table\_Pointer = 1 Size=1

104) + 00 0 005 **MCB** 

| 100) + 04              | 1 | 104 | CAA |
|------------------------|---|-----|-----|
| 101) + 01              | 1 | 105 |     |
| 102) + 05              | 1 | 106 |     |
| 103) + 00              | 0 | 000 |     |
| 103) + 00<br>104) + 00 | 0 | 005 |     |





Code Area Address = 0 (AD, 01) (C, 100) Declaration Location Counter = 105 (IS, 04) AREG, A POOL Table Pointer = 1 Statement? (IS, 01) AREG, B Size=1 (IS, 05) AREG, C (IS, 00)**MCB** 105) + 00 0 008(DL, 01) (C, 5) (DL, 01) (C, 8) CAA 100) + 04 1 104(DL, 02) (C, 1) 101) + 01 1 105102) + 05 1 106Assemble the constant in 103) + 00 0 000Machine Code Buffer for DC 105) + 00 0 008 104) + 00 0 005Statement Size = Size of memory area Size = 1 required by DC/DS



Code\_Area\_Address = 0 Location\_Counter = 105 POOL\_Table\_Pointer = 1 Size=1

| 105) + 00 0 008 <b>N</b> | ICB |
|--------------------------|-----|
|--------------------------|-----|

| _ |
|---|
| Α |
|   |
|   |
|   |
|   |
|   |
|   |









Code\_Area\_Address = 0 Location\_Counter = 105 POOL\_Table\_Pointer = 1 Size=1

| 106) | MCB |
|------|-----|
|------|-----|

| 100) + 04 | 1 | 104 | CAA |
|-----------|---|-----|-----|
| 101) + 01 | 1 | 105 |     |
| 102) + 05 | 1 | 106 |     |
| 103) + 00 | 0 | 000 |     |
| 104) + 00 | 0 | 005 |     |
| 105) + 00 | 0 | 800 |     |
| 106)      |   |     |     |









| Symbol | Address | Length |
|--------|---------|--------|
| Α      | 104     | 1      |
| В      | 105     | 1      |
| С      | 106     | 1      |

## **Symbol Table**

| <b>Mnemonic Opcode</b> | Class | .Mnemonic<br>Info |
|------------------------|-------|-------------------|
| MOVER                  | IS    | (04,1)            |
| DS                     | DL    | R#7               |
| START                  | AD    | R#11              |
|                        |       |                   |
|                        |       |                   |

```
START 100
MOVER AREG, ='5'
ADD AREG, B
MOVEM AREG, C
STOP
B DC '8'
C DS 1
END
```



| Literal_<br>Table_P<br>ointer | Literal | Address |
|-------------------------------|---------|---------|
| 1                             | ='5'    | 106     |
| 2                             |         |         |

**Literal Table** 

| POOL_Table_<br>Pointer | Literal_Table_<br>Pointer |
|------------------------|---------------------------|
| 1                      | 1                         |
| 2                      | 2                         |

**POOL Table** 

| Symbol | Address | Length |
|--------|---------|--------|
| В      | 105     | 1      |
| С      | 106     | 1      |

**Symbol Table** 



| Literal_<br>Table_P<br>ointer | Literal | Address |
|-------------------------------|---------|---------|
| 1                             | ='5'    | 106     |
| 2                             |         |         |

| <b>Literal Table</b> |
|----------------------|
|----------------------|

| POOL_Table_<br>Pointer | Literal_Table_<br>Pointer |
|------------------------|---------------------------|
| 1                      | 1                         |
| 2                      | 2                         |

**POOL Table** 

| Symbol | Address | Length |
|--------|---------|--------|
| В      | 104     | 1      |
| С      | 105     | 1      |

**Symbol Table** 





(AD, 01) (C, 100) (IS, 04) AREG, (L,01) (IS, 01) AREG, B (IS, 05) AREG, C (IS, 00) (DL, 01) (C, 8) (DL, 02) (C, 1)





Mrs. Sunita M Dol, CSE Dept **OP Table** 

POOL Table<sup>201</sup>



Code\_Area\_Address = 0 Location\_Counter = 100 POOL\_Table\_Pointer = 1 Size=1

100) + 04 1 106

MCB

100) + 04 1 106 CAA







| Mnemonic<br>Opcode | Class | Mnemonic<br>Info |
|--------------------|-------|------------------|
| MOVER              | IS    | (04,1)           |
| DS                 | DL    | R#7              |
| START              | AD    | R#11             |
|                    | •     |                  |
|                    |       |                  |
| 8/19/201           | L5    |                  |

| Register | Code |
|----------|------|
| AREG     | 1    |
| BREG     | 2    |
| CREG     | 3    |
| DREG     | 4    |

Mrs. Sunita M Dol, CSE Dept

**Symbol Table** 

205



Code\_Area\_Address = 0 Location\_Counter = 101 POOL\_Table\_Pointer = 1 Size=1

101) + 01 1 104 **MCB** 

100) + 04 1 106 **CAA** 101) + 01 1 104







| Mnemonic<br>Opcode | Class | Mnemonic<br>Info |
|--------------------|-------|------------------|
| MOVER              | IS    | (04,1)           |
| DS                 | DL    | R#7              |
| START              | AD    | R#11             |
|                    |       |                  |
|                    |       |                  |
| 8/19/201           | L5    |                  |

| Register | Code |
|----------|------|
| AREG     | 1    |
| BREG     | 2    |
| CREG     | 3    |
| DREG     | 4    |

Mrs. Sunita M Dol, CSE Dept

209



Code\_Area\_Address = 0 Location\_Counter = 102 POOL\_Table\_Pointer = 1 Size=1

102) + 05 1 105 **MCB** 

100) + 04 1 106 **CAA** 101) + 01 1 104 102) + 05 1 105









Code\_Area\_Address = 0 Location\_Counter = 103 POOL\_Table\_Pointer = 1 Size=1

103) + 00 0 000 **MCB** 

100) + 04 1 106 101) + 01 1 104 102) + 05 1 105 103) + 00 0 000









Code\_Area\_Address = 0 Location\_Counter = 104 POOL\_Table\_Pointer = 1 Size=1

104) + 00 0 008 **MCB** 

| 100) + 04 | 1 | 106 | CAA |
|-----------|---|-----|-----|
| 101) + 01 | 1 | 104 | OAA |
| 102) + 05 |   |     |     |
| 103) + 00 | 0 | 000 |     |
| 104) + 00 | 0 | 800 |     |
|           |   |     |     |









Code\_Area\_Address = 0 Location\_Counter = 105 POOL\_Table\_Pointer = 1 Size=1

#### 105) **MCB**

| 100) + 04 | 1 | 106 | CAA |
|-----------|---|-----|-----|
| 101) + 01 |   |     |     |
| 102) + 05 | 1 | 105 |     |
| 103) + 00 | 0 | 000 |     |
| 104) + 00 | 0 | 800 |     |
| 105)      |   |     |     |
| ,         |   |     |     |







Code\_Area\_Address = 0 Location\_Counter = 106 POOL\_Table\_Pointer = 1 Size=1

#### MCB

100) + 04 1 106 101) + 01 1 104 102) + 05 1 105 103) + 00 0 000 104) + 00 0 008 105) 8/19/2015 (AD, 01) (C, 100) (IS, 04) AREG, (L,01) (IS, 01) AREG, B (IS, 05) AREG, C (IS, 00) (DL, 01) (C, 8) (DL, 02) (C, 1)





Code\_Area\_Address = 0 Location\_Counter = 106 POOL\_Table\_Pointer = 2 Size=1

106) + 00 0 005 **MCB** 

100) + 04 1 106 101) + 01 1 104 102) + 05 1 105 103) + 00 0 000 104) + 00 0 008 105) 8/19/2015 (AD, 01) (C, 100) (IS, 04) AREG, (L,01) (IS, 01) AREG, B (IS, 05) AREG, C (IS, 00) (DL, 01) (C, 8) (DL, 02) (C, 1) Code\_Area\_Address = 0 Location\_Counter = 105 POOL\_Table\_Pointer = 1 Size=1

| 106) + 00 | 0 | 005 | MCB |
|-----------|---|-----|-----|
|           |   |     |     |
|           |   |     |     |

| 100) + 04 | 1 | 106 | CAA        |
|-----------|---|-----|------------|
| 101) + 01 | 1 | 104 | <b>0</b> , |
| 102) + 05 | 1 | 105 |            |
| 103) + 00 | 0 | 000 |            |
| 104) + 00 | 0 | 800 |            |
| 105)      |   |     |            |
| 106) + 00 | 0 | 005 |            |







| Literal_<br>Table_P<br>ointer | Literal | Address |
|-------------------------------|---------|---------|
| 1                             | ='5'    | 106     |
| 2                             |         |         |

| <b>Literal Tab</b> | le |
|--------------------|----|
|--------------------|----|

| POOL_Table_<br>Pointer | Literal_Table_<br>Pointer |
|------------------------|---------------------------|
| 1                      | 1                         |
| 2                      | 2                         |

**POOL Table** 

| Symbol | Address | Length |
|--------|---------|--------|
| В      | 104     | 1      |
| С      | 105     | 1      |

**Symbol Table** 

| A<br>L1<br>D<br>L2 | ADD                                         | 100<br>3<br>AREG, B<br>AREG, C<br>AREG, D<br>A+1<br>D |
|--------------------|---------------------------------------------|-------------------------------------------------------|
| СВ                 | ORIGIN<br>DC<br>ORIGIN<br>STOP<br>DC<br>END | <b>'</b> 5'                                           |



| Symbol | Address | Length |
|--------|---------|--------|
| Α      | 100     |        |
| L1     | 103     |        |
| D      | 101     |        |
| L2     | 106     |        |
| С      | 99      |        |
| В      | 108     |        |



| Symbol | Address | Length |
|--------|---------|--------|
| Α      | 100     |        |
| L1     | 103     |        |
| D      | 101     |        |
| L2     | 106     |        |
| С      | 99      |        |
| В      | 108     |        |

# **Listing and Error Reporting**

| Sr. No.   |       | Statemer  | nt                           | Address |
|-----------|-------|-----------|------------------------------|---------|
| 1         |       | START     | 200                          |         |
| 2         |       | MOVER     | AREG,A                       | 200     |
| 3         |       |           | •                            |         |
| 9         |       | MVER      | BREG,A                       | 207     |
|           |       | **ERR     | OR**INVALID OPCODE           |         |
| 10        |       | ADD       | BREG,B                       | 208     |
| 14        | Α     | DS        | 1                            | 209     |
| 15        |       |           | :                            |         |
| 21        | Α     | DC        | '5'                          | 227     |
|           | **EF  | ROR**DUPL | ICATE DEFINATION OF SYMBOL A |         |
| 22        |       |           | :                            |         |
|           |       |           |                              |         |
| 35        |       | END       |                              |         |
| 8/19/2015 | **ERR | OR**UNDEF | INEDISYMBOLB IN STATEMENT 10 | 234     |

# **Listing and Error Reporting**

| Sr. No.   | Statemen   | t                    | Address  | INSTRUCTION |
|-----------|------------|----------------------|----------|-------------|
| 1         | START      | 200                  |          |             |
| 2         | MOVER      | AREG,A               | 200      | +04 1 209   |
| 3         |            | :                    |          |             |
| 9         | MVER       | BREG,A               | 207      | + 2 209     |
|           | **ERROF    | R**INVALID OPCODE    |          |             |
| 10        | ADD        | BREG,B               | 208      | +01 2       |
| **ERROR** | UNDEFINE   | D SYMBOL B IN OPERAI | ND FIELD |             |
| 14 A      | DS         | 1                    | 209      |             |
| 15        |            | :                    |          |             |
| 21 A      | DC         | '5'                  | 227      | +00 0 005   |
|           | x**DUPLICA | TE DEFINATION OF SYN | MBOL A   |             |
| 22        |            | :                    |          |             |
| 35        | END        |                      |          |             |

# Some Organizational Issues



#### 2. Assemblers

- Elements of Assembly Language Programming
- A Simple Assembly Scheme
- Pass Structure of Assemblers
- Design of a Two Pass Assembler
- A Single Pass Assembler for IBM PC

#### **Architecture of Intel 8088**

#### CPU contains following features:

- Data Registers AX, BX, CX and DX
- Index Registers SI and DI
- Stack Pointers Registers SP and BP
- Segment registers Code Stack, Data and Extra

#### **Architecture of Intel 8088**

Data Register

| AH | AL |
|----|----|
| ВН | BL |
| CH | CL |
| DH | DL |

Base Register

| SP |  |
|----|--|
| BP |  |

**Index Register** 

| SI |  |
|----|--|
| DI |  |

Segment Register

| CODE  |
|-------|
| STACK |
| DATA  |
| EXTRA |

# **Addressing Modes**

| Addressing<br>Modes  | Examples                            | Remarks                                                            |
|----------------------|-------------------------------------|--------------------------------------------------------------------|
| Immediate            | MOV SUM, 1234H                      | Data = 1234H                                                       |
| Register             | MOV SUM, AX                         | AX contains data                                                   |
| Direct               | MOV SUM, [1234H]                    | Data Displacement = 1234H                                          |
| Indirect             | MOV SUM, [BX]                       | Data Displacement = (BX)                                           |
| Register<br>Indirect | MOV SUM, CS: [BX]                   | Segment Override:<br>Segment Base = (CS)<br>Data Displacement (BX) |
| Based                | MOV SUM, 12H[BX]                    | Data Displacement = 12H + (BX)                                     |
| Indexed              | MOV SUM, 34H[SI]                    | Data Displacement = 34H + (SI)                                     |
| Based and<br>Indexed | MOV SUM, 56H[SI][BX]  Mrs. Sunita M | Data Displacement = 56H + (SI) + (BX)  Dol, CSE Dept 240           |

**Statement Format:** 

[Label:] opcode operand(S); comment string

- Assembler Directives
  - ORG
  - EQU
  - END

Declarations:

```
DB

(e.g. A DB 25; Reserve byte and initialize)

DW

(e.g. B DW ?; Reserve byte and no initialization)

DD

(e.g. A DD 6DUP(0); 6 Double words, all 0's)

DQ

DT
```

EQU and PURGE

```
e.g.
```

```
XYZ DB ?
```

ABC EQU XYZ; ABC represent name XYZ

PURGE ABC; ABC no longer XYZ

ABC EQU 25; ABC stands for '25'

- SEGMENT, ENDS and ASSUME
  - SEGMENT and ENDS directives demarcate the segments in assembly language.
  - ASSUME tells the assembler that it can assume the address of indicated segment to be present in <register>

ASSUME <register> : <segment name>

SAMPLE\_DATA SEGMENT

ARRAY DW 100 DUP?

SUM DW 0

SAMPLE DATA ENDS

SAMPLE CODE SEGMENT

ASSUME DS: SAMPLE\_DATA

HERE MOV AX, SAMPLE\_DATA

MOV DS, AX

MOV AX, SUM

-----

SAMPLE CODE ENDS

END HERE

PROC, ENDP, NEAR and FAR

e.g.

SAMPLE CODE SEGMENT

CALCULATE

PROC FAR

RET

CALCULATE ENDP SAMPLE CODE ENDS

**PGM** 

**SEGMENT** 

CALL

**ENDS** 

**END** 

#### PUBLIC and EXTRN

- PUBLIC: when a symbolic name declared in one assembly module is to be accessible in other module, it is specified in a PUBLIC statement
- EXTRN : Another module wishing to use this name must specify in an EXTRN statement

- Analytic operator
  - -SEG
  - OFFSET
  - TYPE
  - SIZE
  - LENGTH

- Synthetic operator
  - PTR creates new memory operand with the same segment and offset addresses as an existing operand but having a different type.
  - THIS performs the special function of creating a new memory operand with the same address as the next memory byte available for allocation

#### Example

XYZ DW 312

NEW NAME EQU BYTE PTR XYZ

LOOP: CMP AX, 234

JMP LOOP

FAR LOOP EQU FAR PTR LOOP

JMP FAR LOOP

#### Example

|          | DW  |           |
|----------|-----|-----------|
| NEW_NAME | EQU | THIS BYTE |
| XYZ      | DW  | 312       |
| FAR_LOOP | EQU | THIS FAR  |
| LOOP     | CMP | AX, 234   |
|          | JMP | LOOP      |
|          |     |           |
|          | JMP | FAR_LOOP  |

| <u>Sr.No</u>               |        |                  | <u>Statement</u>   | <u>Offset</u> |
|----------------------------|--------|------------------|--------------------|---------------|
| 1                          | CODE   | SEGMENT          |                    |               |
| 2                          |        | ASSUME           | CS:CODE,DS:DATA    |               |
| 3                          |        | MOV              | AX,DATA            | 0             |
| 4                          |        | MOV              | DS,AX              | 3             |
| 5                          |        | MOV              | CX,LENGTH STRING   | 5             |
| 6                          |        | MOV              | COUNT,0000         | 8             |
| 7                          |        | MOV              | SI,OFFSET STRING   | 11            |
| 8                          |        | ASSUME           | ES:DATA,DS:NOTHING |               |
| 9                          |        | MOV              | AX,DATA            | 14            |
| 10                         |        | MOV              | ES,AX              | 17            |
| 11                         | COMP:  | SEGMENT          | [SI],'A'           | 19            |
| 12                         |        | JNE              | NEXT               | 22            |
| 13                         |        | MOV              | COUNT,1            | 24            |
| 14                         | NEXT:  | INC              | SI                 | 27            |
| 15                         |        | DEC              | CX                 | 29            |
| 16                         |        | JNE              | COMP               | 30            |
| 17                         | CODE   | ENDS             |                    |               |
| 18                         | DATA   | SEGMENT          |                    |               |
| 19                         |        | ORG              | 1                  |               |
| 20                         | COUNT  | DB               | ?                  | 1             |
| 21                         | STRING | DW               | 50 Dup(?)          | 2             |
| 22                         | DATA   | ENDS             |                    |               |
| 8/19/2 <b>2</b> 1 <b>3</b> |        | ENTs. Sunita M I | Dol, CSE Dept      | 253           |

# Mnemonics table(MOT)

| Mnemonic | Machine | Alignment/  | Routine |  |
|----------|---------|-------------|---------|--|
| Opcode   | Opcode  | Format info | Id      |  |
| (6)      | (2)     | (1)         | (4)     |  |
| JNE      | 75H     | 00H         | R2      |  |
|          |         |             |         |  |

### Symbol Table (SYMTAB)



# Segment register table array (SRTAB\_ARRAY)

| Segment  | Segment |         |
|----------|---------|---------|
| register | name    |         |
| (1)      | (2)     |         |
| 00(ES)   | 23      | SRTAB#1 |
|          |         |         |
| :        |         | SRTAB#2 |

## Forward reference table(FRT)

| Pointer | SRTAB# | Instruction | Usage | Source |
|---------|--------|-------------|-------|--------|
| (2)     | (1)    | Address     | Code  | Stmt#  |
|         |        | (2)         | (1)   | (2)    |

### Cross Reference Table (CRT)

| Pointer | Source Stmt# |  |  |
|---------|--------------|--|--|
| (2)     | (2)          |  |  |

| symbol          | D                        | S | Т    | Offset | Owner Length & |                        | FTR                                        | CRT            | poir  | nter      |
|-----------------|--------------------------|---|------|--------|----------------|------------------------|--------------------------------------------|----------------|-------|-----------|
|                 | ?                        | ? |      |        | segmen         | t size                 | Pointer                                    | First          | L     | ast       |
| CODE            | Υ                        | Y |      |        |                |                        |                                            |                |       |           |
| DATA            | N                        | Υ |      |        |                |                        |                                            |                |       |           |
| COMP            | Υ                        | Ζ | -1   | 19     | 1              |                        | -                                          |                |       |           |
| NEXT            | Υ                        | Ζ | -1   | 27     | 1              |                        | -                                          |                |       |           |
| COUNT           | N                        | Ν |      |        |                |                        | . ¬                                        |                |       |           |
| STRING          | N                        | Ν |      |        |                |                        | <u>                                   </u> |                |       |           |
|                 |                          |   |      |        |                |                        |                                            | ↓ ↓<br>•       | TR ST | MT#       |
| PTR             | PTR STMT#  PTR STMT#  16 |   |      |        |                |                        |                                            |                |       |           |
| ;               | #1 0008 D 0006           |   |      |        | 0006           |                        |                                            |                |       |           |
|                 |                          |   |      |        |                |                        |                                            | V 2            | 12    |           |
| , ;             | #2 0024 D 0013           |   | 0013 |        | CRT            |                        |                                            |                |       |           |
|                 |                          |   |      |        |                |                        |                                            | 1/66\          |       |           |
| <b>&gt;&gt;</b> | #1                       |   | 0    | 005    | L              | 0005                   |                                            | 1(CS)<br>1(DS) | 1 2   | SRTAB#    |
|                 |                          |   |      | -      |                |                        |                                            | 0 (ES)         | 2     | 0.0.7.1.0 |
| \$\\\19/2015    | #1                       |   | 0    | 011    | F Mrs. Sunita  | 0007<br>MDOI, CSE Dept |                                            | 1(CS)          | 1     | SRTAB#2   |

# Single Pass Assembler

```
    code_area_address := address of code_area; srtab_no := 1;
    LC := 0; stmt_no := 1;
    SYMTAB_segment_entry := 0;
    Clear ERRTAB , SRTAB_ARRAY
    While next statement is not an END statement a) Clear machine_code_buffer.
    b) If label is present then this_label := symbol in the label field;
```

#### c) If an EQU statement

- i) this\_address := value of operand expression;
- ii) Make an entry for this\_label in SYMTAB with

```
offset := this_address;
defined := 'yes';
owner_segment :=m owner_segment of operand symbol;
source_stmt# := stmt_no;
```

- iii) Enter stmt\_no in the CRT list of the label in the oprand field.
- iv) Process forward references to this\_label;
- v) size := 0;

#### d) If an ASSUME statement

- i) Copy the SRTAB in SRTAB\_ARRAY[srtab\_no] into SRTAB\_ARRAY[srtab\_no + 1];
- ii) srtab\_no := srtab\_no + 1;
- iii) this\_register := register mentioned in the statement.
- iv) this\_segment := entry number of SYMTAB entry of the segment appearing in the operand field.
- v) Make the entry (this\_register; this\_segment) in the SRTAB\_ARRAY[srtab\_no]. (This overwrites an existing entry for this register.)
- vi) size := 0;

#### e) If a SEGMENT statement

- i) Make an entry for this\_label in SYMTAB.
- ii) Set segment name ? := true;
- iii) SYMTAB\_segment\_entry := entry no in SYMTAB;
- iv) LC := 0;
- v) size := 0;

#### f) If an ENDS statement then

```
SYMTAB_segment_entry := 0;
```

#### g) If a declaration statement

- i) Align LC according to the specification in the operand field.
- ii) Assemble the constant(s), if any, in the machine\_code\_buffer
- iii) size := size of memory area required;

#### h) If an imperative statement

- i) If operand is a symbol symb then enter stmt\_no in the CRT list of symb.
- ii) If operand symbol is already defined then

Check its alignment & addressability.

Generate the address specification (segment register, offset) for the symbol using its SYMTAB entry and SRTAB\_ARRAY[srtab\_no].

else

Make an entry for symbol in SYMTAB.

Defined := 'no';

Enter (srtab no, LC, usage code, stmt no) in FRT.

- iii) Assemble the instruction in machine code buffer.
- iv) size := size of the instruction;

- i) If size <> 0 then
  - i) If label is present then

```
Make an entry for this_label in SYMTAB.

owner_segment := SYMTAB_segment_entry;

Defined := 'yes';

offset := LC;

source stmt# := stmt no;
```

- ii) Move contents of machine\_code\_buffer to the address code\_area\_address;
- iii) code\_area\_address := code\_area\_address + size;
- iv) Process forward references to the symbol. Check for alignment & addressability errors. Enter errors in ERRTAB.
- v) List the statement with errors contained in ERRTAB.
- vi) Clear ERRTAB.

- 3) (Processing of END statement)
  - a) Report undefined symbol from SYMTAB.
  - b) Produce cross reference listing.
  - c) Write code\_area into output file.



































