# Grado en Ingeniería Informática

# Arquitectura de Computadores Curso 2022/23

# Validación Bloque 3



#### **Autor:**

Rafael Osuna Ventura

Grupo de prácticas: 2

# <u>Contenido</u>

| 1.Ejercicio 1                                                                                               | 2                |
|-------------------------------------------------------------------------------------------------------------|------------------|
| 1.2 WinDLX:                                                                                                 | 2                |
| 1.3 SuperDLX                                                                                                | 3                |
| 1.3.1 Sin Predicción de Saltos:                                                                             | 3                |
| 1.3.2 Con Predicción de Saltos:                                                                             | 4                |
| 1.4 Diferencias de resultados entre WinDLX, SuperDLX sin predictor de saltos y Supercon predictor de saltos |                  |
| 2. Ejercicio 2                                                                                              | 5                |
| 2.1 Número de ciclos para A, B, C, D y E, con y sin predictor de saltos                                     | 5                |
| 2.1.A: Captación, decodificación y finalización (2, 2, 2)                                                   | 5                |
| 2.1.B: Captación, decodificación y finalización (4, 4, 4)                                                   | e                |
| 2.1.C: Captación, decodificación y finalización (6, 6, 6)                                                   | 7                |
| 2.1.D: Captación, decodificación y finalización (3, 5, 5)                                                   | 8                |
| 2.1.E: Captación, decodificación y finalización (1, 4, 6)                                                   | <u>S</u>         |
| 2.2 Gráfica para ilustrar el comportamiento                                                                 | 10               |
| 2.3 Descripción de los resultados de A, B y C                                                               | 10               |
| 2.4 Descripción de los resultados de D y E                                                                  | 10               |
| 2.5 Descripción de la influencia del predictor de saltos en los resultados                                  | 10               |
| 3. Ejercicio 3                                                                                              | 11               |
| 3.1: Número de ciclos para A, B y C, con y sin predictor de saltos                                          | 11               |
| 3.1.A: Cola de instrucciones, ventana de instrucciones y buffer de reorden (8, 8, 8                         | s) 11            |
| 3.1.B: Cola de instrucciones, ventana de instrucciones y buffer de reorden (15, 15                          | , <b>15)</b> 12  |
| 3.1.C: Cola de instrucciones, ventana de instrucciones y buffer de reorden (25, 25                          | , <b>25</b> ) 13 |
| 3.2: Gráfica para ilustrar el comportamiento                                                                | 14               |
| 3.3: Descripción de los resultados de las tres opciones                                                     | 14               |
| 3.4: Descripción de la influencia del predictor de saltos en los resultados                                 | 14               |
| 4. El mejor procesador SuperDLX en relación calidad/precio                                                  | 15               |
| 4.1: Número de ciclos                                                                                       | 15               |
| 4.2: Porqué esta configuración                                                                              | 16               |

#### 1. Ejercicio 1

#### 1.2 WinDLX:

**WinDLX** es un procesador segmentado, es decir es capaz de ejecutar varias instrucciones de forma simultánea, se comienza la siguiente instrucción mientras aún se está trabajando en otra. Al ejecutar varias instrucciones, se usan recursos diferentes.

Las instrucciones se dividen en 5 etapas, y la velocidad de ejecución de avance esta dictada por la etapa más lenta. Las etapas son las siguientes:

- 1. Búsqueda de Instrucción (IF)
- 2. Decodificación (ID)
- 3. Ejecución (EX)
- 4. Operación con memoria de datos (MEM)
- 5. Escritura (WB)

Para reducir las detenciones en los saltos, si se trata de un salto incondicional para solucionar el riesgo creado en caso de no ejecutar la instrucción siguiente a este, se debe asegurar que esta no modifique ningún registro o escriba en memoria. Si se trata de un riesgo de control, si el salto n ose realiza, continuamos con la ejecución sino descartamos la instrucción leída y volvemos a la etapa IF.

Número de ciclos: **1524**, con 307 detenciones por riesgo RAW y 0 WAW, y con caminos de bypass.

```
1524 Cycle(s) executed.
     ID executed by 1104 Instruction(s).
     2 Instruction(s) currently in Pipeline.
Hardware configuration:
     Memory size: 32768 Bytes
     faddEX-Stages: 1, required Cycles: 2
     fmulEX-Stages: 1, required Cycles: 5
     fdivEX-Stages: 1, required Cycles: 19
     Forwarding enabled.
Stalls:
    PAW stalls: 307 (20.14% of all Cycles), thereof:
LD stalls: 100 (32.57% of PAW stalls)
          Branch/Jump stalls: 0 (0.00% of RAW stalls)
          Floating point stalls: 207 (67.43% of RAW stalls)
     WAW stalls: 0 (0.00% of all Cycles)
     Structural stalls: 1 (0.06% of all Cycles)
     Control stalls: 108 (7.10% of all Cycles)
     Trap stalls: 2 (0.13% of all Cycles)
     Total: 418 Stall(s) (27.43% of all Cycles)
Conditional Branches):
     Total: 391 (35.42% of all Instructions), thereof:
taken: 108 (27.62% of all cond. Branches)
          not taken: 283 (72.38% of all cond. Branches)
Load-/Store-Instructions:
     Total: 100 (9.06% of all Instructions), thereof:
          Loads: 100 (100.00% of Load-/Store-Instructions)
          Stores: 0 (0.00% of Load-/Store-Instructions)
Floating point stage instructions:
Total: 218 (19.75% of all Instructions), thereof:
          Additions: 218 (100.00% of Floating point stage inst.)
          Multiplications: 0 (0.00% of Floating point stage inst.)
          Divisions: 0 (0.00% of Floating point stage inst.)
     Traps: 1 (0.09% of all Instructions)
```

#### 1.3 SuperDLX

**SuperDLX es** un procesador superescalar, es decir puede ejecutar varias instrucciones por ciclo de reloj. Para ello, utiliza múltiples canales que hacen que múltiples instrucciones comiencen a ejecutarse independientemente unas de otras. Consta de 6 etapas:

- Lectura (fetch).
- Decodificación (decode).
- Lanzamiento (dispatch).
- Ejecución (execute).
- Escritura (writeback).
- Finalización (retirement).

Los predictores de Salto pretenden disminuir los riesgos producidos por los saltos, para ello realizan una pre-lectura y ejecutan las instrucciones del camino destino antes que el salto se realice, es decir, se ejecutan instrucciones sin saber si son el orden correcto en el programa.

#### 1.3.1 Sin Predicción de Saltos:

| ieneral Information   Rena                                  | ming Infor      | mation   I                          | nstruction Process   C                                            | Occupancy Rate                        |                |                                           |
|-------------------------------------------------------------|-----------------|-------------------------------------|-------------------------------------------------------------------|---------------------------------------|----------------|-------------------------------------------|
| Number of Cycles:                                           | 1876            |                                     |                                                                   |                                       |                |                                           |
| Instructions Fetched:<br>Instructions Decoded:              | 1103            | 100                                 | % of total Fetched                                                | Per Cycle Rates:<br>Fetch:<br>Decode: | 0,5879!        | Instructions / Cycle Instructions / Cycle |
| Instructions Issued: Integers: Floating Points:             | 308<br>795      | 27,923i<br>72,076                   | % of total Fetched % of total Issued % of total Issued            | Issue:<br>Commit:                     | 0,5879!        | Instructions / Cycle Instructions / Cycle |
| Instructions Committed:<br>Integers:                        | 1103            | 100                                 | % of total Fetched<br>% of total Committed                        | Loads Blocked by Stores:              | 0              | % of Total Loads                          |
| Floating Points:<br>Writes to Registers:<br>Useless Writes: | 795<br>712<br>0 | 72,076 <sup>-</sup><br>64,551;<br>0 | % of total Committed<br>% of total Committed<br>% of total Writes | Number of branches:<br>Taken:         | 391<br>108     | 27,621- %                                 |
| Fetch Stalls: 1385                                          | 73,827:         | % of Tota                           | al Cycle Count                                                    | Untaken:  Decode Stalls: 1177         | [283<br>[62,73 | 72,378!<br>39; % of Total Cycle Count     |
| Fetch stalls due to full t                                  | ouffers: 0      | 0                                   | % of Total Sta                                                    | lls                                   | ,              |                                           |

#### 1.3.2 Con Predicción de Saltos:

Número de ciclos: 621

| General Information Renaming Information Instruction Process Occupancy Rate |                     |           |                      |                                       |         |                            |  |
|-----------------------------------------------------------------------------|---------------------|-----------|----------------------|---------------------------------------|---------|----------------------------|--|
| Number of Cycles: Instructions Fetched: Instructions Decoded:               | 621<br>1347<br>1173 | 87,082    | % of total Fetched   | Per Cycle Rates:<br>Fetch:<br>Decode: | 2,1690  | Instructions / Cycle       |  |
| Instructions Issued:                                                        | 1144                | 84,929    | % of total Fetched   | Issue:                                | 1,8421: | Instructions / Cycle       |  |
| Integers:                                                                   | 338                 | 25,092    | % of total Issued    | Commit:                               | 1,77611 | Instructions / Cycle       |  |
| Floating Points:                                                            | 806                 | 59,836    | % of total Issued    |                                       |         |                            |  |
| Instructions Committed:                                                     | 1103                | 81,885    | % of total Fetched   | Loads Blocked by Stores:              | 0       | % of Total Loads           |  |
| Integers:                                                                   | 308                 | 22,8651   | % of total Committed |                                       | 0       | % or Lotal Loads           |  |
| Floating Points:                                                            | 795                 | 59,020    | % of total Committed |                                       |         |                            |  |
| Writes to Registers:                                                        | 712                 | 64,551:   | % of total Committed | Number of branches:                   | 110     |                            |  |
| Useless Writes:                                                             | 0                   | 0         | % of total Writes    | Taken:                                | 110     | 100 %                      |  |
|                                                                             |                     |           |                      | Untaken:                              | 0       | 0 ~                        |  |
| Fetch Stalls: 27                                                            | 4,3478:             | % of Tota | l Cycle Count        | Decode Stalls: 33                     | 5,314   | 40i % of Total Cycle Count |  |
| Fetch stalls due to full b                                                  | ouffers: 0          | 0         | % of Total Sta       | lls ,                                 | ,       |                            |  |

# 1.4 Diferencias de resultados entre WinDLX, SuperDLX sin predictor de saltos y SuperDLX con predictor de saltos

- En WinDLX, dado que el código no tiene aplicada ni reordenación ni desenrollado, al ejecutar tantos saltos condicionales sumado a que el código seguramente se pueda realizar de forma más eficiente, el número de ciclos esperado es alto, y así se puede <u>comprobar</u>.
- En SuperDLX sin predictor de saltos, el poder emitir, ejecutar y finalizar varias instrucciones a la vez, sumado al tratamiento independiente de datos enteros y de datos flotantes (salvo en carga y almacenamiento) se esperaría una disminución en el número de ciclos totales, pero esto no ha sido así. Esto me confirma que mi código no es eficiente, realizar una reordenación o desenrollada previa podría reducir este número de ciclos.
- En **SuperDLX con predictor de saltos**, la reducción de ciclos totales sí es bastante más significativa (disminuye aproximadamente a la mitad de ciclos que requiere el programa en WinDLX), ya que el código tiene muchos saltos condicionales e incondicionales por la naturaleza de su propósito, y el predictor se anticipa a todos ellos.

# 2. Ejercicio 2

# 2.1 Número de ciclos para A, B, C, D y E, con y sin predictor de saltos

# 2.1.A: Captación, decodificación y finalización (2, 2, 2)

#### 2.1.A.1: Sin predictor de saltos

Número de ciclos: 1876

| General Information   Renaming Information   Instruction Process   Occupancy Rate                    |            |                                |                                                             |                                       |         |                                           |  |  |
|------------------------------------------------------------------------------------------------------|------------|--------------------------------|-------------------------------------------------------------|---------------------------------------|---------|-------------------------------------------|--|--|
| Number of Cycles:                                                                                    | 1876       |                                |                                                             |                                       |         |                                           |  |  |
| Instructions Fetched:<br>Instructions Decoded:                                                       | 1103       | 100                            | % of total Fetched                                          | Per Cycle Rates:<br>Fetch:<br>Decode: | 0,5879! | Instructions / Cycle Instructions / Cycle |  |  |
| Instructions Issued:                                                                                 | 1103       | 100                            | % of total Fetched                                          | Issue:                                | 0,5879! | Instructions / Cycle                      |  |  |
| Integers:                                                                                            | 308        | 27,923                         | % of total Issued                                           | Commit:                               | 0,5879! | Instructions / Cycle                      |  |  |
| Floating Points:                                                                                     | 795        | 72,076                         | % of total Issued % of total Fetched                        | Loads Blocked by Stores:              | 0       |                                           |  |  |
| Integers:                                                                                            | 308        | 27,923                         | % of total Committed                                        |                                       | 0       | % of Total Loads                          |  |  |
| Floating Points: Writes to Registers: Useless Writes:                                                | 795        | 72,076 <sup>-</sup><br>64,551; | % of total Committed % of total Committed % of total Writes | Number of branches:                   | 391     | 27,621. %                                 |  |  |
| Useless Writes:                                                                                      | 0          | 0                              | % or total Writes                                           | Untaken:                              | 283     | 72,378! %                                 |  |  |
| Fetch Stalls: 1177 62,739; % of Total Cycle Count Decode Stalls: 1177 62,739; % of Total Cycle Count |            |                                |                                                             |                                       |         |                                           |  |  |
| Fetch stalls due to full b                                                                           | ouffers: 0 | 0                              | % of Total Sta                                              | alls                                  |         |                                           |  |  |

#### 2.1.A.2: Con predictor de saltos

| General Information Rena                                                           | aming Infor          | mation   Ir        | nstruction Process   0                                             | Occupancy Rate                                  |                              |                                                                |
|------------------------------------------------------------------------------------|----------------------|--------------------|--------------------------------------------------------------------|-------------------------------------------------|------------------------------|----------------------------------------------------------------|
| Number of Cycles: Instructions Fetched: Instructions Decoded: Instructions Issued: | 621<br>1195<br>1173  | 98,158:            | % of total Fetched<br>% of total Fetched                           | Per Cycle Rates:<br>Fetch:<br>Decode:<br>Issue: | 1,9243<br>1,8888i<br>1,8421! | Instructions / Cycle Instructions / Cycle Instructions / Cycle |
| Integers:<br>Floating Points:                                                      | 338                  | 28,284!<br>67,447i | % of total Issued<br>% of total Issued                             | Commit:                                         | 1,77611                      | Instructions / Cycle                                           |
| Instructions Committed:<br>Integers:<br>Floating Points:                           | 1103<br>308<br>795   | 92,301:            | % of total Fetched<br>% of total Committed<br>% of total Committed | Loads Blocked by Stores:                        | 0                            | % of Total Loads                                               |
| Writes to Registers:<br>Useless Writes:                                            | 712                  | 64,551:            | % of total Committed<br>% of total Writes                          | Number of branches:<br>Taken:<br>Untaken:       | 110<br>110<br>0              | 100 %<br>0 %                                                   |
| Fetch Stalls: 22 Fetch stalls due to full b                                        | 3,5426<br>ouffers: 0 | % of Tota          | Cycle Count<br>  % of Total Sta                                    | Decode Stalls: 33                               | 5,314                        | 40i % of Total Cycle Count                                     |

# 2.1.B: Captación, decodificación y finalización (4, 4, 4)

# 2.1.B.1: Sin predictor de saltos

Número de ciclos: **1865** 

| General Information   Rena                                    | aming Information                    | Instruction Process   (              | Occupancy Rate               |                               |                                                                |
|---------------------------------------------------------------|--------------------------------------|--------------------------------------|------------------------------|-------------------------------|----------------------------------------------------------------|
| Number of Cycles: Instructions Fetched: Instructions Decoded: | 1865<br>1103<br>1103 100             | % of total Fetched                   | Per Cycle Rates:<br>Fetch:   | 0,5914:                       | Instructions / Cycle                                           |
| Instructions Issued: Integers:                                | 1103                                 |                                      | Decode:<br>Issue:<br>Commit: | 0,5914:<br>0,5914:<br>0,5914: | Instructions / Cycle Instructions / Cycle Instructions / Cycle |
| Floating Points:  Instructions Committed:  Integers:          | 795 72,070<br>1103 100<br>308 27,923 | % of total Fetched                   | Loads Blocked by Stores:     | 0                             | % of Total Loads                                               |
| Floating Points:<br>Writes to Registers:<br>Useless Writes:   | 795 72,070<br>712 64,55<br>0 0       |                                      | Number of branches:          | 391<br>108<br>283             | 27,621· %<br>72,378! %                                         |
| Fetch Stalls: 1374 Fetch stalls due to full t                 | // I                                 | otal Cycle Count<br>0 % of Total Sta | Decode Stalls: 1374          | 73,6                          | 72: % of Total Cycle Count                                     |

# 2.1.B.2: Con predictor de saltos

| Reneral Information Reneral Number of Cycles: Instructions Fetched: Instructions Decoded: Instructions Issued: Integers: Floating Points: Instructions Committed: Integers: Floating Points: Writes to Registers: Useless Writes: | 1474<br>1441<br>1284<br>1187<br>368<br>819<br>1103<br>308<br>795 | 89,104 82,373 25,537 56,835 76,544 21,374 55,170 64,551: | % of total Fetched % of total Fetched % of total Issued % of total Issued % of total Fetched % of total Committed % of total Writes | Per Cycle Rates: Fetch: Decode: Issue: Commit:  Loads Blocked by Stores:  Number of branches: Taken: Untaken: | 3,0400:<br>2,7088:<br>2,5042:<br>2,3270:<br>0<br>0<br>1112<br>1112<br>0 | Instructions / Cycle Instructions / Cycle Instructions / Cycle Instructions / Cycle  % of Total Loads |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------|----------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------|
| Fetch Stalls: 25 Fetch stalls due to full                                                                                                                                                                                         | 5,2742i<br>buffers: 0                                            |                                                          | al Cycle Count<br>% of Total Sta                                                                                                                                                                                        | Decode Stalls: 36                                                                                             | 7,594                                                                   | 49: % of Total Cycle Count                                                                            |

#### 2.1.C: Captación, decodificación y finalización (6, 6, 6)

#### 2.1.C.1: Sin predictor de saltos

Número de ciclos: 1865



#### 2.1.C.2: Con predictor de saltos

| General Information   Rena                                  | ming Informal | tion   Ir | nstruction Process   0                                            | Occupancy Rate                            |         |                                           |
|-------------------------------------------------------------|---------------|-----------|-------------------------------------------------------------------|-------------------------------------------|---------|-------------------------------------------|
| Number of Cycles:                                           | 466           |           |                                                                   |                                           |         |                                           |
| Instructions Fetched:<br>Instructions Decoded:              | 1296 8        | 8,043-    | % of total Fetched                                                | Per Cycle Rates:<br>Fetch:<br>Decode:     | 3,1587: | Instructions / Cycle Instructions / Cycle |
| Instructions Issued:                                        | 1194 8        | 1,114     | % of total Fetched                                                | Issue:                                    | 2,5622: | Instructions / Cycle                      |
| Integers:                                                   | 374 25        | 5,4071    | % of total Issued                                                 | Commit:                                   | 2,3669! | Instructions / Cycle                      |
| Floating Points:                                            | 820 5         | 5,706!    | % of total Issued                                                 |                                           |         |                                           |
| Instructions Committed:                                     | 1103 74       | 4,9321    | % of total Fetched                                                | Loads Blocked by Stores:                  | 0       | % of Total Loads                          |
| Integers:                                                   | 308 20        | 0,923:    | % of total Committed                                              |                                           | 0       | % or Lotal Loads                          |
| Floating Points:<br>Writes to Registers:<br>Useless Writes: |               | 4,551:    | % of total Committed<br>% of total Committed<br>% of total Writes | Number of branches:<br>Taken:<br>Untaken: | 114     | 100 %<br>0 %                              |
| Fetch Stalls: 25 Fetch stalls due to full b                 | 10,0040       | of Tota   | Cycle Count<br>  % of Total Sta                                   | Decode Stalls: 46                         | 9,87    | 12. % of Total Cycle Count                |

# 2.1.D: Captación, decodificación y finalización (3, 5, 5)

# 2.1.D.1: Sin predictor de saltos

Número de ciclos: **1865** 

| General Information   Rena                                                                                                                                 | ming Information | Instruction Process   ( | Occupancy Rate                        |                   |                                              |  |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|-------------------------|---------------------------------------|-------------------|----------------------------------------------|--|--|
| Number of Cycles: Instructions Fetched: Instructions Decoded:                                                                                              | 1103 100         | % of total Fetched      | Per Cycle Rates:<br>Fetch:<br>Decode: | 0,5914;           | Instructions / Cycle<br>Instructions / Cycle |  |  |
| Instructions Issued:                                                                                                                                       | 1103 100         | % of total Fetched      | Issue:                                | 0,5914;           | Instructions / Cycle                         |  |  |
| Integers:                                                                                                                                                  | 308 27,923       | % of total Issued       | Commit:                               | 0,5914;           | Instructions / Cycle                         |  |  |
| Floating Points:                                                                                                                                           | 795 72,076       | % of total Issued       |                                       |                   |                                              |  |  |
| Instructions Committed:<br>Integers:<br>Floating Points:<br>Writes to Registers:<br>Useless Writes:                                                        | 1103             | % of total Committed    |                                       | 391<br>108<br>283 | % of Total Loads    27,621                   |  |  |
| Fetch Stalls: 1275 68,3641 % of Total Cycle Count Decode Stalls: 1275 68,3641 % of Total Cycle Count Fetch stalls due to full buffers: 0 % of Total Stalls |                  |                         |                                       |                   |                                              |  |  |

# 2.1.D.2: Con predictor de saltos

| General Information Rena                                                 | aming Information                                    | Instruction Process   (                                                                   | Occupancy Rate                         |                               |                                                                |
|--------------------------------------------------------------------------|------------------------------------------------------|-------------------------------------------------------------------------------------------|----------------------------------------|-------------------------------|----------------------------------------------------------------|
| Number of Cycles:                                                        | 474                                                  |                                                                                           |                                        |                               |                                                                |
| Instructions Fetched: Instructions Decoded: Instructions Issued:         | 1330<br>1258 94,586<br>1176 88,421                   | % of total Fetched                                                                        | Per Cycle Rates: Fetch: Decode: Issue: | 2,8059i<br>2,6540i<br>2,4810i | Instructions / Cycle Instructions / Cycle Instructions / Cycle |
| Integers: Floating Points: Instructions Committed:                       | 358   26,917;<br>  818   61,503;<br>  1103   82,932; | % of total Issued % of total Issued % of total Fetched                                    | Commit:<br>Loads Blocked by Stores:    | 0                             | Instructions / Cycle                                           |
| Integers:<br>Floating Points:<br>Writes to Registers:<br>Useless Writes: | 308 23,157:<br>795 59,774                            | % of total Committed<br>% of total Committed<br>% of total Committed<br>% of total Writes |                                        | 111<br>111<br>0               | % of Total Loads                                               |
| Fetch Stalls: 25 Fetch stalls due to full b                              | 0,2142                                               | al Cycle Count<br>% of Total Sta                                                          | Decode Stalls: 36                      | 7,594                         |                                                                |

# 2.1.E: Captación, decodificación y finalización (1, 4, 6)

# 2.1.E.1: Sin predictor de saltos

Número de ciclos: 2085

| General Information   Rena                                                              | ming Inforn       | nation   Ir                             | nstruction Process   0                                                                                          | Occupancy Rate                  |                             |                                           |
|-----------------------------------------------------------------------------------------|-------------------|-----------------------------------------|-----------------------------------------------------------------------------------------------------------------|---------------------------------|-----------------------------|-------------------------------------------|
| Number of Cycles:                                                                       | 2085              |                                         |                                                                                                                 |                                 |                             |                                           |
| Instructions Fetched:<br>Instructions Decoded:                                          |                   | 100                                     | % of total Fetched                                                                                              | Per Cycle Rates: Fetch: Decode: | 0,5290                      | Instructions / Cycle Instructions / Cycle |
| Instructions Issued:<br>Integers:<br>Floating Points:                                   | 308<br>795        | 27,923i<br>72,076                       | % of total Fetched<br>% of total Issued<br>% of total Issued                                                    | Issue:<br>Commit:               | 0,5290                      | Instructions / Cycle Instructions / Cycle |
| Instructions Committed: Integers: Floating Points: Writes to Registers: Useless Writes: | 308<br>795<br>712 | 100<br>27,923<br>72,076<br>64,551:<br>0 | % of total Fetched<br>% of total Committed<br>% of total Committed<br>% of total Committed<br>% of total Writes |                                 | 0<br>0<br>391<br>108<br>283 | % of Total Loads  27,621- 72,378!  %      |
| Fetch Stalls: 982<br>Fetch stalls due to full b                                         | 11,000.           | % of Tota                               | Il Cycle Count<br>% of Total Sta                                                                                | Decode Stalls: 982              | 47,09                       |                                           |

# 2.1.E.2: Con predictor de saltos

| General Information   Rena                  | aming Infor           | mation   I | nstruction Process   0           | Occupancy Rate              |         |                                 |
|---------------------------------------------|-----------------------|------------|----------------------------------|-----------------------------|---------|---------------------------------|
| Number of Cycles: Instructions Fetched:     | 1148                  |            |                                  | Per Cycle Rates:<br>Fetch:  | 0.9834  | Instructions / Cycle            |
| Instructions Decoded:                       | 1118                  | 99,025     | % of total Fetched               | Decode:                     | 0,97381 | Instructions / Cycle            |
| Instructions Issued:                        | 1107                  | 98,051:    | % of total Fetched               | Issue:                      | 0,9642  | Instructions / Cycle            |
| Integers:                                   | 311                   | 27,546!    | % of total Issued                | Commit:                     | 0,9608i | Instructions / Cycle            |
| Floating Points:                            | 796                   | 70,504     | % of total Issued                |                             | ,       |                                 |
| Instructions Committed:                     | 1103                  | 97,6971    | % of total Fetched               | Loads Blocked by Stores:    | 0       | ov of Total I and               |
| Integers:                                   | 308                   | 27,280     | % of total Committed             |                             | 0       | % of Total Loads                |
| Floating Points:                            | 795                   | 70,416:    | % of total Committed             | Number of branches:         | 108     |                                 |
| Writes to Registers:                        | 712                   | 64,551:    | % of total Committed             | Taken:                      |         | 100 %                           |
| Useless Writes:                             | 0                     | 0          | % of total Writes                |                             | 108     | %                               |
| Fetch Stalls: 19 Fetch stalls due to full t | 1,6550!<br>ouffers: 0 | % of Tota  | ll Cycle Count<br>% of Total Sta | Untaken:  Decode Stalls: 30 | 2,613   | 0<br>32. % of Total Cycle Count |

#### 2.2 Gráfica para ilustrar el comportamiento



#### 2.3 Descripción de los resultados de A, B y C

- A (2,2,2): Al estar la captación, decodificación y finalización casi al mínimo, la ejecución del programa es poco eficiente. Esta configuración le quitaría al procesador superescalar buena parte de su finalidad, que es procesar varias instrucciones a la vez.
- **B** (4,4,4): Esta vez la capacidad de las unidades es más razonable y además equilibrada, así que el rendimiento aumenta.
- C (6,6,6): Pese a aumentar notablemente la capacidad de las unidades manteniendo la proporción, el rendimiento es muy parecido al de la opción B, lo que nos lleva a pensar que estas unidades se quedan ociosas por un cuello de botella, desperdiciando así recursos.

#### 2.4 Descripción de los resultados de D y E

- **D** (3, 5, 5): Ya hay cierta desproporción en la capacidad de las unidades de cada tipo, teniendo en este caso 3 bloques de captación como en la opción B y 5 bloques de decodificación y de finalización como en la opción C. El rendimiento es exactamente el mismo que el de la opción B, puesto que se vio que en C había recursos ociosos.
- **E (1,3,5):** Desproporción total teniendo 1 bloque de captación (opción A), 3 de decodificación (opción B) y 5 de finalización (opción C). El rendimiento empeora mucho, volviendo a los mismos valores que los de la opción A. Esto me lleva a concluir que la capacidad de la unidad de captación es fundamental para evitar cuellos de botella y que el procesador sea eficiente.

#### 2.5 Descripción de la influencia del predictor de saltos en los resultados

Dada la naturaleza y finalidad del programa, se requiere hacer muchos saltos condicionales e incondicionales, así que el predictor de saltos ha resultado determinante para mejorar el rendimiento en las cinco opciones. Su ausencia ha hecho que las opciones B, C y D tengan el mismo rendimiento, pero al activar el predictor y por consiguiente disminuir el número de ciclos, se aprecia que la opción C es ligeramente más eficiente que la B y la D, aunque no de forma significativa.

# 3. Ejercicio 3

# 3.1: Número de ciclos para A, B y C, con y sin predictor de saltos

#### 3.1.A: Cola de instrucciones, ventana de instrucciones y buffer de reorden (8, 8, 8)

#### 3.1.A.1: Sin predictor de saltos

Número de ciclos: 1865

| General Information Renaming Information Instruction Process Occupancy Rate                             |      |        |                                              |                          |         |                      |  |  |
|---------------------------------------------------------------------------------------------------------|------|--------|----------------------------------------------|--------------------------|---------|----------------------|--|--|
| Number of Cycles:                                                                                       | 1865 |        |                                              |                          |         |                      |  |  |
| Instructions Fetched:                                                                                   | 1103 |        |                                              | Per Cycle Rates:         |         |                      |  |  |
| Instructions Decoded:                                                                                   | 1103 | 100    | % of total Fetched                           | Fetch:                   | 0,5914: | Instructions / Cycle |  |  |
|                                                                                                         |      | 1      |                                              | Decode:                  | 0,5914: | Instructions / Cycle |  |  |
| Instructions Issued:                                                                                    | 1103 | 100    | % of total Fetched                           | Issue:                   | 0,5914: | Instructions / Cycle |  |  |
| Integers:                                                                                               | 308  | 27,923 | % of total Issued                            | Commit:                  | 0,5914: | Instructions / Cycle |  |  |
| Floating Points:                                                                                        | 795  | 72,076 | % of total Issued                            |                          |         |                      |  |  |
| Instructions Committed:                                                                                 | 1103 | 100    | % of total Fetched<br>% of total Committed   | Loads Blocked by Stores: | 0       | % of Total Loads     |  |  |
| Floating Points:<br>Writes to Registers:                                                                | 795  | 72,076 | % of total Committed<br>% of total Committed | Number of branches:      | 391     |                      |  |  |
| Useless Writes:                                                                                         | 0    | 0      | % of total Writes                            | Taken:                   | 108     | 27,621- %            |  |  |
| Fetch Stalls: 1374 73,672: % of Total Cycle Count Fetch stalls due to full buffers: 0 % of Total Stalls |      |        |                                              |                          |         |                      |  |  |

#### 3.1.A.2: Con predictor de saltos

| General Information Rena  Number of Cycles:  Instructions Fetched:  Instructions Decoded:  Instructions Issued:  Integers:  Floating Points:  Integers:  Floating Points:  Writes to Registers:  Useless Writes: | 1144     1164     93,569     1141     91,720     334     26,848     807     64,871     1103     88,665     308     24,758     795     63,906 | % of total Issued % of total Issued % of total Fetched % of total Committed % of total Committed | Per Cycle Rates: Fetch: Decode: Issue: Commit: Loads Blocked by Stores: Number of branches: Taken: Untaken: | 1,9316<br>1,8074!<br>1,7717:<br>1,7127:<br>0<br>0<br>110<br>110<br>0 | Instructions / Cycle Instructions / Cycle Instructions / Cycle Instructions / Cycle  % of Total Loads |  |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------|--|--|
| Fetch Stalls: 130 20,186: % of Total Cycle Count Decode Stalls: 148 22,981: % of Total Cycle Count Fetch stalls due to full buffers: 105 80,769; % of Total Stalls                                               |                                                                                                                                              |                                                                                                  |                                                                                                             |                                                                      |                                                                                                       |  |  |

# 3.1.B: Cola de instrucciones, ventana de instrucciones y buffer de reorden (15, 15, 15)

# 3.2.B.1: Sin predictor de saltos

Número de ciclos: 1865

| General Information Renaming Information Instruction Process Occupancy Rate                                                                                |                                            |                                                                   |                                           |                               |                                                                |  |  |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------|-------------------------------------------------------------------|-------------------------------------------|-------------------------------|----------------------------------------------------------------|--|--|--|
| Number of Cycles:                                                                                                                                          | 1865                                       |                                                                   |                                           |                               |                                                                |  |  |  |
| Instructions Fetched: Instructions Decoded: Instructions Issued:                                                                                           | 1103<br>1103<br>1103<br>1100               | % of total Fetched                                                | Per Cycle Rates: Fetch: Decode: Issue:    | 0,5914;<br>0,5914;<br>0,5914; | Instructions / Cycle Instructions / Cycle Instructions / Cycle |  |  |  |
| Integers:<br>Floating Points:                                                                                                                              | 308 27,923                                 | % of total Issued<br>% of total Issued                            | Commit:                                   | 0,5914:                       | Instructions / Cycle                                           |  |  |  |
| Instructions Committed:                                                                                                                                    | 795   72,076   1103   100   308   27,923;  | % of total Fetched % of total Committed                           | Loads Blocked by Stores:                  | 0                             | % of Total Loads                                               |  |  |  |
| Floating Points:<br>Writes to Registers:<br>Useless Writes:                                                                                                | 795   72,076<br>  712   64,551:<br>  0   0 | % of total Committed<br>% of total Committed<br>% of total Writes | Number of branches:<br>Taken:<br>Untaken: | 391<br>108<br>283             | 27,621· %<br>72,378! %                                         |  |  |  |
| Fetch Stalls: 1374 73,672: % of Total Cycle Count Decode Stalls: 1374 73,672: % of Total Cycle Count Fetch stalls due to full buffers: 0 % of Total Stalls |                                            |                                                                   |                                           |                               |                                                                |  |  |  |

#### 3.2.B.2: Con predictor de saltos

| General Information Rena   | ming Infor | mation   Ir | nstruction Process   0 | Occupancy Rate           |         |                            |
|----------------------------|------------|-------------|------------------------|--------------------------|---------|----------------------------|
| Number of Cycles:          | 466        |             |                        |                          |         |                            |
| Instructions Fetched:      | 1386       |             |                        | Per Cycle Rates:         |         |                            |
| Instructions Decoded:      | 1227       | 88,528      | % of total Fetched     | Fetch:                   | 2,9742  | Instructions / Cycle       |
|                            |            |             |                        | Decode:                  | 2,6330- | Instructions / Cycle       |
| Instructions Issued:       | 1164       | 83,982      | % of total Fetched     | Issue:                   | 2,4978! | Instructions / Cycle       |
| Integers:                  | 345        | 24,891      | % of total Issued      | Commit:                  | 2,3669! | Instructions / Cycle       |
| Floating Points:           | 819        | 59,090:     | % of total Issued      |                          |         |                            |
| Instructions Committed:    | 1103       | 79,581!     | % of total Fetched     | Loads Blocked by Stores: | 0       |                            |
| Integers:                  | 308        | 22,222:     | % of total Committed   |                          | 0       | % of Total Loads           |
| Floating Points:           | 795        | 57,359:     | % of total Committed   |                          |         |                            |
| Writes to Registers:       | 712        | 64,551:     | % of total Committed   | Number of branches:      | 111     |                            |
| Useless Writes:            | 0          | 0           | % of total Writes      | Taken:                   | 111     | 100 %                      |
|                            | ,          | ,           |                        | Untaken:                 | 0       | 0 "                        |
|                            |            |             |                        |                          |         |                            |
| -                          |            |             | '                      |                          |         |                            |
| Fetch Stalls: 23           | 4,9356:    | % of Tota   | l Cycle Count          | Decode Stalls: 44        | 9,442   | 201 % of Total Cycle Count |
| Fetch stalls due to full t | ouffers: 0 | 0           | % of Total Sta         | lls                      | ,       |                            |
|                            |            |             |                        |                          |         |                            |

# 3.1.C: Cola de instrucciones, ventana de instrucciones y buffer de reorden (25, 25, 25)

# 3.3.C.1: Sin predictor de saltos

Número de ciclos: 1865

| General Information Renaming Information Instruction Process Occupancy Rate                                                                                        |                                                                                                              |                                                                                                                   |                                                                         |                                               |                                                                                     |  |  |  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------|-----------------------------------------------|-------------------------------------------------------------------------------------|--|--|--|
| Number of Cycles:  Instructions Fetched: Instructions Decoded:  Instructions Issued: Integers: Floating Points: Instructions Committed: Integers: Floating Points: | 1865     1103   100     1103   100     172,076     1103   100     308   27,923     1103   100   308   27,923 | % of total Fetched % of total Fetched % of total Issued % of total Issued % of total Fetched % of total Committed | Per Cycle Rates: Fetch: Decode: Issue: Commit: Loads Blocked by Stores: | 0,5914:<br>0,5914:<br>0,5914:<br>0,5914:<br>0 | Instructions / Cycle Instructions / Cycle Instructions / Cycle Instructions / Cycle |  |  |  |
| Writes to Registers: Useless Writes:  Fetch Stalls: 1374                                                                                                           | 0 0                                                                                                          |                                                                                                                   | Number of branches: Taken: Untaken:  Decode Stalls: 1373                | 391<br>108<br>283                             | 27,621                                                                              |  |  |  |
| Fetch stalls due to full b                                                                                                                                         |                                                                                                              | % of Total Sta                                                                                                    |                                                                         | 1, 0,0                                        |                                                                                     |  |  |  |

# 3.3.C.2: Con predictor de saltos

| General Information   Rena                                                                                                                                                  | aming Info      | mation   I             | nstruction Process   (                                            | Occupancy Rate                        |                 |                                           |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|------------------------|-------------------------------------------------------------------|---------------------------------------|-----------------|-------------------------------------------|
| Number of Cycles:                                                                                                                                                           | 642             |                        |                                                                   |                                       |                 |                                           |
| Instructions Fetched:<br>Instructions Decoded:                                                                                                                              | 1435            | 82,2991                | % of total Fetched                                                | Per Cycle Rates:<br>Fetch:<br>Decode: | 2,2352          | Instructions / Cycle Instructions / Cycle |
| Instructions Issued:                                                                                                                                                        | 1143            | 79,651!                | % of total Fetched                                                | Issue:                                | 1,7803          | Instructions / Cycle                      |
| Integers:                                                                                                                                                                   | 338             | 23,5541                | % of total Issued                                                 | Commit:                               | 1,7180          | Instructions / Cycle                      |
| Floating Points:                                                                                                                                                            | 805             | 56,097!                | % of total Issued                                                 |                                       |                 |                                           |
| Instructions Committed:                                                                                                                                                     | 1103            | 76,864                 | % of total Fetched                                                | Loads Blocked by Stores:              | 0               |                                           |
| Integers:                                                                                                                                                                   | 308             | 21,463                 | % of total Committed                                              |                                       | 0               | % of Total Loads                          |
| Floating Points:<br>Writes to Registers:<br>Useless Writes:                                                                                                                 | 795<br>712<br>0 | 55,400<br>64,551:<br>0 | % of total Committed<br>% of total Committed<br>% of total Writes | Number of branches:                   | 110<br>110<br>0 | 100 %                                     |
| Fetch Stalls: 112   17,445. % of Total Cycle Count Decode Stalls: 131   20,404: % of Total Cycle Count Fetch stalls due to full buffers:   87     77,678! % of Total Stalls |                 |                        |                                                                   |                                       |                 |                                           |

#### 3.2: Gráfica para ilustrar el comportamiento



#### 3.3: Descripción de los resultados de las tres opciones

- A (8,8,8): Aunque el rendimiento es muy parecido en las tres opciones, esta es ligeramente más ineficiente porque la capacidad de las unidades es lo bastante baja como para ser insuficiente en algunas ocasiones.
- **B** (15,15,15): La capacidad de las unidades de cada tipo es ya muy cercana a las del original, así que también lo es el rendimiento.
- C (25,25,25): La capacidad de las unidades es mucho mayor que las del original, y sin embargo el rendimiento es exactamente el mismo que en dicha configuración. Esto lleva a concluir que hay desperdicio de recursos por quedarse las unidades ociosas.

#### 3.4: Descripción de la influencia del predictor de saltos en los resultados

Al igual que en el ejercicio 2, se ve que el predictor de saltos es determinante para la eficiencia del programa. Las ligeras diferencias de rendimiento mencionadas en el apartado anterior (3.3) sólo se aprecian cuando el predictor está activado. Si no lo está, no se notan las distintas configuraciones propuestas.

# 4. El mejor procesador SuperDLX en relación calidad/precio

Siempre buscaremos en nuestro programa el menor número de ciclos de reloj posibles, que en este caso como hemos podido observar será 466 ciclos de reloj totales.

|                           | CAPACIDAD        | COSTE |
|---------------------------|------------------|-------|
| Captación                 | 5                | 10w   |
| Decodificación            | 5                | 10w   |
| sFinalización             | 5                | 10w   |
| Buffer de reorden para    | 12               | 6w    |
| enteros                   |                  |       |
| Buffer de reorden para    | 16               | 8w    |
| flotantes                 |                  |       |
| Ventana de instrucciones  | 6                | 6w    |
| para enteros              |                  |       |
| Ventana de instrucciones  | 8                | 8w    |
| para flotantes            |                  |       |
| Cola de instrucciones     | 6                | 6w    |
| tamaño del búfer de datos | 2 load , 2 store | 1w    |
| Predictor de saltos       | Activo           | 35w   |

Total: 100w -> 466 ciclos

# 4.1: Número de ciclos

| General Information Rena                                    | ming Infor            | mation   I              | nstruction Process                                                | Occupancy Rate                        |         |                                           |
|-------------------------------------------------------------|-----------------------|-------------------------|-------------------------------------------------------------------|---------------------------------------|---------|-------------------------------------------|
| Number of Cycles:                                           | 466                   |                         |                                                                   |                                       |         |                                           |
| Instructions Fetched:<br>Instructions Decoded:              | 1297                  | 94,988                  | % of total Fetched                                                | Per Cycle Rates:<br>Fetch:<br>Decode: | 2,78321 | Instructions / Cycle Instructions / Cycle |
| Instructions Issued:                                        | 1162                  | 89,591:                 | % of total Fetched                                                | Issue:                                | 2,4935  | Instructions / Cycle                      |
| Integers:<br>Floating Points:                               | 818                   | 26,522                  | % of total Issued<br>% of total Issued                            | Commit:                               | 2,3669! | Instructions / Cycle                      |
| Instructions Committed:                                     | 1103                  | 85,042                  | % of total Fetched                                                | Loads Blocked by Stores:              | 0       | % of Total Loads                          |
| Floating Points:<br>Writes to Registers:<br>Useless Writes: | 795<br>712<br>0       | 61,295:<br>64,551:<br>0 | % of total Committed<br>% of total Committed<br>% of total Writes | Number of branches:                   | 111     | 100 %                                     |
| Fetch Stalls: 27 Fetch stalls due to full b                 | 5,7939:<br>ouffers: 3 |                         | al Cycle Count<br>1,111 % of Total Sta                            | Decode Stalls: 37                     | 7,938   |                                           |

#### 4.2: Porqué esta configuración

Para ver que configuración escoger debemos de comparar ambas gráficas y escoger para cada una de ellas, el caso que sea mejor respecto a calidad/precio, es decir, que por el menor precio posible obtengamos el mayor rendimiento posible.

En las etapas de captación, decodificación y finalización he elegido una opción proporcionada en las 3 (5,5,5), ya que con la configuración original que proporciona SuperDLX se desperdician recursos, al aplicar mi configuración tenemos una menor capacidad, pero se aprovechan de una forma más eficiente los recursos. Para ellos hemos asignado una cola de instrucciones de 6 líneas, un tamaño de buffer de datos de 4 líneas, un buffer de reorden de enteros de 12 líneas y de flotantes de 16 líneas, una ventana de instrucciones de enteros de 6 líneas y 8 para los flotantes.

El predictor de saltos es necesario agregarlo ya que en tema de eficiencia para el programa es un factor determinante para reducir drásticamente el número de ciclos.