## **L1 Instruction Cache**



| L0 Instruction Cache            |                                |           |           |           |           |                |           |                |  |  |  |
|---------------------------------|--------------------------------|-----------|-----------|-----------|-----------|----------------|-----------|----------------|--|--|--|
|                                 | Warp Scheduler (32 thread/clk) |           |           |           |           |                |           |                |  |  |  |
|                                 | Dispatch Unit (32 thread/clk)  |           |           |           |           |                |           |                |  |  |  |
| Register File (16,384 x 32-bit) |                                |           |           |           |           |                |           |                |  |  |  |
| FP                              | 64                             | INT       | INT       | FP32      | FP32      | $\blacksquare$ |           |                |  |  |  |
| FP                              | FP64                           |           | INT       | FP32      | FP32      | $\mathbb{H}$   |           |                |  |  |  |
| FP64                            |                                | INT       | INT       | FP32      | FP32      | Ш              |           |                |  |  |  |
| FP                              | FP64                           |           | INT       | FP32      | FP32      | TENSOR<br>CORE |           | TENSOR<br>CORE |  |  |  |
| FP                              | FP64                           |           | INT       | FP32      | FP32      |                |           |                |  |  |  |
| FP                              | FP64                           |           | INT       | FP32      | FP32      |                |           |                |  |  |  |
| FP                              | FP64                           |           | INT       | FP32      | FP32      |                |           |                |  |  |  |
| FP                              | FP64                           |           | INT       | FP32      | FP32      | $\coprod$      |           |                |  |  |  |
| LD/<br>ST                       | LD/<br>ST                      | LD/<br>ST | LD/<br>ST | LD/<br>ST | LD/<br>ST | LD/<br>ST      | LD/<br>ST | SFU            |  |  |  |

## **L0 Instruction Cache** Warp Scheduler (32 thread/clk) Dispatch Unit (32 thread/clk) Register File (16,384 x 32-bit) FP64 INT INT FP32 FP32 TENSOR **TENSOR** CORE CORE FP64 INT INT FP32 FP32 FP64 INT FP32 FP32 INT FP64 INT INT FP32 FP32 INT FP32 FP32 FP64 INT LD/ LD/ LD/ LD/ LD/ LD/ LD/ LD/ SFU ST ST ST ST ST

| Dispatch Unit (32 thread/cik)   |           |           |           |           |           |                |           |        |  |  |
|---------------------------------|-----------|-----------|-----------|-----------|-----------|----------------|-----------|--------|--|--|
| Register File (16,384 x 32-bit) |           |           |           |           |           |                |           |        |  |  |
| FP6                             | 4         | INT       | INT       | FP32      | FP32      | $\blacksquare$ |           |        |  |  |
| FP64                            |           | INT       | INT       | FP32      | FP32      | Н              |           |        |  |  |
| FP64                            |           | INT       | INT       | FP32      | FP32      |                |           |        |  |  |
| FP64                            |           | INT       | INT       | FP32      | FP32      |                |           | TENSOR |  |  |
| FP64                            |           | INT       | INT       | FP32      | FP32      | CORE CORE      |           |        |  |  |
| FP64                            |           | INT       | INT       | FP32      | FP32      |                |           |        |  |  |
| FP64                            |           | INT       | INT       | FP32      | FP32      |                |           |        |  |  |
| FP64                            |           | INT       | INT       | FP32      | FP32      | $\blacksquare$ |           |        |  |  |
| LD/<br>ST                       | LD/<br>ST | LD/<br>ST | LD/<br>ST | LD/<br>ST | LD/<br>ST | LD/<br>ST      | LD/<br>ST | SFU    |  |  |

**L0 Instruction Cache** 

Warp Scheduler (32 thread/clk)

Dispatch Unit (32 thread/clk

## 128KB L1 Data Cache / Shared Memory

Tex Tex Tex