### 311651052 吳挺宇 電物所碩二

3x3 convolution kernel without clock gating.

根據下圖與助教的 pattern 寫 without clock gating Verilog code.



## 1.RTL Verification

### 2.Synthesis

#### Timing:

```
add_0_root_add_0_root_add_184_8/SUM[18] (Convolution_DW01_add_0)
                                                                                             968.11 f
                                                                                0.00
U266/Y (NAND2xp5_ASAP7_75t_R)
Out_OFM_reg[18]/D (ASYNC_DFFHx1_ASAP7_75t_R)
data arrival time
                                                                                             978.73 r
978.73 r
                                                                               10.61
                                                                                0.00
                                                                                             978.73
clock clk (rise edge)
                                                                            1000.00
                                                                                            1000.00
clock network delay (ideal)
Out_OFM_reg[18]/CLK (ASYNC_DFFHx1_ASAP7_75t_R)
library setup time
data required time
                                                                                0.00
                                                                                            1000.00
                                                                                0.00
                                                                                            1000.00 r
                                                                                             978.80
978.80
                                                                              -21.20
                                                                                             978.80
data required time
data arrival time
                                                                                            -978.73
slack (MET)
                                                                                                0.08
```

#### Area:

```
Number of ports:
                                                                830
Number of ports:
Number of nets:
Number of cells:
Number of combinational cells:
Number of sequential cells:
Number of macros/black boxes:
Number of buf/inv:
Number of references:
                                                               4145
                                                               2970
                                                               2784
                                                                171
                                                                  Θ
                                                                551
                                                                  31
Combinational area:
                                                    3971.825261
Buf/Inv area:
                                                     386.778245
Noncombinational area:
                                                    1037.162873
                                                        0.000000
Macro/Black Box area:
Net Interconnect area:
                                          undefined (No wire load specified)
Total cell area:
                                                    5008.988134
                                           undefined
Total area:
```

### 3. Gate Level Simulation



### 測量 power:

```
Internal
                                                            Switching
                                                                               Leakage
                                                                                                    Total
Power Group
                                           Power
                                                            Power
                                                                                Power
                                                                                                    Power
                                                                                                                             %)
                                                                                                                                    Attrs
                                          0.0000 0.0000 0.0000 0.0000 (0.00%)
2.206e-04 2.461e-06 4.946e-08 2.231e-04 (81.22%)
2.205e-05 2.931e-05 2.284e-07 5.159e-05 (18.78%)
0.0000 0.0000 0.0000 0.0000 (0.00%)
0.0000 0.0000 0.0000 0.0000 (0.00%)
0.0000 0.0000 0.0000 0.0000 (0.00%)
0.0000 0.0000 0.0000 0.0000 (0.00%)
clock network
register
combinational
sequential
memory
io_pad
black_box
   Net Switching Power = 3.177e-05
Cell Internal Power = 2.426e-04
Cell Leakage Power = 2.779e-07
                                                                  (11.57%)
                                                                  (88.33%)
(0.10%)
       Intrinsic Leakage = 2.779e-07
       Gate Leakage
                                                 0.0000
Total Power
                                                               (100.00%)
                                        = 2.747e-04
X Transition Power
                                        = 9.712e-07
Glitching Power
                                        = 7.453e-07
Peak Power
                                        = 8.693e-04
Peak Time
                                                     1168
```

### Waveform:

3x3 convolution kernel with clock gating:

在 In\_IFM\_[n],等於 0 且 in\_valid=1 時用 clk\_gate\_[n]去控制 Enable\_[n]。同理,用 weight\_valid = 1 時控制 clk\_gate\_Weight。 波行如下圖:

# In\_IFM\_[n]:



## weight\_valid:



# 輸入讀到的數字:



#### 1.RTL Verification

```
| Reserve | Rese
```

### 2.Synthesis

### Timing:

```
add_o_loor_add_o_loor_add_430_0/00/1 /1MAX1_4245/_/3r_v
                                                                                     911.71 f
add_0_root_add_0_root_add_458_8/U1_17/CON (FAx1_ASAP7
                                                                       75t_R)
                                                                                     932.01 r
add_0_root_add_0_root_add_458_8/U7/Y (INVx1_ASAP7_75t_R)
                                                                        15.56
                                                                                     947.57 f
add_0_root_add_0_root_add_458_8/U4/Y (XOR2xp5_ASAP7_75t_R)
                                                                        19.97
                                                                                     967.55 f
add_0_root_add_0_root_add_458_8/SUM[18] (Convolution_DW01_add_0)
                                                                        0.00
                                                                                     967.55 f
U5/Y (NAND2xp5_ASAP7_75t_R)
Out_OFM_reg[18]/D (ASYNC_DFFHx1_ASAP7_75t_R)
data arrival time
                                                                        11.33
                                                                                     978.87 r
978.87 r
                                                                                     978.87
clock clk (rise edge)
clock network delay (ideal)
Out_OFM_reg[18]/CLK (ASYNC_DFFHx1_ASAP7_75t_R)
library setup time
data required time
                                                                     1000.00
0.00
0.00
                                                                                   1000.00
1000.00
                                                                                    1000.00 r
                                                                       -20.96
                                                                                     979.04
                                                                                     979.04
                                                                                     979.04
data required time
data arrival time
                                                                                    -978.87
slack (MET)
                                                                                       0.16
```

#### AREA:

```
Number of ports:
                                                        830
Number of nets:
                                                       4383
Number of cells:
                                                       3208
Number of combinational cells:
Number of sequential cells:
Number of macros/black boxes:
Number of buf/inv:
Number of references:
                                                       3003
                                                        190
                                                          0
                                                        597
                                                         34
Combinational area:
                                             4259.926064
                                              419.437445
Buf/Inv area:
Noncombinational area:
                                             1138.406391
Macro/Black Box area:
                                                 0.000000
Net Interconnect area:
                                     undefined (No wire load specified)
Total cell area:
                                             5398.332455
                                     undefined
Total area:
```

### 3. Gate Level Simulation

### 測量 power:

```
Attributes
             Including register clock pin internal power
            User defined power group
                          Internal Switching Leakage
                                                             Total
Power Group
                          Power
                                     Power
                                                 Power
                                                             Power
                                                                            %) Attrs
clock_network
                          4.265e-06 1.222e-06 4.081e-09 5.490e-06
register
                          6.001e-05
                                    3.434e-06
                                               5.473e-08 6.350e-05
                                                                     (47.97%)
combinational
                          2.821e-05
                                    3.491e-05 2.690e-07 6.338e-05
                                                                     (47.88%)
                                                   0.0000
sequential
                             0.0000
                                        0.0000
                                                             0.0000
                                                                     ( 0.00%)
memory
                             0.0000
                                        0.0000
                                                   0.0000
                                                             0.0000
                                                                       0.00%)
io_pad
                             0.0000
                                        0.0000
                                                   0.0000
                                                             0.0000
                                                                       0.00%
black_box
                             0.0000
                                        0.0000
                                                   0.0000
                                                              0.0000 (
                                                                       0.00%)
  Net Switching Power
Cell Internal Power
Cell Leakage Power
                       = 3.956e-05
                                        (29.89%)
(69.87%)
                        = 9.249e-05
                                        ( 0.25%)
                        = 3.278e-07
    Intrinsic Leakage
                       = 3.278e-07
    Gate Leakage
                              0.0000
Total Power
                         = 1.324e-04
                                       (100.00%)
X Transition Power
                        = 9.426e-07
Glitching Power
                         = 1.232e-06
Peak Power
                         = 8.940e-04
Peak Time
                                3048
```

- 1. w w/o clock gating 電路面積比較下 w clock gating 的面積相比於 w/o clock gating 大 7.78%。
- 2.在 1ns 下比較 2 種設計後可知 w/o clock gating 設計的總 power 為 274  $\mu$  W; w clock gating 設計總 power 為 132.4  $\mu$  W。由此可得知加入 clock gating 面積雖然會上升,但因為 clk 開關變得較不頻繁,導致總 power 下降。