Name: Pei Chi Chen

ASU ID:1234481926

# **Block Diagram:**



Figure 1: Block Diagram

# **Design Decision:**

**Transformation Block:** 

| load                     | multiply      | partial  | portion       | ote<br>Isun   | = = total   |                        |            |           |
|--------------------------|---------------|----------|---------------|---------------|-------------|------------------------|------------|-----------|
|                          | load<br>data  | nultiply | part).<br>Sun |               | portral sum | ョうか.                   | total      |           |
|                          |               | load     | huti          | ply           | partial sum | acamable<br>partialsam | ⇒ 3/3 0    | f total   |
| ++ fea                   | nre address   |          |               |               |             |                        |            | K. 24     |
|                          | * *           |          | -1            | wate.         | _dot_prod   |                        |            |           |
| to incre-                | ent waght     | address  | · ·           |               |             |                        |            | Andrews . |
|                          | 1 . 1 1       | metholy  | portiol       | portlalse     | n.          |                        |            |           |
|                          |               | nop      | lood          | huttiply      | - portion   | Accumulate postle) sum |            |           |
| +t weight                | of and        |          |               | load          | hultiply    | pertle!                | occurrence |           |
| erable                   | stratul pod [ |          | ni paganan    | and of the ba | load        | nultiply               | portlet    | aumulote  |
| segment<br>of the vector | XZX           | 0        | X             | 丁.            | extra s     | poung due              | to nop"    |           |
| 6 1 12                   |               | wile_do  | t prod [      | erys partie   |             |                        |            |           |

The dot product operation between each pair of 96 vectors is divided into three equal segments of length 32. Every of these segments contains 4 stages First, it loads feature data, and then perform element-wise feature and weight data multiplication. Afterwards, the multiplier block sums every group of 8 (4 groups in total) of the partial products. Finally sum the partial sum and carry with accumulated value stored in the dot\_product registers.



Figure 2: Transformation FSM during Transformation stage



Figure 3: Multiplier during Transformation stage



Figure 4: Address block during Transformation stage

The CSA sub-module consists of two 4-input CSAs operating in parallel, the 8 outputs, which are the sums and shifted carries go through another CSA sub-module.

### **Combination Block:**

| start                                                                                                              | Read_1 | Write_1 | Read_2 | Write_2 | Done |  |  |  |  |
|--------------------------------------------------------------------------------------------------------------------|--------|---------|--------|---------|------|--|--|--|--|
| Assume the coo_in is (x,y), The Read_1 state sum the x <sup>th</sup> row of FMWM matrix with the y <sup>th</sup>   |        |         |        |         |      |  |  |  |  |
| row in the corresponding ADJ_FM_WM memory, and the products will be ready to write                                 |        |         |        |         |      |  |  |  |  |
| after the Write_1 stage. Since the adjacency matrix is symmetric, the Read_2 and Write_2                           |        |         |        |         |      |  |  |  |  |
| stages are performing the same operation on the y <sup>th</sup> row of FMWM and sum it with x <sup>th</sup> row of |        |         |        |         |      |  |  |  |  |
| ADJ_FM_WM memory.                                                                                                  |        |         |        |         |      |  |  |  |  |



Figure 5: Combination block Combination Stage

### **ARG\_MAX Block:**

The argmax block read in the row from ADJ\_FM\_WM, and write to the register after the write signal is asserted in Write stage. After 6 rows are processed, the FSM enters Done stage.



Figure 6: ARG\_MAX block during ARG\_MAX stage

# **Total latency:**

Time:  $5.9702 \times 10^{-5}$  ms=59.7ns at the clock frequency of 1667MHz



Figure 7: post synthesis simulation



Figure 8: post APR simulation

## **Power:**

321.7 mW at clk period = 600ps

```
Total Power

Total Internal Power: 95.52620009 29.6939%

Total Switching Power: 226.17582910 70.3057%

Total Leakage Power: 0.00129508 0.0004%

Total Power: 321.70332444

-------
Ended Static Power Report Generation: (cpu=0:00:00, real=0:00:00, mem(process/total)=1630.79MB/1630.79MB)
```

## Area:

0.0932mm<sup>2</sup>

```
Floorplan/Placement Information
Total area of Standard cells: 87057.996 um^2
Total area of Standard cells(Subtracting Physical Cells): 26610.949 um^2
Total area of Macros: 0.000 um^2
Total area of Blockages: 0.000 um^2
Total area of Pad cells: 0.000 um^2
Total area of Core: 87079.225 um^2
Total area of Chip: 93151.877 um^2
Effective Utilization: 1.0000e+00
Number of Cell Rows: 273
```

### **Innovus Layout:**



# **Innovus Density:**

Density: 32.467%=0.32467

| Setup mode            | all                                      | reg2  | reg   default    | 1                                    |  |
|-----------------------|------------------------------------------|-------|------------------|--------------------------------------|--|
| TNS (<br>Violating Pa | (ns):  0.061<br>(ns):  0.006<br>aths:  0 |       | 000   0.000      |                                      |  |
| DRVs                  | ļ                                        | Real  |                  | Total                                |  |
|                       | Nr nets(t                                | erms) | Worst Vio        | Nr nets(terms)                       |  |
|                       | 0 (0)                                    |       | 0.000  <br>0.000 | 0 (0)<br>0 (0)<br>205 (205)<br>0 (0) |  |

## **Gate Count:**

Number of gates: 37662

```
innovus 3> reportGateCount
Gate area 0.6998 um^2
[0] GCN Gates=37662 Cells=14170 Area=26357.8 um^2
```

# **DRC Results:**

## **Innovus Connectivity:**

```
******* Start: VERIFY CONNECTIVITY ******
Start Time: Sat May 3 04:24:46 2025
Design Name: GCN
Database Units: 4000
Design Boundary: (0.0000, 0.0000) (305.4240, 304.9920)
Error Limit = 1000; Warning Limit = 50
Check all nets
**** 04:24:46 **** Processed 5000 nets.
**** 04:24:46 **** Processed 10000 nets.
Begin Summary
  Found no problems or warnings.
End Summary
End Time: Sat May 3 04:24:47 2025
Time Elapsed: 0:00:01.0
****** End: VERIFY CONNECTIVITY ******
  Verification Complete : 0 Viols. 0 Wrngs.
  (CPU Time: 0:00:00.8 MEM: -0.070M)
innovus 5>
```

#### Innovus DRC:

No DRC violations were found

```
VERIFY DRC ...... Sub-Area: {62.208 183.168 124.416 244.224} 17 of 25
 VERIFY DRC ...... Sub-Area : 17 complete 0 Viols.
 VERIFY DRC ..... Sub-Area: {124.416 183.168 186.624 244.224} 18 of 25
 VERIFY DRC ..... Sub-Area : 18 complete 0 Viols.

VERIFY DRC ..... Sub-Area: {186.624 183.168 248.832 244.224} 19 of 25

VERIFY DRC ..... Sub-Area : 19 complete 0 Viols.
 VERIFY DRC ...... Sub-Area: {248.832 183.168 305.424 244.224} 20 of 25
 VERIFY DRC ...... Sub-Area : 20 complete 0 Viols.
 VERIFY DRC ...... Sub-Area: {0.000 244.224 62.208 304.992} 21 of 25
 VERIFY DRC ...... Sub-Area : 21 complete 0 Viols.
 VERIFY DRC ..... Sub-Area: {62.208 244.224 124.416 304.992} 22 of 25
 VERIFY DRC ..... Sub-Area : 22 complete 0 Viols.
 VERIFY DRC ..... Sub-Area: {124.416 244.224 186.624 304.992} 23 of 25
 VERIFY DRC ..... Sub-Area : 23 complete 0 Viols.
 VERIFY DRC ...... Sub-Area: {186.624 244.224 248.832 304.992} 24 of 25
 VERIFY DRC ..... Sub-Area : 24 complete 0 Viols.
 VERIFY DRC ...... Sub-Area: {248.832 244.224 305.424 304.992} 25 of 25
 VERIFY DRC ..... Sub-Area : 25 complete 0 Viols.
 Verification Complete : 0 Viols.
*** End Verify DRC (CPU: 0:00:48.5 ELAPSED TIME: 49.00 MEM: 336.7M) ***
innovus 4>
```

#### Virtuoso DRC:



#### LVS Results:



# **Timing Report:**

Worst Case SetUp:

```
# Generated by: Cadence Innovus 17.12-s095_1
# 0S:
                 Linux x86 64(Host ID nc-asu6-l02.apporto.com)
# Generated on:
                 Fri May 2 18:25:52 2025
# Design:
                GCN
# Command:
                 optDesign -postroute -setup
Path 1: MET Setup Check with Pin Transformation/Multi_partial_product_reg_reg_
28 7 /CLK
Endpoint:
         Transformation/Multi partial product reg reg 28 7 /D (v) checked
with leading edge of 'clk'
Beginpoint: Transformation/Multi cnt reg 0 /QN
                                                      (^) triggered
by leading edge of 'clk'
Path Groups: {reg2reg}
Analysis View: default_setup_view
Other End Arrival Time 79.800
- Setup
                         6.103
+ Phase Shift
                        600.000
= Required Time
                        673.687
- Arrival Time
                        612.227
= Slack Time
                         61.460
   Clock Rise Edge
                                  0.000
    + Drive Adjustment
                                 5.200
    + Source Insertion Delay
                                -94.574
    = Beginpoint Arrival Time
                                -89.374
    Timing Path:
                        Pin
                                               | Edge |
                                                                     Net
Delay | Arrival | Required |
```

#### Worst Case Hold:

| ***************************************                                      |                                               |             |                         |                         |          |                     |  |  |
|------------------------------------------------------------------------------|-----------------------------------------------|-------------|-------------------------|-------------------------|----------|---------------------|--|--|
| # Generated by:                                                              | nerated by: Cadence Innovus 17.12-s095_1      |             |                         |                         |          |                     |  |  |
| # 0S:                                                                        | Linux x86_64(Host ID nc-asu6-l02.apporto.com) |             |                         |                         |          |                     |  |  |
| # Generated on:                                                              | Fri May 2 18:25:22 2025                       |             |                         |                         |          |                     |  |  |
| # Design:                                                                    | GCN                                           |             |                         |                         |          |                     |  |  |
| # Command: optDesign -postroute -hold                                        |                                               |             |                         |                         |          |                     |  |  |
| ***************************************                                      |                                               |             |                         |                         |          |                     |  |  |
| Path 1: VIOLATED Hold Check with Pin Transformation/Mem memory reg 11 3 /CLK |                                               |             |                         |                         |          |                     |  |  |
| Endpoint: Transfor                                                           | mation/Mem memory re                          | q 11 3 /D ( | v) checked with leading | 1                       |          |                     |  |  |
| edge of 'clk'                                                                |                                               |             | ,                       | ,                       |          |                     |  |  |
| Beginpoint: data in[                                                         | 423]                                          | (1          | v) triggered by leading | 1                       |          |                     |  |  |
| edge of '@'                                                                  |                                               |             | ,,                      | ,                       |          |                     |  |  |
| Path Groups: {clk}                                                           |                                               |             |                         |                         |          |                     |  |  |
| Analysis View: defau                                                         | lt hold view                                  |             |                         |                         |          |                     |  |  |
| Other End Arrival Ti                                                         |                                               |             |                         |                         |          |                     |  |  |
| + Hold                                                                       | 13.435                                        |             |                         |                         |          |                     |  |  |
| + Phase Shift                                                                | 0.000                                         |             |                         |                         |          |                     |  |  |
| = Required Time                                                              | 21.932                                        |             |                         |                         |          |                     |  |  |
| Arrival Time                                                                 | 21.700                                        |             |                         |                         |          |                     |  |  |
| Slack Time                                                                   | -0.232                                        |             |                         |                         |          |                     |  |  |
| Clock Rise Edge                                                              |                                               | 0.000       |                         |                         |          |                     |  |  |
| + Input Delay                                                                |                                               | 0.000       |                         |                         |          |                     |  |  |
| = Beginpoint Ar                                                              | rival Time                                    | 0.000       |                         |                         |          |                     |  |  |
| Timing Path:                                                                 |                                               |             |                         |                         |          |                     |  |  |
| +                                                                            |                                               |             |                         |                         |          | +                   |  |  |
| 1                                                                            | Pin                                           | Edge        | Net                     | I Cell                  | Delay    | Arrival   Required  |  |  |
| i                                                                            |                                               | i           | i                       |                         | i        | Time   Time         |  |  |
| j                                                                            |                                               |             |                         |                         | <u>.</u> |                     |  |  |
| data in[423]                                                                 |                                               | l v         | data in 11 3            | I                       | 1        | 0.000   0.232       |  |  |
| Transformatio                                                                | n/U3034/B                                     | i v         | data in 11 3            | NAND2xp33 ASAP7 75t R   | 3.500    | 3.500   3.7 Messeng |  |  |
| Transformatio                                                                | n/U3034/Y                                     | i ^         | Transformation/n375     | NAND2xp33 ASAP7 75t R   | 8.900    | 12.400   12.6       |  |  |
| L Transformatio                                                              | n/U3035/B                                     | i-^         | l_Transformation/n375   | L 0AI21xn33 ASAP7 75t R | 0.000    | 12.400   12.632     |  |  |