#### HW4 的基本設計

(速度 clock speed、資料運算吞吐量 throughput,面積/邏輯閘數/FPGA 成本,耗能) 下圖為 Direct Form FIR 架構



## `timescale 1ns/10ps `define CYCLE 6.0

- 1. 速度 clock speed:6.0 define CYCLE
- 2. 資料運算吞吐量 throughput:2.6 G bits/sec
- 3. 面積

|                                                                                                            | Global cell area       |                  | Local cell area        |                       |                 |                                      |
|------------------------------------------------------------------------------------------------------------|------------------------|------------------|------------------------|-----------------------|-----------------|--------------------------------------|
|                                                                                                            | Absolute<br>Total      | Percent<br>Total | Combi-<br>national     | Noncombi-<br>national | Black-<br>boxes | Design                               |
| FIR                                                                                                        | 226142.9036            | 100.0            | 1420.7238              | 53788.9079            | 0.0000          | FIR                                  |
| add_0_root_add_0_root_add_150_27                                                                           |                        |                  | 3471.1830              | 0.0000                | 0.0000          | FIR_DW01_add_27                      |
| add_10_root_add_0_root_add_150_27                                                                          | 4435.3062              |                  | 4435.3062              | 0.0000                | 0.0000          | FIR_DW01_add_39                      |
| add_11_root_add_0_root_add_150_27                                                                          | 4431.9114              | 2.0              | 4431.9114              | 0.0000<br>0.0000      | 0.0000          | FIR_DW01_add_29                      |
| add_12_root_add_0_root_add_150_27<br>add_13_root_add_0_root_add_150_27                                     | 4654.2708<br>3771.6228 | 2.1<br>1.7       | 4654.2708<br>3771.6228 | 0.0000                | 0.0000          | FIR_DW01_add_33<br>FIR DW01 add 45   |
| add_14_root_add_0_root_add_150_27                                                                          | 2763.3672              |                  | 2763.3672              | 0.0000                | 0.0000          | FIR DW01 add 55                      |
| add 15 root add 0 root add 150 27                                                                          | 2780.3412              |                  | 2780.3412              | 0.0000                | 0.0000          | FIR_DW01_add_54                      |
| add_16_root_add_0_root_add_150_27                                                                          |                        |                  | 3707.1216              | 0.0000                | 0.0000          | FIR_DW01_add_37                      |
| add_17_root_add_0_root_add_150_27                                                                          | 3819.1500              |                  |                        | 0.0000                | 0.0000          | FIR_DW01_add_50                      |
| add_18_root_add_0_root_add_150_27                                                                          | 3739.3722              | 1.7              | 3739.3722              | 0.0000                | 0.0000          | FIR_DW01_add_42                      |
| add_19_root_add_0_root_add_150_27<br>add_1_root_add_0_root_add_150_27                                      | 3951.5472<br>5053.1598 | 1.7              | 3951.5472<br>5053.1598 | 0.0000                | 0.0000          | FIR_DW01_add_38<br>FIR_DW01_add_31   |
| add_20_root_add_0_root_add_150_27                                                                          | 4148.4456              | 1.8              | 4148.4456              | 0.0000                | 0.0000          | FIR_DW01_add_35                      |
| add_21_root_add_0_root_add_150_27                                                                          | 3848.0058              | 1.7              | 3848.0058              | 0.0000                | 0.0000          | FIR_DW01_add_48                      |
| add_22_root_add_0_root_add_150_27                                                                          |                        |                  | 3725.7930              | 0.0000                | 0.0000          | FIR_DW01_add_44                      |
| add_23_root_add_0_root_add_150_27                                                                          | 3793.6890              |                  | 3793.6890              | 0.0000                | 0.0000          | FIR_DW01_add_49                      |
| add_24_root_add_0_root_add_150_27                                                                          | 4175.6040              |                  | 4175.6040              | 0.0000                | 0.0000          | FIR_DW01_add_28                      |
| add_25_root_add_0_root_add_150_27<br>add_26_root_add_0_root_add_150_27<br>add_2_root_add_0_root_add_150_27 | 3659.5944              | 1.6              | 3659.5944              | 0.0000                | 0.0000          | FIR_DW01_add_32                      |
| add_26_root_add_0_root_add_150_2/                                                                          | 3790.2942<br>4603.3488 | 1.7<br>2.0       | 3790.2942<br>4603.3488 | 0.0000                | 0.0000          | FIR_DW01_add_46<br>FIR_DW01_add_41   |
| add_3_root_add_0_root_add_150_27                                                                           | 4671.2448              | 2.1              | 4671.2448              | 0.0000                | 0.0000          | FIR_DW01_add_40                      |
| add_4_root_add_0_root_add_150_27                                                                           | 4691.6136              | 2.1              | 4691.6136              | 0.0000                | 0.0000          | FIR_DW01_add_30                      |
| add 5 root add 0 root add 150 27                                                                           | 4810.4316              |                  | 4810.4316              | 0.0000                | 0.0000          | FIR_DW01_add_34                      |
| add_6_root_add_0_root_add_150_27                                                                           |                        |                  | 4372.5024              | 0.0000                | 0.0000          | FIR_DW01_add_43                      |
| add_7_root_add_0_root_add_150_27                                                                           | 3240.3366              |                  | 3240.3366              | 0.0000                | 0.0000          | FIR_DW01_add_56                      |
| add_8_root_add_0_root_add_150_27                                                                           | 4457.3724              |                  | 4457.3724              | 0.0000                | 0.0000          | FIR_DW01_add_36                      |
| add_9_root_add_0_root_add_150_27<br>mult 113                                                               | 4251.9870<br>1497.1068 | 1.9<br>0.7       | 4251.9870<br>1497.1068 | 0.0000<br>0.0000      | 0.0000          | FIR_DW01_add_47<br>FIR_DW_mult_tc_4  |
| mult_114                                                                                                   | 1189.8774              | 0.7              | 1189.8774              | 0.0000                | 0.0000          | FIR_DW_mult_tc_5                     |
| mult_115                                                                                                   | 2210.0148              | 1.0              | 2210.0148              | 0.0000                | 0.0000          | FIR_DW_mult_tc_5                     |
| mult_116                                                                                                   | 1946.9178              | 0.9              | 1946.9178              | 0.0000                | 0.0000          | FIR_DW_mult_tc_5                     |
| mult_117                                                                                                   | 1308.6954              | 0.6              | 1308.6954              | 0.0000                | 0.0000          | FIR_DW_mult_tc_5                     |
| mult_118                                                                                                   | 3292.9561              |                  |                        | 0.0000                | 0.0000          | FIR_DW_mult_tc_3                     |
| mult_119                                                                                                   | 2065.7358              | 0.9              | 2065.7358              | 0.0000                | 0.0000          | FIR_DW_mult_tc_4                     |
| mult_120<br>mult_121                                                                                       | 2062.3410<br>2496.8754 | 0.9<br>1.1       | 2062.3410<br>2496.8754 | 0.0000<br>0.0000      | 0.0000          | FIR_DW_mult_tc_3<br>FIR_DW_mult_tc_4 |
| mult_122                                                                                                   | 2495.1780              |                  | 2495.1780              | 0.0000                | 0.0000          | FIR DW mult to 4                     |
| mult 123                                                                                                   | 3062.1096              |                  | 3062.1096              | 0.0000                | 0.0000          | FIR_DW_mult_tc_2                     |
| mult_124                                                                                                   |                        |                  | 2564.7714              | 0.0000                | 0.0000          | FIR_DW_mult_tc_3                     |
| mult_125                                                                                                   | 2498.5728              |                  | 2498.5728              | 0.0000                | 0.0000          | FIR_DW_mult_tc_3                     |
| mult_126                                                                                                   | 2298.2796              | 1.0              | 2298.2796              | 0.0000                | 0.0000          | FIR_DW_mult_tc_6                     |
| mult_127<br>mult 128                                                                                       | 2274.5160<br>2474.8092 | 1.0              | 2274.5160<br>2474.8092 | 0.0000                | 0.0000          | FIR_DW_mult_tc_6                     |
| mult_128<br>mult_129                                                                                       | 2554.5870              |                  | 2554.5870              | 0.0000                | 0.0000          | FIR_DW_mult_tc_3<br>FIR_DW_mult_tc_3 |
| mult_130                                                                                                   | 3060.4122              | 1.4              | 3060.4122              | 0.0000                | 0.0000          | FIR_DW_mult_tc_2                     |
| mult_131                                                                                                   | 2490.0858              |                  | 2490.0858              | 0.0000                | 0.0000          | FIR DW mult to 4                     |
| mult_132                                                                                                   | 2503.6650              |                  | 2503.6650              | 0.0000                | 0.0000          | FIR_DW_mult_tc_4                     |
| mult_133                                                                                                   | 2089.4994              | 0.9              | 2089.4994              | 0.0000                | 0.0000          | FIR_DW_mult_tc_3                     |
| mult_134                                                                                                   | 2084.4072              | 0.9              | 2084.4072              | 0.0000                | 0.0000          | FIR_DW_mult_tc_4                     |
| mult_135                                                                                                   | 3466.0909<br>1295.1162 | 1.5<br>0.6       | 3466.0909<br>1295.1162 | 0.0000                | 0.0000          | FIR_DW_mult_tc_3                     |
| mult_136<br>mult_137                                                                                       | 1943.5230              | 0.5              | 1943.5230              | 0.0000<br>0.0000      | 0.0000          | FIR_DW_mult_tc_5<br>FIR_DW_mult_tc_5 |
| mult_138                                                                                                   | 2220.1992              | 1.0              | 2220.1992              | 0.0000                | 0.0000          | FIR_DW_mult_tc_5                     |
|                                                                                                            | 1183.0878              | 0.5              | 1183.0878              | 0.0000                | 0.0000          |                                      |
| MULT 139                                                                                                   |                        |                  |                        |                       |                 |                                      |
| mult_139<br>mult_140                                                                                       | 1485.2250              | 0.7              | 1485.2250              | 0.0000                | 0.0000          | FIR_DW_mult_tc_4                     |

#### 4. 耗能

```
power.log X
Loading db file '/home/m111/m111064503/cad/CBDK/CBDK_IC_Contest_v2.1/SynopsysDC/db/slow.db'
Information: Propagating switching activity (high effort zero delay simulation). (PWR-6)
Warning: Design has unannotated primary inputs. (PWR-414)
Warning: Design has unannotated sequential cell outputs. (PWR-415)
        *************
      15
15
16 Library(s) Used:
17
            slow (File: /home/m111/m111064503/cad/CBDK/CBDK_IC_Contest_v2.1/SynopsysDC/db/slow.db)
      Operating Conditions: slow Library: slow
Wire Load Model Mode: top
      Design Wire Load Model Library
FIR tsmc13_wl10 slow
      Global Operating Voltage = 1.08
Power-specific unit information :
Voltage Units = 1V
Capacitance Units = 1.000000pf
Time Units = 1ns
Dynamic Power Units = 1mW (derived from V,C,T units)
Leakage Power Units = 1pW
30 Power-specific unit information:
31 Voltage Units = 17
32 Capacitance Units = 1.000000pf
33 Time Units = 1ns
34 Dynamic Power Units = 1mW (derived fro
35 Leakage Power Units = 1pW
36
37
38 Cell Internal Power = 9.4328 mW (71%)
39 Net Switching Power = 3.8478 mW (29%)
40
41 Total Dynamic Power = 13.2806 mW (100%)
42
43 Cell Leakage Power = 195.8756 uW
44
                                    Internal
Power
0.0000
0.0000
0.0000
0.0000
k 0.0000
7.7387
0.0000
l 1.6941
                                                                                    Switching
Power
0.0000
0.0000
0.0000
0.3109
0.0000
3.5369
                                                                                                                                                                                  Total Cell
Power ( % ) Attrs Count
                                                                                                                                    Leakage
Power
                                                                                                                             Power
0.0000
0.0000
0.0000
0.0000
4.9670e+07
0.0000
1.4621e+08
      io_pad
memory
black_box
clock_network
register
sequential
combinational
                                                                                                                                                                                                                                                          0
14429
                                                                                                                           1.9588e+08 pW
                                                   9.4328 mW
                                                                                             3.8478 mW
      Total.
```

#### 5. timing report:

```
power.log X timing.log X

Report : timing
-path full
-delay max
-max_paths 1

Design : FIR
Version: R-2020.09-SP5
Date : Wed Jan 4 03:49:23 2023
0.00
0.50
0.50 r
0.92 r
                                                                                                                            3.54 r
3.73 f
3.90 r
4.01 f
4.23 r
4.34 f
4.43 r
4.62 r
```

# Final-Project-Report

下圖為 Folding FIR 架構圖:



### \*製成採用的是 tsmc13\_neg. v

#### (1)預期規格:

>面積:由前面所列作業四的合成 report 可以看到 cell area 面積很大,一共 22 萬,造成面積這麼大的原因是因為一共有 28 個乘法器和加法器,所以我主要是用 Folding 來減少面積,將原本 28 個乘法器和加法器,進行 Folding 後,只需要 1 個乘法器和加法器,我預期可以減少 2 倍以上的面積。

>速度:Clock speed 是 6ns, critical path 是由一個乘法器和一個加法器所組成的,我採用 pipeline 來切開一個乘法器和一個加法器,將 critical path 切成只剩下乘法器,所以我預期可以減少 lns 的 clock speed

#### 2. Fixed Point 模擬

(a)

#### (i)floating coef:

- -0.002964894749042235391756072715452319244
  - 0.005015861152363371849860484985583752859
  - 0.020667893101735827776632703489667619579
  - 0.025148888112610387479683993205981096253
- 0.002316865275779686830781578521509800339
- -0.025779624933500097649918814113334519789
- -0.015946518131217442965086306116972991731

- $0.\,\,029216940020626670782011302662795060314$
- 0.042349361155728501571182675888849189505
- -0.02019719422775444889195384234881203156
- -0.085611751306674921391248744839685969055
- -0.021255965443284792482092626642042887397
- 0. 191877715214637978302647525197244249284
- 0.391355121984242604327164372080005705357
- 0.391355121984242604327164372080005705357
- 0.191877715214637978302647525197244249284
- -0.021255965443284792482092626642042887397
- -0.085611751306674921391248744839685969055
- -0.02019719422775444889195384234881203156
  - 0.042349361155728501571182675888849189505
  - $0.\,\,029216940020626670782011302662795060314$
- $-0.\ 015946518131217442965086306116972991731$
- -0.025779624933500097649918814113334519789
  - 0.002316865275779686830781578521509800339
  - 0.025148888112610387479683993205981096253
  - $0.\,\,020667893101735827776632703489667619579$
  - 0.005015861152363371849860484985583752859
- $-0.\ 002964894749042235391756072715452319244$

(ii)

▼ Filter Designer - [untitled.fda \*] File Edit Analysis Targets View Window Help 9 -20 Order: 27 Magnitude Yes Designed -40 -60 Store Filter 2.5 0 3.5 Frequency (kHz) Filter Manager Response Type Frequency Specifications Magnitude Specifications Filter Order O Specify order: 10 Highpass 8000 Minimum order O Bandpass O Bandstop Fpass: 1500 50 O Differentiator Density Factor: 20 2000 Design Method -O IIR Butterworth FIR Equiripple F. Design Filter Designing Filter ... Done



(b)



可以看到在精度為8左右就開始收斂了 我採用精度為15

→ SNR loss=0 dB

(3)presim:

## (4)gate level simulation:



## (5) synthesis Report

#### 1. area report:

從原本 cell area22 萬降到 7 萬,少了 3 倍,達到預期規格面積少 2 倍以上。

```
***********
 slow (File: /home/m111/m111064503/cad/CBDK/CBDK_IC_Contest_v2.1/SynopsysDC/db/slow.db)
   Number of ports:

Number of nets:

Number of cells:

Number of combinational cells:

Number of sequential cells:

Number of macros/black boxes:

Number of buf/inv:

Number of references:
22 COMBinational area:
23 Buf/Inv area:
24 Noncombinational area:
25 Macro/Black Box area:
26 Net Interconnect area:
27
28 Total cell area:
29 Total area:
30
31 Hierarchical area distribution
                                                         Global cell area
                                                                                                    Local cell area
36 Hierarchical cell
37
                                                         Absolute
Total
                                                                            Percent
Total
                                                                                         Combi-
national
                                                                                                                               Black-
boxes
                                                                                                             Noncombi-
                                                                                                             national
                                                                                                                                            Design
                                                         70058.4870
                                                                               100.0
                                                                                          51449.8914
                                                                                                             18608.5956
                                                                                                                                            FIR
                                                                                                                               0.0000
40 -----
41 Total
42
43 1
                                                                                                             18608.5956
```

timescale 1ns/10ps
2.timing report: define CYCLE 4.88

從 timing report 看出 critical path 只剩下乘法器, 所以有成功在乘法和加法器之間切 pipeline,6-4.88=1.12ns,也有達成預期規格少 lns 的要求。

```
timing.log
12 Operating Conditions: slow
13 Wire Load Model Mode: top
                                                                                   Library: slow
          19
20
21
22
23
24
25
26
27
28
            Des/Clust/Port
                                                           Wire Load Model
                                                                                                                       Library
           FIR
                                                             tsmc13_wl10
                                                                                                                        slow
          Point
                                                                                                                        Incr
                                                                                                                                                     Path
                                                                                                                                                0.00

0.50

1.20

1.35

1.43

1.58

1.77

1.90

2.35

2.58

2.35

2.58

2.36

2.35

3.13

3.13

3.28

4.71

4.86

4.71

4.86

4.71

4.86

4.71

4.86

5.12

5.12

5.12
                                                                                                                       0.00
0.50
0.50
0.14
0.09
0.15
0.09
0.14
0.23
0.12
0.06
0.13
0.08
0.13
0.24
0.13
0.24
0.14
0.13
0.13
0.13
0.13
0.14
0.13
0.13
0.14
0.14
0.15
0.15
0.15
0.16
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
                                                                                                                        0.16
0.10
0.00
61
63
64
65
           clock clk (rise edge)
clock network delay (ideal)
clock uncertainty
D_in_reg[11]/CK (DFFRX2)
library setup time
data required time
                                                                                                                                                    4.88
5.38
5.28
5.28 r
5.12
5.12
                                                                                                                     4.88
0.50
-0.10
0.00
66
67
68
69
70
71
72
73
74
           data required time
data arrival time
                                                                                                                                                    5.12
-5.12
            slack (MET)
                                                                                                                                                     0.00
```

#### 3. power report:



#### 4. Throughput=117M bits/sec

## (6) 設計前與設計後比較

| 設計          | Direct Form FIR(前) | Folding FIR(後) |
|-------------|--------------------|----------------|
| Clock speed | 6.0                | 4. 88          |
| Hardware    | 226242             | 70058          |
| cost        |                    |                |
| (cell area) |                    |                |
| Throughput  | 2.6 G bits/sec     | 117M bits/sec  |
| power       | 13.4765mW          | 3. 2837mW      |

#### (7)設計想法

index=28 輸出 dout。

原本的設計是 Direct Form FIR 但會需要用到 28 個乘法器和加法器,這樣會造成面積過大,所以我採用 Folding 來讓只需要一個乘法器和加法器,但這就表示採樣頻率會變成原本的 1/28,造成 Latency 變長。這裡我設一個 reg index 來控制,當 index=28 時候 buffer 接 input 進來,index=0 時,multiplexer 根據 index 選出 d\_input[0]和 tap[0],然後進行乘法 D\_in=d\_input[0]\*tap[0],乘完之後會有一個 delay 進行累加 temp\_output=temp\_output+D\_in,index=1,multiplexer 根據 index 選出 d\_input[0]和 tap[0],然後進行乘法 D\_in=d\_input[1]\*tap[1],乘完之後會有一個 delay 加進行累加 temp\_output=temp\_output+D\_in,以此類推,最後到