

## ▶吾陽科技有限公司





# **Outline**

- □ Part I
  - Design Considerations
  - Design Entity
    - □ MageCore
    - Parallel
    - PipeLine
- □ Part II
  - Project Setting



#### Part - I

# **YUV CONVERSION**



## YUV Conversion Algorithm

- RGB to YUV Conversion Equations
  - Y = (0.257\*R) + (0.504\*G) + (0.098\*B) + 16
  - $\Gamma$  Cr = V = (0.439\*R) (0.368\*G) (0.071\*B) +128
  - Cb = U = -(0.148\*R) (0.291\*G) + (0.439\*B) + 128





## Design Considerations

- Floating Point Operation
  - Y = (0.257\*R) + (0.504\*G) + (0.098\*B) + 16
- Speed
  - High Speed DSP Block
  - Pipe-Line
- □ Area
  - Optimization
- Power
- ☐ Verification
- Testing





### **Fixed Point Conversion**

#### ■ 8-bit Fixed Point Excel Table

|    | A                 | В  | С           | D  | E        | F  | G        | Н | I         |
|----|-------------------|----|-------------|----|----------|----|----------|---|-----------|
| 1  | 256               | 8  |             |    |          |    |          |   |           |
| 2  | Floating Point    |    | R           |    | G        |    | В        |   |           |
| 3  | Y                 | 1  | 0.257       | 1  | 0.504    | 1  | 0.098    | 1 | 16        |
| 4  | U                 | 1  | 0.439       | -1 | 0.368    | -1 | 0.071    | 1 | 128       |
| 5  | V                 | -1 | 0.148       | 1  | 0.291    | 1  | 0.439    | 1 | 128       |
| 6  |                   |    |             |    |          |    |          |   |           |
| 7  | Fixed Point       |    | R *256      |    | G        |    | В        |   |           |
| 8  | Y                 | 1  | <b>∀</b> 66 | 1  | 130      | 1  | 26       | 1 | 4096      |
| 9  | U                 | 1  | 113         | -1 | 95       | -1 | 19       | 1 | 32768     |
| 10 | V                 | -1 | 38          | 1  | 75       | 1  | 113      | 1 | 32768     |
| 11 |                   |    |             |    |          |    |          |   |           |
| 12 | Fixed Point - hex |    | R           |    | G        |    | В        |   |           |
| 13 | Y                 | 1  | 00000042    | 1  | 00000082 | 1  | 0000001A | 1 | 00001000  |
| 14 | U                 | 1  | 00000071    | -1 | 0000005F | -1 | 00000013 | 1 | 00080000  |
| 15 | V                 | -1 | 00000026    | 1  | 0000004B | 1  | 00000071 | 1 | 000080000 |

- ☐ Floating Point to Fixed Point
  - Value<sub>fp</sub> \* 2<sup>n</sup> = Value<sub>fixed</sub>



# Fixed Point Conversion, Cont.

- ☐ A example
  - Percentage Error

| 20 |                | <b>R</b> 12     | <b>G</b> 123 | <b>B</b> 234 |       |
|----|----------------|-----------------|--------------|--------------|-------|
| 21 | Fixed Point    | Fixed Operation | Return       | Error        |       |
| 22 | Y              | 26962           | 105          | 0.99         | 0.94% |
| 23 | U              | 17993           | 70           | 1.39         | 1.99% |
| 24 | V              | 67979           | 265          | 0.26         | 0.10% |
| 25 | Floating Point |                 |              |              |       |
| 26 | Y              |                 | 104.01       |              |       |
| 27 | U              |                 | 71.39        |              |       |
| 28 | V              |                 | 264.74       |              |       |



### Design Entity - MegaCore

☐ Fixed Point Operation

```
w_yd <= X"1000" + (X"42"*r_rd) + (X"82"*r_gd) + (X"1A"*r_bd);
w_ud <= X"8000" + (X"71"*r_rd) - (X"5F"*r_gd) - (X"13"*r_bd);
w_vd <= X"8000" - (X"26"*r_rd) - (X"4B"*r_gd) + (X"71"*r_bd);</pre>
```

- Data Width
  - Adder
  - Multiplication
- ☐ Timing Analysis
  - Between two register





## Exercise I – MegaCore

Use MegaCore to Implement





### Exercise I – MegaCore, Cont.

Use 7 Multiplication Block



- Summary
  - **300 LEs**
  - 109.03 Mhz (9.172 ns)

| Flow Summary                                                                                                                                                                               |                                                                                                                                                                  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Flow Status Quartus II Version Revision Name Top-level Entity Name Family Device Timing Models Met timing requirements                                                                     | Successful - Mon Jul 19 11:08:34 2010 10:0 Build 218 06/27/2010 SJ Full Version miatc3x yuv_converter Cyclone III EP3C25F256C8 Final Yes 300 / 24.624 ( 1 % )    |
| Total logic elements Total combinational functions Dedicated logic registers Total registers Total pins Total virtual pins Total memory bits Embedded Multiplier 9-bit elements Total PLLs | 300 / 24,624 ( 1 % )<br>300 / 24,624 ( 1 % )<br>48 / 24,624 ( < 1 % )<br>48<br>50 / 157 ( 32 % )<br>0<br>0 / 608,256 ( 0 % )<br>0 / 132 ( 0 % )<br>0 / 4 ( 0 % ) |

| Tir | ming Analyzer Summary        |       |               |                                  |                 |                 |
|-----|------------------------------|-------|---------------|----------------------------------|-----------------|-----------------|
|     | Туре                         | Slack | Required Time | Actual Time                      | From            | То              |
| 1   | Worst-case tsu               | N/A   | None          | 3.230 ns                         | yuvc_i_gdata[5] | r_gd[5]         |
| 2   | Worst-case tco               | N/A   | None          | 10.546 ns                        | r_ud[0]         | yuvc_o_udata[0] |
| 3   | Worst-case th                | N/A   | None          | 0.231 ns                         | yuvc_i_bdata[5] | r_bd[5]         |
| 4   | Clock Setup: 'yuvc_i_clock'  | N/A   | None          | 109.03 MHz ( period = 9.172 ns ) | r_rd[4]         | r_vd[7]         |
| 5   | Total number of failed paths |       |               |                                  |                 |                 |



## Exercise I – MegaCore, Cont.

#### ☐ Simulation





## Exercise II – Parallel Operations

### ☐ Improve HDL Description

```
w_yd <= (X"1000" + (X"42"*r_rd)) + ((X"82"*r_gd) + (X"1A"*r_bd));
w_ud <= (X"8000" + (X"71"*r_rd)) - ((X"5F"*r_gd) + (X"13"*r_bd));
w_vd <= (X"8000" + (X"71"*r_bd)) - ((X"26"*r_rd) + (X"4B"*r_gd));</pre>
```



## Exercise II – Parallel Operations, Cont.

- □ Summary
  - **286 LEs**
  - 110.95 Mhz (9.013ns)

| Flow Status                        | Successful - Mon Jul 19 12:47:51 2010     |
|------------------------------------|-------------------------------------------|
| Quartus II Version                 | 10.0 Build 218 06/27/2010 SJ Full Version |
| Revision Name                      | miatc3x                                   |
| Top-level Entity Name              | yuv_converter                             |
| Family                             | Cyclone III                               |
| Device                             | EP3C25F256C8                              |
| Timing Models                      | Final                                     |
| Met timing requirements            | Yes                                       |
| Total logic elements               | 286 / 24,624 ( 1 % )                      |
| Total combinational functions      | 286 / 24,624 ( 1 % )                      |
| Dedicated logic registers          | 48 / 24,624 ( < 1 % )                     |
| Total registers                    | 48                                        |
| Total pins                         | 50 / 157 ( 32 % )                         |
| Total virtual pins                 | 0                                         |
| Total memory bits                  | 0 / 608,256 ( 0 % )                       |
| Embedded Multiplier 9-bit elements | 0 / 132 ( 0 % )                           |
| Total PLLs                         | 0/4(0%)                                   |

| Tir | ning Analyzer Summary        |       |               |                                  |                 |                 |
|-----|------------------------------|-------|---------------|----------------------------------|-----------------|-----------------|
|     | Туре                         | Slack | Required Time | Actual Time                      | From            | То              |
| 1   | Worst-case tsu               | N/A   | None          | 3.265 ns                         | yuvc_i_gdata[6] | r_gd[6]         |
| 2   | Worst-case tco               | N/A   | None          | 10.584 ns                        | r_ud[4]         | yuvc_o_udata[4] |
| 3   | Worst-case th                | N/A   | None          | 0.205 ns                         | yuvc_i_rdata[2] | r_rd[2]         |
| 4   | Clock Setup: 'yuvc_i_clock'  | N/A   | None          | 110.95 MHz ( period = 9.013 ns ) | r_bd[4]         | r_ud[7]         |
| 5   | Total number of failed paths |       |               |                                  |                 |                 |



## Exercise II – Parallel Operations, Cont.

#### ☐ RTL Viewer





## Exercise II – Parallel Operations, Cont.

#### Simulation





# Exercise III – PipeLine1

One order pipeline





# Exercise III - PipeLine1, Cont.

- ☐ Summary
  - **289 LEs**
  - 154.37 Mhz (6.478ns)

| Flow Summary                       |                                           |
|------------------------------------|-------------------------------------------|
| Flow Status                        | Successful - Mon Jul 19 13:19:02 2010     |
| Quartus II Version                 | 10.0 Build 218 06/27/2010 SJ Full Version |
| Revision Name                      | miatc3x                                   |
| Top-level Entity Name              | yuv_converter                             |
| Family                             | Cyclone III                               |
| Device                             | EP3C25F256C8                              |
| Timing Models                      | Final                                     |
| Met timing requirements            | Yes                                       |
| Total logic elements               | 289 / 24,624 ( 1 % )                      |
| Total combinational functions      | 288 / 24,624 ( 1 % )                      |
| Dedicated logic registers          | 136 / 24,624 ( < 1 % )                    |
| Total registers                    | 136                                       |
| Total pins                         | 50 / 157 ( 32 % )                         |
| Total virtual pins                 | 0                                         |
| Total memory bits                  | 0 / 608,256 ( 0 % )                       |
| Embedded Multiplier 9-bit elements | 0 / 132 ( 0 % )                           |
| Total PLLs                         | 0/4(0%)                                   |

| Ti | ning Analyzer Summary        |       |               |                                  |                 |                 |
|----|------------------------------|-------|---------------|----------------------------------|-----------------|-----------------|
|    | Туре                         | Slack | Required Time | Actual Time                      | From            | То              |
| 1  | Worst-case tsu               | N/A   | None          | 2.886 ns                         | yuvc_i_gdata[5] | r_gd[5]         |
| 2  | Worst-case tco               | N/A   | None          | 10.497 ns                        | r_ud[4]         | yuvc_o_udata[4] |
| 3  | Worst-case th                | N/A   | None          | 0.349 ns                         | yuvc_i_gdata[2] | r_gd[2]         |
| 4  | Clock Setup: 'yuvc_i_clock'  | N/A   | None          | 154.37 MHz ( period = 6.478 ns ) | r_rd[1]         | r_vop2[15]      |
| 5  | Total number of failed paths |       |               |                                  |                 |                 |



## Exercise III - PipeLine1, Cont.

#### Simulation





# Exercise VI – PipeLine2

### ☐ Two order pipeline



[Project Name] /yuv\_dsp4



# Exercise VI – PipeLine2, Cont.

### □ Summary

| Flow Summary                       |                                           |
|------------------------------------|-------------------------------------------|
| Flow Status                        | Successful - Mon Jul 19 14:14:56 2010     |
| Quartus II Version                 | 10.0 Build 218 06/27/2010 SJ Full Versior |
| Revision Name                      | miatc3x                                   |
| Top-level Entity Name              | yuv_converter                             |
| Family                             | Cyclone III                               |
| Device                             | EP3C25F256C8                              |
| Timing Models                      | Final                                     |
| Met timing requirements            | Yes                                       |
| Total logic elements               | 323 / 24,624 ( 1 % )                      |
| Total combinational functions      | 289 / 24,624 ( 1 % )                      |
| Dedicated logic registers          | 253 / 24,624 ( 1 % )                      |
| Total registers                    | 253                                       |
| Total pins                         | 50 / 157 ( 32 % )                         |
| Total virtual pins                 | 0                                         |
| Total memory bits                  | 0 / 608,256 ( 0 % )                       |
| Embedded Multiplier 9-bit elements | 0 / 132 ( 0 % )                           |
| Total PLLs                         | 0/4(0%)                                   |

| Tir | ning Analyzer Summary        |       |               |                                  |                 |                 |
|-----|------------------------------|-------|---------------|----------------------------------|-----------------|-----------------|
|     | Туре                         | Slack | Required Time | Actual Time                      | From            | То              |
| 1   | Worst-case tsu               | N/A   | None          | 3.314 ns                         | yuvc_i_rdata[4] | r_rd[4]         |
| 2   | Worst-case tco               | N/A   | None          | 10.547 ns                        | r_vd[1]         | yuvc_o_vdata[1] |
| 3   | Worst-case th                | N/A   | None          | 0.242 ns                         | yuvc_i_gdata[2] | r_gd[2]         |
| 4   | Clock Setup: 'yuvc_i_clock'  | N/A   | None          | 232.40 MHz ( period = 4.303 ns ) | r_bd[6]         | r_ump3[12]      |
| 5   | Total number of failed paths |       |               |                                  |                 |                 |



## Exercise VI – PipeLine2, Cont.

#### Simulation





## Exercise V – Without Multiplication

### ☐ Use Adder to Implement

| 31 | Fixed Point - bin | R           | G           | В           |
|----|-------------------|-------------|-------------|-------------|
| 32 | Y                 | 1 01000010  | 1 10000010  | 1 00011010  |
| 33 | U                 | 1 01110001  | -1 01011111 | -1 00010011 |
| 34 | V                 | -1 00100110 | -1 01001011 | 1 01110001  |

```
w_ymp1 <= ("00"&r_rd&"000000") + ("00000000"&r_rd&"0");
w_ymp2 <= ("0"&r_gd&"0000000") + ("00000000"&r_gd&"0");
w_ymp3 <= (("0000"&r_bd&"0000") + ("00000"&r_bd&"000")) + ("0000000"&r_bd&"0");
```



# Exercise V – Without Multiplication, Cont.





# Exercise V – Without Multiplication, Cont.

### ☐ Summary

| Flow Status                        | Successful - Mon Jul 19 16:18:36 2010     |
|------------------------------------|-------------------------------------------|
| Quartus II Version                 | 10.0 Build 218 06/27/2010 SJ Full Version |
| Revision Name                      | miatc3x                                   |
| Top-level Entity Name              | yuv_converter                             |
| Family                             | Cyclone III                               |
| Device                             | EP3C25F256C8                              |
| Timing Models                      | Final                                     |
| Met timing requirements            | Yes                                       |
| Total logic elements               | 284 / 24,624 ( 1 % )                      |
| Total combinational functions      | 284 / 24,624 ( 1 % )                      |
| Dedicated logic registers          | 48 / 24,624 ( < 1 % )                     |
| Total registers                    | 48                                        |
| Total pins                         | 50 / 157 ( 32 % )                         |
| Total virtual pins                 | 0                                         |
| Total memory bits                  | 0 / 608,256 ( 0 % )                       |
| Embedded Multiplier 9-bit elements | 0 / 132 ( 0 % )                           |
| Total PLLs                         | 0/4(0%)                                   |

| Timing Analyzer Summary |                              |       |               |                                  |                 |                 |
|-------------------------|------------------------------|-------|---------------|----------------------------------|-----------------|-----------------|
|                         | Туре                         | Slack | Required Time | Actual Time                      | From            | То              |
| 1                       | Worst-case tsu               | N/A   | None          | 3.470 ns                         | yuvc_i_bdata[3] | r_bd[3]         |
| 2                       | Worst-case tco               | N/A   | None          | 9.581 ns                         | r_yd[6]         | yuvc_o_ydata[6] |
| 3                       | Worst-case th                | N/A   | None          | 0.435 ns                         | yuvc_i_gdata[2] | r_gd[2]         |
| 4                       | Clock Setup: 'yuvc_i_clock'  | N/A   | None          | 97.65 MHz ( period = 10.241 ns ) | r_gd[1]         | r_ud[7]         |
| 5                       | Total number of failed paths |       |               |                                  |                 |                 |



## Exercise V – Without Multiplication, Cont.

#### Simulation Result





## Note: