

# Design and Implementation of an LDPC-based FEC encoder/decoder suitable for Storage devices

#### 15. November 2018

**Ruhr University of Bochum** 

Chair for Embedded Systems for Information Technology

Henry Bathke



- 1 Motivation
- 2 LDPC Codes
- 3 Encoding
- 4 Decoding
- 5 Performance
- 6 Results
- 7 Conclusion and Outlook

- 1 Motivation
- 2 LDPC Codes
- 3 Encoding
- 4 Decoding
- 5 Performance
- 6 Results
- 7 Conclusion and Outlook

- Build a Hardware encoder and decoder suitable for storage devices
- Create a system for testing the encoder and decoder

- 1 Motivation
- 2 LDPC Codes
- 3 Encoding
- 4 Decoding
- 5 Performance
- 6 Results
- 7 Conclusion and Outlook

## Low Density Parity Check (LDPC) Codes



- Multiple parity check equations
- Each bit is contained in multiple equations
- With these equations we can correct errors

## Low Density Parity Check (LDPC) Codes

- Multiple parity check equations
- Each bit is contained in multiple equations
- With these equations we can correct errors
- We construct the check matrix with the check equations



## Quasi Cyclic Low Density Parity Check (LDPC)

- High complexity of LDPC codes
- Reduce complexity by adding structure to the PCM
- Split PCM into submatrices
- Only allow shifted version of a submatrix

$$\mathbf{B} = \begin{bmatrix} -1 & 0 & 2 & -1 \\ 0 & 1 & -1 & -1 \\ -1 & 1 & 0 & 2 \end{bmatrix}$$

- 1 Motivation
- 2 LDPC Codes
- 3 Encoding
- 4 Decoding
- 5 Performance
- 6 Results
- 7 Conclusion and Outlook

- Is usually done with generator matrix
- The generator matrix is dense due to the inversion
- With long codes the dense matrix multiplication is large
- Use transforms on the PCM to convert it into a more desireable form



- $\blacksquare$  Reach minimum gap g by doing only row and column permutations
- Only need an inverted matrix of size  $g \times g$



- Only large sparse matrix multiplication and back substitution
- One small dense matrix multiplication

| Operation                                                                                                                       | Туре                              |
|---------------------------------------------------------------------------------------------------------------------------------|-----------------------------------|
| $\mathbf{A}\mathbf{s}^{T}$                                                                                                      | sparse multiplication             |
| $T^{-1}\mathbf{A}\mathbf{s}^{T}$                                                                                                | sparse back substitution          |
| $-\mathbf{E}\mathbf{T}^{-1}\mathbf{A}\mathbf{s}^{T}$                                                                            | sparse multiplication             |
| $\mathbf{C} \mathbf{s}^{T}$                                                                                                     | sparse multiplication             |
| $\left(-\boldsymbol{E}\boldsymbol{T}^{-1}\boldsymbol{A}\boldsymbol{s}^{T}\right)+\left(\boldsymbol{C}\boldsymbol{s}^{T}\right)$ | vector addition                   |
| $\phi^{-1}\left(-\boldsymbol{E}\boldsymbol{T}^{-1}\boldsymbol{A}\boldsymbol{s}^{T}+\boldsymbol{C}\boldsymbol{s}^{T} ight)$      | dense $g \times g$ multiplication |

- Implemented as combinatorial logic
- Connects to the other modules with an axi stream bus
- Repacking is needed as bit counts dont evenly divide
- Encoder is generated from the QC PCM with Python scripts

For wifi ldpc with rate 0.5 it looks like this



- 1 Motivation
- 2 LDPC Codes
- 3 Encoding
- 4 Decoding
- 5 Performance
- 6 Results
- 7 Conclusion and Outlook

- I implemented a message passing decoder
- Messages are passed along the edges on the tanner graph
- Computations are done on the nodes



## **Decoding**

$$r_{mn} = \left(\prod_{n' \in M(m) \setminus n} \operatorname{sign}(q_{n'm})\right) \min_{n' \in M(m) \setminus n} (|q_{n'm}|)$$

$$L_n = y_n + \sum_{m \in N(n)} r_{m'n}$$

$$q_{nm} = y_n + \sum_{m' \in N(n) \setminus m} r_{m'n}$$

- We have to implement these equations
- Can exploit similarities

## **Decoding**

$$r_{mn} = \left(\prod_{n' \in M(m) \setminus n} \operatorname{sign}(q_{n'm})\right) \min_{n' \in M(m) \setminus n} (|q_{n'm}|)$$

$$L_n = y_n + \sum_{m \in N(n)} r_{m'n}$$

$$q_{nm} = L_n - r_{mn}$$

- two pass algorithm
- intermediate values stored im memory



## RUB

## **Hardware Decoding**











4

## **Hardware Decoding**

8

#### Running minimum

Shifted input

7 5 9 2 4 2 8

8

#### Running minimum

Shifted input

5 9 2 4 2 8

#### Running minimum

5 9

8

Shifted input

4 2 8

#### Running minimum

4 1 6 3 8 7 5 9 2 4 2 8

Shifted input

- 1 Motivation
- 2 LDPC Codes
- 3 Encoding
- 4 Decoding
- 5 Performance
- 6 Results
- 7 Conclusion and Outlook

- Used a ZedBoard
- Testing requires an encoder, channel, decoder setup
- AWGN channel
- Whole performance test is implemented on Hardware
- ARM cpu calculates test parameters and reads the error counts



## **Performance Testing**

**Block Design Implementation** 



## **Performance Testing**

**Block Design Implementation** 



- 1 Motivation
- 2 LDPC Codes
- 3 Encoding
- 4 Decoding
- 5 Performance
- 6 Results
- 7 Conclusion and Outlook

#### **Simulation Results**



#### Simulation Results



- Decoder expects LLR
- Convert channel output to LLR











#### **Hardware Results**

Normalized Min Sum



- 1 Motivation
- 2 LDPC Codes
- 3 Encoding
- 4 Decoding
- 5 Performance
- 6 Results
- 7 Conclusion and Outlook

#### Conclusion

- LDPC codes can be implemented on hardware
- My Implementation has problematic error floor

#### Outlook

- Look at power consumption
- Optimize area utilization
- Utilize reconfiguration / change instructions

## Questions?