# SD-FEC Evaluation

> This notebook offers an environment to explore the Soft Decision Forward Error Correction (SD-FEC) IPs in RFSoC using the ZCU111 board. Based on work by Andy Dow (Xilinx, Edinburgh), it allows us to play with a set of configurable blocks including:
>
>  1. A data source with support for BPSK, QPSK, QAM-16, and QAM-64 modulation schemes
>
>  2. A pair of SD-FEC encoder/decoder blocks with configurable LDPC codes
>
>  3. An additive white Gaussian noise (AWGN) channel model with parameterisable noise power
>
> We'll quickly get some classic bit error rate curves from the hardware then investigate how these change with different modulation schemes and LDPC codes, and provide a look at some performance metrics. 

## Contents

 * [SD-FEC Evaluation](#SD-FEC-Evaluation)
   + [SD-FEC refresher](#SD-FEC-refresher)
   + [Loading the design](#Loading-the-design)
   + [Getting a simple BER curve](#Getting-a-simple-BER-curve)
   + [Comparing modulation schemes](#Comparing-modulation-schemes)
   + [Comparing LDPC codes](#Comparing-LDPC-codes)
   + [A note on performance](#A-note-on-performance)

## SD-FEC refresher

The ZCU111 has 8 SD-FEC integrated blocks that we can use to enable our RF systems to function under non-ideal, noisy environments.

The SD-FEC blocks support Low Density Parity Check (LDPC) decoding and encoding, as well as the turbo code decoding used in LTE. We'll focus on LDPC codes for now since we can encode *and* decode these using a SD-FEC block. These codes are configurable from software, as we'll see [later](#Comparing-LDPC-codes).

An LDPC code is a form of parity check matrix. Let's take a look at a graphical representation of what this means:

![](assets/ldpc_fourney.svg)

Here the row of `=` blocks represent the original data bits, the `+` blocks represent the parity bits, and the code dictates the number of blocks and their interconnects.
Note that most data bits contribute to multiple parity bits. Upon detecting error(s), multiple parity bits can be used to iteratively retrieve the original data. This iterative decode process can terminate early if we detect a valid codeword.

For some further reading on LDPC codes, take a look at Bernhard M.J. Leiner's excellent [tutorial](http://www.bernh.net/media/download/papers/ldpc.pdf). You might want to save this reading for later though — some of our upcoming SD-FEC tests take a few minutes to execute!

## Loading the design

We'll first load the bitstream and our supporting Python library

In [None]:
from rfsoc_sdfec import SdFecOverlay, ModType
ol = SdFecOverlay()

This design includes a complete datapath with a pair of SD-FEC encoding/decoding blocks, as pictured below.
![](assets/sd-fec-eval.svg)

Let's have a look at what we can do with this design. Take a look at the most important method we expose, `run_block`:

In [None]:
ol.run_block?

To run a block of data through the signal path, we must supply configurations for the source, SD-FEC, and channel model. Let's take the time to define a set of default parameters.

In [None]:
base_params = lambda : dict(
    source_params = dict(
        mod_type   = ModType.BPSK,
        zero_data  = False,
        num_blocks = 5000,
    ),
    fec_params = dict(
        code_name    = 'docsis_short',
        max_iter     = 8,
        term_on_pass = True,
    ),
    channel_params = dict(
        snr       = 5.0,
        skip_chan = False,
    ),
)

## Getting a simple BER curve

First of all, let's try to run a single block of data through the signal path. We ask `base_params` for a set of parameters, and pass it to the overlay.

In [None]:
ol.run_block(**base_params())

To be clear, we've just used two of the SD-FEC blocks present in the RFSoC! We've asked the overlay to push 5000 blocks of random data through an SD-FEC encoder, through a noisy channel using BPSK modulation, and back through an SD-FEC decoder. The size of each block depends on the LDPC code selected but in this case, we've just sent $\approx$40 Mb through the data path.

There's a lot of statistics we can potentially look at. These include:
  * Bit Error Rate (BER) and Frame Error Rate (FER) of the final signal *after* SD-FEC decoding
  
  * BER and FER of the raw received signal *before* SD-FEC decoding
  
  * Throughput of the SD-FEC encoding and decoding in Gb/s
  
  * Average iterations needed for SD-FEC decoding (remember, the decoder can exit early)
  
Let's now run a set of tests, sweeping the SNR of the channel from low (noisy channel) to high (clean channel), and see how the bit error rate is affected.

In [None]:
import numpy  as np            # Math functions
import pandas as pd            # DataFrame for storing results
import tqdm.notebook # Progress bars

# Define a progress bar helper
bar = lambda itr, desc: tqdm.notebook.tqdm(
    itr,
    desc=desc,
    bar_format='{desc:<5}{bar} {percentage:3.0f}% {r_bar}'
)

params = base_params()
results = pd.DataFrame()

for snr in bar(np.arange(3, 5.5, step=0.25), 'SNR Loop'):
    params['channel_params']['snr'] = snr
    results = results.append(ol.run_block(**params), ignore_index=True)

We can inspect the results as a table (with the `pandas` library). 

In [None]:
results

Now let's plot this using `plotly`, hopefully getting that classic BER curve!

In [None]:
import plotly.express as px

px.line(
    results, x='snr', y='ber',                                    # Data config
    labels = {'snr': 'SNR (dB)', 'ber': 'Bit error probability'}, # Label config
    template ='log_plot', height=400                              # Appearance
)

The Bit Error Rate plot shows that as the SNR increases (our signal gets less noisy) it becomes less likely that a bit is corrupted, so the bit error rate probability decreases

## Comparing modulation schemes

The next step is to run the BER vs SNR test for different modulation schemes and compare the results. Here we send over 200 different 40 Mb blocks with IP configuration and stats recovery in between.
Because this test will take just over 3 minutes to run, now would be a good time to take a short break.You could also read a little more about the [LDPC codes](http://www.bernh.net/media/download/papers/ldpc.pdf) we're using here... and at least we're not waiting on a software implementation of the same codes!

In [None]:
params = base_params()
results = pd.DataFrame()
mod_schemes = [ModType.BPSK, ModType.QPSK, ModType.QAM16, ModType.QAM64]

for mod_type in bar(mod_schemes, f'Modulation type loop'):
    params['source_params']['mod_type'] = mod_type
    for snr in bar(np.arange(3, 16, step=0.25), f'{mod_type.name} SNR Loop'):
        params['channel_params']['snr'] = snr
        results = results.append(ol.run_block(**params), ignore_index=True)

results.to_csv('assets/ber_data.csv', mode='w', index=False)

Let's plot the results as before, but giving each modulation scheme a line with a unique colour. Note that we're only plotting BER test results that are statistically significant(ish) — i.e. we ignore runs with less than a minimum number of bits in error.

In [None]:
px.line(results.query('_bit_errors>5'), x='snr', y='ber', color='mod_type',
        labels = {'snr': 'SNR (dB)', 'ber': 'Bit error probability'},
        category_orders={"mod_type": ['BPSK', 'QPSK', 'QAM16', 'QAM64']},
        range_y = (-4.5, -0.4), template='log_plot', height=400
        )

Notice that the legend in this plot is interactive! You can use it to select which modulation schemes are visible (single click an entry to hide it; double click to hide all others). We can see from the graph that QAM-64 needs the highest SNR to meet a fixed/acceptable BER, followed by QAM-16 and finally QPSK & BPSK.

This matches our intuition: in general, the more complex modulation schemes are used to transmit more information in a given bandwidth.  Consequently, they are more susceptible to errors in the presence of noise.

Let's continue by looking at some of the other statistics available to us. We'll plot four subplots showing different stats vs SNR.

In [None]:
from plotly import subplots
import plotly.offline as po

sub_plot = lambda results, y_field: px.line(
    results,  x='snr', y=y_field, color='mod_type',
    category_orders={"mod_type": ['BPSK', 'QPSK', 'QAM16', 'QAM64']},
    template='log_plot'
)

traces = [('Bit Error Rate'    , 'Error probability'        , 'ber'           , 'log'      , '_bit_errors>5', 1, 1,  False),
          ('Average Iterations', 'Iterations'               , 'dec_avg_iters' , 'linear'   , None           , 1, 2,  False),
          ('Frame Error Rate'  , 'Error Probability'        , 'fer'           , 'log'      , '_bit_errors>5', 2, 1,  False),
          ('Decoder Throughput', 'Decoder Throughput (Gb/s)', 'dec_throughput', 'linear'   , None           , 2, 2,  True )]
#          Plot title             Y-axis title                Y data field      Y-axis type  Query filter     Plot#  Show legend? 

def matrix_plot(sub_plot, traces):
    fig = subplots.make_subplots(rows=2, cols=2, subplot_titles=list(map(lambda s:s[0], traces)), print_grid=False)

    for _, y_title, y_field, y_scale, query, index_v, index_h, legend in traces:
        trace_dataset = results if query == None else results.query(query)
        for trace in sub_plot(trace_dataset, y_field).data:
            trace.showlegend = legend
            subplot_name = str(index_h+2*(index_v-1))
            x_axis = getattr(fig.layout, 'xaxis'+subplot_name)
            x_axis.title = 'SNR (dB)'
            y_axis = getattr(fig.layout, 'yaxis'+subplot_name)
            y_axis.type=y_scale
            y_axis.exponentformat = 'power' if y_scale == 'log' else 'none'
            y_axis.title=y_title
            fig.append_trace(trace, index_v, index_h)
            
    fig.layout.template = 'log_plot'
    fig.layout.height = 500
    po.iplot(fig)

matrix_plot(sub_plot, traces)

There are three patterns worth noting here:

  1. The bit error rate has a direct effect on the frame error rate (this at least shows that the errors are evenly distrbuted between frames)
  
  2. The average number of iterations starts at our maximum but drops as SNR increases
  
  3. The decoder throughput *in this design* depends on a couple of factors, including:
      * The average number of iterations — also influenced by the SNR.
      
      * The modulation scheme (QAM-64 transmits 6 bits of information with each symbol whereas BPSK only transmits 1 bit). Our channel model takes in symbols at a fixed rate, so the QPSK and BPSK curves are actually limited by the channel model and not by the FEC decoder. The QAM-16 curve, however, *is* limited by the FEC decoder block.

The SNR of our signal not only impacts the BER we can achieve, but also the maximum throughput of the system. With the risk of being a bit too gimmicky, let's plot this relationship as a 3D scatter plot.

In [None]:
px.scatter_3d(results, x='snr', y='ber', z='dec_throughput', color='mod_type',
              labels = {'snr': 'SNR (dB)', 'ber': 'Bit error probability', 'dec_throughput': 'Throughput (Gb/s)'},
              category_orders={"mod_type": ['BPSK', 'QPSK', 'QAM16', 'QAM64']},
              template='log_plot', height=500)

## Comparing LDPC codes

One final parametric sweep we might want to look at is for testing different LDPC codes. We expect to see some trade-offs between throughput and BER performance.

Let's have a quick look at how to configure these LDPC codes with the PYNQ SdFec driver. This driver is a Python wrapper around the [existing baremetal driver](https://github.com/Xilinx/embeddedsw/tree/release-2018.3/XilinxProcessorIPLib/drivers/sd_fec), with a few extra conveniences. Because of the way we parse bitstream metadata, the SdFec driver can extract all LDPC code parameters that have been preloaded in Vivado. We can now setup different codes by name rather than large C structures.

We can ask the SdFec driver for a full list of LDPC codes preloaded in this design. There are many codes so let's only look at the first 5 or so:

In [None]:
ol.sd_fec_dec.available_ldpc_params()[:5]

Now let's put a new code at the start of the SD-FEC decoder's look-up tables.

In [None]:
ol.sd_fec_dec.add_ldpc_params(0, 0, 0, 0, 'wifi802_11_cr1_2_1296')

We can iterate a test over a subset of the available codes.

In [None]:
params = base_params()
params['fec_params']['max_iter'] = 16
params['source_params']['mod_type'] = ModType.QAM16
results = pd.DataFrame()
ldpc_codes = ['docsis_short', 'wifi802_11_cr2_3_1296', '5g_graph1_set1_l46_p32']

# See ol.sd_fec_dec.available_ldpc_params() for a full list of included LDPC codes 
for ldpc_code in bar(ldpc_codes, 'Code loop'):   
    params['fec_params']['code_name'] = ldpc_code
    for snr in bar(np.arange(0, 10, step=0.5), f'{str(ldpc_code)} SNR loop'):
        params['channel_params']['snr'] = snr
        results = results.append(ol.run_block(**params), ignore_index=True)

Let's generate some plots with a unique colour for each LDPC code

In [None]:
sub_plot = lambda results, y_field: px.line(
    results,  x='snr', y=y_field, color='code_name',
    template='log_plot', height = 500
)

matrix_plot(sub_plot, traces)

The set of codes selected have dramatic differences in performance! Note in particular the difference between throughput and bit error rate — a classic balancing act. The 5G code has a substantially lower throughput at these SNR values but is the clear winner in terms of error correction. 

## A note on performance

It should be noted that the encoder typically has a much higher throughput than the decoder. This design includes a FIFO that allows performs some buffering of encoded data but if this becomes full the encoder IP is throttled. Therefore, to measure the throughput of the encoder the number of codeblocks run through the system should be limited such the encoded data FIFO does not fill. Generally limiting the number of blocks to 100 will ensure the encoded data FIFO does not fill.

Also note that the channel model throughput is limited by the modulation type selected. The maximum throughput supported is using QAM-64 modulation (6 bits per symbol). With the channel model's 4 symbol wide input, this gives a maximum throughput of:
$$ 4 \times 6\ bits \times 300\ MHz = 7.2\ Gb/s $$

Let's run a small test and inspect the encoder and decoder throughputs.

In [None]:
params = base_params()
params['source_params']['mod_type'] = ModType.QAM64
params['source_params']['num_blocks'] = 100
params['channel_params']['snr'] = 16.0

results = ol.run_block(**params)
enc_tp = results['enc_throughput']
dec_tp = results['dec_throughput']
print(f'Encoder throughput: {enc_tp} Gb/s \t Decoder throughput: {dec_tp} Gb/s')

It's also good to note that these throughput stats for the 'docsis_short' code agree quite closely with the [official documentation](https://www.xilinx.com/support/documentation/ip_documentation/pl/sd-fec-throughput-latency.html#toc8).

## Summary

In this notebook we've:

  * Used PYNQ to interact with the SD-FEC blocks present on the RFSoC
  * Looked at profiling the performance of the SD-FEC encoder and decoder blocks
  * Taken an example SD-FEC design and demonstrated the benefits Python productivity:
    + Performed parametric sweeps of SNR, modulation scheme, and LDPC codes...
    + with interactive visualisations of the results...
    + helping to learn about the relationship between the parameters and different performance metrics

This design is open source and available [here](https://github.com/Xilinx/SDFEC-PYNQ).