ANATOLIY MARTYNYUK

# Custom SEE Tolerant Aurora Gearbox

#### Overview

- The YARR rx from the RD53B receives serial data across multiple lanes with a 2 bit "delimiter" to identify individual data blocks.
- SEEs can corrupt the data on the output of the RD53B so that bits are dropped, added or flipped within the data stream, resulting in a loss of synchronization between the RD53B and YARR.
- THE YARR rx has a sync loss detection and recovery scheme... however, it suffers from inefficiencies which result in an unnecessary number of lost data and can be improved.
- The new recovery scheme aims to remove redundancy and complexity as well add parallelism to improve sync recovery performance.
- The new recovery scheme can provide moderate improvements at no cost in resources, upwards to drastic improvements with an investment in device resources.

### 64b/66b Communication

- The RD53B and YARR DAQ communicate via the 64b/66b Aurora encoding. Up to four channels transmit data simultaneously.
- Aurora encoding packs together 64 "scrambled" bits of scrambled data with 2 header bits, which can be either a "01" or a "10".
- The header bits are used to determine alignment and are expected to be seen every 66 bits.



### Modeling SEEs

Single Event Effects are radiation inflicted damage that can result in bit flips or glitches.

- SEUs (Single Event Upsets): A bit flip of a memory register, modeled as a bit flip in one or more bits transmitted.
- SETs (Single Event Transients): A voltage glitch on a line, results in a glitch being captured (bit flip) or bits being added or dropped from transmission.



## SEEs Consequences to the YARR

- Once a data block is corrupted through an SEE it is lost and can't be recovered.
- Bit adds, bit drops, and certain bit flips result in header misalignment.
- An effective solution minimizes the downtime and blocks lost during alignment resynchronization.

| rx expectation        | h | h | 64 bits scrambled data | h    | h |       | 64 bits scrambled data |
|-----------------------|---|---|------------------------|------|---|-------|------------------------|
| rx w/ header bit flip | h | h | 64 bits scrambled data | \ ~h | h | X     | 64 bits scrambled data |
| rx w/ drop            | h | h | 63 bits scrambled data | h    |   |       | 64 bits scrambled data |
| rx w/ add             | h | h | 65 bits scrambled data |      | h | ( h ) | 64 bits scrambled data |

### Original Resync Scheme

- Utilizes two bit slipping mechanisms simultaneously to shift header bitsback into correct alignment.
- Slips in only a single direction.
- Checks a single bit pair at a time for the header bits during recovery.
- Offers no tolerance for header bit flips and would immediately start slipping even if there is no issue to resolve.

#### **Blocks Lost During Resync**



# Free Improved Resync Scheme

- Reduced to a single effective bit slip mechanism.
- Offered limited tolerance to allow single bit flips but fail fast otherwise.

#### **Blocks Lost During Resync**



## Fully Parallel Search Resync Scheme

- No bit slipping mechanism whatsoever.
- Checks 67 bits for 66 possible header locations simultaneously.
- Reduced tolerance but increased consecutive valid block count synchronization requirement.
- At most 34 blocks can be lost regardless of bits dropped or added.

#### **Blocks Lost During Resync**



### Resource Utilization of Schemes

### ABSOLUTE RESOURCE UTILIZATION OF RESYNC SCHEMES



#### % RESOURCE UTILIZATION WITH KINTEX 7 T160 TARGET FPGA



### Further Development

#### Parameterizable Partially Parallel Search:

- A tradeoff can be made between the amount of parallel search occurring to maximize common case recovery while minimizing the additional resource required to support the search.
- In this variation bit slips and parallel search are used together where the number of bits searched in parallel is equal to the number of bits slipped per bit slip.

#### Fail Fast Header Seekers:

• Rather than search a consecutive block of bits, have seekers assigned to evenly spaced bit positions to which they rotate whenever their current position results in an invalid header.

#### Error Tolerance Algorithms:

- At a relatively cheap cost in resources, various error tolerance schemes can strike a balance to avoid pointless and costly searches as a result of a bit flip, while remaining vigilant to bit adds and bit drops.
- The resources used to track tolerance and evaluate whether sync is achieved can be merged with the parallel header checking to reduce resources.