---
layout: post
title: CRC calculations and compile time metaprogramming in Mojo
categories: [mojo]
date: "2024-05-99"
author: "Ferdinand Schenck"
draft: true
excerpt: . 
---

In my [last post](https://fnands.com/blog/2024/mojo-png-parsing/) on parsing PNG images in I very briefly mentioned cyclic redundancy checks, and posted a rather cryptic looking function which I claimed was a bit inefficient. 

In this post I want to follow up on that a bit, and delve into the compile time metaprogramming side of Mojo to see how we can speed up these calculations.   

But first, let's go through a bit of background so we know what we're dealing with. 

## Cyclic redundancy checks

CRCs are error detecting codes that are often used to detect corruption of data in digital files, an example of which is PNG files. In the case of PNGs for example the CRC32 is calculated for the data of each chunk and appended to the end of the chunk, so that the person reading the file can verify whether the data they read was the same as the data that was written.  

A CRC check technically does "long division in the ring of polynomials of binary coefficients ($\Bbb{F}_2[x]$)" 😳.   

It's not as complicated as it sounds. I found the [Wikipedia article on Polynomial long division](https://en.wikipedia.org/wiki/Polynomial_long_division) to be helpful, and if you want an in depth explanation then
[this post](http://www.sunshine2k.de/articles/coding/crc/understanding_crc.html) by Bastian Molkenthin really digs deep into the details of implementing. 

But what you need to know is that XOR is equivalent to polynomial long division (over a finite field) for binary numbers, and XOR is a very efficient to calculate in hardware. 

The simplest example of a cyclic redundancy check is the [parity bit](https://en.wikipedia.org/wiki/Parity_bit), AKA CRC-1. The parity bit is used to detect whether an error has occurred while transmitting a byte-long message (it can be used for longer messages, but probably shouldn't be). 

In the formalism of CRC checks, it can be calculated by successively applying XOR between your message and the relevant *generator polynomial*. For larger cases the choice of generator polynomial can get quite involved, but for the CRC-1 case it is $x + 1$, expressed in binary as 11. Notice that the Generator polynomial is always 1 order (or has one more bit) than the CRC  


In [None]:
1+0+0+1 (mod 2) = 0
1+0+1+1 (mod 2) = 1

1001/1100 = 0101
0101/0110 = 0011
0011/0011 = 0000

1011/1100 = 0111
0111/0110 = 0001


In [None]:
from math.bit import bitreverse

fn CRC32(owned data: List[SIMD[DType.int8, 1]]) -> SIMD[DType.uint32, 1]:
    var crc32: UInt32 = 0xffffffff
    for byte in data:
        crc32 = (bitreverse(byte[]).cast[DType.uint32]() << 24) ^ crc32
        for i in range(8):
            
            if crc32 & 0x80000000 != 0:
                crc32 = (crc32 << 1) ^ 0x04c11db7
            else:
                crc32 = crc32 << 1

    return bitreverse(crc32^0xffffffff)