qubyte · qubyte · Mar 13, 2022
diff --git a/content/posts/2022-03-13.md b/content/posts/2022-03-13.md
@@ -0,0 +1,332 @@
+---
+{
+  "datetime": "2022-03-13T11:30:00Z",
+  "updatedAt": null,
+  "draft": false,
+  "title": "PNGs from the ground up for QR codes",
+  "description": "I decided to learn how to make PNG images from scratch for QR codes.",
+  "tags": [
+    "JavaScript",
+    "npmjs"
+  ]
+}
+---
+In this post I talk about building PNGs from scratch, optimized for QR
+codes.
+
+At work, we have `npm audit` as part of our CI pipeline. We have it configured to
+ignore advisories which can safely be ignored, but eventually I like them to be
+resolved. When a package is slow to fix the issue or is no longer being
+maintained it makes my brain itch.
+
+One such package we use takes some content and encodes it as a QR code in a PNG
+image. I like to lower the `node_modules` footprint when I can, and a good
+replacement module is [`qrcode-svg`][qrcode-svg] since it has no dependencies of
+its own.
+
+The resultant QR code looks the same in the browser, but there's one big
+difference... As the QR code is embedded in an SVG it's much larger! Grids of
+data like heatmaps, photographs, and of course QR codes are better represented
+as pixels, so raster formats like PNG are a better fit.
+
+The `qrcode-svg` module makes the grid of data representing a QR code available,
+so the grid can be used to produce a raster image. I _could_ use another package
+to do that encoding, but it struck me that QR codes have some interesting
+properties which may provide size optimizations when encoding them as PNGs:
+
+- They have only two colours.
+- They're square (this fact turns out not to be useful).
+- Each block can be a single pixel.
+
+So I looked up the PNG spec and decided to see if I could make my own!
+
+## What's in a PNG?
+
+For this experiment I used the [PNG RFC] (request for comments) document. RFC
+documents can be pretty dense to make sense of, but this one _defines_ PNGs, so
+it's an excellent reference.
+
+A PNG turns out to be composed of a preamble of eight bytes, and "chunks". The
+preamble is a fixed set of eight bytes which software inspecting the file can
+read to know that the file is a PNG.
+
+The chunks all follow the same essential layout:
+
+- The first four bytes are the length of the data in the chunk in bytes, encoded
+  as a 32-bit unsigned network-order (big-endian) integer. This does not include
+  the length itself, the type, or the CRC check (see below). Just the data!
+- The next four bytes are the type of the chunk as ASCII characters. e.g. `IHDR`
+  for the header chunk.
+- The following bytes (not a fixed length) are the data of the chunk. The form
+  of these data depends on the chunk type.
+- The final four bytes are a "Cyclic Redundancy Check" (CRC) performed on the
+  type and data bytes. These amount to a checksum for the chunk.
+
+So that's `4B + 4B + nB + 4B`. For the avoidance of any doubt, when I use "byte"
+here, it is to mean eight bits.
+
+These chunks are reminiscent of chunk encoded HTTP bodies in that the chunk
+starting with a length lets streaming software collect the stated number of
+bytes before passing it on to more involved logic. The CRC provides a way of
+checking the integrity of individual chunks.
+
+There are all sorts of chunk types defined in the specification, and the spec
+allows for custom chunks to be defined. The simplest possible PNG is composed of
+the chunks:
+
+- Preamble: Not a chunk, just fixed bytes. Must come first.
+- `IHDR`: The header chunk, which comes immediately after the preamble.
+- `IDAT`: The data chunk, which contains pixel data. The form of this depends on
+  the header chunk. There may be more than one of these, but there must be at
+  least one, and it must come after the header chunk.
+- `IEND`: The end chunk. This is empty of data, and must be the last chunk.
+
+Like the preamble bytes, the `IEND` chunk is fixed since it contains no data to
+vary, so it can be hard-coded into software writing a PNG.
+
+The `IHDR` chunk is required immediately after the preamble. This chunk is a
+header which encodes data about how colours are encoded in the PNG, the width,
+the height, and some fixed bytes. The `IHDR` chunk data is always the same
+length, and the chunk data takes 13 bytes:
+
+- The first 4 bytes are the width of the PNG in bytes, encoded as a 32-bit
+  unsigned network ordered integer.
+- The next 4 are the height, also encoded as a 32-bit unsigned network ordered
+  integer.
+- 1 byte for the bit depth.
+- 1 byte for colour type.
+- 1 byte for compression.
+- 1 byte for filter kind.
+- 1 byte for interlace.
+
+Since we're dealing with a QR code, pixels can have only one of two colours, and
+so the bit depth is set to 1 (one bit is all I need to encode two values). A
+basic PNG is black and white, so I can choose colour type 0 (greyscale) with a
+bit depth of 1 for those[^][though the input needs to be flipped for black on white].
+If I want to use two different colours then I can instead choose colour type 3
+(palette) with a bit depth of 1. Type 0 is very compact. Type 3 requires the
+addition of a `PLTE` chunk to encode the RGB values to correspond to (in this
+case) a 0 or a 1. This chunk comes between the `IHDR` and `IDAT` chunks.
+
+Compression and filter kind are fixed 0 values. They're provided for forward
+compatibility. For a QR code interlacing makes no sense (assuming one pixel per
+module) to that's set to 0 too.
+
+One thing I wanted for this experiment was for my PNG code to work in the
+browser as well as Node. As a Node engineer I often reach for `Buffer` to handle
+bytes of data. Fortunately, `Buffer` is really just some gloss over
+`Uint8Array` (a typed array of unsigned 8-bit integers), which are _almost_ as
+convenient for working with bytes. Buffers have convenience functions for
+converting to and from bytes to hex and base64 which `Uint8Array` lacks.
+
+Fortunately `Uint8Array` does have some very useful features. One thing the PNG
+spec requires is that 32-bit integers are inserted into the byte array in
+network order. At first, I used some old-school [bit shifting] to address the
+bytes of each 32-bit integer, which works but littered the code with pretty
+opaque for-loops. For the specific example of building the header data mentioned
+above I built a function which uses a data view to do this:
+
+```javascript
+function makeHeaderData(width, height, isBlackAndWhite) {
+  const IHDRData = Uint8Array.of(
+    0, 0, 0, 0, // The width will go here.
+    0, 0, 0, 0, // The height will go here.
+    1, // bit depth (two possible pixel colors)
+    isBlackAndWhite ? 0 : 3, // 0 is grayscale, 3 is palette.
+    0, // compression
+    0, // filter
+    0 // interlace (off)
+  );
+
+  // Width and height are set in network byte order.
+  const view = new DataView(IHDRData.buffer);
+  view.setUint32(0, width, false);
+  view.setUint32(4, height, false);
+
+  return IHDRData;
+}
+```
+
+The `DataView` opens up some low level helpers which let me set bytes of the
+`ArrayBuffer` that the `Uint8Array` wraps. This is the _data_ for the header
+chunk, and not the chunk itself. The data must be wrapped up to form a chunk:
+
+```javascript
+function makeChunk(name, data) {
+  // Allocate the memory for the complete chunk.
+  const chunk = new Uint8Array(4 + name.length + data.length + 4);
+
+  // A view lets us insert things in network order easily.
+  const view = new DataView(chunk.buffer);
+
+  // Offsets for readability.
+  const nameOffset = 4;
+  const dataOffset = nameOffset + name.length;
+  const crcOffset = dataOffset + data.length;
+
+  // Set the length bytes.
+  view.setUint32(0, data.length, false);
+
+  // Set the name (ascii) bytes.
+  for (let i = 0, len = name.length; i < len; i++) {
+    chunk[nameOffset + i] = name.charCodeAt(i);
+  }
+
+  // Set the data bytes.
+  chunk.set(data, dataOffset);
+
+  // Calculate the CRC from the name and data bytes.
+  // subarray gets a view on the chunk. It does not copy.
+  const crc = crc32(chunk.subarray(nameOffset, crcOffset));
+
+  // Set the CRC bytes.
+  view.setUint32(crcOffset, crc, false);
+
+  return chunk;
+}
+```
+
+Next up is the palette chunk. This is named `PLTE` and takes triplets of bytes
+as RGB values for pixels. For example, a white background with blue pixels has
+the palette `0x0000FF000000`. In the special case of black and white QR codes
+we can skip this chunk, because the `IHDR` chunk already contains all the
+information (the bit-depth and the colour type of 0).
+
+The hard part is all in the `IDAT` chunk. A PNG may have many of these, each
+having a CRC (handy for transferring large PNGs over a network). QR codes
+needn't be large though, so I chose to encode the image data in a single `IDAT`
+chunk.
+
+Each row of image data is formatted as a scan-line. The first byte of each
+scan-line is a filter byte. Filters try to lower the data required to encode the
+row by using surrounding bytes (depending on the filter mode). They get quite
+complex, and for this use case are of limited effectiveness. Thankfully the 0
+filter does no transformation, so I set the first byte to each scan-line to 0.
+
+The rest of the scan-line depends on the bit depth set in the header. When using
+a greyscale or colour palette, each pixel is represented by the same number of
+bits as the bit depth. The bit depth here is one, so each pixel takes just one
+bit! Scan-lines always end on a byte, which means that when the pixel width of
+the image is not divisible by 8 there will be trailing unused bits in the final
+byte of each scan-line. In other words, the bits need to be rounded up to the
+nearest byte for each row of pixels.
+
+Manipulating bits is fiddly. I had to fall back on my early days of C coding
+with bit shifts.
+
+```javascript
+function buildScanLines(data, width, height) {
+  // A bit depth of 1 allows 8 pixels to be packed into one byte. When the
+  // width is not divisible by 8, the last byte will have trailing low bits.
+  // The first byte of the scanline is the filter byte.
+  const nBytesPerRow = 1 + Math.ceil(width / 8);
+
+  // All the scanlines are inserted into one Uint8Array.
+  const buffer = new Uint8Array(nBytesPerRow * height);
+
+  for (let scanline = 0; scanline < height; scanline++) {
+    const offset = nBytesPerRow * scanline;
+
+    // The filter byte.
+    buffer[offset] = 0;
+
+    for (let n = 0; n < nBytesPerRow - 1; n++) {
+      for (let i = 0; i < 8; i++) {
+        if (data[scanline][n * 8 + i]) {
+          // Flip bits in the same order as the row.
+          // 1 << n is the same as 2 ** n.
+          buffer[offset + n + 1] += 1 << (7 - i);
+        }
+      }
+    }
+  }
+
+  return buffer;
+}
+```
+
+Once the scan-lines are built they're compressed. The library I've built accepts
+a function to deflate the data passed to it. In node such a function is provided
+out of the box by the `crypto` core module. For the browser and alternative like
+[`pako`][pako] can be used. To keep the footprint low, the library I built
+doesn't _require_ a deflate function to be passed in. When one isn't it falls
+back to using some fixed bytes to make valid chunks without actually compressing
+them (sort of no-op deflation). Figuring this out was another interesting task,
+and took a lot of tinkering! The result is pleasingly compact. Playing around
+with the result has shown it to be fairly effective on its own. Since the data
+is already quite dense (we're packing pixels into bits, after all), proper
+deflation yields limited results and so the default is pretty good.
+
+```javascript
+function adler32(buffer) {
+  let a = 1;
+  let b = 0;
+
+  for (const byte of buffer) {
+    a = (a + byte) % 65521;
+    b = (b + a) % 65521;
+  }
+
+  return b * 65536 + a;
+}
+
+// Make valid deflated bytes without actual compression.
+function deflate(buffer) {
+  const deflated = Uint8Array.of(
+    0x78, 0x01, // no compression
+    // One block with all the data.
+    0b001, // 1 - final block, 00 - no compression, ...rest: junk
+    0, 0, // length
+    0, 0, // one's complement of length
+    ...buffer,
+    0, 0, 0, 0 // adler32
+  );
+  const view = new DataView(deflated.buffer);
+
+  view.setUint16(3, buffer.length, true);
+  view.setUint16(5, ~buffer.length, true);
+  view.setUint32(deflated.length - 4, adler32(buffer), false);
+
+  return deflated;
+}
+```
+
+The deflated result is the data of the `IDAT` chunk. If I decided to use more than
+one `IDAT` chunk, then I can split the compressed scan-lines on any byte I like.
+In other words, the compression of scan-lines always happens first.
+
+The concatenation of:
+
+- preamble
+- `IHDR`
+- `PLTE` (when not using black and white)
+- `IDAT`
+- `IEND`
+
+In code that looks like:
+
+```javascript
+// These are fixed.
+const PREAMBLE = Uint8Array.of(137, 80, 78, 71, 13, 10, 26, 10);
+const IEND = Uint8Array.of(0, 0, 0, 0, 73, 69, 78, 68, 174, 66, 96, 130);
+
+// When using greyscale the black and white values are the wrong way around,
+// so they must be inverted.
+if (isBlackAndWhite) {
+  // Mutates data! In the library this is already a copy.
+  invertData(data);
+}
+
+const png = Uint8Array.of(
+  ...PREAMBLE,
+  ...makeChunk('IHDR', makeHeaderData(width, height, isBlackAndWhite)),
+  ...(isBlackAndWhite ? [] : makeChunk('PLTE', [...backgroundRgb, ...colorRgb])),
+  ...makeChunk('IDAT', deflate(buildScanLines(data, width, height))),
+  ...IEND
+);
+```
+
+[PNG RFC]: https://datatracker.ietf.org/doc/html/rfc2083
+[bit shifting]: https://en.wikipedia.org/wiki/Bitwise_operations_in_C#Shift_operators
+[qrcode-svg]: https://npmjs.com/package/qrcode-svg
+[pako]: https://npmjs.com/package/qrcode-svg