Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds PNG post. #495

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
332 changes: 332 additions & 0 deletions content/posts/2022-03-13.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,332 @@
---
{
"datetime": "2022-03-13T11:30:00Z",
"updatedAt": null,
"draft": false,
"title": "PNGs from the ground up for QR codes",
"description": "I decided to learn how to make PNG images from scratch for QR codes.",
"tags": [
"JavaScript",
"npmjs"
]
}
---
In this post I talk about building PNGs from scratch, optimized for QR
codes.

At work, we have `npm audit` as part of our CI pipeline. We have it configured to
ignore advisories which can safely be ignored, but eventually I like them to be
resolved. When a package is slow to fix the issue or is no longer being
maintained it makes my brain itch.

One such package we use takes some content and encodes it as a QR code in a PNG
image. I like to lower the `node_modules` footprint when I can, and a good
replacement module is [`qrcode-svg`][qrcode-svg] since it has no dependencies of
its own.

The resultant QR code looks the same in the browser, but there's one big
difference... As the QR code is embedded in an SVG it's much larger! Grids of
data like heatmaps, photographs, and of course QR codes are better represented
as pixels, so raster formats like PNG are a better fit.

The `qrcode-svg` module makes the grid of data representing a QR code available,
so the grid can be used to produce a raster image. I _could_ use another package
to do that encoding, but it struck me that QR codes have some interesting
properties which may provide size optimizations when encoding them as PNGs:

- They have only two colours.
- They're square (this fact turns out not to be useful).
- Each block can be a single pixel.

So I looked up the PNG spec and decided to see if I could make my own!

## What's in a PNG?

For this experiment I used the [PNG RFC] (request for comments) document. RFC
documents can be pretty dense to make sense of, but this one _defines_ PNGs, so
it's an excellent reference.

A PNG turns out to be composed of a preamble of eight bytes, and "chunks". The
preamble is a fixed set of eight bytes which software inspecting the file can
read to know that the file is a PNG.

The chunks all follow the same essential layout:

- The first four bytes are the length of the data in the chunk in bytes, encoded
as a 32-bit unsigned network-order (big-endian) integer. This does not include
the length itself, the type, or the CRC check (see below). Just the data!
- The next four bytes are the type of the chunk as ASCII characters. e.g. `IHDR`
for the header chunk.
- The following bytes (not a fixed length) are the data of the chunk. The form
of these data depends on the chunk type.
- The final four bytes are a "Cyclic Redundancy Check" (CRC) performed on the
type and data bytes. These amount to a checksum for the chunk.

So that's `4B + 4B + nB + 4B`. For the avoidance of any doubt, when I use "byte"
here, it is to mean eight bits.

These chunks are reminiscent of chunk encoded HTTP bodies in that the chunk
starting with a length lets streaming software collect the stated number of
bytes before passing it on to more involved logic. The CRC provides a way of
checking the integrity of individual chunks.

There are all sorts of chunk types defined in the specification, and the spec
allows for custom chunks to be defined. The simplest possible PNG is composed of
the chunks:

- Preamble: Not a chunk, just fixed bytes. Must come first.
- `IHDR`: The header chunk, which comes immediately after the preamble.
- `IDAT`: The data chunk, which contains pixel data. The form of this depends on
the header chunk. There may be more than one of these, but there must be at
least one, and it must come after the header chunk.
- `IEND`: The end chunk. This is empty of data, and must be the last chunk.

Like the preamble bytes, the `IEND` chunk is fixed since it contains no data to
vary, so it can be hard-coded into software writing a PNG.

The `IHDR` chunk is required immediately after the preamble. This chunk is a
header which encodes data about how colours are encoded in the PNG, the width,
the height, and some fixed bytes. The `IHDR` chunk data is always the same
length, and the chunk data takes 13 bytes:

- The first 4 bytes are the width of the PNG in bytes, encoded as a 32-bit
unsigned network ordered integer.
- The next 4 are the height, also encoded as a 32-bit unsigned network ordered
integer.
- 1 byte for the bit depth.
- 1 byte for colour type.
- 1 byte for compression.
- 1 byte for filter kind.
- 1 byte for interlace.

Since we're dealing with a QR code, pixels can have only one of two colours, and
so the bit depth is set to 1 (one bit is all I need to encode two values). A
basic PNG is black and white, so I can choose colour type 0 (greyscale) with a
bit depth of 1 for those[^][though the input needs to be flipped for black on white].
If I want to use two different colours then I can instead choose colour type 3
(palette) with a bit depth of 1. Type 0 is very compact. Type 3 requires the
addition of a `PLTE` chunk to encode the RGB values to correspond to (in this
case) a 0 or a 1. This chunk comes between the `IHDR` and `IDAT` chunks.

Compression and filter kind are fixed 0 values. They're provided for forward
compatibility. For a QR code interlacing makes no sense (assuming one pixel per
module) to that's set to 0 too.

One thing I wanted for this experiment was for my PNG code to work in the
browser as well as Node. As a Node engineer I often reach for `Buffer` to handle
bytes of data. Fortunately, `Buffer` is really just some gloss over
`Uint8Array` (a typed array of unsigned 8-bit integers), which are _almost_ as
convenient for working with bytes. Buffers have convenience functions for
converting to and from bytes to hex and base64 which `Uint8Array` lacks.

Fortunately `Uint8Array` does have some very useful features. One thing the PNG
spec requires is that 32-bit integers are inserted into the byte array in
network order. At first, I used some old-school [bit shifting] to address the
bytes of each 32-bit integer, which works but littered the code with pretty
opaque for-loops. For the specific example of building the header data mentioned
above I built a function which uses a data view to do this:

```javascript
function makeHeaderData(width, height, isBlackAndWhite) {
const IHDRData = Uint8Array.of(
0, 0, 0, 0, // The width will go here.
0, 0, 0, 0, // The height will go here.
1, // bit depth (two possible pixel colors)
isBlackAndWhite ? 0 : 3, // 0 is grayscale, 3 is palette.
0, // compression
0, // filter
0 // interlace (off)
);

// Width and height are set in network byte order.
const view = new DataView(IHDRData.buffer);
view.setUint32(0, width, false);
view.setUint32(4, height, false);

return IHDRData;
}
```

The `DataView` opens up some low level helpers which let me set bytes of the
`ArrayBuffer` that the `Uint8Array` wraps. This is the _data_ for the header
chunk, and not the chunk itself. The data must be wrapped up to form a chunk:

```javascript
function makeChunk(name, data) {
// Allocate the memory for the complete chunk.
const chunk = new Uint8Array(4 + name.length + data.length + 4);

// A view lets us insert things in network order easily.
const view = new DataView(chunk.buffer);

// Offsets for readability.
const nameOffset = 4;
const dataOffset = nameOffset + name.length;
const crcOffset = dataOffset + data.length;

// Set the length bytes.
view.setUint32(0, data.length, false);

// Set the name (ascii) bytes.
for (let i = 0, len = name.length; i < len; i++) {
chunk[nameOffset + i] = name.charCodeAt(i);
}

// Set the data bytes.
chunk.set(data, dataOffset);

// Calculate the CRC from the name and data bytes.
// subarray gets a view on the chunk. It does not copy.
const crc = crc32(chunk.subarray(nameOffset, crcOffset));

// Set the CRC bytes.
view.setUint32(crcOffset, crc, false);

return chunk;
}
```

Next up is the palette chunk. This is named `PLTE` and takes triplets of bytes
as RGB values for pixels. For example, a white background with blue pixels has
the palette `0x0000FF000000`. In the special case of black and white QR codes
we can skip this chunk, because the `IHDR` chunk already contains all the
information (the bit-depth and the colour type of 0).

The hard part is all in the `IDAT` chunk. A PNG may have many of these, each
having a CRC (handy for transferring large PNGs over a network). QR codes
needn't be large though, so I chose to encode the image data in a single `IDAT`
chunk.

Each row of image data is formatted as a scan-line. The first byte of each
scan-line is a filter byte. Filters try to lower the data required to encode the
row by using surrounding bytes (depending on the filter mode). They get quite
complex, and for this use case are of limited effectiveness. Thankfully the 0
filter does no transformation, so I set the first byte to each scan-line to 0.

The rest of the scan-line depends on the bit depth set in the header. When using
a greyscale or colour palette, each pixel is represented by the same number of
bits as the bit depth. The bit depth here is one, so each pixel takes just one
bit! Scan-lines always end on a byte, which means that when the pixel width of
the image is not divisible by 8 there will be trailing unused bits in the final
byte of each scan-line. In other words, the bits need to be rounded up to the
nearest byte for each row of pixels.

Manipulating bits is fiddly. I had to fall back on my early days of C coding
with bit shifts.

```javascript
function buildScanLines(data, width, height) {
// A bit depth of 1 allows 8 pixels to be packed into one byte. When the
// width is not divisible by 8, the last byte will have trailing low bits.
// The first byte of the scanline is the filter byte.
const nBytesPerRow = 1 + Math.ceil(width / 8);

// All the scanlines are inserted into one Uint8Array.
const buffer = new Uint8Array(nBytesPerRow * height);

for (let scanline = 0; scanline < height; scanline++) {
const offset = nBytesPerRow * scanline;

// The filter byte.
buffer[offset] = 0;

for (let n = 0; n < nBytesPerRow - 1; n++) {
for (let i = 0; i < 8; i++) {
if (data[scanline][n * 8 + i]) {
// Flip bits in the same order as the row.
// 1 << n is the same as 2 ** n.
buffer[offset + n + 1] += 1 << (7 - i);
}
}
}
}

return buffer;
}
```

Once the scan-lines are built they're compressed. The library I've built accepts
a function to deflate the data passed to it. In node such a function is provided
out of the box by the `crypto` core module. For the browser and alternative like
[`pako`][pako] can be used. To keep the footprint low, the library I built
doesn't _require_ a deflate function to be passed in. When one isn't it falls
back to using some fixed bytes to make valid chunks without actually compressing
them (sort of no-op deflation). Figuring this out was another interesting task,
and took a lot of tinkering! The result is pleasingly compact. Playing around
with the result has shown it to be fairly effective on its own. Since the data
is already quite dense (we're packing pixels into bits, after all), proper
deflation yields limited results and so the default is pretty good.

```javascript
function adler32(buffer) {
let a = 1;
let b = 0;

for (const byte of buffer) {
a = (a + byte) % 65521;
b = (b + a) % 65521;
}

return b * 65536 + a;
}

// Make valid deflated bytes without actual compression.
function deflate(buffer) {
const deflated = Uint8Array.of(
0x78, 0x01, // no compression
// One block with all the data.
0b001, // 1 - final block, 00 - no compression, ...rest: junk
0, 0, // length
0, 0, // one's complement of length
...buffer,
0, 0, 0, 0 // adler32
);
const view = new DataView(deflated.buffer);

view.setUint16(3, buffer.length, true);
view.setUint16(5, ~buffer.length, true);
view.setUint32(deflated.length - 4, adler32(buffer), false);

return deflated;
}
```

The deflated result is the data of the `IDAT` chunk. If I decided to use more than
one `IDAT` chunk, then I can split the compressed scan-lines on any byte I like.
In other words, the compression of scan-lines always happens first.

The concatenation of:

- preamble
- `IHDR`
- `PLTE` (when not using black and white)
- `IDAT`
- `IEND`

In code that looks like:

```javascript
// These are fixed.
const PREAMBLE = Uint8Array.of(137, 80, 78, 71, 13, 10, 26, 10);
const IEND = Uint8Array.of(0, 0, 0, 0, 73, 69, 78, 68, 174, 66, 96, 130);

// When using greyscale the black and white values are the wrong way around,
// so they must be inverted.
if (isBlackAndWhite) {
// Mutates data! In the library this is already a copy.
invertData(data);
}

const png = Uint8Array.of(
...PREAMBLE,
...makeChunk('IHDR', makeHeaderData(width, height, isBlackAndWhite)),
...(isBlackAndWhite ? [] : makeChunk('PLTE', [...backgroundRgb, ...colorRgb])),
...makeChunk('IDAT', deflate(buildScanLines(data, width, height))),
...IEND
);
```

[PNG RFC]: https://datatracker.ietf.org/doc/html/rfc2083
[bit shifting]: https://en.wikipedia.org/wiki/Bitwise_operations_in_C#Shift_operators
[qrcode-svg]: https://npmjs.com/package/qrcode-svg
[pako]: https://npmjs.com/package/qrcode-svg