Skip to content

SPECS: SCoA Specifications

mounaiban edited this page Jan 21, 2024 · 59 revisions

This page is about the codec used by CAPT printers made up to the mid-2000s. For information on the distinct Hi-SCoA compression codec, please see part 1.4 and 3 of the SPECS file in the tree.

Overview

SCoA (expanded to Smart Compression Architecture in some marketing materials) is a compression codec for 1-bit (bi-level) monochrome images which makes use of Run-Length Encoding (RLE) and Delta Encoding.

Images begin from one or more "key" or "seed" lines that encode a whole line. Key lines are RLE-compressed; contiguous repeated bytes are replaced by a single byte with a repeat count. Lines are dealt with a byte at a time, with each byte representing eight 1-bit pixels (this format is identical to Netpbm P4).

Delta encoding is applied on subsequent lines. Segments of unchanged bytes from the previous line are reduced to a single reference. Changed bytes are encoded with RLE. Lines can be ended early with an End-of-Line (EOL) opcode that fills the rest of the line with bytes from the previous line. Lines can be repeated with a lone EOL opcode.

The compressed stream is terminated with an End-of-Page opcode. Unlike lines, pages cannot end early. All lines on a page must be encoded. If the content on a page doesn't reach the bottom, or if the page is a blank page, the first blank line must be encoded as a key line, and each subsequent line must be encoded by EOL opcodes.

No SCoA colour devices are known to exist. Canon has claimed that full colour support was only introduced with the newer and distinct Hi-SCoA codec in a product brochure for the LBP2410.

Opcodes

Note: The SCoA format is not yet completely charted. Information in this section may be subject to change.

Syntax

All opcodes are bit-length, but aligned to start and end on byte boundaries. This allows the bit stream to be processed as a byte stream. Multi-byte opcodes are read in big-endian order.

Data Opcodes

These opcodes encode data in a compressed form, and are shown in base-2 (0b). There are three parts to data opcodes: the operation, counts and data. Operations and counts are interleaved in the first bytes of the opcode. The data always comes next. All counts are unsigned integers.

One-byte opcode example

0b00XXXYYY S0..Sn

The operation in this case is 0b00. The first count is the 3-bit value 0bXXX and the second count is the 3-bit value 0bYYY. The multi-byte string S0 to Sn is the data.

Two-byte opcode examples

0b100WWWWW 0b00YYYXXX S0..Sn: operation 0b10000, 8-bit count 0bWWWWWXXX and 3-bit count 0bYYY, data bytes S0 to Sn

0b101WWWWW 0b00YYYXXX C: operation 0b10100, 8-bit count 0bWWWWWXXX and 3-bit count 0bYYY, datum byte C

Three-byte opcode example

0b100UUUUU 0b101WWWWW 0b11XXXYYY C: operation 0b10010111, 8-bit count 0bUUUUUYYY, 8-bit count 0bWWWWWXXX, datum byte C

Control Opcodes

Control opcodes affect the decompressor's behaviour and are shown in base-16 (0x). These commands have no counts and cannot be compressed (although having a compressible EOL would have further increased efficiency).

Opcode Table

Please note that the opcodes have not yet been thoroughly verified.

There are three data decoding operations:

  • P(n): Copy n bytes from the previous line, at the same offset/position as the current line
  • R(n, C): Repeat, n times, the single byte C
  • N(n, S0...Sn): Write n new uncompressed bytes S0 to Sn.

The + operator herein concatenates the results of the operations.

Opcode Operation Canonical Name (TBC) Operation Description
0b00YYYXXX S0..Sn P(0bXXX) + N(0bYYY, S0..Sn) CopyThenRaw 0bXXX (0-7) bytes from previous line then 0bYYY (1-7) uncompressed bytes S0 to Sn
0b01YYYXXX C P(0bXXX) + R(0bYYY, C) CopyThenRepeat 0bXXX (0-7) bytes from previous line then 0bYYY (1?-7) repeats of C (minimum R() count may be 2, not 1; please see this comment in #33 in the original repo)
0b11XXXYYY C S0..Sn R(0bXXX, C) + N(0bYYY, S0..Sn) RepeatThenRaw 0bXXX (1-7) repeats or C, then 0bYYY (1-7) uncompressed bytes S0 to Sn.
0b100WWWWW 0b00YYYXXX S0..Sn P(0bWWWWXXX) + N(0bYYY, S0..Sn) CopyThenRawLong 0bWWWWWXXX (8-255) bytes from previous line, then 0bYYY (1-7) uncompressed bytes S0 to Sn.
0b100WWWWW 0b01YYYXXX C P(0bWWWWXXX) + R(0bYYY, C) CopyThenRepeatLong 0bWWWWWXXX (8-255) bytes from previous line, then 0bYYY (1-7) repeats of C.
0b101WWWWW 0b00XXXYYY C S0..Sn R(0bWWWWXXX, C) + N(0bYYY, S0..Sn) RepeatThenRawLong 0bWWWWWXXX (8-255) repeats of C, then 0bYYY (1-7) uncompressed bytes S0 to Sn
0b101XXXXX 0b01WWWYYY C S0..Sn R(0bWWW, C) + N(0bXXXXXYYY, S0..Sn) RepeatThenRaw 0bWWW (1-7) repeats of C, then 0bXXXXXYYY (8-255) uncompressed bytes S0 to Sn.
0b101XXXXX 0b10YYYWWW C P(0bWWW) + P(0bXXXXXYYY, C) CopyThenRepeatLong 0bWWW (0-7) bytes from the previous line, then 0bXXXXXYYY (8-255) repeats of C
0b101XXXXX 0b11YYYWWW S0..Sn P(0bWWW) + N(0bXXXXXYYY, S0..Sn) CopyThenRawLong 0bWWW (0-7) bytes from the previous line, then 0bXXXXXYYY (8-255) uncompressed bytes S0 to Sn
0b100UUUUU 0b101XXXXX 0b10YYYWWW C P(0bUUUUUWWW) + R(0bXXXXXYYY, C) CopyThenRepeatLong 0bUUUUUWWW (8-255) bytes from previous line, then 0bXXXXXYYY (8-255) repeats of C.
0b100UUUUU 0b101XXXXX 0b11YYYWWW S0..Sn P(0bUUUUUWWW) + R(0bXXXXXYYY, S0..Sn) CopyThenRawLong 0bUUUUUWWW (8-255) bytes from previous line then 0bXXXXXYYY (8-255) uncompressed bytes S0 to Sn.
0x40 NOP NOP Dummy non-op.
0x41 EOL EOL End of line. Fill the rest of the current line with bytes from the previous line from the same offset on the current line
0x42 EOP EOP End of page/picture. Don't decompress anything past this point.
0x9f/0b10011111 n + 248 Extend Add 248 to the byte count for P()+N() and P()+R() commands.
Can be used N times in a row for 248 * N bytes.
Identical to the first byte P(n)+N(m, S0..Sm) and P(n)+R(m, C) where n is from 248 to 255.

Notes

Opcode Names

Canonical names were taken from a disassembly of the captfilter command from the original Canon driver. Copy is currently understood as "copy from previous line", and Raw is currently understood as "uncompressed".

Tracking the Previous Line

The decoder should keep track of the position on the previous line. Every operation advances the position by its count, regardless of using the contents of the previous line or not. For example:

P(7) + N(2, [0xBA, 0xBE]) + P(7) + R(17, 0xCC) + P(7)

Copies 7 bytes from the previous line,

Skips the next 2 bytes and inserts [0xBA, 0xBE] instead,

Copies another 7 from the previous line,

Skips the next 17 and inserts the same amount of 0xCC bytes instead, and finally,

Copies yet another 7 from the previous line.

The behaviour of using P() on the first line is unknown. As such, it is advised to assume an imaginary "previous line" entirely of zero (0x00) bytes before the first line on the compressed image.

Extending P() Byte Count With the 0x9f Opcode

The P() byte in P()+N() and P()+R() may be extended beyond 255 bytes by using one or more 0x9f commands at the start of the opcode. For example, 0x9f 0b10000001 0b01010010 C dumps 258 bytes from the previous line followed by two repeats of C. Likewise, 0x9f 0x9f 0b10000001 0b01010010 C does the same with 506 bytes. Only two 0x9f's are necessary to reach the end of the line on an 8.5 inch wide page at 600 dpi.

Unknown Cases

It is yet to be known how captfilter or printers handle the following:

  • Data that run past the end of line. captfilter's encoder is careful to keep lines shorter than the line size. Should excess bytes be discarded or carried over to the next line?

  • Input images with a width that is not a multiple of eight. Where should padding be added, or should the image be rejected outright?

  • Output of P() opcodes on the first line of the compressed image. What would printers output?

Support

Captdriver does not yet support SCoA compression, but there is ongoing work to implement support.

Decoder

An experimental working SCoA decoder is available from the Studycapt repository. Instructions on its usage may be found in README.md of the source tree.

References

Nicolas Boichat. LBP 810 and 1120 Driver SPECS file. Repository maintained by Alexander Sakharuk.

Canon (2003-12-01). Laser Shot LBP-2410 Colour Laser Printer. ICAN0275. SHA256: 250b5113a5986daf90ad2a44df683fa7afafb468a9635d4a8f1e86733b5d608b

Further Reading

PackBits Compression, described in detail in Section 9 of the TIFF 6.0 Specification (1992-06-03). SCoA appears to have been influenced by PackBits, which similarly divides the data into uncompressed and compressed regions.

pbm - Netpbm bi-level image format. See The Layout.

Mode 9 Compression as specified in Chapter 2 Section 6.3.8 of the Brother Printer Technical Reference Guide implements similar delta coding techniques on Brother printers.

  • Alternate download link: Brother HL-2132 Manuals (click on the Download link to the Command Reference Guide for Software Developers. The manual is also linked from information pages for most other laser printer products on the Brother website.

Acknowledgements

This document is based on findings by Nicolas Boichat and documented in the source files of the LBP810 and 1120 driver.

Bookmarks

Installation Guide

Targeted Printers

Essential Test Suite

Unofficial Introduction to CAPT (Executive Summary)

Support Levels

Rootless Write Access To USB Devices

Miscellaneous Tips

Wishlists

Other Canon Printer-Related Projects

SPECS: 0xA1A1 Command and Response Format

Home Page

Search for pages starting with

  • SPECS for notes on the operation of the CAPT data formats and communications protocol
  • TESTING for guidelines on testing Captdriver
  • TIPS for potentially helpful information on studying the project or the CAPT format-protocol
Clone this wiki locally