Skip to content

Eyevinn/hi264

Repository files navigation

Logo

Test Coverage Status Go Reference Go Report Card license Badge OSC

Pure Go H.264/AVC IDR Decoder & Bitstream Generator

A pure Go H.264/AVC decoder for IDR (and P_Skip) frames with both CABAC and CAVLC entropy coding, plus a bitstream generator for producing valid H.264 test content with IDR and empty P-frames from grid patterns at 16x16 or 8x8 block granularity. It can also be used to extend video with extra frames without a change of SPS/PPS.

This is not a general-purpose video encoder — it does not accept arbitrary pixel input or perform motion estimation. The encoder produces I_16x16 DC prediction frames from grid patterns (one color per block), with proper AC residual encoding when 8x8 blocks create step patterns at 4x4 sub-block boundaries. This is useful for generating test bitstreams, color bars, frame counters, and reference content for decoder verification.

All processing is currently 8-bit 4:2:0 only (no 10-bit or 4:2:2/4:4:4 support).

Pixel-perfect match with FFmpeg IDR decoding across 41+ golden test cases covering varied content, profiles, QP ranges, scaling matrices, deblocking, resolutions, and both entropy coding modes.

Build & Test

go build ./...
go test ./...

CLI Tools

hi264dec — Decode H.264 IDR frames from raw .264 or MP4

Auto-detects input format by extension (.mp4/.m4v = MP4, else Annex-B raw bitstream). Output format detected from output extension: .png, .jpg/.jpeg, .y4m, .yuv.

# Raw Annex-B .264 input
go run ./cmd/hi264dec input.264 output.png             # PNG output
go run ./cmd/hi264dec input.264 output.jpg             # JPEG output
go run ./cmd/hi264dec input.264 output.y4m             # Y4M output
go run ./cmd/hi264dec input.264 output.yuv             # raw YUV (auto-adds _WxH_yuv420p suffix)

# MP4 input
go run ./cmd/hi264dec input.mp4 output.png             # decode first IDR frame
go run ./cmd/hi264dec -n 5 input.mp4 frames.png        # extract 5 IDR frames (frames_0000.png, ...)
go run ./cmd/hi264dec -n 3 input.mp4 output.y4m        # 3 frames in single Y4M file

# Options
go run ./cmd/hi264dec -no-deblock input.264 output.yuv # skip deblocking filter
go run ./cmd/hi264dec -q 95 input.264 output.jpg       # JPEG quality (default 85)
go run ./cmd/hi264dec -colorspace bt709 input.264 output.png  # override color space
go run ./cmd/hi264dec input.264                        # decode only, print info

# Decode IDR + P_Skip frames (for hi264gen-produced streams)
go run ./cmd/hi264dec -idr-and-skip -n 10 input.264 frames.png

hi264gen — H.264 bitstream generator for test content

Generates valid H.264 bitstreams from grid-based patterns. Each character in a grid maps to one block (16x16 by default, or 8x8 with @8x8 directive) filled with a single flat color, encoded as I_16x16 with DC prediction. This is not a general-purpose encoder — it produces test content from color patterns, not from arbitrary video frames.

Output format is auto-detected from the file extension, or set explicitly with -f:

Extension / -f Format Notes
.264 / 264 Annex-B Raw H.264 bitstream
.mp4 / mp4 Fragmented MP4 fMP4/CMAF with configurable fps and fragment duration
.y4m / y4m Y4M YUV4MPEG2 container
.yuv / yuv Raw YUV 4:2:0 planar (auto-adds _WxH_yuv420p suffix)
.png / png PNG Raw grid output (no H.264 encoding)
.jpg / jpg JPEG Raw grid output (-q for quality, default 85)

Use -o - to write to stdout (requires -f to set the format).

For H.264 output, supports both CAVLC (Baseline profile) and CABAC (Main profile) entropy coding. Multi-frame sequences use P_Skip frames between IDR keyframes to copy the reference frame unchanged (huge size reduction vs all-IDR). Image formats (YUV, Y4M, PNG, JPEG) output the grid pattern directly without H.264 encoding, useful as reference images for encode-decode chain verification.

# Grid-only: single IDR frame from grid pattern (frame size = grid size)
go run ./cmd/hi264gen -gi examples/sweden.gridimg -o sweden.264
go run ./cmd/hi264gen -gi examples/sweden.gridimg -cabac -o sweden_cabac.264
go run ./cmd/hi264gen -gp "xy,yx" -gc x=235,128,128 -gc y=16,128,128 -o checker.264
go run ./cmd/hi264gen -gp "ab" -gc a=255,0,0 -gc b=0,0,255 -rgb -qp 20 -no-deblock -o test.264

# Text overlay: frame counter on solid background
go run ./cmd/hi264gen -w 176 -h 80 -n 10 -text "%03d" -o counter.264

# Timestamp overlay
go run ./cmd/hi264gen -w 512 -h 240 -n 75 -fps 25 -text "%mm:%ss.%ff" -o timestamp.264

# With P_Skip frames (IDR every 50 frames, P_Skip copies between, CAVLC)
go run ./cmd/hi264gen -w 1280 -h 720 -n 121 -text "%03d" -idr-interval 50 -o counter.264

# With CABAC P_Skip frames (Main profile)
go run ./cmd/hi264gen -w 1280 -h 720 -n 121 -text "%03d" -cabac -idr-interval 50 -o counter.264

# Fragmented MP4 output (25 fps default, fragment every 25 frames)
go run ./cmd/hi264gen -w 176 -h 80 -n 50 -text "%03d" -o counter.mp4

# MP4 with custom framerate and fragment duration
go run ./cmd/hi264gen -w 320 -h 240 -n 75 -text "%03d" -fps 30 -frag-dur 30 -o counter.mp4

# Tiled: grid pattern tiled to fill custom dimensions, with optional text overlay
go run ./cmd/hi264gen -gi examples/checker4x4.gridimg -w 176 -h 80 -n 10 -text "%03d" -o counter.264

# SMPTE color bars with counter overlay
go run ./cmd/hi264gen -smpte -w 176 -h 80 -n 10 -text "%03d" -o smpte.264

# SMPTE bars with text background box and explicit scale
go run ./cmd/hi264gen -smpte -w 352 -h 288 -n 1 -text "%02d" -text-scale 3 -text-bg 0,0,0 -o smpte_big.264

# Multi-line text overlay (use \n to separate lines)
go run ./cmd/hi264gen -smpte -w 320 -h 240 -n 75 -fps 25 -text '%03d\n%mm:%ss.%ff' -o multiline.mp4

# Fixed bytes per picture (pad with H.264 filler NALUs for CBR-like streams)
go run ./cmd/hi264gen -smpte -w 176 -h 80 -bpp 5000 -o padded.264
go run ./cmd/hi264gen -w 320 -h 240 -n 50 -text "%03d" -bpp 8000 -o cbr_counter.mp4

# Target bitrate instead of bytes per picture (-kbps converts to bpp using -fps)
go run ./cmd/hi264gen -w 320 -h 240 -n 50 -text "%03d" -kbps 1000 -o cbr_counter.mp4

# Pipe to stdout (requires -f to specify format)
go run ./cmd/hi264gen -smpte -w 320 -h 240 -n 100 -text "%03d" -f 264 -o - | ffplay -i -
go run ./cmd/hi264gen -smpte -w 320 -h 240 -n 100 -text "%03d" -f mp4 -o - | ffplay -i -

# PNG/JPEG image as background (downsampled to block resolution)
go run ./cmd/hi264gen -gi photo.png -o photo.264                                        # native resolution
go run ./cmd/hi264gen -gi photo.png -w 320 -h 240 -o photo_scaled.264                   # scale to cover
go run ./cmd/hi264gen -gi photo.jpg -8x8 -o photo_8x8.264                               # 8x8 block detail
go run ./cmd/hi264gen -gi photo.png -w 320 -h 240 -text "%03d" -n 10 -o counter.mp4     # scale + text
go run ./cmd/hi264gen -gi photo.png -o roundtrip.png                                     # raw PNG output

# Raw image output (no H.264 encoding, useful as decoder reference)
go run ./cmd/hi264gen -gi examples/sweden.gridimg -o sweden.png
go run ./cmd/hi264gen -gi examples/sweden.gridimg -o sweden.yuv
go run ./cmd/hi264gen -gi examples/sweden.gridimg -q 95 -o sweden.jpg
go run ./cmd/hi264gen -w 176 -h 80 -n 5 -text "%03d" -o output.y4m
go run ./cmd/hi264gen -w 176 -h 80 -n 5 -text "%03d" -o frame_%03d.png
# Color space: generate BT.709 stream (VUI signaled in SPS)
go run ./cmd/hi264gen -gi examples/sweden.gridimg -colorspace bt709 -o sweden_709.264

# Full-range BT.709
go run ./cmd/hi264gen -smpte -w 320 -h 240 -colorspace bt709 -full-range -o smpte_709.264

Flags:

Flag Description Default
-gi Grid image file (.gridimg, .png, .jpg, .jpeg)
-gp Inline grid pattern (e.g. "xy,yx")
-gc Grid color mapping (repeatable, e.g. x=235,128,128 YCbCr or RGB with -rgb)
-f Output format (264, mp4, y4m, yuv, png, jpg); required with -o - auto-detect
-rgb Treat -gc values as RGB instead of YCbCr off
-smpte Use built-in 75% SMPTE color bars pattern off
-w Frame width in pixels grid width
-h Frame height in pixels grid height
-n Number of frames 1
-text Text overlay pattern (e.g. "%03d", "%mm:%ss.%ff", \n for newlines)
-text-scale Text scale factor (0 = auto-fit) 0
-text-bg Text background box color (R,G,B) none

Text supports A-Z 0-9 and punctuation ! # % + - . / : = ? [ ] _ ( ) plus space. Lowercase input is auto-uppercased. | -fg | Foreground color (R,G,B) | — | | -bg | Background color (R,G,B) | — | | -qp | Quantization parameter | 26 | | -cabac | Use CABAC entropy coding (Main profile) | off (CAVLC) | | -no-deblock | Disable deblocking filter | off | | -q | JPEG quality | 85 | | -idr-interval | Frames between IDR keyframes (0 = all-IDR) | 0 | | -bpp | Bytes per picture (filler NAL padding) | 0 (off) | | -kbps | Target bitrate in kbit/s (converted to bpp using -fps) | 0 (off) | | -colorspace | Color space (bt601/bt709/bt2020) | bt601 | | -full-range | Full-range YCbCr (0-255) | off (limited) | | -fps | MP4 framerate | 25 | | -frag-dur | MP4 fragment duration in frames | 25 | | -o | Output file (- for stdout) | — |

Constant bitrate testing with -bpp / -kbps

The -bpp flag pads each picture to an exact byte count using H.264 filler data NAL units (NAL type 12, per spec section 7.3.2.7). This is useful for testing bitrate-sensitive scenarios such as ABR ladder switching, buffer management, and segment size constraints.

Alternatively, use -kbps to specify the target bitrate directly in kbit/s — it is converted to bytes per picture using the current -fps value: bpp = kbps * 1000 / 8 / fps. The two flags are mutually exclusive.

The target bitrate in kbit/s is: bpp * 8 * fps / 1000. For example, -bpp 5000 at 25 fps gives 1000 kbit/s (equivalent to -kbps 1000). An error is returned if a frame's encoded slice already exceeds the target (use a higher QP or larger value).

A practical pattern is to use different background colors or patterns for different bitrate tiers so the current quality level is visually obvious during playback:

# 500 kbit/s tier — green background
go run ./cmd/hi264gen -w 320 -h 240 -n 50 -text "%03d" -bg 0,128,0 -kbps 500 -o low.mp4

# 1500 kbit/s tier — blue background
go run ./cmd/hi264gen -w 640 -h 360 -n 50 -text "%03d" -bg 0,0,200 -kbps 1500 -o mid.mp4

# 3000 kbit/s tier — red background
go run ./cmd/hi264gen -w 1280 -h 720 -n 50 -text "%03d" -bg 200,0,0 -kbps 3000 -o high.mp4

This makes it easy to verify that an ABR player switches between the correct renditions — you can tell which bitrate tier is active just by looking at the background color.

Image File Format

The .gridimg format combines color definitions and a grid layout in one file:

# Comments start with #
@rgb
@bt709
# Colors: char=v1,v2,v3 (YCbCr by default, RGB with @rgb directive or -rgb flag)
B=0,106,167
Y=254,204,0

BBBBBYYBBBBBBBBB
BBBBBYYBBBBBBBBB
YYYYYYYYYYYYYYYY
YYYYYYYYYYYYYYYY
BBBBBYYBBBBBBBBB
BBBBBYYBBBBBBBBB

Each character in the grid maps to one block. By default each character is a 16x16 macroblock; with the @8x8 directive, each character maps to an 8x8 block (4 characters per macroblock, enabling finer spatial detail with proper AC residual encoding). Supported directives: @rgb (treat values as RGB), @8x8 (8x8 block granularity), @bt601/@bt709/@bt2020 (color space for RGB-to-YCbCr conversion). See examples/ for complete examples.

Example Patterns

The examples/ directory contains several .gridimg files:

File Description Size (MBs)
sweden.gridimg Swedish flag with official NCS colors 16x10
france.gridimg French tricolore 9x6
japan.gridimg Japanese flag (Hinomaru) 12x8
rainbow_stripe.gridimg Vertical rainbow (6 colors) 6x2
checker4x4.gridimg Red/cyan checkerboard 4x4
gradient5.gridimg 5-shade gray gradient 5x3
dark_saturated.gridimg Extreme chroma values 4x4
logo.gridimg hi264 logo: SMPTE bars with text 48x27
# Encode to H.264
go run ./cmd/hi264gen -gi examples/sweden.gridimg -o sweden.264

# Decode to PNG
go run ./cmd/hi264dec sweden.264 sweden.png

# Generate reference PNG for comparison (raw output, no H.264)
go run ./cmd/hi264gen -gi examples/sweden.gridimg -o expected.png

# Cross-verify with FFmpeg (raw YUV)
go run ./cmd/hi264dec sweden.264 sweden.yuv
ffmpeg -i sweden.264 -pix_fmt yuv420p -f rawvideo ff.yuv
cmp sweden.yuv ff.yuv  # should be identical

# Run all encoder verification tests
bash tools/verify_hi264gen.sh

Library Usage

The pkg/ packages provide a public API for use as a Go library. Implementation details are in internal/ and not accessible to external callers.

import (
    "github.com/Eyevinn/hi264/pkg/decoder"
    "github.com/Eyevinn/hi264/pkg/encode"
    "github.com/Eyevinn/hi264/pkg/yuv"
)

// Decode an Annex-B byte stream (e.g. .264 file contents)
dec := decoder.New()
frame, err := dec.DecodeAnnexB(data)

// Decode AVC-format data (4-byte length-prefixed NALUs, e.g. from MP4 samples)
frame, err = dec.DecodeAVC(sampleData)

// Decode multi-frame stream (IDR + P_Skip)
frames, err := dec.DecodeAllAnnexB(data)

// Generate H.264 test bitstream from grid pattern
p := encode.EncodeParams{Width: 320, Height: 240, QP: 26}
sps, _ := encode.GenerateSPS(p)
pps, _ := encode.GeneratePPS(p)
idr, _ := encode.GenerateIDR(p, grid, colors, 0)

// Generate from PlaneGrid (supports 8x8 block granularity)
plane, _ := yuv.GridToPlaneGridBS(grid, colors, 8)
idr, _ = encode.GenerateIDRFromPlane(p, plane, 0)

Performance tip: When writing encoded slices to a file, wrap the io.Writer in a bufio.Writer to avoid a syscall per frame. This can reduce write overhead by ~87% for multi-frame sequences.

Appending frames to an existing bitstream

This example parses SPS/PPS from an existing H.264 bitstream, then appends a black IDR frame and a P_Skip frame that are compatible with the original parameter sets:

import (
    "github.com/Eyevinn/mp4ff/avc"
    "github.com/Eyevinn/hi264/pkg/encode"
    "github.com/Eyevinn/hi264/pkg/yuv"
)

// Parse parameter sets from the existing bitstream
nalus := avc.ExtractNalusFromByteStream(existingStream)
spsMap := make(map[uint32]*avc.SPS)
var sps *avc.SPS
var pps *avc.PPS
for _, nalu := range nalus {
    if len(nalu) < 1 {
        continue
    }
    naluType := nalu[0] & 0x1f
    switch naluType {
    case 7: // SPS
        sps, _ = avc.ParseSPSNALUnit(nalu, true)
        spsMap[sps.ParameterID] = sps
    case 8: // PPS
        pps, _ = avc.ParsePPSNALUnit(nalu, spsMap)
    }
}

// Create a single-color black grid matching the frame dimensions
w := int(sps.Width)
h := int(sps.Height)
blackY := uint8(16)  // limited range black
if sps.VUI != nil && sps.VUI.VideoFullRangeFlag {
    blackY = 0       // full range black
}
grid, colors := yuv.SolidGrid(w, h, yuv.Color{Y: blackY, Cb: 128, Cr: 128})

// Encode a black IDR frame using parameters matching the existing SPS/PPS
p := encode.EncodeParams{
    Width:  w,
    Height: h,
    QP:     26,
    CABAC:  pps.EntropyCodingModeFlag,
}
idrSlice, _ := encode.GenerateIDR(p, grid, colors, 0)

// Encode a P_Skip slice (copies the IDR frame unchanged)
pSkipSlice, _ := encode.EncodePSkipSlice(sps, pps, 1, 0)

// Append to the original stream
stream := append(existingStream, idrSlice...)
stream = append(stream, pSkipSlice...)

Architecture

pkg/decoder/       — Public: top-level decoder API (DecodeAnnexB, DecodeAVC, etc.)
pkg/encode/        — Public: bitstream generator API (flat-color I_16x16 IDR + P_Skip)
pkg/frame/         — Public: Frame type (decoded output)
pkg/yuv/           — Public: Grid, ColorMap, PlaneGrid (encode input), YUV/Y4M/PNG output
internal/cabac/    — Internal: CABAC arithmetic decoder and encoder engines
internal/cavlc/    — Internal: CAVLC bitstream reader, VLC tables, residual decoder
internal/context/  — Internal: Context model initialization (1024 contexts)
internal/slice/    — Internal: Slice data parsing, MB type decoding, residual decoding
internal/transform/— Internal: Inverse quantization and transform (4x4, 8x8, DC)
internal/pred/     — Internal: Intra prediction modes (4x4, 8x8, 16x16, chroma)
cmd/hi264dec/      — CLI: decode H.264 from raw .264 or MP4 containers
cmd/hi264gen/      — CLI: generate H.264 bitstreams or raw images from grid patterns
examples/          — Example grid image files
tools/             — Test generation and verification scripts
testdata/          — Golden H.264 bitstreams for regression testing

Dependencies

Support

Join our community on Slack where you can post any questions regarding any of our open source projects. Eyevinn's consulting business can also offer you:

  • Further development of this component
  • Customization and integration of this component into your platform
  • Support and maintenance agreement

Contact sales@eyevinn.se if you are interested.

About Eyevinn Technology

Eyevinn Technology is an independent consultant firm specialized in video and streaming. Independent in a way that we are not commercially tied to any platform or technology vendor. As our way to innovate and push the industry forward we develop proof-of-concepts and tools. The things we learn and the code we write we share with the industry in blogs and by open sourcing the code we have written.

Want to know more about Eyevinn and how it is to work here. Contact us at work@eyevinn.se!

About

Pure Go H.264/AVC decoder and bitstream generator

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors