blockchain-zc-parser

A zero-copy, allocation-free parser for Bitcoin blockchain binary data written in Rust, designed for high-throughput indexers, analytics engines, and embedded environments.

A low-level Bitcoin binary parser focused on performance, memory locality, and predictable streaming behavior.

Features


Zero-copy	All parsed structures borrow `&'a [u8]` directly from the input — no `memcpy`, no `String`, no `Vec`.
No alloc	Compatible with `#![no_std]` targets. Use in embedded devices, WASM, kernel modules.
Streaming	`BlockTxIter` and `TransactionParser` process transactions lazily via closures — never load an entire block into structured memory.
Fast	Parsing an 80-byte block header requires only ~10 integer reads from a contiguous buffer. Block file iteration is a tight loop over magic bytes and size fields.
Safe	`unsafe` is used only inside `cursor.rs` for pointer arithmetic after explicit bounds checks. Every `unsafe` block is annotated.

Comparison

Feature	blockchain-zc-parser	rust-bitcoin
Zero-copy	✅	Partial
Alloc-free parsing	✅	❌
Streaming block iteration	✅	❌
Full protocol model	❌	✅

Supported formats

Block headers (80 bytes, Bitcoin protocol)
Legacy and SegWit (BIP 141) transactions
Bitcoin script pattern matching:
- P2PKH, P2SH, P2WPKH, P2WSH, P2TR, P2PK, OP_RETURN, bare multisig
blkNNNNN.dat raw block files written by Bitcoin Core

Quick start

[dependencies]
blockchain-zc-parser = "0.1"

Parse a block header

use blockchain_zc_parser::{Cursor, BlockHeader};

fn parse(raw_80_bytes: &[u8]) -> blockchain_zc_parser::ParseResult<()> {
    let mut cursor = Cursor::new(raw_80_bytes);
    let header = BlockHeader::parse(&mut cursor)?;

    println!("version   = {}", header.version);
    println!("timestamp = {}", header.timestamp);
    println!("nonce     = {:#010x}", header.nonce);
    println!("prev_hash = {}", header.prev_block);   // Display impl, no alloc

    // header.prev_block is a &[u8;32] pointing into raw_80_bytes — no copy.
    Ok(())
}

Example: parse a raw block file

Download a raw block (example: Bitcoin genesis block):

curl -L \
"https://mempool.space/api/block/000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f/raw" \
-o genesis.bin

Run the example parser:

cargo run --example parse_block -- genesis.bin

Summary-only mode (no per-transaction printing):

cargo run --example parse_block -- --summary genesis.bin

Limit printed transactions:

cargo run --example parse_block -- --limit-tx 5 genesis.bin

Print a specific transaction index:

cargo run --example parse_block -- --tx 100 genesis.bin

Raw block vs `blkNNNNN.dat`

There are two different binary formats you may encounter:

Raw block (`.bin`, RPC, mempool API)

This is the pure Bitcoin block payload:

[80-byte header]
[varint tx_count]
[transactions...]

It contains no magic bytes and no size prefix.

You typically obtain it via:

curl -L \
"https://mempool.space/api/block/<blockhash>/raw" \
-o block.bin

This format can be parsed directly with:

let (header, iter) = BlockTxIter::new(raw_block_bytes)?;

Bitcoin Core `blkNNNNN.dat`

Files in your local Bitcoin Core data directory:

~/.bitcoin/blocks/blk00000.dat

Each file contains multiple blocks, each prefixed by:

[4-byte magic][4-byte little-endian size][raw block]

To parse these files, use BlkFileIter:

use blockchain_zc_parser::block::{BlkFileIter, MAINNET_MAGIC};

let mut it = BlkFileIter::new(file_bytes, MAINNET_MAGIC);
while let Some(raw_block) = it.next_block()? {
    let (_header, mut tx_iter) = BlockTxIter::new(raw_block)?;
    // process block...
}

Why this crate exists

Most blockchain parsers:

Allocate Vecs for every transaction
Copy script bytes into owned buffers
Build large in-memory representations
Optimize for convenience over throughput

blockchain-zc-parser takes the opposite approach:

Every structure borrows directly from &[u8]
No heap allocation in parsing paths
Streaming transaction iteration
Minimal and auditable unsafe
Designed for memory-mapped blkNNNNN.dat processing

If you care about throughput, memory locality, and predictable performance, this crate is built for you.

When not to use this crate

If you need:

Full Bitcoin protocol validation
Address encoding/decoding
PSBT, descriptors, miniscript
Wallet functionality

Use rust-bitcoin instead.

Important

If you pass a blkNNNNN.dat file directly to BlockTxIter::new, parsing will fail because the file contains magic bytes and size prefixes.

The parse_block example automatically detects and unwraps the first block from a blkNNNNN.dat file if necessary.

Why zero-copy matters

Bitcoin blocks can exceed 1–2 MB and may contain thousands of transactions.

A traditional parser typically:

Allocates Vecs for inputs and outputs
Copies script bytes into owned buffers
Builds full in-memory representations

blockchain-zc-parser avoids all of this.

Every parsed structure borrows directly from the original &[u8] buffer. No heap allocations. No memcpy. No string building.

This has several practical consequences:

High throughput (hundreds of MB/s on modern CPUs)
Very low memory usage
Suitable for streaming, indexers, and embedded environments
Works naturally with memory-mapped files (mmap)

For indexers and blockchain analytics pipelines, this allows processing entire block files with near-linear memory access patterns.

Stream transactions from a block

use blockchain_zc_parser::{BlockTxIter, script::ScriptType};

fn scan_block(raw_block: &[u8]) -> blockchain_zc_parser::ParseResult<u64> {
    let (_header, mut iter) = BlockTxIter::new(raw_block)?;
    let mut total_satoshis: u64 = 0;

    while iter.next_tx(
        |_input| Ok(()),              // called for every TxInput
        |output| {                    // called for every TxOutput
            total_satoshis += output.value;
            if let ScriptType::P2WPKH { pubkey_hash } = output.script_pubkey.script_type() {
                // pubkey_hash: &[u8; 20] — zero-copy pointer into raw_block
                println!("  P2WPKH output to {:?}", pubkey_hash);
            }
            Ok(())
        },
    )? {}

    Ok(total_satoshis)
}

Iterate over a Bitcoin Core `blkNNNNN.dat` file

use blockchain_zc_parser::block::{BlkFileIter, MAINNET_MAGIC};

fn count_blocks(file_bytes: &[u8]) -> usize {
    let mut iter = BlkFileIter::new(file_bytes, MAINNET_MAGIC);
    let mut count = 0;
    while let Ok(Some(_raw_block)) = iter.next_block() {
        count += 1;
    }
    count
}

Architecture

src/
├── lib.rs          — crate root, re-exports
├── cursor.rs       — zero-copy Cursor<'a> over &[u8]  ← start here
├── error.rs        — ParseError enum, no_std compatible
├── hash.rs         — Hash32<'a> / Hash20<'a> wrappers
├── script.rs       — Script<'a>, ScriptType, instruction iterator
├── transaction.rs  — TxInput, TxOutput, OutPoint, TransactionParser
└── block.rs        — BlockHeader, BlockTxIter, BlkFileIter

The Cursor type is the single entry point for all parsing. It advances a usize offset into a &'a [u8] and returns sub-slices with lifetime 'a — identical to the original input. No unsafe code exists outside this file.

Core design principles

Zero-copy first — data is never duplicated.
Streaming over materialization — process blocks incrementally.
no_std compatible — works outside of full OS environments.
Explicit safety — all unsafe is documented and bounded.
Performance transparency — benchmarked and reproducible.

Benchmarks

Run on an Apple M2 Pro (single-core, Rust stable 1.88 at time of measurement, --release):

Benchmark	Throughput
`block_header/parse_80_bytes`	~1.1 GB/s
`transaction/parse/coinbase`	~860 MB/s
`transaction/parse/p2pkh_2out`	~740 MB/s
`block/streaming_iter/tx_count=1000`	~695 MB/s

Run yourself:

cargo bench
# HTML report: target/criterion/report/index.html

`no_std` usage

Disable the default feature set (which enables std):

[dependencies]
blockchain-zc-parser = { version = "0.1", default-features = false }

With default-features = false:

All std::error::Error impls are removed.
BlockHeader::block_hash() (requires SHA-256) is removed — call sha2::Sha256 directly on header.raw.
Everything else works identically.

Minimum supported Rust version (MSRV)

Rust 1.88+ (edition 2021). The crate uses only stable Rust features.

Safety

The only unsafe code lives in src/cursor.rs:

// SAFETY: `end` was checked to be ≤ data.len() on the line above.
let slice = unsafe { self.data.get_unchecked(self.pos..end) };

All other code is safe Rust. The crate passes cargo miri test (run it yourself with cargo +nightly miri test).

Contributing

Pull requests are welcome. Please:

Run:

   cargo test
   cargo clippy --all-targets --all-features -- -D warnings

Add a unit test for any new parsing logic.
Keep unsafe blocks minimal and documented.

License

This project is licensed under the Apache License 2.0 — see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
benches		benches
examples		examples
src		src
tests		tests
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

blockchain-zc-parser

Features

Comparison

Supported formats

Quick start

Parse a block header

Example: parse a raw block file

Raw block vs `blkNNNNN.dat`

Raw block (`.bin`, RPC, mempool API)

Bitcoin Core `blkNNNNN.dat`

Why this crate exists

When not to use this crate

Important

Why zero-copy matters

Stream transactions from a block

Iterate over a Bitcoin Core `blkNNNNN.dat` file

Architecture

Core design principles

Benchmarks

`no_std` usage

Minimum supported Rust version (MSRV)

Safety

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

blockchain-zc-parser

Features

Comparison

Supported formats

Quick start

Parse a block header

Example: parse a raw block file

Raw block vs blkNNNNN.dat

Raw block (.bin, RPC, mempool API)

Bitcoin Core blkNNNNN.dat

Why this crate exists

When not to use this crate

Important

Why zero-copy matters

Stream transactions from a block

Iterate over a Bitcoin Core blkNNNNN.dat file

Architecture

Core design principles

Benchmarks

no_std usage

Minimum supported Rust version (MSRV)

Safety

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Raw block vs `blkNNNNN.dat`

Raw block (`.bin`, RPC, mempool API)

Bitcoin Core `blkNNNNN.dat`

Iterate over a Bitcoin Core `blkNNNNN.dat` file

`no_std` usage

Packages