Skip to content

Optimization: optional variable-time secp256k1 signing #19

@koko1123

Description

@koko1123

Summary

Add an optional variable-time secp256k1 signing path with precomputed generator tables to close the 4.09x gap against alloy/k256-rs.

Current Performance

Benchmark eth.zig alloy.rs Gap
secp256k1_sign 112,061 ns 27,372 ns 4.09x loss
secp256k1_sign_recover 254,525 ns 119,700 ns 2.13x loss

This is our largest performance gap.

Root Cause

eth.zig uses Zig's stdlib std.crypto.ecc.Secp256k1 which performs constant-time scalar multiplication to prevent timing side-channels. alloy uses k256-rs which uses variable-time operations with precomputed lookup tables.

The constant-time approach iterates through all 256 bits regardless of the scalar value. The variable-time approach skips zero windows and uses precomputed multiples of the generator point.

Security Context

Constant-time (current): Safe for hot wallets, hardware wallets, any scenario where an attacker can measure signing latency to extract the private key.

Variable-time (proposed): Suitable for:

  • Trading bots (signing happens locally, no network timing exposure)
  • MEV searchers (latency matters more than side-channel resistance)
  • Batch signing (offline contexts)

NOT suitable for:

  • User-facing wallets
  • HSMs or key management systems
  • Any context where signing latency is observable by adversaries

Proposed Approach

1. Precomputed Generator Table

Build a table of [G, 2G, 4G, 8G, ..., 2^255 * G] at comptime or as a static:

const WINDOW_SIZE = 4; // Process 4 bits at a time
const TABLE_SIZE = 1 << WINDOW_SIZE; // 16 entries per window
const NUM_WINDOWS = 256 / WINDOW_SIZE; // 64 windows

/// Precomputed table: table[w][i] = i * 2^(w*WINDOW_SIZE) * G
const generator_table: [NUM_WINDOWS][TABLE_SIZE]AffinePoint = precomputeTable();

fn precomputeTable() [NUM_WINDOWS][TABLE_SIZE]AffinePoint {
    // Computed at comptime if Zig supports it, otherwise at init
    var table: [NUM_WINDOWS][TABLE_SIZE]AffinePoint = undefined;
    var base = Secp256k1.basePoint;
    for (0..NUM_WINDOWS) |w| {
        table[w][0] = AffinePoint.identity();
        for (1..TABLE_SIZE) |i| {
            table[w][i] = table[w][i-1].add(base);
        }
        // Advance base by 2^WINDOW_SIZE
        for (0..WINDOW_SIZE) |_| base = base.double();
    }
    return table;
}

2. Windowed Scalar Multiplication

/// Variable-time scalar multiplication using precomputed tables.
/// WARNING: Not constant-time. Do not use in side-channel-sensitive contexts.
pub fn fastBaseMul(scalar: [32]u8) Point {
    var result = Point.identity();
    for (0..NUM_WINDOWS) |w| {
        // Extract 4-bit window from scalar
        const bits = extractWindow(scalar, w * WINDOW_SIZE, WINDOW_SIZE);
        if (bits != 0) { // Variable-time skip!
            result = result.add(generator_table[w][bits]);
        }
    }
    return result;
}

3. API Design

pub const SignOptions = struct {
    /// Use variable-time signing for maximum speed.
    /// WARNING: Introduces timing side-channels. Only use for
    /// trading bots, offline signing, or contexts where signing
    /// latency is not observable by adversaries.
    fast_variable_time: bool = false,
};

pub fn sign(private_key: [32]u8, msg_hash: [32]u8, options: SignOptions) !Signature {
    if (options.fast_variable_time) {
        return signFast(private_key, msg_hash);
    }
    return signConstantTime(private_key, msg_hash);
}

Expected Gain

Approach Expected speedup Notes
4-bit windowed with precomputed table 2-3x Standard technique, well-understood
wNAF (width-w Non-Adjacent Form) 3-4x More complex, better for larger windows
GLV endomorphism + windowed 3.5-4x Exploits curve structure, close to k256-rs

Realistic target: 2-3x improvement (56-37 us), narrowing the gap to ~1.4-2x.

Important Considerations

  • The API MUST default to constant-time (fast_variable_time: false)
  • Document the security tradeoff clearly in doc comments and README
  • Consider putting variable-time code behind a build flag so it's not even compiled by default
  • The precomputed table is ~16KB -- acceptable for server/desktop, might matter for embedded

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is neededoptimizationPerformance optimization

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions