Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 24, 2025

📄 18% (0.18x) speedup for _int64 in src/deepgram/extensions/telemetry/proto_encoder.py

⏱️ Runtime : 68.0 microseconds 57.8 microseconds (best of 186 runs)

📝 Explanation and details

The optimized code achieves a 17% speedup through two key optimizations targeting the common case of small integer values in protobuf encoding:

1. Single-byte varint preallocation: Added _SINGLE_BYTE_VARINTS tuple containing pre-computed bytes objects for values 0-127. This eliminates the overhead of bytearray() allocation, bit manipulation loops, and bytes() conversion for small values - which are extremely common in protobuf field keys and small integers.

2. Inlined key computation in _key(): Instead of calling _varint(), the function now directly computes the key value and uses the fast path for single-byte results, avoiding function call overhead.

Why this works: Protobuf field numbers are typically small (1-15 are most common), and the wire type is always 0-5, making the combined key value (field_number << 3) | wire_type usually less than 128. The test results show this optimization is most effective for small field numbers and values:

  • test_int64_basic_zero(): 58.6% faster
  • test_int64_basic_field_number_zero(): 62.8% faster
  • test_basic_zero_value(): 75.4% faster

The optimization provides diminishing returns for larger values (like test_basic_larger_positive_value() showing 17% slower) since they still require the full varint encoding loop, but the overall benefit comes from the high frequency of small values in typical protobuf usage patterns.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 30 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

# imports
import pytest
from deepgram.extensions.telemetry.proto_encoder import _int64

# unit tests

# --- Basic Test Cases ---

def test_int64_basic_zero():
    # Test encoding zero value
    # field_number=1, value=0
    codeflash_output = _int64(1, 0) # 1.74μs -> 1.10μs (58.6% faster)

def test_int64_basic_small_positive():
    # Test encoding small positive integer
    # field_number=2, value=5
    codeflash_output = _int64(2, 5) # 1.55μs -> 1.03μs (51.0% faster)

def test_int64_basic_small_negative():
    # Test encoding small negative integer
    # field_number=3, value=-1
    # In protobuf, negative numbers are encoded as their 2's complement
    codeflash_output = _int64(3, -1) # 3.34μs -> 3.29μs (1.52% faster)

def test_int64_basic_field_number_zero():
    # Test encoding with field_number=0
    # field_number=0, value=123
    codeflash_output = _int64(0, 123) # 1.65μs -> 1.01μs (62.8% faster)

def test_int64_basic_field_number_max():
    # Test encoding with a large field number
    # field_number=15, value=1
    codeflash_output = _int64(15, 1) # 1.52μs -> 1.04μs (45.8% faster)

# --- Edge Test Cases ---





def test_int64_edge_value_exceeds_64bit():
    # Test encoding value larger than 64 bits (should be truncated to 64 bits)
    value = (1 << 70) + 123
    codeflash_output = _int64(1, value & ((1 << 64) - 1)); expected = codeflash_output # 3.00μs -> 1.70μs (76.7% faster)
    codeflash_output = _int64(1, value) # 2.74μs -> 3.46μs (20.7% slower)

def test_int64_edge_value_is_none():
    # Test encoding None value should raise TypeError
    with pytest.raises(TypeError):
        _int64(1, None) # 2.65μs -> 2.10μs (26.4% faster)


def test_int64_edge_value_not_int():
    # Test encoding with non-integer value should raise TypeError
    with pytest.raises(TypeError):
        _int64(1, "not an int") # 3.85μs -> 2.60μs (48.0% faster)

def test_int64_edge_field_number_not_int():
    # Test encoding with non-integer field_number should raise TypeError
    with pytest.raises(TypeError):
        _int64("not an int", 123) # 1.50μs -> 1.53μs (2.35% slower)

# --- Large Scale Test Cases ---






#------------------------------------------------
from __future__ import annotations

# imports
import pytest  # used for our unit tests
from deepgram.extensions.telemetry.proto_encoder import _int64

# unit tests

# --- BASIC TEST CASES ---

def test_basic_small_positive_values():
    # Field 1, value 1
    # key: (1 << 3) | 0 = 8 -> varint(8) = b'\x08'
    # value: varint(1) = b'\x01'
    codeflash_output = _int64(1, 1) # 3.05μs -> 1.69μs (80.7% faster)
    # Field 2, value 123
    codeflash_output = _int64(2, 123) # 683ns -> 513ns (33.1% faster)
    # Field 15, value 42
    codeflash_output = _int64(15, 42) # 569ns -> 465ns (22.4% faster)

def test_basic_zero_value():
    # Field 3, value 0
    codeflash_output = _int64(3, 0) # 1.81μs -> 1.03μs (75.4% faster)

def test_basic_larger_positive_value():
    # Field 1, value 300
    # varint(300) = b'\xac\x02'
    codeflash_output = _int64(1, 300) # 2.37μs -> 2.86μs (17.0% slower)
    # Field 7, value 16384
    # varint(16384) = b'\x80\x80\x01'
    codeflash_output = _int64(7, 16384) # 1.25μs -> 1.25μs (0.000% faster)

def test_basic_multiple_fields():
    # Field 5, value 1000
    # varint(1000) = b'\xe8\x07'
    codeflash_output = _int64(5, 1000) # 1.92μs -> 1.91μs (0.366% faster)
    # Field 31, value 255
    # varint(255) = b'\xff\x01'
    codeflash_output = _int64(31, 255) # 1.06μs -> 1.27μs (16.1% slower)

# --- EDGE TEST CASES ---

def test_edge_negative_values():
    # Field 1, value -1
    # varint(-1 & ((1 << 64) - 1)) = varint(18446744073709551615)
    # varint(18446744073709551615) = b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\x01'
    codeflash_output = _int64(1, -1) # 3.29μs -> 3.22μs (2.08% faster)
    # Field 2, value -123456
    # varint(-123456 & ((1 << 64) - 1)) = varint(18446744073709428160)
    codeflash_output = _int64(2, -123456) # 1.92μs -> 1.75μs (9.66% faster)

def test_edge_maximum_int64():
    # Field 1, value 2**63 - 1
    max_int64 = 2**63 - 1
    codeflash_output = _int64(1, max_int64) # 2.73μs -> 2.70μs (1.11% faster)

def test_edge_minimum_int64():
    # Field 1, value -2**63
    min_int64 = -2**63
    # -2**63 & ((1 << 64) - 1) = 9223372036854775808
    codeflash_output = _int64(1, min_int64) # 3.07μs -> 3.09μs (0.615% slower)

def test_edge_field_number_zero():
    # Field number 0, value 123
    # key: (0 << 3) | 0 = 0 -> varint(0) = b'\x00'
    codeflash_output = _int64(0, 123) # 1.68μs -> 1.04μs (60.8% faster)


def test_edge_value_zero_and_field_zero():
    # Both field and value are zero
    codeflash_output = _int64(0, 0) # 2.92μs -> 1.60μs (82.7% faster)

def test_edge_non_integer_input():
    # Should raise TypeError if field_number or value is not int
    with pytest.raises(TypeError):
        _int64('a', 1) # 1.68μs -> 1.60μs (4.88% faster)
    with pytest.raises(TypeError):
        _int64(1, 'b') # 2.52μs -> 1.80μs (39.8% faster)
    with pytest.raises(TypeError):
        _int64(1.5, 2) # 801ns -> 761ns (5.26% faster)
    with pytest.raises(TypeError):
        _int64(2, 3.7) # 1.43μs -> 1.10μs (30.0% faster)

# --- LARGE SCALE TEST CASES ---





def test_large_scale_field_and_value_extremes():
    # Test with max field number and max/min values
    max_field = 2**29 - 1
    max_value = 2**64 - 1
    min_value = -2**63
    codeflash_output = _int64(max_field, max_value) # 5.47μs -> 5.51μs (0.853% slower)
    codeflash_output = _int64(max_field, min_value) # 2.58μs -> 2.66μs (2.93% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from deepgram.extensions.telemetry.proto_encoder import _int64

def test__int64():
    _int64(0, 0)
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_7zeygj7s/tmpbo5k38g4/test_concolic_coverage.py::test__int64 1.69μs 1.11μs 52.8%✅

To edit these changes git checkout codeflash/optimize-_int64-mh4izvm6 and push.

Codeflash

The optimized code achieves a **17% speedup** through two key optimizations targeting the common case of small integer values in protobuf encoding:

**1. Single-byte varint preallocation**: Added `_SINGLE_BYTE_VARINTS` tuple containing pre-computed `bytes` objects for values 0-127. This eliminates the overhead of `bytearray()` allocation, bit manipulation loops, and `bytes()` conversion for small values - which are extremely common in protobuf field keys and small integers.

**2. Inlined key computation in `_key()`**: Instead of calling `_varint()`, the function now directly computes the key value and uses the fast path for single-byte results, avoiding function call overhead.

**Why this works**: Protobuf field numbers are typically small (1-15 are most common), and the wire type is always 0-5, making the combined key value `(field_number << 3) | wire_type` usually less than 128. The test results show this optimization is most effective for small field numbers and values:

- `test_int64_basic_zero()`: 58.6% faster 
- `test_int64_basic_field_number_zero()`: 62.8% faster
- `test_basic_zero_value()`: 75.4% faster

The optimization provides diminishing returns for larger values (like `test_basic_larger_positive_value()` showing 17% slower) since they still require the full varint encoding loop, but the overall benefit comes from the high frequency of small values in typical protobuf usage patterns.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 24, 2025 07:24
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant